Introducing recipes to bootstrap annotation via OpenAI GPT3

We're looking forward to publishing more along these lines, and I'm sure there will be lots to tweak and refine in the prompts. We haven't carried out very thorough experiments with this, so please let us know what you find.

Below I've reproduced the text of my Twitter/Mastodon thread, explaining what this is and why :slightly_smiling_face:

We've been working on new Prodigy workflows that let you use the @OpenAI API to kickstart your annotations, via zero- or few-shot learning. We've just published the first recipe, for NER annotation :tada: . Here's what, why and how. :thread:

Let's say you want to do some 'traditional' NLP thing, like extracting information from text. The information you want to extract isn't on the public web โ€” it's in this pile of documents you have sitting in front of you.

So how can models like GPT3 help? One answer is zero- or few-shot learning: you prompt the model with something like "Annotate this text for these entities", and you append your text to the prompt. This works surprisingly well! It was an in the original paper.

However, zero-shot classifiers really aren't good enough for most applications. The prompt just doesn't give you enough control over the model's behaviour.

Machine learning is basically programming by example: instead of specifying a system's behaviour with code, you (imperfectly) specify the desired behaviour with training data.

Well, zero-shot learning is like that, but without the training data. That does have some advantages โ€” you don't have to tell it much about what you want it to do. But it's also pretty limiting. You can't tell it much about what you want it to do.

So, let's compromise. We'll pipe our data through the OpenAI API, prompting it to suggest entities for us. But instead of just shipping whatever it suggested, we're going to go through and correct its annotations. Then we'll save those out and train a much smaller supervised model.

This workflow looks pretty promising from initial testing. The model provides useful suggestions for categories like "ingredient", "dish" and "equipment" just from the labels, with no examples. And the precision isn't bad โ€” I was impressed that it avoided marking "Goose" here.

I especially like this zero-shot learning workflow because it's a great example of what we've always set out to achieve with Prodigy. Two distinct features of Prodigy are its scriptability and the ease with which you can scale down to a single-person workflow.

Modern neural networks are very sample efficient, because they use transfer learning to acquire most of their knowledge. You just need enough examples to define your problem. If annotation is mostly about problem definition, iteration is much more important than scaling.

The key to iteration speed is letting a small group of people โ€” ideally just you! โ€” annotate faster. That's where the scriptability comes in. Every problem is different, and we can't guess exactly what tool assistance or interface will be best. So we let you control that.

We didn't have to make any changes to Prodigy itself for this workflow โ€” everything happens in the "recipe" script. You can build other things at least this complex for yourself, or you can start from one of our scripts and modify it according to your requirements.

If you don't have Prodigy, you can get a copy here: We sell Prodigy in a very old-school way, with a once-off fee for software you run yourself. There's no free download, but we're happy to issue refunds, and we can host trials for companies.


Hi, when trying to follow this command but I'm getting the following error:

Traceback (most recent call last):
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/", line 86, in _run_code
    exec(code, run_globals)
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/site-packages/prodigy/", line 62, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 384, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "cython_src/prodigy/core.pyx", line 73, in prodigy.core.Controller.from_components
  File "cython_src/prodigy/core.pyx", line 170, in prodigy.core.Controller.__init__
  File "cython_src/prodigy/components/feeds.pyx", line 104, in prodigy.components.feeds.Feed.__init__
  File "cython_src/prodigy/components/feeds.pyx", line 150, in prodigy.components.feeds.Feed._init_stream
  File "cython_src/prodigy/components/stream.pyx", line 107, in
  File "cython_src/prodigy/components/stream.pyx", line 58, in
  File "/home/poa3/", line 191, in format_suggestions
    for example in stream:
  File "cython_src/prodigy/components/preprocess.pyx", line 165, in add_tokens
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/site-packages/spacy/", line 1545, in pipe
    for doc in docs:
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/site-packages/spacy/", line 1589, in pipe
    for doc in docs:
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/site-packages/spacy/", line 1586, in <genexpr>
    docs = (self._ensure_doc(text) for text in texts)
  File "/home/poa3/anaconda3/envs/prodigy/lib/python3.10/site-packages/spacy/", line 1535, in <genexpr>
    docs_with_contexts = (
  File "cython_src/prodigy/components/preprocess.pyx", line 158, in genexpr
  File "/home/poa3/", line 169, in stream_suggestions
    prompts = [
  File "/home/poa3/", line 171, in <listcomp>
    eg["text"], labels=self.labels, examples=self.examples
TypeError: list indices must be integers or slices, not str

My examples file is formatted the same way it is on the github

  text: "Current symptoms of dyspnea are consistent with NYHA Class II-III and also has occasional exertional chest pressure"

      - NYHA Class II-III

Interesting. Could you share the command that you ran? Maybe with an example of the data that you gave? I just ran openai.ner.fetch locally and it didn't have this error message, but it can be that OpenAI returns a response that's unexpected. I can try to reproduce it locally if I have your example though!

Hello! I am encountering a rate limit issue. I have seen some content on the OpenAI site that suggests using exponential backoff, or requesting a rate limit increase. I tried using the full dataset in the repo, and also a smaller dataset. I also tried adjusting the batch size.

My command:

PRODIGY_ALLOWED_SESSIONS=cheyanne prodigy ner.openai.correct my_data ./data/reddit_r_cooking_sample.jsonl ingredient -F ./recipes/ -b 2

And this is the message I get in return:

Retrying call (retries left: 10, timeout: 20s). Previous call returned: 429 (Too
Many Requests)

Any thoughts?


We've hit this issue ourselves as well at times, and it seems to be something that we don't have too much control over. OpenAI can rate-limit however they see fit, and it is a very busy API these days. As is explained here, a 429 might also indicate that their backends are swamped.

It suprises me that you're seeing this issue with ner.openai.correct though, I usually experience this with the ner.openai.fetch recipe (which can send way more traffic).

One thing you might try and do though: you could adapt the retry time in the code directly. It might still take a while, but at least the error should be happening less.

Another thing to keep in mind is that the time-out depends a bit on how you use their API. Their docs have a page up that explains that new accounts have different rate-limits compared to accounts that have been around for a while:

Some of these issues might go away after you've had an account for 48 hours. Might that be the case for you?

Thank you @koaning! So I did a few things, and this seemed to work:

  • Adjusted the retry code.
  • Added a sleep function to the code.
  • I was using a free account (older than 48 hrs) and hitting the rate limit, so I am now using a paid account. This worked.


1 Like

Ah good to know. My gut is thinking it must've been the fact that you were on a free account, but I'm happy to hear that it's working for you now :slightly_smiling_face: !

Let us know if you have any feedback! We're researching the effectiveness of large language models for annotation usecases and we'd for sure appreciate any feedback you might have. Let us know if something did/not work out well for you!


@cheyanneb We've decided to add a --resume flag to these *.openai.fetch recipes in the upcoming Prodigy v1.12 release. If you want to be kept in the loop for an alpha version: let me know :smile:.

Would love to be kept in the loop for an alpha release!

1 Like

Any plans to integrate this with text classification, particularly textcat.teach?

We already support textcat recipes via textcat.openai.correct and textcat.openai.fetch. Do these not suffice your use-case?

Note that you can see a preview for these here:

1 Like

I was not able to call this model in the recipes:

model: str = "gpt-3.5-turbo"

Should this be available? Thanks!

That's ... strange. Could you share the error message?

I've been able to reproduce this error message:

requests.exceptions.HTTPError: 404 Client Error: Not Found for url:
1 Like

Found the reason. This model is a chat based model, which uses another API.

	"error": {
		"message": "This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?",
		"type": "invalid_request_error",
		"param": "model",
		"code": null

That said. It is something that we might want to support, or at least throw a better warning. Will make an internal ticket for this. Thanks for reporting!

1 Like

Curious if the issue above was fixed. Thanks!

Hey, I am using Prodigy Span Categorization instead of NER because my entities are of a larger span of text. I have used spancat.manual to annotate 400 examples of my data manually and want to annotate 2000 examples in total. Are there any recipes for gpt3 for spancat? I also want to exclude the data I have already annotated. Thank you.

The current recipes don't support spancat specifically, but the spans that are added by the ner recipe can still be used to train a spancat system as long as the spans don't overlap. Have you tried that?

Note that you're free to adapt the prompt as you see fit, so you're also able to bend the prompt to accept longer spans of text.

That said, since you've already gotten some annotations, wouldn't it be helpful to consider the spans.correct recipe as well? This way you can leverage a spaCy model to pre-label the texts without spending OpenAI credits. Have you tried this?

@cheyanneb to reply to your question: we are working on a refactor for this now, but it's not ready just yet. We're exploring supporting not just this GPT endpoint, but also other providers.

1 Like