Introducing recipes to bootstrap annotation via OpenAI GPT3

I am planning to use spans.correct but I thought GPT3 would be better. Hence, was looking for a way to implement it. Will try spacy first then if it does not work then will look at other options.
Thank you for the prompt response :slight_smile:

Figured I'd mention it here to folks: last week Explosion released spacy-llm which makes it easy to integrate large language models in a spaCy pipeline. This should also make it much easier to re-use spaCy pipelines in Prodigy.

Feel free to check it out here:


Just wanted to share some Prodigy x LLM annotation side projects I did!

  • The first blog post involves using an LLM-assisted textcat annotation interface for argument mining! Here, I explored how we can use language models to augment the annotation process on tasks that require some nuance and chains of reasoning. I tried different prompting "styles" such as standard zero-shot and chain-of-thought reasoning.

  • The second blog post attempts to ingest a large annotation guideline (a PDF document) and incorporate it into a prompt. Aside from Prodigy, I also used langchain. Here, I discussed a method on how I was able to fit a very long document given a smaller token limit. In the future, I find it interesting to explore how well annotation guidelines actually "capture" the phenomena itself.

Hope these blog posts inspire you to try out some LLM-enhanced annotation workflows!


Update May 19: We've recently released v1.12 alpha, which includes LLM components like OpenAI recipes into Prodigy. Let us know if you have any feedback!

There is also a preview docs site:

We're excited to see what you can build with them :rocket:

I'm unable to reference gpt-4 in and The latest model I can call is legacy text-davinci-003. Can I reference gpt-4 or gpt-3.5? Or something we can call here: GitHub - explosion/spacy-llm: ๐Ÿฆ™ Integrating LLMs into structured NLP pipelines?

def textcat_openai_correct(
    dataset: str,
    filepath: Path,
    labels: List[str],
    lang: str = "en",
    model: str = "gpt-4",
    batch_size: int = 10,
    segment: bool = False,
    prompt_path: Path = DEFAULT_PROMPT_PATH,
    examples_path: Optional[Path] = None,
    max_examples: int = 2,
    exclusive_classes: bool = False,
    verbose: bool = False,

I'm also looking for a way to combine llm NER predictions with spaCy ner.correct default entities in the same recipe -- basically one task that takes an utterance and predicts the list of default spaCy entities with additional custom labelsI have defined that are predicted by gpt-3.5 or gpt-4. The annotator would then review and correct.

@cheyanneb have you seen this blogpost?

It uses the spacy-llm project in Prodigy to help generate these kinds of review interfaces.

At the moment, and also for the long term, I would recommend using spacy-llm. The next version of Prodigy (v1.13, which shouldn't take too long) will support spacy-llm directly as an alternative for these OpenAI recipes. The spacy-llm project has a bunch of benefits, like support for more backends as well as a proper caching mechanism. But it's also a project that's easier to update when, as shown recently, OpenAI decides to deprecate some of their main endpoints.

Let me know if the blogpost does not help in the meantime!