Introducing recipes to bootstrap annotation via OpenAI GPT3

Ah good to know. My gut is thinking it must've been the fact that you were on a free account, but I'm happy to hear that it's working for you now :slightly_smiling_face: !

Let us know if you have any feedback! We're researching the effectiveness of large language models for annotation usecases and we'd for sure appreciate any feedback you might have. Let us know if something did/not work out well for you!

2 Likes

@cheyanneb We've decided to add a --resume flag to these *.openai.fetch recipes in the upcoming Prodigy v1.12 release. If you want to be kept in the loop for an alpha version: let me know :smile:.

Would love to be kept in the loop for an alpha release!

1 Like

Any plans to integrate this with text classification, particularly textcat.teach?

We already support textcat recipes via textcat.openai.correct and textcat.openai.fetch. Do these not suffice your use-case?

Note that you can see a preview for these here:

1 Like

I was not able to call this model in the recipes:

model: str = "gpt-3.5-turbo"

Should this be available? Thanks!

That's ... strange. Could you share the error message?

I've been able to reproduce this error message:

requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://api.openai.com/v1/completions
1 Like

Found the reason. This model is a chat based model, which uses another API.

{
	"error": {
		"message": "This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?",
		"type": "invalid_request_error",
		"param": "model",
		"code": null
	}
}

That said. It is something that we might want to support, or at least throw a better warning. Will make an internal ticket for this. Thanks for reporting!

1 Like

Curious if the issue above was fixed. Thanks!

Hey, I am using Prodigy Span Categorization instead of NER because my entities are of a larger span of text. I have used spancat.manual to annotate 400 examples of my data manually and want to annotate 2000 examples in total. Are there any recipes for gpt3 for spancat? I also want to exclude the data I have already annotated. Thank you.

The current recipes don't support spancat specifically, but the spans that are added by the ner recipe can still be used to train a spancat system as long as the spans don't overlap. Have you tried that?

Note that you're free to adapt the prompt as you see fit, so you're also able to bend the prompt to accept longer spans of text.

That said, since you've already gotten some annotations, wouldn't it be helpful to consider the spans.correct recipe as well? This way you can leverage a spaCy model to pre-label the texts without spending OpenAI credits. Have you tried this?

@cheyanneb to reply to your question: we are working on a refactor for this now, but it's not ready just yet. We're exploring supporting not just this GPT endpoint, but also other providers.

1 Like

I am planning to use spans.correct but I thought GPT3 would be better. Hence, was looking for a way to implement it. Will try spacy first then if it does not work then will look at other options.
Thank you for the prompt response :slight_smile:

Figured I'd mention it here to folks: last week Explosion released spacy-llm which makes it easy to integrate large language models in a spaCy pipeline. This should also make it much easier to re-use spaCy pipelines in Prodigy.

Feel free to check it out here:

2 Likes

Just wanted to share some Prodigy x LLM annotation side projects I did!

  • The first blog post involves using an LLM-assisted textcat annotation interface for argument mining! Here, I explored how we can use language models to augment the annotation process on tasks that require some nuance and chains of reasoning. I tried different prompting "styles" such as standard zero-shot and chain-of-thought reasoning.

  • The second blog post attempts to ingest a large annotation guideline (a PDF document) and incorporate it into a prompt. Aside from Prodigy, I also used langchain. Here, I discussed a method on how I was able to fit a very long document given a smaller token limit. In the future, I find it interesting to explore how well annotation guidelines actually "capture" the phenomena itself.

Hope these blog posts inspire you to try out some LLM-enhanced annotation workflows!

3 Likes

Update May 19: We've recently released v1.12 alpha, which includes LLM components like OpenAI recipes into Prodigy. Let us know if you have any feedback!

There is also a preview docs site:

We're excited to see what you can build with them :rocket:

I'm unable to reference gpt-4 in openai_textcat.py and openai_ner.py. The latest model I can call is legacy text-davinci-003. Can I reference gpt-4 or gpt-3.5? Or something we can call here: GitHub - explosion/spacy-llm: šŸ¦™ Integrating LLMs into structured NLP pipelines?

def textcat_openai_correct(
    dataset: str,
    filepath: Path,
    labels: List[str],
    lang: str = "en",
    model: str = "gpt-4",
    batch_size: int = 10,
    segment: bool = False,
    prompt_path: Path = DEFAULT_PROMPT_PATH,
    examples_path: Optional[Path] = None,
    max_examples: int = 2,
    exclusive_classes: bool = False,
    verbose: bool = False,
):

I'm also looking for a way to combine llm NER predictions with spaCy ner.correct default entities in the same recipe -- basically one task that takes an utterance and predicts the list of default spaCy entities with additional custom labelsI have defined that are predicted by gpt-3.5 or gpt-4. The annotator would then review and correct.

@cheyanneb have you seen this blogpost?

It uses the spacy-llm project in Prodigy to help generate these kinds of review interfaces.

At the moment, and also for the long term, I would recommend using spacy-llm. The next version of Prodigy (v1.13, which shouldn't take too long) will support spacy-llm directly as an alternative for these OpenAI recipes. The spacy-llm project has a bunch of benefits, like support for more backends as well as a proper caching mechanism. But it's also a project that's easier to update when, as shown recently, OpenAI decides to deprecate some of their main endpoints.

Let me know if the blogpost does not help in the meantime!