Ah good to know. My gut is thinking it must've been the fact that you were on a free account, but I'm happy to hear that it's working for you now !
Let us know if you have any feedback! We're researching the effectiveness of large language models for annotation usecases and we'd for sure appreciate any feedback you might have. Let us know if something did/not work out well for you!
Found the reason. This model is a chat based model, which uses another API.
"message": "This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?",
That said. It is something that we might want to support, or at least throw a better warning. Will make an internal ticket for this. Thanks for reporting!
Hey, I am using Prodigy Span Categorization instead of NER because my entities are of a larger span of text. I have used spancat.manual to annotate 400 examples of my data manually and want to annotate 2000 examples in total. Are there any recipes for gpt3 for spancat? I also want to exclude the data I have already annotated. Thank you.
The current recipes don't support spancat specifically, but the spans that are added by the ner recipe can still be used to train a spancat system as long as the spans don't overlap. Have you tried that?
Note that you're free to adapt the prompt as you see fit, so you're also able to bend the prompt to accept longer spans of text.
That said, since you've already gotten some annotations, wouldn't it be helpful to consider the spans.correct recipe as well? This way you can leverage a spaCy model to pre-label the texts without spending OpenAI credits. Have you tried this?
I am planning to use spans.correct but I thought GPT3 would be better. Hence, was looking for a way to implement it. Will try spacy first then if it does not work then will look at other options.
Thank you for the prompt response
Figured I'd mention it here to folks: last week Explosion released spacy-llm which makes it easy to integrate large language models in a spaCy pipeline. This should also make it much easier to re-use spaCy pipelines in Prodigy.
Just wanted to share some Prodigy x LLM annotation side projects I did!
The first blog post involves using an LLM-assisted textcat annotation interface for argument mining! Here, I explored how we can use language models to augment the annotation process on tasks that require some nuance and chains of reasoning. I tried different prompting "styles" such as standard zero-shot and chain-of-thought reasoning.
The second blog post attempts to ingest a large annotation guideline (a PDF document) and incorporate it into a prompt. Aside from Prodigy, I also used langchain. Here, I discussed a method on how I was able to fit a very long document given a smaller token limit. In the future, I find it interesting to explore how well annotation guidelines actually "capture" the phenomena itself.
Hope these blog posts inspire you to try out some LLM-enhanced annotation workflows!
I'm also looking for a way to combine llm NER predictions with spaCy ner.correct default entities in the same recipe -- basically one task that takes an utterance and predicts the list of default spaCy entities with additional custom labelsI have defined that are predicted by gpt-3.5 or gpt-4. The annotator would then review and correct.
It uses the spacy-llm project in Prodigy to help generate these kinds of review interfaces.
At the moment, and also for the long term, I would recommend using spacy-llm. The next version of Prodigy (v1.13, which shouldn't take too long) will support spacy-llm directly as an alternative for these OpenAI recipes. The spacy-llm project has a bunch of benefits, like support for more backends as well as a proper caching mechanism. But it's also a project that's easier to update when, as shown recently, OpenAI decides to deprecate some of their main endpoints.
Let me know if the blogpost does not help in the meantime!