Prodigy 1.12.0rc2 release candidate available for download!

Hi Andrey,

You can definitely achieve similar results with a custom ner correct recipe and the new spacy-llm library. spacy-llm let's you integrate an LLM (hosted or local) as a spaCy component. It will take care of prompt generation and parsing and store the LLM annotation results on the Doc object just like any other spaCy pipeline.
To use a LLM with spaCy you’ll need to start by creating a configuration file that tells spacy-llm how to construct a prompt for your task. Please see spacy-llm docs for details, but for NER it could look like this:

[nlp]
lang = "en"
pipeline = ["llm"]

[components]

[components.llm]
factory = "llm"

[components.llm.task]
@llm_tasks = "spacy.NER.v2"
labels = ["PERSON", "ORGANISATION", "LOCATION"]

[components.llm.backend]
@llm_backends = "spacy.Dolly_HF.v1"
# For better performance, use databricks/dolly-v2-12b instead
model = "databricks/dolly-v2-3b"

Then, from your custom recipe you could assemble the nlp pipeline like so:

from spacy_llm.util import assemble

# Assemble a spaCy pipeline from the config
nlp = assemble("config.cfg")

# Use this pipeline as you would normally
doc = nlp("I know of a great pizza recipe with anchovis.")
print(doc.ents) # (pizza, anchovis)

Once you have processed your examples with the LLM loaded pipeline, you can use it as input to ner_manual interface for annotators to correct the LLM annotations just like it's done with openai.ner.correct.

In the very near future we are going to release Prodigy built-in spacy-llm recipes, but for now the same results can be achieved with a just a little bit custom scripting thanks to spacy-llm.
Let us know how it goes and if you need any assistance!

1 Like