Hi Prodigy Team,
I am using Spacy's Dbpedia Spotlight pipeline for NER and would like to implement a Prodigy ner.correct session based on that model. I know you can specify the component within the ner.correct recipe, but since the DBpedia Spotlight pipeline isn't included by default, that isn't recognized as an option even when downloaded. Is there a simpler way to call this pipeline within the standard ner.correct recipe or is it necessary to write a custom recipe for this support?
Thanks for your help!
Hi @danalynn , welcome to Prodigy!
It should be possible by loading the model, saving it to disk, and pointing the ner.correct
recipe to that path. Something like this:
import spacy
nlp = spacy.load("en_core_web_lg")
nlp.add_pipe("dbpedia_spotlight")
print(nlp.pipe_names) # ['tok2vec', 'tagger', 'parser', 'ner', 'attribute_ruler', 'lemmatizer', 'dbpedia_spotlight']
nlp.to_disk("pipe_with_dbpedia") # c.f. https://spacy.io/api/language#to_disk
Then afterwards you can pass them to the spacy_model
positional argument of ner.correct
. Something like this:
prodigy ner.correct my-dataset pipe_with_dbpedia ...
Assuming that the spacy-dbpedia-spotlight
component sets doc.ents
, then it should work out of the box.