Hello,
Instead of creating recipe from scratch, after reading the doc one more time I saw that I could just “modify existing recipe”, nice ! However I got the following error:
TypeError: teach() got an unexpected keyword argument 'source'
My use case is that I want to add a PhraseMatcher and a custom stream. I followed the Modifying existing recipes
section of the doc, but maybe I did something wrong… My recipe looks like this:
@recipe('ner.teach',
dataset=prodigy.recipe_args['dataset'],
mongo_uri=("MongoDB URL", "positional"),
spacy_model=("spacy model name or path", "option", "sm", str),
label=("Label to annotate", "option", "l", str),
ents_path=("path to entity text file", "option", "p", str),
sources=("Comma separted list of news source", "option", "s", str))
def teach(dataset, mongo_uri, spacy_model="en", label='ORG', sources="", ents_path=""):
"""
Annotate texts to train a NER model
"""
nlp = spacy.load("en")
if ents_path:
ent_matcher = EntityMatcher(nlp, ents_path=ents_path)
nlp.add_pipe(ent_matcher)
# model = EntityRecognizer(nlp, label=label)
db = MongoClient(mongo_uri)["data"]
sources = sources.split(",") if sources else []
stream = split_sentences(nlp, news_stream(db, s=sources))
components = teach(dataset=dataset, spacy_model=nlp, source=stream, label=label)
return components