Hi
I have a large dataset with 500000 patterns to add to add_patterns of EntityRuler. I have tried disabling other pipes as suggested on the website but it still doesnt fix the issue. PFB my code and help me out.
nlp = spacy.load("en_core_web_sm")
disabled = nlp.disable_pipes("tagger", "parser", "ner")
ruler = EntityRuler(nlp, overwrite_ents=True)
reader = df.to_dict("records")
ruler.overwrite_ents = True
for row in reader:
ruler.add_patterns([{"label": row["feature code"], "pattern": row["asciiname"]}])
disabled.restore()
nlp.add_pipe(ruler, before='ner')
text = open("city.txt",encoding="latin-1").read()
doc = nlp(text)
print([ (ent.text, ent.label_) for ent in doc.ents ])
Thank you so much for your help.