Hi, unfortunately, I have problems training a custom ner model. Here's what I'm seeing during the process, and I'm pretty sure it's wrong (no scores and no details R/F/P):
Recipe:
python -m prodigy train --ner correct_PER_MAIL_NG_data --base-model custom_model_email01 --label-stats
If I change the model to the standard spacy model, it works:
Recipe:
python -m prodigy train --ner correct_PER_MAIL_NG_data --base-model en_core_web_md --label-stats
I have created the custom model as follows:
nlp = spacy.load("en_core_web_md")
ruler = nlp.add_pipe("entity_ruler")
patterns = [{"label": "EMAIL", "pattern": [{'LIKE_EMAIL':True}]}]
ruler.add_patterns(patterns)
nlp.to_disk('custom_model_email01')
And it also seems to work:
doc = nlp("Apple is opening XXX.XXX@uni.com office.")
print([(ent.text, ent.label_) for ent in doc.ents])
[('Apple', 'ORG'), ('XXX.XXX@uni.com', 'EMAIL')]
I can use the custom model to annotate and create a dataset with ner.correct:
Recipe:
python -m prodigy ner.correct correct_PER_MAIL_NG_data custom_model_email01 ./NG_data_meta.jsonl --label PERSON,EMAIL --update
But something seems to go wrong during the training process. Any idea what I am doing wrong?
Thanks Alfred
============================== ✨ Prodigy Stats ==============================
Version 1.11.2
Location C:\Users\xxx\Miniconda3\lib\site-packages\prodigy
Prodigy Home C:\Users\xxx\.prodigy
Platform Windows-10-10.0.18362-SP0
Python Version 3.8.3
Database Name SQLite
Database Id sqlite
Total Datasets 5
Total Sessions 16
correct_PER_MAIL_NG_data.jsonl (87.7 KB)