Hello, I have 50k dataset for designations, converted to prodigy spans format,done batch-train with 10 iterations and saved the model to Titles_Model. Below is output.
python -m prodigy ner.batch-train titles_dataset en_core_web_sm --output Titles_Model --label TITLE --n-iter 10 --eval-split 0.2 --dropout 0.2 --unsegmented
nlp1 = spacy.load('Titles_Model')
doc = nlp1("My client is looking for Assistant General Manager in well established company"
entities = [(ent.text, ent.label_) for ent in doc.ents]
print(entities)
[('Assistant Genral Manager', 'TITLE')] -- correct
doc = nlp1("My client is looking for assistant General Manager in well established company")
[('Genral Manager', 'TITLE')] -- expecting assistant General Manager
doc = nlp1("looking for a Childcare Assessor SYM Tutor in Early Years in a nursery")
[('Childcare Assessor SYM Tutor in Early Years', 'TITLE')] -- correct
doc = nlp1("looking for a Childcare Assessor SYM Tutor in early years in a nursery")
[('Childcare Assessor SYM Tutor', 'TITLE')] -- expecting Childcare Assessor SYM Tutor in early years
is there anything i am missing here?. How to do case-in sensitive match?
03 1299780.668 8574 1831 9875 0 0.824
17:06:20 - MODEL: Merging entity spans of 9947 examples
17:06:20 - MODEL: Using 9947 examples (without 'ignore')
17:07:00 - MODEL: Evaluated 9947 examples
04 1301417.082 8858 1547 9881 0 0.851
17:45:38 - MODEL: Merging entity spans of 9947 examples
17:45:39 - MODEL: Using 9947 examples (without 'ignore')
17:46:19 - MODEL: Evaluated 9947 examples
05 1292085.172 8968 1437 9887 0 0.862
18:19:14 - MODEL: Merging entity spans of 9947 examples
18:19:14 - MODEL: Using 9947 examples (without 'ignore')
18:19:58 - MODEL: Evaluated 9947 examples
06 1299365.891 9034 1371 9876 0 0.868
18:48:29 - MODEL: Merging entity spans of 9947 examples
18:48:29 - MODEL: Using 9947 examples (without 'ignore')
18:49:14 - MODEL: Evaluated 9947 examples
07 1291458.556 9044 1361 9867 0 0.869
19:18:13 - MODEL: Merging entity spans of 9947 examples
19:18:20 - MODEL: Using 9947 examples (without 'ignore')
19:19:05 - MODEL: Evaluated 9947 examples
08 1289640.442 9066 1339 9866 0 0.871
19:50:08 - MODEL: Merging entity spans of 9947 examples
19:50:08 - MODEL: Using 9947 examples (without 'ignore')
19:50:52 - MODEL: Evaluated 9947 examples
09 1298395.806 9074 1331 9903 0 0.872
20:19:49 - MODEL: Merging entity spans of 9947 examples
20:19:50 - MODEL: Using 9947 examples (without 'ignore')
20:20:34 - MODEL: Evaluated 9947 examples
10 1301209.275 9069 1336 9923 0 0.872
Correct 9074
Incorrect 1331
Baseline 0.000
Accuracy 0.872