Trouble training for Portuguese

ines · November 19, 2018, 5:20pm

Yay, glad it works now!

Okay, this probably explains a lot! You should definitely reject incomplete entities and any other suggestions that are wrong (even if it's painful sometimes if the model almost got it right ). My comment on this thread explains the reasoning behind this in more detail:

So if you've been annotating differently, I'd definitely suggest to convert your existing annotations to gold-standard, pre-train your model from that and then try the binary workflow again starting with a fresh dataset. You could also try adding some patterns when you run ner.teach, to make sure the model sees enough positive examples during annotation. For example, some street names or abstract patterns of possible street names could work well (e.g. any token + - + any token + "avenue").

This is interesting and definitely something I'd keep an eye on! (Also a nice example of why it's always super important to reason about the data and be familiar with both the language and the domain!)

Topic		Replies	Views
Prodigy model not learning, spaCy model ~90% F1 score usage , ner , spacy	11	1827	May 21, 2019
Improve trained models with annotations usage , ner , training	3	517	September 20, 2021
Prodigy to Spacy Guide ner , spacy , best-practices	4	5318	January 13, 2020
False Results of Trained models ner , spacy	16	925	March 12, 2019
Training few new entities: Result very low usage , ner , spacy	3	17	January 29, 2025

Trouble training for Portuguese

Related topics