Correct order in Named Entity Recognition

Hi. I am starting with the use of Prodigy and I have doubts about whether the order we are following is correct. Wich would a reccommeded order be?
In the first instance, we created 5 entities (Place, Person, Organization, Area, Position) and we executed ner.manual to be able to teach the model.
Then we executed a ner.make-gold of approximately 300 cases and we saw that the model improved in its recognition. The problem is that we then executed a ner.teach of approximately 1000 cases and as a consequence we observed that the model got worse.
Could you recommend a correct order? For example, first ner.manual, second ner.teach etc.

In general, your approach sounds reasonable: you first created some gold-standard training data manually to bootstrap the new entity types and then went on to improve the model with an active learning-powered recipe.

Could you share some more details on the exact commands you ran? And what did you use as a base model?

Thanks for your answer. The entities are in Spanish. I detail the steps that we follow:

prodigy dataset bora_dataset2018

#GOLD1
PRODIGY_PORT=8000 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label AREA
PRODIGY_PORT=8001 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label ORGANISMO
PRODIGY_PORT=8002 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label PERSONA
PRODIGY_PORT=8003 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label CARGO
PRODIGY_PORT=8004 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LUGAR
PRODIGY_PORT=8005 prodigy ner.make-gold bora_dataset_2018 es_core_news_sm --output /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LEY, DECRETO. DA

#TRAIN
prodigy ner.batch-train bora_dataset_2018 /opt/prodigy/data/salidamodelo_bora2018 --output /opt/prodigy/data/salidamodelo_bora2018 --label “AREA, ORGANISMO, PERSONA, CARGO, LUGAR, LEY, DECRETO, DA” --eval-split 0.2 --n-iter 15

#GOLD2 CON MODELO
PRODIGY_PORT=8000 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label AREA
PRODIGY_PORT=8001 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label ORGANISMO
PRODIGY_PORT=8002 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label PERSONA
PRODIGY_PORT=8003 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label CARGO
PRODIGY_PORT=8004 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LUGAR
PRODIGY_PORT=8005 prodigy ner.make-gold bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LEY, DECRETO. DA

#TRAIN
prodigy ner.batch-train bora_dataset_2018 /opt/prodigy/data/salidamodelo_bora2018 --output /opt/prodigy/data/salidamodelo_bora2018 --label “AREA, ORGANISMO, PERSONA, CARGO, LUGAR, LEY, DECRETO, DA” --eval-split 0.2 --n-iter 15

#TEACH
PRODIGY_PORT=8000 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label AREA
PRODIGY_PORT=8001 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label ORGANISMO
PRODIGY_PORT=8002 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label PERSONA
PRODIGY_PORT=8003 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label CARGO
PRODIGY_PORT=8004 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LUGAR
PRODIGY_PORT=8005 prodigy ner.teach bora_dataset_2018 /opt/prodigy/data/modelo /opt/prodigy/data/dataset11.txt --label LEY, DECRETO. DA

When you say the model got worse, are you basing that on the evaluation printed by ner.batch-train?

In your commands there, it looks like you’re using the --eval-split command to conduct the evaluation. This means that each time you continue training the model, you’ll be evaluating over different data, especially as you continue annotating and the dataset increases in size.

The ner.teach recipe uses active learning, and by default the strategy is to select cases the model is most unsure about. This means you’re biasing the sample towards hard cases. This is good for training, but may be misleading for evaluation.

I would recommend separating out some of your examples, and making an evaluation set. Take care that the texts in your evaluation set do not also occur within your training data, to make sure the accuracy on your evaluation is a better indication of accuracy on other data.

When annotating the evaluation data, you want to use either the ner.make-gold or ner.manual recipes, rather than ner.teach. The goal is to get complete and correct annotations for a random sample of text. ner.teach skips through the text asking question that the model can learn the most from — which isn’t the right strategy for annotating evaluation data.

As a rule of thumb, you’ll want your evaluation data to have at least 10 entities per significant figure of accuracy you want to estimate. So if you want to distinguish, say, 70% accuracy from 71% accuracy, you’ll want an evaluation set with at least 1000 entities.