Trying to train module with Ner.manual after ner.batch-train results are not perfect

Hi Team,

We are done with installation and started training with the following command

prodigy ner.manual jd en_core_web_sm /var/www/html/role_jd.jsonl --label “ROLE,SKILL,PLACE,Education,Certification”

We are done with training and saved the annotations,

after building model with the following command

prodigy ner.batch-train jd en_core_web_sm --output jd --label ROLE,SKILL,PLACE,Education,Certification --eval-split 0.2 --n-iter 6 --batch-size 8

we can able to build model, but once we integrate with python, results are not coming which train in first step,

Please help me out


I’m not 100% sure I’m understanding the question correctly: is the problem that the model you trained after ner.batch-train isn’t being loaded correctly in spaCy? Or is the problem that your annotations aren’t producing a usefully accurate model?

If the problem is that the model isn’t being loaded correctly, could you give some more details on what you run in Python, and what happens?

If the problem is that the results are not accurate, you might have better results with the workflow described here:

The ner.manual interface is designed primarily to create evaluation data, or to do statistical analysis so you can figure out how frequent a category is in order to plan your approach. It can be a lot of work to train a new statistical model this way, especially if you label for five entity types at once.

How many examples did you annotate, and roughly how many instances of the different entity types did you collect? Finally, what sort of accuracy did the ner.batch-train command say you received?

Thanks for replay,

Using ner.manual, we train the skills and check with the data set we see the results but when we move the created dataset to the model we are not getting the accurate results which we see in the print data set output.

We tried with 50 examples but still not gettting the result.
Steps we are following

  1. Creating dataset
    Prodigy dataset skill_test
  2. Training the dataset
    prodigy ner.manual skill_test en_core_web_sm /var/www/html/role_jd.jsonl --label “ROLE,SKILL,PLACE,Education,Certification”
  3. we saved annotations to dataset
    4.doing batch train with dataset
    prodigy ner.batch-train skill_test en_core_web_sm --output skill_list --label SKILL,PLACE,Education,Certification --eval-split 0.2 --n-iter 6 --batch-size 8

output model is creationg with incorrect and correct entities,

When we build the model and import with code in testing those results which we saw in dataset print, those are not appear in output

Please help me out

First, the steps you describe all look good, so you’ve been doing this part correctly :blush:

The problem is that 50 examples are nowhere near enough to train a model from scratch. Keep in mind that you’re starting off with a model that knows nothing about those categories you’re training. You also want to model to be able to learn generalised weights based on the examples you give it so it can detect other, similar entities in context as well.

You also never want your model to just memorize the training data, because this would mean that it could only detect those exact examples. So the training algorithm actively prevents the model from doing this – for example, by shuffling the examples and setting a dropout rate. This is why the model you’ve trained on 50 examples doesn’t perform well on the training data either. It’s tried to generalise based on the examples, but didn’t get enough data to do so successfully.

If you’re only labelling data with ner.manual, you need a lot of examples – ideally thousands or more.

Because this takes very long and is often inefficient, Prodigy also comes with active learning-powered workflows that make it easier to train a model with less data. Instead of labelling everything from scratch, you can work with the model. You can also use some tricks, like working with seed terms and match patterns, to give the model more examples upfront, without having to label every single example by hand. For more details on this, check out this example workflow (including the video that @honnibal already linked above).

Ideas for a solution

In your case, you could, for example, start of with a list of examples of ROLE, like “engineering manager”, “senior developer”, “CEO” etc. You can then create a patterns.jsonl file that looks like this:

{"label": "ROLE", "pattern": [{"lower": "engineering"}, {"lower": "manager"}]}
{"label": "ROLE", "pattern": [{"lower": "ceo"}]}

Each entry in "pattern" describes one token, just like in the patterns for spaCy’s Matcher. You can find more details and examples of this in the PRODIGY_README.html. Ideally, you want a lot of examples for each label, which can all live in the same patterns.jsonl file.

Next, you can use the ner.teach with the --patterns argument pointing to your patterns file. This will tell Prodigy to find matches of those terms in your data, and ask you whether they are instances of that entity type. This is especially important for ambiguous entities – for example, “bachelor” can refer to a Bachelor’s degree, but also to a person or the show “The Bachelor” :wink:

prodigy ner.teach skill_dataset en_core_web_sm /var/www/html/role_jd.jsonl --label ROLE --patterns /path/to/patterns.json

As you click accept or reject, the model in the loop will be updated, and will start learning about your new entity type. Once you’ve annotated enough examples, the model will also start suggesting entities based on what it’s learned so far. By default, the suggestions you’ll see are the ones that the model is most uncertain about – i.e. the ones with a prediction closest to 50/50. Those are also the most important ones to annotate, since they will produce the most relevant gradient for training. So don’t worry if they seem a little weird at first – this is good, because your model is still learning and by rejecting the bad suggestions, you’re able to improve it.

Because you’re only clicking accept and reject, you’ll be able to collect training data much faster. So you can repeat this process for each entity type, until you have a few hundred annotations for each type. You can then start training again and the results should be much better. You can still keep your skill_test dataset with the 50 manual annotation btw, and use it as an evaluation set. ner.manual is actually a really good recipe to create evaluation sets.

So, in summary:

  1. Create a patterns.jsonl file with examples of each entity type.
  2. Train each entity type with a model in the loop using ner.teach and your patterns, to get over the “cold start” problem.
  3. Repeat for each type until you have enough annotations.
  4. Train a model again and test it.

Thanks for detailed replay

Please look into the following responce
Loaded model en_core_web_lg
Using 20% of accept/reject examples (42) for evaluation
Using 100% of remaining examples (169) for training
Dropout: 0.2 Batch size: 8 Iterations: 6

BEFORE 0.000
Correct 0
Incorrect 84
Entities 139
Unknown 0


01 47.051 15 69 58 0 0.179
02 44.887 25 59 194 0 0.298
03 36.563 24 60 76 0 0.286
04 37.948 30 54 172 0 0.357
05 51.274 31 53 120 0 0.369
06 39.567 34 50 160 0 0.405

Correct 34
Incorrect 50
Baseline 0.000
Accuracy 0.405

Model: /root/skills-pat-list-model
Training data: /root/skills-pat-list-model/training.jsonl
Evaluation data: /root/skills-pat-list-model/evaluation.jsonl

How we can correct from incorrect 50


How we can correct from incorrect 50

You need to create more data. The best solution is to use the ner.teach recipe, possibly using the patterns.json approach as @ines described.

I actually think the progress on the annotations you’ve collected so far looks pretty promising! It looks like the model will be able to learn your problem — you just need to give it more examples.