Tune existing Spacy NER model


I've got a question on good practice to improve an existing NER model

I have created a first model, version_1, and I want to improve it with data that are already pre-annotated in the format recognized by Prodigy

I would like to know how I can create the version_2 on top of version_1 and analyze if the performances are better or not

Thank's !

Hi @xD3CODER ,

If I understand your question correctly, then this should be possible. If you already have annotated data and the Version 1 model, then you can "continue" training on that previous model from spaCy itself. You can actually do this from spaCy directly. However, if you're keen to using Prodigy, you can also provide that path in the train recipe.

For analysis, the best way to do this is via spacy evaluate.

Hello @ljvmiranda921

I have trouble understanding the process to follow.
Until now to reinforce my predictions I recreated a new model merging my previous annotations with the new ones (the corrections) with data-to-spacy command.
Now I only have a base model (my current NER model) and corrected annotations that I want to use to reinforce my model.

For this I meant you have two options:

  • Train your version 1 model with your newer annotations (don't include the data that you used to create your v1).
prodigy train my-v2-model --base-model my-v1-model --ner new_annotations
  • Train the base model you used for v1 with the combined annotations (previous + new)
prodigy train my-v2-model --base-model my-base-model --ner combined_annotations

Since you corrected your previous data, it might be better to do the second option.

Thank's @ljvmiranda921 !

What's the advantage of passing the combinated annotations over just the newer ?

The primary advantage is that you can train your model with that whole batch right away, as opposed to a delta of that dataset. Also, you can tune your model more properly since it's exposed to the final dataset.