spans.correct --update

I understood that --update should update the model I am using. However, I don't see any changes to the timestamps in the models folder (Windows). Am I missing something?

hi @dsr2021!

Great question! I think the confusing aspect is that we can use "update" for different types of learning: incremental and batch learning.

The --update in model-in-the-loop recipes (e.g., ner.correct, ner.teach) is using spaCy's nlp.update which is incremental learning (which is like online learning). This is not the same thing as fully (batch) training your model. While update is helpful to slightly modify your predictions while you are correcting your model, it does not to replace the need to fully retrain your model afterwards.

This post helps explain that even if you use --update, you should still retrain your model after you've completed your annotation session:

We also cover it briefly in the NER Prodigy docs:

When you annotate with a model in the loop, the model is also updated in the background. So why do you still need to train your model on the annotations afterwards, and can’t just export the model that was updated in the loop? The main reason is that the model in the loop is only updated once each new annotation. This is never going to be as effective as batch training a model on the whole dataset, making multiple passes over the data, shuffling on each epoch and using other deep learning tricks like dropout rates, compounding batch sizes and so on. If you batch train your model with the collected annotations afterwards , you should receive the same model you had in the loop, just better .

I could see the --update could help when you have a model-in-the-loop (like active learning) and you're passing in very different unlabeled examples than the model was trained on. Incremental learning via update may give you slightly more accurate model predictions after it has seen several examples that can maybe improve active learning (aka, the order unlabeled examples appear to be labeled) than if you weren't updating the model via incremental learning.

Let me know if you have any further questions!

Thanks. That cleared it up