Start with a New Model When Starting a New Session

yusun · July 11, 2018, 2:17pm

Hi all,

When I use prodigy textcat.teach dataset spacy_model source to take annotation task with active learning, what’s the model if I exit current process and reopen it with the same command? Is the model the same with the one in first annotation without any update from first annotation task? What if I want to continue with the updated model?

ines · July 11, 2018, 3:02pm

The model you pass in via the command line will only be updated in memory and won’t be overwritten on disk. So if you re-start with the en_core_web_sm model, it will be the same initial model (not the one you updated in the loop).

If you want to start the process with an updated model, it’s usually recommended to run the textcat.batch-train command first. This will update the model with the annotations (just like the active learning recipe), but it will use multiple iterations and other training tricks, so you usually end up with a better and more accurate model.

Here’s an example:

# first session
prodigy textcat.teach dataset en_core_web_sm source.jsonl --label SOME_LABEL

# train model from annotations
prodigy textcat.batch-train dataset en_core_web_sm /path/to/new-model --label SOME_LABEL

# next session
prodigy textcat.teach dataset /path/to/new-model source.jsonl --label SOME_LABEL

Topic		Replies	Views
Resume Annotation Session with Prodigy - Text Classification textcat	1	1641	June 14, 2018
Updated model in ner.teach usage , ner , solved	5	1802	May 20, 2019
Textcat teach after training to better converge model's decisions usage , textcat , solved	1	364	November 11, 2020
Understanding textcat.teach from PyData Berlin 2018 talk textcat , solved	3	629	October 11, 2018
Help updating spaCy v2 model usage , spacy	5	381	December 15, 2021

Start with a New Model When Starting a New Session

Related topics