prodigy train ner: log performance metrics on hard drive


In our projects, we are running a lot of training (prodigy train ner ...), is there a way to store the performance metrics (F-Score genral, F-Score, Precision and Recall per entity) on the hard drive (and not having the result only in the notebook).


Hi! There are different ways you could do this, depending on what your goal is. If you just want to save the exact terminal output to a file, you could redirect the output from stdout to a file:

prodigy train ... > log.txt

This should work in a notebook cell as well (although I haven't tried it). Alternatively, there's also some magic you can do with %%capture. This gives you the raw text, which you can then save to a file.

Alternatively, you could also use data-to-spacy to export your dataset in spaCy's format so you can train with spaCy and spacy train directly. This will add the metrics to the exported model's meta.json by default. You'll then also be able to access the raw scores programmatically, so you could have a script that takes directories of models and compares their scores side-by-side, or something like that.

Finally, for completeness, in spaCy v3 you can fully customise the logging and provide a custom function that does whatever you want with the scores – including saving them to disk in a custom format. See here for an example: If you've exported your Prodigy annotations with data-to-spacy, you can use spaCy v3's spacy convert to convert it so you can train with spaCy v3 directly.

1 Like