I am looking to replicate the command below via Python. Is there a way to recreate that exact training?
Command Line --> prodigy train --ner ds_<dataset_name> ./models --eval-split 0.25 -L
I am looking to replicate the command below via Python. Is there a way to recreate that exact training?
Command Line --> prodigy train --ner ds_<dataset_name> ./models --eval-split 0.25 -L
hi @wertzhayden!
Not sure what you mean by "recreate" -- do you just mean you want to look at the raw Python code for prodigy train
? You can view all the built-in recipes by running prodigy stats
, looking for Location:
, then finding the recipes
folder. train.py
includes prodigy train
.
As you'll see, prodigy train
is just a wrapper for spacy train
. The issue you may get is that prodigy train
will redo the partitioning when doing --eval-split
, which isn't ideal -- that is, each time you run prodigy train
, you'll get a different result. That's why we typically only recommend prodigy train
for quick-and-simple training but suggest using data-to-spacy
then spacy train
for more sophisticated workflows.
There are a lot of support issues that discuss this more. For example, this one I use an example showing how to replicate prodigy train
with spacy train
using a sample project: