understanding the different terminology in the command line output of a training pipeline

Hi @nanyasrivastav!

Thank you for your questions! These concepts are very important. I'm glad you've asked because it's hard to effectively use Prodigy without understanding them :slight_smile:.

Initial learn rate: The initial learning rate used. The learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. This is a default value but can be optimized for training. Optimizing the learning rate is an advanced approach as modifying the learning rate can have trade-offs between the rate of convergence and overshooting. See Wikipedia's article on Learning Rate

E: This is the number of completed epochs. Epochs can be thought of the number of passes on the training data. Training data is used multiple times in the algorithm (e.g., gradient descent) because it typically doesn't reach a (global or local) minimum on its first epoch. Stack Exchange comment for more details.

#: Number of iterations, or documents, that were passed through in training. This number recounts documents even after re-use from an additional epoch. For example, if you have 114 training documents, your first epoch will be completed after iteration 114, your second epoch will be completed after iteration 228, etc.

LOSS: The loss is the value of the loss function for the associated pipeline component. Spacy models can be made up of multiple components in a pipeline. As you see in your pipeline, you have two components: tok2vec (which is the value for each token) and ner (which is the named entity recognition). By default in spaCy 3.0, pipelines will share the same tok2vec component. This can be altered to unique independent components that are more modular but lead to larger and slower-to-train pipelines.

Here are a few general resources I would recommend reading:

There are also a few helpful Prodigy Support and spaCy community discussions that may help:

Thanks again for your question and let us know if you have any further questions!

2 Likes