prodigy train result is different with the spacy train result, why?

Hi,

I have created an annotated NER dataset using Prodigy. After finishing the annotation, I export the dataset using Prodigy's data-to-spacy CLI.

Then I did the training process with CLI both on Spacy and Prodigy. After the training process finishes, I have found that it produces different results. The best f-score for the Prodigy train result is 81.792, and the Spacy train result is 82.323. Why does it's happening? From what I knew, it should produce the same results.

For your information, I am using Prodigy v1.10.8 and Spacy v2.3.5. Here are the steps that I have done:

1. python -m prodigy data-to-spacy ./train-70.json ./dev-30.json --lang id --ner my-dataset --eval-split 0.3
2. prodigy train ner my-dataset blank:id --output ./prodigy-model --eval-split 0.3 --n-iter 10
3. python -m spacy train id ./spacy-model ./train-70.json ./dev-30.json -p ner -n 10

The config file for both of the processes is the same:

{
  "beam_width":1,
  "beam_density":0.0,
  "beam_update_prob":1.0,
  "cnn_maxout_pieces":3,
  "nr_feature_tokens":6,
  "nr_class":46,
  "hidden_depth":1,
  "token_vector_width":96,
  "hidden_width":64,
  "maxout_pieces":2,
  "pretrained_vectors":null,
  "bilstm_depth":0,
  "self_attn_depth":0,
  "conv_depth":4,
  "conv_window":1,
  "embed_size":2000
}

Thank you

Hi! Older versions of Prodigy (v1.10 and below) used their own training loop implementation with its own default settings, so it's possible that a small difference in dropout, learning rate or batching can easily account for a small difference in accuracy of +/- 1%.

This was actually one of the main motivations we standardised the training process in v1.11+ to call into spacy train directly, and it's also a good example of why the config system in spaCy v3 is useful for reproducible experiments, because it prevents hidden defaults.

Hi Ines

Thank you. So, is it okay to use the resulted data-to-spacy command in Spacy V3?

Yes, that's the recommended workflow once you're serious about training your model :slightly_smiling_face:

Hello Ines

My purpose of using the spacy train CLI is to do some experiments related to the number of iterations like in the prodigy train CLI to observe the score of the metrics. Is it possible? I have already tried using the spacy train in Spacy V3, but I cannot find how to do this.

Thank you

I'm not sure I understand what you're trying to do here? What do you mean by the score of the metrics? If you're looking for per-label stats, you can get the same results and more using spacy evaluate.

Hi Ines

What I have done with Prodigy and Spacy V2 is doing several experiment based-on the number of train split data and number of iteration, for example train split 70:30, iteration 50, 75, 100, and 500 plus 80:20 train split and 50, 75, 100, and 500 iteration. From those experiments, I observed the overall f-score, recall, and precision for all entity to determine the best model (the highest f-score). After that I will observe more details on the metrics for all of the entities.

I want to migrate that experiments using Spacy V3.

hi @sigitpurnomo!

Sorry for the delayed response. We're trying to close out old issues.

Regarding comparing prodigy train and spacy train, I recommend anyone interested to check out the Prodigy sample project:

If you clone this repo, you can run two examples to compare spacy train and prodigy train.

Using sample fashion data, you can run spacy train by running

python -m spacy project run all 

This will load the data (db-in), export the data and config file (data-to-spacy), and spacy train (see the project.yml).

python -m spacy project run all
โ„น Running workflow 'all'

=================================== db-in ===================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_training assets/fashion_brands_training.jsonl
โœ” Created dataset 'fashion_brands_training' in database SQLite
โœ” Imported 1235 annotations to 'fashion_brands_training' (session
2023-02-03_15-10-32) in database SQLite
Found and keeping existing "answer" in 1235 examples
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_eval assets/fashion_brands_eval.jsonl
โœ” Created dataset 'fashion_brands_eval' in database SQLite
โœ” Imported 500 annotations to 'fashion_brands_eval' (session
2023-02-03_15-10-33) in database SQLite
Found and keeping existing "answer" in 500 examples

=============================== data-to-spacy ===============================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy data-to-spacy corpus/ --ner fashion_brands_training,eval:fashion_brands_eval
โ„น Using language 'en'

============================== Generating data ==============================
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 1235 | Evaluation: 500 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
โœ” Saved 1235 training examples
corpus/train.spacy
โœ” Saved 500 evaluation examples
corpus/dev.spacy

============================= Generating config =============================
โ„น Auto-generating config with spaCy
โœ” Generated training config

======================== Generating cached label data ========================
โœ” Saving label data for component 'ner'
corpus/labels/ner.json

============================= Finalizing export =============================
โœ” Saved training config
corpus/config.cfg

To use this data for training with spaCy, you can run:
python -m spacy train corpus/config.cfg --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy

================================ train_spacy ================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m spacy train configs/config.cfg --output training/ --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy --gpu-id -1
โ„น Saving to output directory: training
โ„น Using CPU
โ„น To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-02-03 15:10:39,792] [INFO] Set up nlp object from config
[2023-02-03 15:10:39,799] [INFO] Pipeline: ['tok2vec', 'ner']
[2023-02-03 15:10:39,801] [INFO] Created vocabulary
[2023-02-03 15:10:39,802] [INFO] Finished initializing nlp object
[2023-02-03 15:10:41,529] [INFO] Initialized pipeline components: ['tok2vec', 'ner']
โœ” Initialized pipeline

============================= Training pipeline =============================
โ„น Pipeline: ['tok2vec', 'ner']
โ„น Initial learn rate: 0.0
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     46.17    0.00    0.00    0.00    0.00
  0     200         10.44  14143.08    0.00    0.00    0.00    0.00                                                                                                                               
  0     400         17.36    921.48    0.00    0.00    0.00    0.00                                                                                                                               
  1     600         18.70    517.74    0.00    0.00    0.00    0.00                                                                                                                               
  1     800         22.26    619.64    0.83   50.00    0.42    0.01                                                                                                                               
  2    1000         26.61    656.45    4.84   60.00    2.52    0.05                                                                                                                               
  3    1200         29.64    745.70    9.67   41.94    5.46    0.10                                                                                                                               
  4    1400         37.73    754.50   20.98   47.76   13.45    0.21                                                                                                                               
  6    1600         82.65    884.78   30.59   46.96   22.69    0.31                                                                                                                               
  7    1800        391.86    984.87   36.60   49.64   28.99    0.37                                                                                                                               
  9    2000        354.41   1072.19   39.60   48.19   33.61    0.40                                                                                                                               
 12    2200        107.65    988.55   41.21   51.25   34.45    0.41                                                                                                                               
 15    2400        138.04   1029.82   47.12   55.06   41.18    0.47                                                                                                                               
 19    2600        149.17    955.62   50.24   57.61   44.54    0.50                                                                                                                               
 22    2800        124.06    703.44   50.84   59.22   44.54    0.51                                                                                                                               
 25    3000        121.32    583.64   53.72   62.57   47.06    0.54                                                                                                                               
 29    3200        112.32    431.85   54.55   63.33   47.90    0.55                                                                                                                               
 32    3400        115.82    384.64   55.77   65.17   48.74    0.56                                                                                                                               
 35    3600        122.27    307.42   55.50   64.44   48.74    0.56                                                                                                                               
 38    3800        124.70    295.25   57.84   69.41   49.58    0.58                                                                                                                               
 42    4000        153.26    254.92   57.56   68.60   49.58    0.58                                                                                                                               
 45    4200        183.82    225.83   57.63   68.00   50.00    0.58                                                                                                                               
 48    4400        191.45    206.76   57.62   66.48   50.84    0.58                                                                                                                               
 52    4600        183.82    170.08   57.42   66.67   50.42    0.57                                                                                                                               
 55    4800        104.11    106.09   57.76   66.85   50.84    0.58                                                                                                                               
 58    5000        132.83     96.88   57.97   68.18   50.42    0.58                                                                                                                               
 62    5200        104.80     78.27   59.51   70.93   51.26    0.60                                                                                                                               
 65    5400         94.62     77.89   59.66   71.35   51.26    0.60                                                                                                                               
 68    5600         88.30     58.62   59.51   70.93   51.26    0.60                                                                                                                               
 72    5800         91.84     43.24   60.00   71.51   51.68    0.60                                                                                                                               
 75    6000        132.88     50.87   59.86   68.85   52.94    0.60                                                                                                                               
 78    6200         77.27     42.82   60.59   73.21   51.68    0.61                                                                                                                               
 82    6400         73.68     33.23   60.78   72.94   52.10    0.61                                                                                                                               
 85    6600         79.77     29.21   61.65   72.99   53.36    0.62                                                                                                                               
 88    6800        125.10     44.11   61.69   72.32   53.78    0.62                                                                                                                               
 91    7000         62.31     29.18   61.95   73.84   53.36    0.62                                                                                                                               
 95    7200         44.03     19.51   61.99   73.14   53.78    0.62                                                                                                                               
 98    7400         46.05     15.76   60.98   72.67   52.52    0.61                                                                                                                               
101    7600         43.38     10.81   62.20   72.22   54.62    0.62                                                                                                                               
105    7800         25.63     10.48   58.65   72.67   49.16    0.59                                                                                                                               
108    8000         92.39     25.84   62.35   72.63   54.62    0.62                                                                                                                               
111    8200         27.62      9.18   62.65   73.45   54.62    0.63                                                                                                                               
115    8400         40.35     11.85   62.14   73.56   53.78    0.62                                                                                                                               
118    8600         24.75      8.94   62.05   71.82   54.62    0.62                                                                                                                               
121    8800         32.70     10.96   61.72   71.67   54.20    0.62                                                                                                                               
125    9000         23.91      7.12   61.24   71.11   53.78    0.61                                                                                                                               
128    9200         31.73     10.01   61.24   71.11   53.78    0.61                                                                                                                               
131    9400         65.21     20.19   61.72   71.67   54.20    0.62                                                                                                                               
134    9600         11.40      3.41   61.54   71.91   53.78    0.62                                                                                                                               
138    9800         21.41      6.48   61.69   72.32   53.78    0.62                                                                                                                               
Epoch 139:   0%|                                                                                                                                                          | 0/200 [00:00<?, ?it/s]โœ” Saved pipeline to output directory
training/model-last

Alternatively, you can run prodigy train on the same data by running all_prodigy:

$ python3 -m spacy project run all_prodigy
โ„น Running workflow 'all_prodigy'

=================================== db-in ===================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_training assets/fashion_brands_training.jsonl
โœ” Imported 1235 annotations to 'fashion_brands_training' (session
2023-02-03_15-19-02) in database SQLite
Found and keeping existing "answer" in 1235 examples
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_eval assets/fashion_brands_eval.jsonl
โœ” Imported 500 annotations to 'fashion_brands_eval' (session
2023-02-03_15-19-04) in database SQLite
Found and keeping existing "answer" in 500 examples

=============================== train_prodigy ===============================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy train training/ --ner fashion_brands_training,eval:fashion_brands_eval --config configs/config.cfg --gpu-id -1
โ„น Using CPU
โ„น To switch to GPU 0, use the option: --gpu-id 0

========================= Generating Prodigy config =========================
โœ” Generated training config

=========================== Initializing pipeline ===========================
[2023-02-03 15:19:05,519] [INFO] Set up nlp object from config
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 2470 | Evaluation: 1000 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
[2023-02-03 15:19:05,818] [INFO] Pipeline: ['tok2vec', 'ner']
[2023-02-03 15:19:05,820] [INFO] Created vocabulary
[2023-02-03 15:19:05,821] [INFO] Finished initializing nlp object
[2023-02-03 15:19:06,960] [INFO] Initialized pipeline components: ['tok2vec', 'ner']
โœ” Initialized pipeline

============================= Training pipeline =============================
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 2470 | Evaluation: 1000 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
โ„น Pipeline: ['tok2vec', 'ner']
โ„น Initial learn rate: 0.0
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     46.17    0.00    0.00    0.00    0.00
  0     200         10.44  14143.08    0.00    0.00    0.00    0.00
  0     400         17.36    921.48    0.00    0.00    0.00    0.00
  1     600         18.70    517.74    0.00    0.00    0.00    0.00
  1     800         22.26    619.64    0.83   50.00    0.42    0.01
  2    1000         26.61    656.45    4.84   60.00    2.52    0.05
  3    1200         29.64    745.70    9.67   41.94    5.46    0.10
  4    1400         37.73    754.50   20.98   47.76   13.45    0.21
  6    1600         82.65    884.78   30.59   46.96   22.69    0.31
  7    1800        391.86    984.87   36.60   49.64   28.99    0.37
  9    2000        354.41   1072.19   39.60   48.19   33.61    0.40
 12    2200        107.65    988.55   41.21   51.25   34.45    0.41
 15    2400        138.04   1029.82   47.12   55.06   41.18    0.47
 19    2600        149.17    955.62   50.24   57.61   44.54    0.50
 22    2800        124.06    703.44   50.84   59.22   44.54    0.51
 25    3000        121.32    583.64   53.72   62.57   47.06    0.54
 29    3200        112.32    431.85   54.55   63.33   47.90    0.55
 32    3400        115.82    384.64   55.77   65.17   48.74    0.56
 35    3600        122.27    307.42   55.50   64.44   48.74    0.56
 38    3800        124.70    295.25   57.84   69.41   49.58    0.58
 42    4000        153.26    254.92   57.56   68.60   49.58    0.58
 45    4200        183.82    225.83   57.63   68.00   50.00    0.58
 48    4400        191.45    206.76   57.62   66.48   50.84    0.58
 52    4600        183.82    170.08   57.42   66.67   50.42    0.57
 55    4800        104.11    106.09   57.76   66.85   50.84    0.58
 58    5000        132.83     96.88   57.97   68.18   50.42    0.58
 62    5200        104.80     78.27   59.51   70.93   51.26    0.60
 65    5400         94.62     77.89   59.66   71.35   51.26    0.60
 68    5600         88.30     58.62   59.51   70.93   51.26    0.60
 72    5800         91.84     43.24   60.00   71.51   51.68    0.60
 75    6000        132.88     50.87   59.86   68.85   52.94    0.60
 78    6200         77.27     42.82   60.59   73.21   51.68    0.61
 82    6400         73.68     33.23   60.78   72.94   52.10    0.61
 85    6600         79.77     29.21   61.65   72.99   53.36    0.62
 88    6800        125.10     44.11   61.69   72.32   53.78    0.62
 91    7000         62.31     29.18   61.95   73.84   53.36    0.62
 95    7200         44.03     19.51   61.99   73.14   53.78    0.62
 98    7400         46.05     15.76   60.98   72.67   52.52    0.61
101    7600         43.38     10.81   62.20   72.22   54.62    0.62
105    7800         25.63     10.48   58.65   72.67   49.16    0.59
108    8000         92.39     25.84   62.35   72.63   54.62    0.62
111    8200         27.62      9.18   62.65   73.45   54.62    0.63
115    8400         40.35     11.85   62.14   73.56   53.78    0.62
118    8600         24.75      8.94   62.05   71.82   54.62    0.62
121    8800         32.70     10.96   61.72   71.67   54.20    0.62
125    9000         23.91      7.12   61.24   71.11   53.78    0.61
128    9200         31.73     10.01   61.24   71.11   53.78    0.61
131    9400         65.21     20.19   61.72   71.67   54.20    0.62
134    9600         11.40      3.41   61.54   71.91   53.78    0.62
138    9800         21.41      6.48   61.69   72.32   53.78    0.62
โœ” Saved pipeline to output directory
training/model-last

From these two examples, you should get the same results!

Here's the versions:

$ python -m prodigy stats
============================== โœจ  Prodigy Stats ==============================

Version          1.11.10                       
Location         /opt/homebrew/lib/python3.10/site-packages/prodigy   
Platform         macOS-13.0.1-arm64-arm-64bit  
Python Version   3.10.8

$ python -m spacy info

============================== Info about spaCy ==============================

spaCy version    3.5.0                         
Location         /opt/homebrew/lib/python3.10/site-packages/spacy
Platform         macOS-13.0.1-arm64-arm-64bit  
Python version   3.10.8