prodigy train result is different with the spacy train result, why?

sigitpurnomo · February 17, 2022, 12:29pm

Hi,

I have created an annotated NER dataset using Prodigy. After finishing the annotation, I export the dataset using Prodigy's data-to-spacy CLI.

Then I did the training process with CLI both on Spacy and Prodigy. After the training process finishes, I have found that it produces different results. The best f-score for the Prodigy train result is 81.792, and the Spacy train result is 82.323. Why does it's happening? From what I knew, it should produce the same results.

For your information, I am using Prodigy v1.10.8 and Spacy v2.3.5. Here are the steps that I have done:

1. python -m prodigy data-to-spacy ./train-70.json ./dev-30.json --lang id --ner my-dataset --eval-split 0.3

2. prodigy train ner my-dataset blank:id --output ./prodigy-model --eval-split 0.3 --n-iter 10

3. python -m spacy train id ./spacy-model ./train-70.json ./dev-30.json -p ner -n 10

The config file for both of the processes is the same:

{
  "beam_width":1,
  "beam_density":0.0,
  "beam_update_prob":1.0,
  "cnn_maxout_pieces":3,
  "nr_feature_tokens":6,
  "nr_class":46,
  "hidden_depth":1,
  "token_vector_width":96,
  "hidden_width":64,
  "maxout_pieces":2,
  "pretrained_vectors":null,
  "bilstm_depth":0,
  "self_attn_depth":0,
  "conv_depth":4,
  "conv_window":1,
  "embed_size":2000
}

Thank you

ines · February 22, 2022, 9:48am

Hi! Older versions of Prodigy (v1.10 and below) used their own training loop implementation with its own default settings, so it's possible that a small difference in dropout, learning rate or batching can easily account for a small difference in accuracy of +/- 1%.

This was actually one of the main motivations we standardised the training process in v1.11+ to call into spacy train directly, and it's also a good example of why the config system in spaCy v3 is useful for reproducible experiments, because it prevents hidden defaults.

sigitpurnomo · February 22, 2022, 10:04am

Hi Ines

Thank you. So, is it okay to use the resulted data-to-spacy command in Spacy V3?

ines · February 22, 2022, 5:07pm

Yes, that's the recommended workflow once you're serious about training your model

sigitpurnomo · February 23, 2022, 2:06pm

Hello Ines

My purpose of using the spacy train CLI is to do some experiments related to the number of iterations like in the prodigy train CLI to observe the score of the metrics. Is it possible? I have already tried using the spacy train in Spacy V3, but I cannot find how to do this.

Thank you

ines · February 28, 2022, 9:56am

I'm not sure I understand what you're trying to do here? What do you mean by the score of the metrics? If you're looking for per-label stats, you can get the same results and more using spacy evaluate.

sigitpurnomo · February 28, 2022, 10:26am

Hi Ines

What I have done with Prodigy and Spacy V2 is doing several experiment based-on the number of train split data and number of iteration, for example train split 70:30, iteration 50, 75, 100, and 500 plus 80:20 train split and 50, 75, 100, and 500 iteration. From those experiments, I observed the overall f-score, recall, and precision for all entity to determine the best model (the highest f-score). After that I will observe more details on the metrics for all of the entities.

I want to migrate that experiments using Spacy V3.

ryanwesslen · February 3, 2023, 8:47pm

hi @sigitpurnomo!

Sorry for the delayed response. We're trying to close out old issues.

Regarding comparing prodigy train and spacy train, I recommend anyone interested to check out the Prodigy sample project:

If you clone this repo, you can run two examples to compare spacy train and prodigy train.

Using sample fashion data, you can run spacy train by running

python -m spacy project run all

This will load the data (db-in), export the data and config file (data-to-spacy), and spacy train (see the project.yml).

python -m spacy project run all
ℹ Running workflow 'all'

=================================== db-in ===================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_training assets/fashion_brands_training.jsonl
✔ Created dataset 'fashion_brands_training' in database SQLite
✔ Imported 1235 annotations to 'fashion_brands_training' (session
2023-02-03_15-10-32) in database SQLite
Found and keeping existing "answer" in 1235 examples
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_eval assets/fashion_brands_eval.jsonl
✔ Created dataset 'fashion_brands_eval' in database SQLite
✔ Imported 500 annotations to 'fashion_brands_eval' (session
2023-02-03_15-10-33) in database SQLite
Found and keeping existing "answer" in 500 examples

=============================== data-to-spacy ===============================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy data-to-spacy corpus/ --ner fashion_brands_training,eval:fashion_brands_eval
ℹ Using language 'en'

============================== Generating data ==============================
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 1235 | Evaluation: 500 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
✔ Saved 1235 training examples
corpus/train.spacy
✔ Saved 500 evaluation examples
corpus/dev.spacy

============================= Generating config =============================
ℹ Auto-generating config with spaCy
✔ Generated training config

======================== Generating cached label data ========================
✔ Saving label data for component 'ner'
corpus/labels/ner.json

============================= Finalizing export =============================
✔ Saved training config
corpus/config.cfg

To use this data for training with spaCy, you can run:
python -m spacy train corpus/config.cfg --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy

================================ train_spacy ================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m spacy train configs/config.cfg --output training/ --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy --gpu-id -1
ℹ Saving to output directory: training
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

=========================== Initializing pipeline ===========================
[2023-02-03 15:10:39,792] [INFO] Set up nlp object from config
[2023-02-03 15:10:39,799] [INFO] Pipeline: ['tok2vec', 'ner']
[2023-02-03 15:10:39,801] [INFO] Created vocabulary
[2023-02-03 15:10:39,802] [INFO] Finished initializing nlp object
[2023-02-03 15:10:41,529] [INFO] Initialized pipeline components: ['tok2vec', 'ner']
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.0
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     46.17    0.00    0.00    0.00    0.00
  0     200         10.44  14143.08    0.00    0.00    0.00    0.00                                                                                                                               
  0     400         17.36    921.48    0.00    0.00    0.00    0.00                                                                                                                               
  1     600         18.70    517.74    0.00    0.00    0.00    0.00                                                                                                                               
  1     800         22.26    619.64    0.83   50.00    0.42    0.01                                                                                                                               
  2    1000         26.61    656.45    4.84   60.00    2.52    0.05                                                                                                                               
  3    1200         29.64    745.70    9.67   41.94    5.46    0.10                                                                                                                               
  4    1400         37.73    754.50   20.98   47.76   13.45    0.21                                                                                                                               
  6    1600         82.65    884.78   30.59   46.96   22.69    0.31                                                                                                                               
  7    1800        391.86    984.87   36.60   49.64   28.99    0.37                                                                                                                               
  9    2000        354.41   1072.19   39.60   48.19   33.61    0.40                                                                                                                               
 12    2200        107.65    988.55   41.21   51.25   34.45    0.41                                                                                                                               
 15    2400        138.04   1029.82   47.12   55.06   41.18    0.47                                                                                                                               
 19    2600        149.17    955.62   50.24   57.61   44.54    0.50                                                                                                                               
 22    2800        124.06    703.44   50.84   59.22   44.54    0.51                                                                                                                               
 25    3000        121.32    583.64   53.72   62.57   47.06    0.54                                                                                                                               
 29    3200        112.32    431.85   54.55   63.33   47.90    0.55                                                                                                                               
 32    3400        115.82    384.64   55.77   65.17   48.74    0.56                                                                                                                               
 35    3600        122.27    307.42   55.50   64.44   48.74    0.56                                                                                                                               
 38    3800        124.70    295.25   57.84   69.41   49.58    0.58                                                                                                                               
 42    4000        153.26    254.92   57.56   68.60   49.58    0.58                                                                                                                               
 45    4200        183.82    225.83   57.63   68.00   50.00    0.58                                                                                                                               
 48    4400        191.45    206.76   57.62   66.48   50.84    0.58                                                                                                                               
 52    4600        183.82    170.08   57.42   66.67   50.42    0.57                                                                                                                               
 55    4800        104.11    106.09   57.76   66.85   50.84    0.58                                                                                                                               
 58    5000        132.83     96.88   57.97   68.18   50.42    0.58                                                                                                                               
 62    5200        104.80     78.27   59.51   70.93   51.26    0.60                                                                                                                               
 65    5400         94.62     77.89   59.66   71.35   51.26    0.60                                                                                                                               
 68    5600         88.30     58.62   59.51   70.93   51.26    0.60                                                                                                                               
 72    5800         91.84     43.24   60.00   71.51   51.68    0.60                                                                                                                               
 75    6000        132.88     50.87   59.86   68.85   52.94    0.60                                                                                                                               
 78    6200         77.27     42.82   60.59   73.21   51.68    0.61                                                                                                                               
 82    6400         73.68     33.23   60.78   72.94   52.10    0.61                                                                                                                               
 85    6600         79.77     29.21   61.65   72.99   53.36    0.62                                                                                                                               
 88    6800        125.10     44.11   61.69   72.32   53.78    0.62                                                                                                                               
 91    7000         62.31     29.18   61.95   73.84   53.36    0.62                                                                                                                               
 95    7200         44.03     19.51   61.99   73.14   53.78    0.62                                                                                                                               
 98    7400         46.05     15.76   60.98   72.67   52.52    0.61                                                                                                                               
101    7600         43.38     10.81   62.20   72.22   54.62    0.62                                                                                                                               
105    7800         25.63     10.48   58.65   72.67   49.16    0.59                                                                                                                               
108    8000         92.39     25.84   62.35   72.63   54.62    0.62                                                                                                                               
111    8200         27.62      9.18   62.65   73.45   54.62    0.63                                                                                                                               
115    8400         40.35     11.85   62.14   73.56   53.78    0.62                                                                                                                               
118    8600         24.75      8.94   62.05   71.82   54.62    0.62                                                                                                                               
121    8800         32.70     10.96   61.72   71.67   54.20    0.62                                                                                                                               
125    9000         23.91      7.12   61.24   71.11   53.78    0.61                                                                                                                               
128    9200         31.73     10.01   61.24   71.11   53.78    0.61                                                                                                                               
131    9400         65.21     20.19   61.72   71.67   54.20    0.62                                                                                                                               
134    9600         11.40      3.41   61.54   71.91   53.78    0.62                                                                                                                               
138    9800         21.41      6.48   61.69   72.32   53.78    0.62                                                                                                                               
Epoch 139:   0%|                                                                                                                                                          | 0/200 [00:00<?, ?it/s]✔ Saved pipeline to output directory
training/model-last

Alternatively, you can run prodigy train on the same data by running all_prodigy:

$ python3 -m spacy project run all_prodigy
ℹ Running workflow 'all_prodigy'

=================================== db-in ===================================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_training assets/fashion_brands_training.jsonl
✔ Imported 1235 annotations to 'fashion_brands_training' (session
2023-02-03_15-19-02) in database SQLite
Found and keeping existing "answer" in 1235 examples
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy db-in fashion_brands_eval assets/fashion_brands_eval.jsonl
✔ Imported 500 annotations to 'fashion_brands_eval' (session
2023-02-03_15-19-04) in database SQLite
Found and keeping existing "answer" in 500 examples

=============================== train_prodigy ===============================
Running command: /opt/homebrew/opt/python@3.10/bin/python3.10 -m prodigy train training/ --ner fashion_brands_training,eval:fashion_brands_eval --config configs/config.cfg --gpu-id -1
ℹ Using CPU
ℹ To switch to GPU 0, use the option: --gpu-id 0

========================= Generating Prodigy config =========================
✔ Generated training config

=========================== Initializing pipeline ===========================
[2023-02-03 15:19:05,519] [INFO] Set up nlp object from config
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 2470 | Evaluation: 1000 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
[2023-02-03 15:19:05,818] [INFO] Pipeline: ['tok2vec', 'ner']
[2023-02-03 15:19:05,820] [INFO] Created vocabulary
[2023-02-03 15:19:05,821] [INFO] Finished initializing nlp object
[2023-02-03 15:19:06,960] [INFO] Initialized pipeline components: ['tok2vec', 'ner']
✔ Initialized pipeline

============================= Training pipeline =============================
Components: ner
Merging training and evaluation data for 1 components
  - [ner] Training: 2470 | Evaluation: 1000 (from datasets)
Training: 1235 | Evaluation: 500
Labels: ner (1)
ℹ Pipeline: ['tok2vec', 'ner']
ℹ Initial learn rate: 0.0
E    #       LOSS TOK2VEC  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  ------------  --------  ------  ------  ------  ------
  0       0          0.00     46.17    0.00    0.00    0.00    0.00
  0     200         10.44  14143.08    0.00    0.00    0.00    0.00
  0     400         17.36    921.48    0.00    0.00    0.00    0.00
  1     600         18.70    517.74    0.00    0.00    0.00    0.00
  1     800         22.26    619.64    0.83   50.00    0.42    0.01
  2    1000         26.61    656.45    4.84   60.00    2.52    0.05
  3    1200         29.64    745.70    9.67   41.94    5.46    0.10
  4    1400         37.73    754.50   20.98   47.76   13.45    0.21
  6    1600         82.65    884.78   30.59   46.96   22.69    0.31
  7    1800        391.86    984.87   36.60   49.64   28.99    0.37
  9    2000        354.41   1072.19   39.60   48.19   33.61    0.40
 12    2200        107.65    988.55   41.21   51.25   34.45    0.41
 15    2400        138.04   1029.82   47.12   55.06   41.18    0.47
 19    2600        149.17    955.62   50.24   57.61   44.54    0.50
 22    2800        124.06    703.44   50.84   59.22   44.54    0.51
 25    3000        121.32    583.64   53.72   62.57   47.06    0.54
 29    3200        112.32    431.85   54.55   63.33   47.90    0.55
 32    3400        115.82    384.64   55.77   65.17   48.74    0.56
 35    3600        122.27    307.42   55.50   64.44   48.74    0.56
 38    3800        124.70    295.25   57.84   69.41   49.58    0.58
 42    4000        153.26    254.92   57.56   68.60   49.58    0.58
 45    4200        183.82    225.83   57.63   68.00   50.00    0.58
 48    4400        191.45    206.76   57.62   66.48   50.84    0.58
 52    4600        183.82    170.08   57.42   66.67   50.42    0.57
 55    4800        104.11    106.09   57.76   66.85   50.84    0.58
 58    5000        132.83     96.88   57.97   68.18   50.42    0.58
 62    5200        104.80     78.27   59.51   70.93   51.26    0.60
 65    5400         94.62     77.89   59.66   71.35   51.26    0.60
 68    5600         88.30     58.62   59.51   70.93   51.26    0.60
 72    5800         91.84     43.24   60.00   71.51   51.68    0.60
 75    6000        132.88     50.87   59.86   68.85   52.94    0.60
 78    6200         77.27     42.82   60.59   73.21   51.68    0.61
 82    6400         73.68     33.23   60.78   72.94   52.10    0.61
 85    6600         79.77     29.21   61.65   72.99   53.36    0.62
 88    6800        125.10     44.11   61.69   72.32   53.78    0.62
 91    7000         62.31     29.18   61.95   73.84   53.36    0.62
 95    7200         44.03     19.51   61.99   73.14   53.78    0.62
 98    7400         46.05     15.76   60.98   72.67   52.52    0.61
101    7600         43.38     10.81   62.20   72.22   54.62    0.62
105    7800         25.63     10.48   58.65   72.67   49.16    0.59
108    8000         92.39     25.84   62.35   72.63   54.62    0.62
111    8200         27.62      9.18   62.65   73.45   54.62    0.63
115    8400         40.35     11.85   62.14   73.56   53.78    0.62
118    8600         24.75      8.94   62.05   71.82   54.62    0.62
121    8800         32.70     10.96   61.72   71.67   54.20    0.62
125    9000         23.91      7.12   61.24   71.11   53.78    0.61
128    9200         31.73     10.01   61.24   71.11   53.78    0.61
131    9400         65.21     20.19   61.72   71.67   54.20    0.62
134    9600         11.40      3.41   61.54   71.91   53.78    0.62
138    9800         21.41      6.48   61.69   72.32   53.78    0.62
✔ Saved pipeline to output directory
training/model-last

From these two examples, you should get the same results!

Here's the versions:

$ python -m prodigy stats
============================== ✨  Prodigy Stats ==============================

Version          1.11.10                       
Location         /opt/homebrew/lib/python3.10/site-packages/prodigy   
Platform         macOS-13.0.1-arm64-arm-64bit  
Python Version   3.10.8

$ python -m spacy info

============================== Info about spaCy ==============================

spaCy version    3.5.0                         
Location         /opt/homebrew/lib/python3.10/site-packages/spacy
Platform         macOS-13.0.1-arm64-arm-64bit  
Python version   3.10.8

Topic		Replies	Views
✨ Prodigy nightly: spaCy v3 support, UI for overlapping spans & more meta , done , spacy , news , nightly	113	12702	January 20, 2022
questions on Multi NERs Annotation & Training at Once in a Sentence usage , ner , spacy	5	615	October 3, 2022
Prodigy ner.batch-train vs Spacy train usage , spacy , best-practices	13	3498	June 2, 2020
Prodigy annotations to SpaCy train spacy	13	5620	January 31, 2018
Prodigy to Spacy Guide ner , spacy , best-practices	4	5333	January 13, 2020

prodigy train result is different with the spacy train result, why?

Related topics