I just noticed that v1.11
yields worse results than v.10.8
. I know the accuracy is crazy high but the task is quite easy. Looking the following outputs I'm a little concerned about using the new training script. Any comments? Actually I'm not even sure if the same score are being used?
I only have one label currently but each document can have multiple labels in the future
v1.10.8
❯ prodigy train textcat tags-earnings blank:en
✔ Loaded model 'blank:en'
Created and merged data for 30340 total examples
Using 24272 train / 6068 eval (split 20%)
Component: textcat | Batch size: compounding | Dropout: 0.2 | Iterations: 10
ℹ Baseline accuracy: 0.573
=========================== ✨ Training the model ===========================
# Loss F-Score
-- -------- --------
1 90.53 0.998
2 0.05 0.999
3 0.04 0.999
...
v1.11.1
❯ prodigy train model --textcat-multilabel tags-earnings --base-model blank:en --gpu-id 0
ℹ Using GPU: 0
========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
ℹ Using config from base model
✔ Generated training config
=========================== Initializing pipeline ===========================
[2021-08-19 10:59:23,965] [INFO] Set up nlp object from config
Components: textcat_multilabel
Merging training and evaluation data for 1 components
- [textcat_multilabel] Training: 24850 | Evaluation: 6212 (20% split)
Training: 24348 | Evaluation: 6162
Labels: textcat_multilabel (1)
[2021-08-19 10:59:44,940] [INFO] Pipeline: ['textcat_multilabel']
[2021-08-19 10:59:44,942] [INFO] Created vocabulary
[2021-08-19 10:59:44,942] [INFO] Finished initializing nlp object
[2021-08-19 11:04:03,859] [INFO] Initialized pipeline components: ['textcat_multilabel']
✔ Initialized pipeline
============================= Training pipeline =============================
Components: textcat_multilabel
Merging training and evaluation data for 1 components
- [textcat_multilabel] Training: 24850 | Evaluation: 6212 (20% split)
Training: 24348 | Evaluation: 6162
Labels: textcat_multilabel (1)
ℹ Pipeline: ['textcat_multilabel']
ℹ Initial learn rate: 0.001
E # LOSS TEXTC... CATS_SCORE SCORE
--- ------ ------------- ---------- ------
0 0 0.25 23.22 0.23
.venv/lib/python3.9/site-packages/thinc/backends/ops.py:575: RuntimeWarning: overflow encountered in exp
return cast(FloatsType, 1.0 / (1.0 + self.xp.exp(-X)))
0 200 29.42 26.12 0.26
0 400 24.75 47.24 0.47
0 600 14.49 82.22 0.82
0 800 9.31 94.85 0.95
0 1000 9.09 95.81 0.96
0 1200 6.06 95.97 0.96
0 1400 8.99 93.77 0.94
0 1600 5.46 98.40 0.98
0 1800 4.66 96.37 0.96
0 2000 6.60 97.91 0.98
0 2200 6.03 98.20 0.98
0 2400 1.88 99.06 0.99
0 2600 4.84 99.30 0.99
0 2800 5.86 99.04 0.99
0 3000 2.18 98.79 0.99
0 3200 1.81 96.60 0.97
0 3400 6.08 98.86 0.99
0 3600 5.32 97.38 0.97
0 3800 1.12 98.75 0.99
0 4000 2.05 99.10 0.99
0 4200 6.30 99.47 0.99
0 4400 2.72 98.94 0.99
0 4600 1.83 98.82 0.99
0 4800 1.85 97.64 0.98
0 5000 4.61 98.63 0.99
0 5200 2.04 98.21 0.98
0 5400 2.85 99.18 0.99
0 5600 3.72 97.34 0.97
0 5800 2.62 98.71 0.99
✔ Saved pipeline to output directory
model/model-last
Also what does E
and #
denote in the header?