train --spancat questions

Hello, This is my first post :slight_smile: in the forum. Sorry if I make mistakes, so here goes.

I'm using prodigy 1.11.6, spacy 3.1.4, spacy-transformers 1.0.6 and python 3.7.3 for this project.

I begin with some questions about training with spancat

python -m prodigy train dest_path --spancat dataset -m en_core_web_lg --gpu-id 0
  1. Why LOSS TOK2VEC column values are always zero? I'm missing something?

here the model I'm loading is

  1. When I load a model trained using transformers get this error
python -m prodigy train dest_path --spancat dataset -m en_core_web_lg --gpu-id 0
nlp_span =spacy.load(path)
ValueError: Cannot deserialize model: mismatched structure

Please , some help will be most appreciated
!

Thanks so much, best!

Hi @mhlucero , welcome to Prodigy!

For the first question, I'm curious if you're combining models in some way? Is there anything in your pipeline
that you're customizing in particular?

For the second question, can you try upgrading your spacy-transformers version? Perhaps the error is coming from an incompatible deserealization of an older model. Upgrading to v1.1.x should work!

Hello Miranda! Thanks so much for your answer!

  • First Question: Nop, I'm using the standard train spancat, with no modifications on the model or config.
  • Second, about transformers. I updated the spacy-transformers library and did a retraining, with the same result.
    Here is the output. As you can see the first column values are always zero (using en_core_web_trf or en_core_web_lg)
>python -m prodigy train ./spans --spancat spans_adq -m en_core_web_trf --gpu-id 0
ℹ Using GPU: 0

========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
Using 'spacy.ngram_range_suggester.v1' for 'spancat' with sizes 1 to 14 (inferred from data)
ℹ Using config from base model
✔ Generated training config

=========================== Initializing pipeline ===========================
[2022-01-26 01:54:36,369] [INFO] Set up nlp object from config
Components: spancat
Merging training and evaluation data for 1 components
  - [spancat] Training: 5604 | Evaluation: 1400 (20% split)
Training: 5247 | Evaluation: 1376
Labels: spancat (4)
[2022-01-26 01:54:37,147] [INFO] Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner', 'spancat']
[2022-01-26 01:54:37,148] [INFO] Resuming training for: ['transformer']
[2022-01-26 01:54:37,154] [INFO] Created vocabulary
[2022-01-26 01:54:37,155] [INFO] Finished initializing nlp object
[2022-01-26 01:54:38,254] [INFO] Initialized pipeline components: ['spancat']
✔ Initialized pipeline

============================= Training pipeline =============================
Components: spancat
Merging training and evaluation data for 1 components
  - [spancat] Training: 5604 | Evaluation: 1400 (20% split)
Training: 5247 | Evaluation: 1376
Labels: spancat (4)
ℹ Pipeline: ['transformer', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer',
'ner', 'spancat']
ℹ Frozen components: ['tagger', 'parser', 'attribute_ruler', 'lemmatizer',
'ner']
ℹ Initial learn rate: 0.0
E    #       LOSS TRANS...  LOSS SPANCAT  SPANS_SC_F  SPANS_SC_P  SPANS_SC_R  SCORE
---  ------  -------------  ------------  ----------  ----------  ----------  ------
  0       0           0.00       7624.78        1.00        0.50       52.42    0.01
 16    1000           0.00    1308630.16       50.70       80.95       36.91    0.51
 32    2000           0.00      92991.15       66.33       77.73       57.84    0.66
 49    3000           0.00      74833.84       71.21       79.33       64.59    0.71
 65    4000           0.00      64684.00       73.33       80.59       67.27    0.73
 81    5000           0.00      57694.04       74.62       81.72       68.66    0.75
 98    6000           0.00      52803.86       75.42       82.19       69.68    0.75
114    7000           0.00      48827.18       76.11       82.89       70.34    0.76
130    8000           0.00      45509.48       76.48       83.13       70.82    0.76
147    9000           0.00      42590.01       76.94       83.36       71.44    0.77
163   10000           0.00      39692.86       77.30       83.40       72.03    0.77
180   11000           0.00      36962.59       77.37       82.78       72.62    0.77
196   12000           0.00      34527.88       77.95       82.69       73.72    0.78
212   13000           0.00      32684.88       78.36       82.35       74.74    0.78
229   14000           0.00      30843.14       78.41       81.80       75.29    0.78
245   15000           0.00      29531.76       78.61       81.81       75.66    0.79
261   16000           0.00      28393.27       78.80       81.86       75.95    0.79