ValueError: [T003] Resizing pretrained Tagger models is not currently supported.

Sab1 · February 11, 2020, 1:49pm

Hello, I used pos.correct to manually correct spaCy's predictions. After completing the annotations, I tried to train the model using the annotated dataset (pos_correct_v2) but it gave me the following error:

ValueError: [T003] Resizing pretrained Tagger models is not currently supported.

Could you please help me in determining what I am doing wrong? Below is the full code:

python -m prodigy train tagger pos_correct_v2 en_core_web_md

Loaded model 'en_core_web_md'
Created and merged data for 46 total examples
Traceback (most recent call last):
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy_main.py", line 60, in
controller = recipe(args, use_plac=True)
File "cython_src\prodigy\core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 207, in consume
return cmd, self.func((args + varargs + extraopts), **kwargs)
File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 102, in train
pipe.add_label(label)
File "pipes.pyx", line 565, in spacy.pipeline.pipes.Tagger.add_label
ValueError: [T003] Resizing pretrained Tagger models is not currently supported.

ines · February 11, 2020, 9:26pm

Hi! The error could maybe be a little more specific, sorry about that. What it's trying to tell you is that spaCy currently doesn't suppoort adding more labels to an existing pretrained tagger. So your training data seems to include labels that have not been added to the model.

One possible explanation is that the annotations were collected using coarse-grained tags like VERB (and not the fine-grained tags like VBZ etc. that are the underlying labels in the model). If that's the case, the easiest workaround would be to set the --binary flag to use Prodigy's annotation model that can handle coarse-grained tag.

You could also use Prodigy to turn your coarse-grained POS tag annotations into fine-grained annotations by streaming in the data again and adding multiple-choice options based on the possible options in the tag map (nlp.Defaults.tag_map) – for instance, VBD, VBP, VBZ and so on for VERB. This makes sense if you want the fine-grained distinction in your data – if not, it's probably overkill.

Sab · February 12, 2020, 9:49am

Hi Ines, thank you for your quick reply!

So pos.correct should only be used with fine-grained tags - is that correct? Because when I used pos.correct in Prodigy with the coarse grained tags (which is the default), I did not add any new tags - I only updated the existing pre-selected annotations.

I don't really need the fine-grained tags, but if I understand correctly, that's the only way to add/change the annotations suggested by the model - is that correct?

ines · February 12, 2020, 3:50pm

You can use both – it's just that for the coarse-grained tags, we need a bit of extra "magic", which is only available in Prodigy, to be able to update the model with the information. (For instance, if we know that something is a VERB, but we don't know if it's actually VBD, VBZ etc., we can still update the model towards the VERB.) If you add the --binary flag when you run the train command, it will use Prodigy's annotation model with the extra logic needed for coarse-grained tags.

(We should probably make this clear in the docs and also solve this more elegantly in the future – parts of this were a little tricky while we needed to preserve backwards-compatibility.)

Sab · February 13, 2020, 9:05am

Gotcha, now I understand. Definitely would be helpful to have this explained in the docs.

...but unfortunately, when I added the --binary flag when I ran the train command, I still got the same error.

python -m prodigy train tagger pos_correct_v2 en_core_web_md --binary

Is there something I am doing wrong? Or something more I should be doing?

ines · February 14, 2020, 1:25pm

Ahhh I think there's a second problem here that I missed earlier: the train recipe also adds all labels present in the data to the model, which makes sense for all other scenarios – except the one where you have coarse-grained part-of-speech tags. What happens if you comment out those lines (line 101-102 in prodigy/recipes/train.py)?

Alternatively, you could also just use the previous pos.batch-train recipe. It's still included with Prodigy and the plan it to replace it with the new train recipe. But since this one use case isn't fully covered yet, there's nothing wrong with using the old recipe

Sab · February 18, 2020, 10:01am

    Component: tagger | Batch size: compounding | Dropout: 0.2 | Iterations: 10
Preformatted text`Traceback (most recent call last):
Preformatted text`  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\__main__.py", line 60, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src\prodigy\core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 136, in train
    eval_data = [(doc.text, annot) for doc, annot in eval_data]
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\recipes\train.py", line 136, in <listcomp>
    eval_data = [(doc.text, annot) for doc, annot in eval_data]
ValueError: too many values to unpack (expected 2)

Traceback (most recent call last):
  File "C:\Users\Namea\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\__main__.py", line 60, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src\prodigy\core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "C:\Users\Name\AppData\Local\Continuum\anaconda3\lib\site-packages\prodigy\deprecated\train.py", line 536, in pos_batch_train
    losses = model.batch_train(examples, batch_size=batch_size, drop=dropout)
  File "cython_src\prodigy\models\pos.pyx", line 87, in prodigy.models.pos.Tagger.batch_train
  File "cython_src\prodigy\models\pos.pyx", line 141, in prodigy.models.pos.Tagger.update
  File "cython_src\prodigy\models\pos.pyx", line 150, in prodigy.models.pos.Tagger.inc_gradient
  File "cython_src\prodigy\models\pos.pyx", line 75, in prodigy.models.pos.Tagger.get_label_index
ValueError: tuple.index(x): x not in tuple

@ines Could you kindly help in resolving this?

ines · February 18, 2020, 11:13am

The first error here looks like a very different problem and not related to adding the labels to the model. Are you using the latest version of Prodigy?

And how are you running the previous pos.batch-train recipe? What's the command and labels you're setting? That workflow should work, because nothing here changed.

Topic		Replies	Views
Error loading spacy POS TAG model for pos.teach usage , spacy	3	479	November 26, 2019
Custom POS tag model and errors spacy , custom , pos	3	2364	January 16, 2019
tagger -> neg_prefix extra fields not permitted ner , spacy , solved	0	1196	March 29, 2022
spacy pretrain TypeError ner , spacy	2	553	April 27, 2020
Training on binary annotations throws error done , training	4	683	August 12, 2021

ValueError: [T003] Resizing pretrained Tagger models is not currently supported.

Related topics