Can't find component 'spancat' in pipeline.

Mohammad · January 20, 2023, 1:35pm

python3 -m prodigy spans.correct correctd_DBCVDOBV1 ./CV_DOBV1/model-best/ ./12.01.23_100_Cv_Format.jsonl

✘ Can't find component 'spancat' in pipeline. Make sure that the
pipeline you're using includes a trained span categorizer that you can correct.
If your component has a different name, you can use the --component option to
specify it.

Can you help how to resolve this issue

koaning · January 20, 2023, 3:13pm

Just to check, how did you train your model? Did you train for a NER task? If so, that might explain why it doesn't have a spancat component.

Note that you can always check the components in your pipeline from Python. Here's the standard en_core_web_sm model and it's components:

import spacy 

nlp = spacy.load("en_core_web_sm")
print(nlp.pipeline)

This yields:

[('tok2vec', <spacy.pipeline.tok2vec.Tok2Vec at 0x7fea5e251fa0>),
 ('tagger', <spacy.pipeline.tagger.Tagger at 0x7fea5e251ee0>),
 ('parser', <spacy.pipeline.dep_parser.DependencyParser at 0x7fea9610f2e0>),
 ('attribute_ruler',
  <spacy.pipeline.attributeruler.AttributeRuler at 0x7fea5e1e7140>),
 ('lemmatizer',
  <spacy.lang.en.lemmatizer.EnglishLemmatizer at 0x7fea5e1de140>),
 ('ner', <spacy.pipeline.ner.EntityRecognizer at 0x7fea9610f350>)]

Similarily you should also be able to load your own model and see the pipeline components via:

nlp = spacy.load("path/to/model")
print(nlp.pipeline)

Mohammad · January 23, 2023, 6:35am

I printed the pipeline and I got massage below :

[('tok2vec', <spacy.pipeline.tok2vec.Tok2Vec object at 0x0000027FAA5EEB60>), ('ner', <spacy.pipeline.ner.EntityRecognizer object at 0x0000027FAA525310>)]

koaning · January 23, 2023, 10:03am

Then it seems like you've gotten a pretrained model that can detect named entities (NER) while your recipe is trying to predict spans, via prodigy spans.correct. That also explains the warning that you see because a spancat component is needed to predict spans.

To learn more about the difference between spancat and NER, you might appreciate this blogpost:

One of the main differences is that spancat can predict overlapping spans, which NER doesn't allow in spaCy. That means that you probably to train a model that can handle spans, which you can do via the train command in Prodigy. It will probably look something like:

python -m prodigy train --spancat <datasets>

Alternatively, if you can solve your task with NER instead of spancat, you may also consider using the ner.correct recipe instead.

Topic		Replies	Views
Can't find component 'spancat' in pipeline	1	138	March 5, 2024
Error applying ner.correct to a dataset ner	4	303	February 6, 2023
batch-train error ([E001] No component 'tagger' found in pipeline. Available names: ['ner']) training	4	831	September 20, 2021
Error using XLNet for text classification: No component 'trf_tok2vec' found in pipeline textcat , transformers	1	962	January 12, 2020
Base model without tok2vec throws error spacy	11	1085	February 23, 2024

Can't find component 'spancat' in pipeline.

Related topics