I am trying to do Coref tagging.
My CLI is this:
python -m prodigy coref.manual coref_dataset en_core_web_sm d:\training\spacy_rel\assets\text_fragments.jsonl --label COREF
The error message I get confused me since I more or less copied this off the documentation. And I can't imagine that the pre-trained model is not complete...
ValueError: [E155] The pipeline needs to include a morphologizer or tagger+attribute_ruler in order to use Matcher or PhraseMatcher with the attribute POS. Try using 'nlp()' instead of 'nlp.make_doc()' or 'list(nlp.pipe())' instead of 'list(nlp.tokenizer.pipe())'.
I assume this is a silly mistake on my side, but I can't see it...
In case it is important: I'm running Prodigy 1.14.12 and spaCy 3.7 (incl. the 3.7 models).
Cheers,
Kai
Hi @akimotode,
Have you modified the spaCy pipeline in any way, e.g. by adding EntityRuler or custom NER component?
If so, you should make sure, the entity_ruler
and ner
components are after the atribute_ruler
so that the POS
labels produced by tagger
and attribute_ruler
are available for entity_ruler
and ner
.
You can see your current order like so:
import spacy
nlp = spacy.load("en_core_web_sm")
print(nlp.pipe_names)
To change the order, if necessary:
# move the NER component to the end of the pipeline: remove and then reload from the same source in the new position
nlp.remove_pipe("ner")
nlp.add_pipe("ner", source=spacy.load("en_core_web_sm"))
# add entity ruler
nlp.add_pipe("entity_ruler", before="ner")
print(nlp.pipe_names)
# ['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'entity_ruler', 'ner', ... ]