No tagger in pre-trained models?

I am trying to do Coref tagging.

My CLI is this:
python -m prodigy coref.manual coref_dataset en_core_web_sm d:\training\spacy_rel\assets\text_fragments.jsonl --label COREF

The error message I get confused me since I more or less copied this off the documentation. And I can't imagine that the pre-trained model is not complete...
ValueError: [E155] The pipeline needs to include a morphologizer or tagger+attribute_ruler in order to use Matcher or PhraseMatcher with the attribute POS. Try using 'nlp()' instead of 'nlp.make_doc()' or 'list(nlp.pipe())' instead of 'list(nlp.tokenizer.pipe())'.

I assume this is a silly mistake on my side, but I can't see it...

In case it is important: I'm running Prodigy 1.14.12 and spaCy 3.7 (incl. the 3.7 models).


Hi @akimotode,

Have you modified the spaCy pipeline in any way, e.g. by adding EntityRuler or custom NER component?
If so, you should make sure, the entity_ruler and ner components are after the atribute_ruler so that the POS labels produced by tagger and attribute_ruler are available for entity_ruler and ner.
You can see your current order like so:

import spacy
nlp = spacy.load("en_core_web_sm")

To change the order, if necessary:

# move the NER component to the end of the pipeline: remove and then reload from the same source in the new position
nlp.add_pipe("ner", source=spacy.load("en_core_web_sm"))

# add entity ruler
nlp.add_pipe("entity_ruler", before="ner")
# ['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'entity_ruler', 'ner', ... ]