Merge Entities Error

Thanks for the report! The problem here is that the terms.train-vectors adds a new merge_entities component to the pipeline, which is later added to the model’s meta.json. So when you load the model back in, spaCy is trying to find a factory for that component to initialise it (just like it does for the 'tagger' or 'parser').

Sorry about that – the way this is currently handled is kind of unideal – we need to go back and think about how to best solve this. For now, you could simply remove the 'merge_entities' component from the "pipeline" setting of your model’s meta.json, add the component manually after loading the model:

from prodigy.components.preprocess import merge_entities

nlp = spacy.load('your_model')
nlp.add_pipe(merge_entities, name='merge_entities')

This ensures that the entities are merged so the vectors you’ve trained for the merged entities are available. Here’s the function for reference:

def merge_entities(doc):
    """Preprocess a spaCy doc, merging entities into a single token.
    Best used with nlp.add_pipe(merge_entities).

    doc (spacy.tokens.Doc): The Doc object.
    RETURNS (Doc): The Doc object with merged noun entities.
    """
    spans = [(e.start_char, e.end_char, e.root.tag, e.root.dep, e.label)
             for e in doc.ents]
    for start, end, tag, dep, ent_type in spans:
        doc.merge(start, end, tag=tag, dep=dep, ent_type=ent_type)
    return doc

Alternatively, you could also package your model using the spacy package command and add an entry to Language.factories that initialises the pipeline component – my comments on this thread have more details on this solution.