Combining the results from two different Models

I have a well-performing NER model (self-trained) as well as a well-performing spancat model. However, they do not share the same Tok2Vec / Vocab. Therefore, just adding one to the other with nlp.add_pipe does not bring the same results.
Can I apply them to the same sentence idenpendently and that just add the span information from the spancat doc to the ner doc? So that I have one final doc that includes both entities and spans.

hi @DerDiego13,

Thanks for your question.

If your vectors are different, then you can try this:

import spacy
from spacy.tokens import Doc

nlp1 = spacy.load("ner_model")
nlp2 = spacy.load("spancat_model")
doc1 = nlp1(text)
doc2 = Doc(nlp2.vocab).from_bytes(doc1.to_bytes())
doc2 = nlp2(doc2)

Just for completeness, if you're assuming the same vectors it would be:

nlp_ner = spacy.load("ner_model")
nlp_spancat = spacy.load("spancat_model", vocab=nlp_ner.vocab)
doc = nlp_ner(text)
doc = nlp_spancat(doc)

Or use nlp.add_pipe but replace the listeners first:

nlp_spancat.replace_listeners("tok2vec", "spancat", ["model.tok2vec"]) nlp_ner.add_pipe("spancat", source=nlp_spancat)

FYI, for questions like this that are spaCy-specific, make sure to check out the spaCy GitHub discussions first and consider posting there. This forum is for Prodigy-specific questions and while there can sometimes be overlap, you'll likely get a faster response by posting spaCy questions there.

Hope this helps!

1 Like