Combining the results from two different Models

DerDiego13 · December 13, 2023, 8:53am

I have a well-performing NER model (self-trained) as well as a well-performing spancat model. However, they do not share the same Tok2Vec / Vocab. Therefore, just adding one to the other with nlp.add_pipe does not bring the same results.
Can I apply them to the same sentence idenpendently and that just add the span information from the spancat doc to the ner doc? So that I have one final doc that includes both entities and spans.

ryanwesslen · December 14, 2023, 2:09pm

hi @DerDiego13,

Thanks for your question.

If your vectors are different, then you can try this:

import spacy
from spacy.tokens import Doc

nlp1 = spacy.load("ner_model")
nlp2 = spacy.load("spancat_model")
doc1 = nlp1(text)
doc2 = Doc(nlp2.vocab).from_bytes(doc1.to_bytes())
doc2 = nlp2(doc2)

Just for completeness, if you're assuming the same vectors it would be:

nlp_ner = spacy.load("ner_model")
nlp_spancat = spacy.load("spancat_model", vocab=nlp_ner.vocab)
doc = nlp_ner(text)
doc = nlp_spancat(doc)

Or use nlp.add_pipe but replace the listeners first:

nlp_spancat.replace_listeners("tok2vec", "spancat", ["model.tok2vec"]) nlp_ner.add_pipe("spancat", source=nlp_spancat)

FYI, for questions like this that are spaCy-specific, make sure to check out the spaCy GitHub discussions first and consider posting there. This forum is for Prodigy-specific questions and while there can sometimes be overlap, you'll likely get a faster response by posting spaCy questions there.

Hope this helps!

Topic		Replies	Views
Annotation interface to do both SpanCat and NER ner , spancat	2	564	August 31, 2022
Is it possible for the entities tagged and merged in one document to be respected when passed to another spacy.load() model? usage , ner , spacy	3	513	December 3, 2020
Combining and validating spaCy labels and in-house NER output usage , ner	3	281	July 20, 2023
Spancat: use of embeddings, compatibility with transformers, upstream to relationship extraction usage , relations , spancat	4	787	November 17, 2021
Can't merge non-disjoint spans when using terms.train-vectors terms	7	2224	December 18, 2019

Combining the results from two different Models

Related topics