Use SpaCy textcat weights in a Prodigy TextClassifier

timothyjlaurent · September 17, 2019, 11:54pm

Hi,

I'm saving a SpaCy model from a batch_train run and now I'd like to use that as the weights in a textcat.teach workflow.

If I initialize the TextClassifier with the SpaCy model with the textcat pipe will it use the previously trained weights or is there some way to make this happen?

timothyjlaurent · September 18, 2019, 10:28pm

OK, I think I might have the answer to my question (but would appreciate confirmation ).

So as far as I can tell, when initializing the TextClassifier prodigy model, it will add sentencizer and textcat pipes to the spacy model if they don't exist.

If they do exist though, it looks like it just uses the ones that are present.

    textcat = nlp.get_pipe('textcat')
    textcat.foo = 'bar'

    model = TextClassifier(
        nlp,
        labels,
        long_text=long_text,
        low_data=len(examples) < 1000,
        init_tok2vec=init_tok2vec,
        exclusive_classes=exclusive,
    )

    assert model.nlp.get_pipe('textcat').foo == 'bar'

Is this right?

ines · September 19, 2019, 10:08am

Thanks for updating and yes, that's correct. What Prodigy calls the TextClassifier class (and what could have probably been named better) is the "annotation model" that takes the nlp object, makes sure that everything is set up correctly and handles updating the model in the loop.

One small thing to note about pre-defined text classifiers is that spaCy currently doesn't support resizing an existing pre-trained text classifier, so you can't add more labels to it if the model's already pre-trained. So if you're using an existing text classifier, the labels in the data all need to be in the model already.

timothyjlaurent · September 19, 2019, 8:07pm

Thank for adding clarity and the bit about adding new labels.

Topic		Replies	Views
Pretraining support usage , textcat , spacy , solved	2	1045	May 21, 2019
Access to/manipulate sent.cat within TextClassifier class? usage , textcat , spacy	4	947	February 21, 2019
Do the outputted models using textcat.batch-train make use of word vectors? usage , textcat , spacy	2	595	March 28, 2019
Custom spacy pipe for Prodigy view textcat , spacy	2	671	November 21, 2019
How to use a (sentence targeted) textcat model together with the core model textcat , spacy	2	1342	November 28, 2017

Use SpaCy textcat weights in a Prodigy TextClassifier

Related topics