using saved texcat trained model for new data set

red3d · May 1, 2020, 2:22am

Hello,

Sorry for my basic question but I could not figure it out even though I spent hours on documentation and here. Can you please explain how I can use my saved trained model on a new large dataset to classify the texts based on the trained model? In insult classifier video it is shown how to do it for a sentence or sample but I want to label whole unlabeled data set according to the saved model. Again, sorry for taking your time with this basic question, which I am really stuck with.

Thanks in advance.

ines · May 1, 2020, 11:59am

Hi! The model you've trained is a regular spaCy model – so you can apply it to your large dataset just like you would with any other spaCy model. So you'd load your texts and your model, process the texts with your model, and use the doc.cats property to access the predicted categories. Check out the spaCy docs on efficient processing here. The result could look something like this:

your_data = load_list_or_stream_with_lots_of_texts()
for doc in nlp.pipe(your_data):
    # Do something with the predicted categories here
    print(doc.cats)

Topic		Replies	Views
Save trained model and add to a pretrained model usage , textcat , spacy , solved	4	1507	September 19, 2019
How does the Spacy language model classify before any human annotation? textcat , spacy	3	468	March 10, 2020
Multiple, separate text classifications in a single model usage , textcat , solved	12	2877	September 28, 2021
Use textcat and textcat_multilabel in the same model textcat , spacy	1	345	May 19, 2022
Do the outputted models using textcat.batch-train make use of word vectors? usage , textcat , spacy	2	587	March 28, 2019

using saved texcat trained model for new data set

Related topics