TextCategorizer

mcharest1135 · July 9, 2020, 12:52am

In the documentation, Default: Stacked ensemble of a bag-of-words model and a neural network model. The neural network uses a CNN with mean pooling and attention. The “ngram_size” and “attr” arguments can be used to configure the feature extraction for the bag-of-words model. How do you add the ngram_size and attr arguments to config?

honnibal · July 9, 2020, 4:06pm

Hi @mcharest1135,

In general we can't support spaCy usage questions on this forum, so in future you'll need to ask this type of question on a forum such as StackOverflow. I've answered it this time but I've also unlisted the thread, and in future we won't be able to help here.

spaCy v2 doesn't really make it easy to configure the models, but you should be able to pass the ngram_size attribute as a config parameter in nlp.create_pipe, like this:

nlp.add_pipe(
    nlp.create_pipe("textcat", config={"ngram_size": 3, "architecture": "bow"}))

If you're working from Prodigy, you'll want to do this inside your recipe script. You can also modify the cfg dictionary of the component after it's created, like this:

textcat = nlp.get_pipe("textcat")
textcat.cfg["ngram_size"] = 3
textcat.cfg["architecture"] = "bow"

You'll need to make these modifications before the call to nlp.begin_training, as during begin_training the model instance is created from the config.

If you need further help with the spaCy usage, you can also post on the spaCy issue tracker.

Topic		Replies	Views
How to change the parameters of model usage , spacy	1	481	July 9, 2018
Pretraining support usage , textcat , spacy , solved	2	1005	May 21, 2019
Custom spacy pipe for Prodigy view textcat , spacy	2	642	November 21, 2019
Error while loading the custom Text classification model in python textcat , spacy	1	772	June 20, 2019
How to use a (sentence targeted) textcat model together with the core model textcat , spacy	2	1308	November 28, 2017

TextCategorizer

Related Topics