Same task presented for every pattern match

nix411 · November 29, 2019, 10:41pm

Running textcat.teach with patterns I am getting the same task presented multiple times (one time for every pattern match). Afaik textcat.teach is labelling the document and not the tokens so is this a bug or?

ines · November 30, 2019, 9:22am

This is currently an expected limitation, because the matter will just yield out every match and doesn't do any filtering or make assumptions about what the matches "mean". We do want to add an option to change the default for text classification, it just requires some rewrites of how the pattern matcher works internally.

In the meantime, see this thread for details and how to add your own filter function:

Topic		Replies	Views
textcat.teach presents same annotation task if text snippet contains multiple patterns enhancement , usage , textcat , solved	11	1668	June 3, 2019
Seeds for text classification appearing multiple times usage , textcat	1	667	June 27, 2019
What is the reasoning behind duplicate labeling per pattern? textcat	4	898	May 25, 2019
textcat.teach repeatedly annotating the same text, not annotating entire text at once usage , textcat	1	625	November 22, 2019
Same text appearing twice (with matches and without) textcat	5	464	December 13, 2022

Same task presented for every pattern match

Related topics