Seeding text categorization with phrases

ines · February 15, 2018, 9:20am

Just posted an update on this thread with an updated version of textcat.teach using the PatternMatcher. (It still includes the entity labels, but you can easily filter them out using a function like yours above).

You could write a little wrapper for your stream that checks the _input_hash, which will be identical for tasks with the same text, and either merges the spans, or removes the duplicates. (This depends on how you want the tasks to look – i.e. if you want all matches to be highlighted, or just the first one.)

Ah yes – sorry if this was confusing. We just ended up using eg because it's short.

Topic		Replies	Views
Is there a way to highlight seeded terms in textcat.teach? enhancement , textcat , done	5	1802	January 29, 2020
How textcat.teach works under the hood usage , textcat	16	93	March 26, 2025
Can't get phrase matching to work spancat	3	295	June 27, 2023
Access to/manipulate sent.cat within TextClassifier class? usage , textcat , spacy	4	945	February 21, 2019
Seeds not recognized by textcat.teach textcat , solved	10	3275	January 23, 2019

Seeding text categorization with phrases

Related topics