Match recipe: docs and distinction from ner.manual

adamkgoldfarb · March 15, 2020, 3:02am

Great to see the new match recipe!

Looks like directly below the introductory recipe text, prodigy mark is noted in the command terminal rather than prodigy match. Super minor but wanted to raise it in case it throws some folks.
It looks like ner.correct doesn't have a --patterns argument-- was this reference in the match docs intended to refer to ner.manual?
Is the only major difference between ner.manual with match patterns and match that ner.manual requires a model? Just making sure I wrap my head around the different functions!

Thanks,

Adam

ines · March 15, 2020, 10:30am

Thanks for the heads-up – 1 and 2 were both typos / copy-paste mistakes and I also noticed that the docs didn't list the spacy_model argument. So sorry if this made things confusing! Already fixed this and should be live in a second.

The main difference between the new match and ner.manual with --patterns is that match will only show you the matches, with different optionds for how to present them (and lets you accept or reject). If you use ner.manual with --patterns, you're still going through every single example and if a pattern matches, the match is pre-highlighted.

However, if what you want to do is find examples via matches, that type of workflow isn't a good fit. This was kind of a gap in the API that I noticed when working on a small project. For example, you might be working on a text classification project with very imbalanced categories that make it difficult to get over the "cold start problem". So you could start by using match with a few patterns to quickly find enough positive examples for your category, then pretrain a model on that dataset and improve it further, e.g. using textcat.teach. (Early version of the matching logic in textcat.teach tried to do this all in one by preprocessing the stream and starting with only matches – but this wasn't very transparent and a bit too "magical". So match lets you do this more explicitly in a separate step.)

adamkgoldfarb · March 15, 2020, 2:47pm

Super clear explanation-- thanks, as always! Stay safe.

Topic		Replies	Views
Prodigy present text with no matching pattern (ner.manual) usage , ner , solved	5	463	April 12, 2020
patterns in ner.match and ner.teach usage , ner , solved	3	841	July 10, 2019
Creating a custom recipe to integrate bespoke model usage , ner , custom , solved	3	720	November 12, 2019
Feedback on NER recipes documentation docs , ner , done	2	451	May 12, 2020
ner.match error with exact string patterns enhancement , usage , ner , done	8	762	June 12, 2018

Match recipe: docs and distinction from ner.manual

Related topics