Annotation Flowchart: Named Entity Recognition

ines · May 21, 2019, 10:41am

Just shared this on Twitter – if you find flowcharts like this useful, I'm happy to make more

Updated on October 18, 2022:

New versions of the NER flowcharts! See our Twitter thread for more details.

Simpan · June 13, 2019, 1:55pm

This is really helpful @ines! Thanks a lot. Would love one for textcat as well.

frownbreaker · November 11, 2019, 2:51pm

Really useful. @ines are there any more in this series?

Ali · December 4, 2019, 4:18pm

When I try to get the flowchart pdf from the following link:

I get this error:

"Failed to load PDF document."

ines · December 4, 2019, 8:24pm

@Ali What happens when you right click > "Save as"? The file is quite big and depending on your browser, it might not display in the browser.

kevinruder · December 17, 2019, 1:36pm

Just stopping by to say that this is amazing! Really helpful!

ines · December 17, 2019, 8:38pm

@kevinruder Thank you, that's nice to hear!

ronaldcturner · December 26, 2019, 12:31am

1. Thanks. For us, NER training is tough and is therefore Prodigy's (and Spacy's) central attraction. We want to add our thanks for this flowchart and our strong encouragement for additional ones to support the most critical areas of Prodigy. As both learners and early deployers of NLP-based solutions, we treat the flowchart as the authoritative springboard into the large corpora of Prodigy (and Spacy) documentation.

2. High-level reality check re DOCUMENTATION. We applaud you for the current release of the documentation set, which is much more coherent than before. Are we generally safe to assume that there are no blatant inconsistencies among the various NER-related API and USAGE segments for both Prodigy and SpaCy? Improvements such as the "collapsed" train functionality are not (yet) reflected in the (current) flowchart, but such are only minor. We just hope to continue relying on the generous guidance you're provided with formal definitions, explanations, code examples, warnings, links, . . . We are repurposing all of these as internal cookbooks to support our development and production.

ines · December 27, 2019, 1:34pm

Thanks!

Yes, the recommendations here are still accurate and fully backwards/forwards-compatible. v1.9 just provides a bunch of workflow improvements – e.g. instead of ner.match, you can now use ner.manual with --patterns, or instead of ner.batch-train, you can now also use train.

(That said, best practices and strategies can always change over time. Especially now that transfer learning works so well for NLP, we may end up recommending slightly different workflows in the future.)

kinghuang · January 28, 2020, 4:43pm

I'm just looking at Prodigy to help with generating NER labels. The flowchart is super useful!

How do I learn more about workflow improvements like the ner.match to ner.manual with --patterns noted above? Will there be an updated flowchart with the latest best practices and strategies?

ines · January 28, 2020, 8:17pm

The NER docs include more detailed descriptions and examples of the different workflows: Named Entity Recognition · Prodigy · An annotation tool for AI, Machine Learning & NLP

You can also check out the NER recipe documentation here: Built-in Recipes · Prodigy · An annotation tool for AI, Machine Learning & NLP

For now, the flowchart is fully backwards-compatible with earlier versions of Prodigy, and you can still use all the same recipes and workflows. The new version just introduces more convenient versions of them. If there are more breaking changes and new recommendations, we'll definitely update the flowchart.

nrjanjanam · February 7, 2020, 12:57pm

This is awesome. Visual representations helps us to make decisions wisely. Thank you so much.
We would love to get more flowcharts if possible.

baxtersapp · June 7, 2020, 9:31pm

Hey, @ines! Any chance we could get an updated version of this flowchart (and others, if you've got the time!) reflecting the shift from sub-recipe training to the top level train command? I can follow the changes via the documentation of the recipes, but I'd love to be able to walk through this with a coworker who is slightly newer than me to ML and prodigy.

ines · June 10, 2020, 7:56am

@baxtersapp Ah yes, I've had this on my list for a while! Maybe the v1.10 release will be a good opportunity to also update the flowchart The main changes would be:

ner.batch-train → train
ner.make-gold → ner.correct
ner.match → ner.manual with patterns / match

Rmsharks4 · December 4, 2020, 8:20pm

Hi @ines - this is really helpful! I was wondering - do you have a workflow designed for doing Multi-Intent Classification and Slot Tagging in Dialog Conversations simultaneously with prodigy with active learning? or anything similar to this? Thanks alot!

ines · December 8, 2020, 11:08pm

You can definitely put a workflow together for something like this in Prodigy – it just depends on the specifics of the tasks and the structured data you want to collect for it. For intent classification, a simple choice UI should work well, and you can then use span highlighting to annotate the slots and combine the two interfaces? The active learning depends on your model, so you'd have to implement the update callback to update your model in the loop. Here are some relevant docs links:

Custom interfaces with blocks: Custom Interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP
Custom recipes: Custom Recipes · Prodigy · An annotation tool for AI, Machine Learning & NLP
Custom active learning for text classification: Text Classification · Prodigy · An annotation tool for AI, Machine Learning & NLP
Custom active learning for NER: Named Entity Recognition · Prodigy · An annotation tool for AI, Machine Learning & NLP

marino · May 5, 2021, 1:23pm

Excellent how -to
I am just wondering why flowchart suggest "Train new model from scratch" if the number of
new entities is > 3?

Topic		Replies	Views
questions on Multi NERs Annotation & Training at Once in a Sentence usage , ner , spacy	5	615	October 3, 2022
Best strategy for training an NER engine usage , ner	8	2177	December 27, 2017
Prodigy to Spacy Guide ner , spacy , best-practices	4	5330	January 13, 2020
NER document Labeling ner , solved	25	3687	August 1, 2019
Ambiguous NER annotation decisions usage , ner , solved , best-practices	12	4675	February 12, 2018

Annotation Flowchart: Named Entity Recognition

Related topics