Customizing prodigy for NER and relationship extraction

ines · December 20, 2017, 12:52pm

Prodigy does come with an ner.mark recipe that uses the boundaries interface, which lets you highlight spans of text. You can see an example of this in the recipes overview. However, since marking entities manually is often unnecessarily tedious, you should only have to use this for edge cases or if your goal is to create gold-standard annotations.

To get over the “cold start problem” when training a new entity label, Prodigy lets you pass in a list of match patterns describing examples of the entities you’re looking for. Match patterns can include all properties available for spaCy’s rule-based matcher – so you can define single or multi-word tokens or use other linguistic annotations like part-of-speech tags. You can also use the terms.teach and terms.to-patterns recipe to create a terminology list from a number of seed terms using word vectors, and convert the list to match patterns.

When you start training, Prodigy uses the patterns to start suggesting entities and will collect the first set of examples to update the model in the loop. As the model improves, it will also start suggesting entities based on what it’s learned so far from the pattern matches.

We actually just recorded another video tutorial that shows an end-to-end example of training a new entity type from scratch starting off with only 3 seed terms:

You can find more details in the docs and this thread. I’ve also posted a quick TL;DR version of the workflow in this comment.

Topic		Replies	Views
Inquiry on Using Relation Extraction Model for Annotation in Prodigy relations	6	194	June 10, 2024
Post Processing On prodigy usage , ner	2	298	February 2, 2022
Named Entities(manual) usage , ner , solved	4	803	May 11, 2018
Question about NER and relation annotation usage , ner , spacy , relations	2	407	August 26, 2021
ner.correct equivalent for relation extraction? usage , ner , textcat , spacy , relations	3	479	December 9, 2021

Customizing prodigy for NER and relationship extraction

Related topics