Skip Functionality

koaning · September 28, 2022, 12:42pm

Prodigy doesn't allow too much interaction with the database, as explained here, because it easily gets messy. If users are able to make changes to annotations, you probably also need a way to track who made what change and when.

So instead, here's how I've dealt with this in the past. I make two datasets, say ner_v1 and ner_v2. When I start annotating, everything goes into ner_v1. I'm fully aware that this v1 data will be a first draft. Many annotations are correct, but some might need to change later after understanding the problem better.

Then, once there are a few flagged examples, or when some bad labels have been detected, I re-label the relevant candidates and move these annotations to ner_v2.

Then, when it's time to make a model, I have a custom script that gets the examples from ner_v1 and ner_v2. If an example appears in both sets, I always prefer the annotation from ner_v2. This gives me a final dataset that can be used to train a model.

Other people might have another way to handle their data, but for my projects, this approach has worked quite well.

Topic		Replies	Views
Skip annotation and annotate later usage , ner	2	416	May 23, 2022
Undesirable "ignore" examples build up with low quality input streams enhancement	5	1760	September 27, 2022
Review recipe: Ignore for now, but go over later. usage , ner , solved , review	2	442	January 21, 2023
NER UI/UX annotation lost after adding task from history front-end , solved	13	716	May 14, 2020
Prodigy Annoation: Best Practise usage , ner , solved	3	404	February 18, 2022

Skip Functionality

Related topics