Correction of annotation in UI

Sometimes I need to make a small correction in annotated NER boundaries to approve the annotation.
What’s the best way to do it now?

Will I able to correct annotations in upcoming versions?

Hey! This is definitely something we’ve been thinking about a lot. One of the things that makes Prodigy so powerful is the binary interface, and being able to move through examples quickly. This is also the reason we’ve decided against a click-and-drag-style interface for annotating entities manually.

So for the stable v1.0 release, we’ve been working on a new recipe and interface that combines Prodigy’s philosophy (don’t waste the annotator’s time and attention) with a more interactive and manual workflow for creating boundary annotations. (We hope that this will also be a good answer to similar issues and questions, like the ones described in this or this thread).

Here’s the concept: Just like ner.teach, the new ner.mark recipe will let you stream in examples from a text source. Prodigy will step through the texts sentence by sentence (or sentence slice by sentence slice, depending on the length), and show each token as a selectable element as part of the annotation card:

You can then click on the start and end token (or use the number keys on your keyboard) to select the span. At the moment, the recipe only supports annotating one label at a time – but it also means that ideally, it’ll take you only three clicks or key presses to create an entity annotation.

When you exit Prodigy, the collected data will be used to export ready-to-use entity annotations to your dataset (e.g. for training a model with ner.batch-train).

The recipe uses a new interface, boundaries, which might be useful for other tasks as well – essentially, anything that requires selecting spans of texts within a document, and assigning optional labels. So instead of token spans, you could also annotate phrase spans or sentence spans, i.e. to correct a sentence boundary detection system.

We’re currently in the process of testing the new interface and recipe with more data – but it’ll definitely be included in the stable v1.0 release :blush:

2 Likes

+1 – I’m very intersted in this as well. Great to hear that it’ll be included in the 1.0 release!

Maybe I’m missing something obvious, but I looked around and couldn’t find an ETA. Is there one right now?

Yes, we actually launched Prodigy v1.0.0 last week and the latest version is now v1.1.0 :tada: If you signed up for the beta, you should have also received a discount code via email (if not, maybe check your promotions or spam folder). For more details on the available licensing options, see this page.

Btw, another update on the thread topic: We’re also working on additional annotation interfaces for “unguided annotation” from scratch. The NER workflow and “cold start” already works pretty well without manual annotation – but there might still be cases where users need to label data in a more static way. This is especially relevant for image annotation, e.g. for object detection or image segmentation tasks. I also want to experiment with options for audio annotation – not sure what the best strategy is, but it could be really cool to have a built-in interface included with Prodigy.

That’s awesome to hear! I wasn’t in the beta, but I’ve gone ahead and purchased a license and I confirmed I do have v1.1.0.

Not sure how I missed that, but ner.mark is exactly what I was looking for. Thanks!

Edit: So just to confirm, this is how my workflow would look?

  1. Run ner.mark and annotate a couple hundred examples of my entity.
  2. Run ner.batch-train for these annotations
  3. Now run ner.teach to further improve the model.

This depends on what you’re trying to do – in some cases, it can make more sense to only annotate edge cases with ner.mark. So if you’re looking to improve the model, you could run ner.teach, reject all entities with wrong boundaries, extract the rejected examples from the dataset using the db-out command with --answer reject and re-annotate them with ner.mark.