Correction of annotation in UI

Hey! This is definitely something we’ve been thinking about a lot. One of the things that makes Prodigy so powerful is the binary interface, and being able to move through examples quickly. This is also the reason we’ve decided against a click-and-drag-style interface for annotating entities manually.

So for the stable v1.0 release, we’ve been working on a new recipe and interface that combines Prodigy’s philosophy (don’t waste the annotator’s time and attention) with a more interactive and manual workflow for creating boundary annotations. (We hope that this will also be a good answer to similar issues and questions, like the ones described in this or this thread).

Here’s the concept: Just like ner.teach, the new ner.mark recipe will let you stream in examples from a text source. Prodigy will step through the texts sentence by sentence (or sentence slice by sentence slice, depending on the length), and show each token as a selectable element as part of the annotation card:

You can then click on the start and end token (or use the number keys on your keyboard) to select the span. At the moment, the recipe only supports annotating one label at a time – but it also means that ideally, it’ll take you only three clicks or key presses to create an entity annotation.

When you exit Prodigy, the collected data will be used to export ready-to-use entity annotations to your dataset (e.g. for training a model with ner.batch-train).

The recipe uses a new interface, boundaries, which might be useful for other tasks as well – essentially, anything that requires selecting spans of texts within a document, and assigning optional labels. So instead of token spans, you could also annotate phrase spans or sentence spans, i.e. to correct a sentence boundary detection system.

We’re currently in the process of testing the new interface and recipe with more data – but it’ll definitely be included in the stable v1.0 release :blush:

2 Likes