Custom and dynamic label sets within same session

Hello. I am trying to use prodigy to prepare dataset for object detection. I have created a custom recipe where a model assists in annotating each image. I need help with the following tasks. All the help will be appreciated!

  1. There are over 50 classes that the model can detect. Instead of putting all 50 classes to label during session startup, is there a way to only use the classes detected by the model as the label set? For example, the model detects a dog and a cat in an image, the label set for this image will be dog and cat; for the next image the model detects a person and a car, the label set for this image will be person and car.
  2. I also want to add a custom label where the annotator can use. For example, the image has a cat, but cat is not in the label set. The annotator can add a custom label 'cat' to annotate.

Hi! Prodigy is mostly designed for machine learning use cases where the label scheme is usually pretty critical – you typically wouldn't want the annotator to decide which labels to train because the label scheme is related to the machine learning model and one of the most important decisions to get good results. Annotator's perceptions can also change over time as they annotate, so they might start with "animal" and later choose "cat", which gives you inconsistent results. So we always recommend developing the categories upfront and that's also what Prodigy is designed to expect and encourage.

If you have 50+ classes, it can introduce a lot of cognitive load on the annotator because they constantly have to keep all options in mind (and have a limited "cache" :sweat_smile:). So another and more efficient approach is to break up the task and focus on one label or a small subset of labels at a time. While it might sound like more work to make several passes over the data, we've actually seen over 10x overall speedup improvements with this workflow – see this case study and example for an NLP use case!

Another approach would be to focus on the objects and boundaries first, since this is usually the most tedious task. So instead of creating bounding boxes with labels, the annotator would first only create bounding boxes for all objects using a single label. In the next step, you could then stream in the data again with multiple choice options for a subset of labels, or even just a single label. So the task becomes something like "Is this a cat?", which is a straightforward decision the annotator can make in a second, without switching between clicking, dragging and selecting labels. This also gives you more individual data points and lets you intervene early if there are problems or misunderstandings.

I hope this helps!