Pytorch based bounding box annotations

ines · January 12, 2021, 6:08am

Hi and sorry you were having trouble with TensorFlow! If you already have a PyTorch model, it's probably better to just use that (Also, the modelling it typically the "hard part" so if you have a model that's trained and that predicts something, it should hopefully be no problem to integrate it with Prodigy). For your annotation workflow, you probably want to use a custom recipe with the image_manual interface.

Here's a basic template you can use to get started:

github.com

explosion/prodigy-recipes/blob/master/image/image_manual.py

import prodigy
from prodigy.components.loaders import Images
from prodigy.util import split_string
from typing import List, Optional


# Recipe decorator with argument annotations: (description, argument type,
# shortcut, type / converter function called on value before it's passed to
# the function). Descriptions are also shown when typing --help.
@prodigy.recipe(
    "image.manual",
    dataset=("The dataset to use", "positional", None, str),
    source=("Path to a directory of images", "positional", None, str),
    label=("One or more comma-separated labels", "option", "l", split_string),
    exclude=("Names of datasets to exclude", "option", "e", split_string),
    darken=("Darken image to make boxes stand out more", "flag", "D", bool),
)
def image_manual(
    dataset: str,
    source: str,

This file has been truncated. show original

Fundamentally, there are two things you need:

The stream of examples to annotate, typically a Python generator. Prodigy's JSON format is pretty straightforward (see here) and if you can stream in examples in this format, you can render them with the image_manual UI. So you'd just need to write a function that yields dictionaries in this format, with the bounding box coordinates predicted by your model, either as x/y/width/height or a list of points representing the pixel coordinates. That's likely what your model already outputs. Here's a pseudocode example showing a stream: https://prodi.gy/docs/computer-vision#custom-model (the Images loader converts the image data to base64, so you can easily get the byte representation to feed to your model)
An optional update callback that updates your model in the loop. Whenever a batch of answers is available, they're sent back to the server and your update callback is called. It receives data in the same JSON format as the input. Because the stream is a generator, only a batch is processed at a time, and the updated model will be used to process future batches.

There are a few things that are pretty model/data-specific and that you might have to experiment with to find the best and most efficient workflow:

The right batch size. This depends on your model, and you can configure it by setting Prodigy's batch_size config setting (the number of examples fetched from the stream). Streams are generators and Prodigy will only ever ask for the next batch.
The best updating strategy. Ideally, your model should be sensitive enough to small updates so you can see the impact of your annotations quickly, but also not too sensitive so it doesn't overfit. That's a slightly unusual requirement and not something you'd typically optimise your implementation for. It's also possible that the updating ends up being a bit slow, especially if you need to make several passes over the data. You can also consider not updating in the loop at all and just retrain your model and restart the annotation after every X examples.
Make sure PyTorch isn't launching multiple threads under the hood. Otherwise, your generator may end up getting stuck or run out prematurely. (If you're having problems with this, move your image processing out into a separate process and out of the main thread, write to stdout and pipe the data forward to Prodigy.)

Topic		Replies	Views
image.manual with model in the loop usage , image	8	954	June 24, 2020
Image Classification Example? usage , image	4	1243	September 9, 2019
Integrating Tensorflow's Object Detection API with Prodigy image , project	5	8105	July 13, 2022
Prodigy and Skin Cancer Detection custom , medical	2	471	December 20, 2022
📺 Video: Custom recipes & image captioning with PyTorch custom , project , pytorch , best-practices , news	0	991	March 24, 2020

Pytorch based bounding box annotations

Related topics