Pytorch based bounding box annotations

Daniel-R-Armstrong · January 10, 2021, 10:17pm

I am brand new to using prodigy, I have been trying to use a model in the loop approach to creating bounding boxes, but I was not able to get the TensorFlow variant to work. I tried using the fork notated in the tutorial, I also attempted to upgrade the code to TensoFlow2, but I haven't been successful.

Since I use Fastai and Pytorch, I was I was wondering if anyone had a PyTorch model based alternative, since I have a Pytorch model already trained on my data.

ines · January 12, 2021, 6:08am

Hi and sorry you were having trouble with TensorFlow! If you already have a PyTorch model, it's probably better to just use that (Also, the modelling it typically the "hard part" so if you have a model that's trained and that predicts something, it should hopefully be no problem to integrate it with Prodigy). For your annotation workflow, you probably want to use a custom recipe with the image_manual interface.

Here's a basic template you can use to get started:

github.com

explosion/prodigy-recipes/blob/master/image/image_manual.py

import prodigy
from prodigy.components.loaders import Images
from prodigy.util import split_string
from typing import List, Optional


# Recipe decorator with argument annotations: (description, argument type,
# shortcut, type / converter function called on value before it's passed to
# the function). Descriptions are also shown when typing --help.
@prodigy.recipe(
    "image.manual",
    dataset=("The dataset to use", "positional", None, str),
    source=("Path to a directory of images", "positional", None, str),
    label=("One or more comma-separated labels", "option", "l", split_string),
    exclude=("Names of datasets to exclude", "option", "e", split_string),
    darken=("Darken image to make boxes stand out more", "flag", "D", bool),
)
def image_manual(
    dataset: str,
    source: str,

This file has been truncated. show original

Fundamentally, there are two things you need:

The stream of examples to annotate, typically a Python generator. Prodigy's JSON format is pretty straightforward (see here) and if you can stream in examples in this format, you can render them with the image_manual UI. So you'd just need to write a function that yields dictionaries in this format, with the bounding box coordinates predicted by your model, either as x/y/width/height or a list of points representing the pixel coordinates. That's likely what your model already outputs. Here's a pseudocode example showing a stream: https://prodi.gy/docs/computer-vision#custom-model (the Images loader converts the image data to base64, so you can easily get the byte representation to feed to your model)
An optional update callback that updates your model in the loop. Whenever a batch of answers is available, they're sent back to the server and your update callback is called. It receives data in the same JSON format as the input. Because the stream is a generator, only a batch is processed at a time, and the updated model will be used to process future batches.

There are a few things that are pretty model/data-specific and that you might have to experiment with to find the best and most efficient workflow:

The right batch size. This depends on your model, and you can configure it by setting Prodigy's batch_size config setting (the number of examples fetched from the stream). Streams are generators and Prodigy will only ever ask for the next batch.
The best updating strategy. Ideally, your model should be sensitive enough to small updates so you can see the impact of your annotations quickly, but also not too sensitive so it doesn't overfit. That's a slightly unusual requirement and not something you'd typically optimise your implementation for. It's also possible that the updating ends up being a bit slow, especially if you need to make several passes over the data. You can also consider not updating in the loop at all and just retrain your model and restart the annotation after every X examples.
Make sure PyTorch isn't launching multiple threads under the hood. Otherwise, your generator may end up getting stuck or run out prematurely. (If you're having problems with this, move your image processing out into a separate process and out of the main thread, write to stdout and pipe the data forward to Prodigy.)

Topic		Replies	Views
Image Classification Example? usage , image	4	1249	September 9, 2019
Manual Image Annotation usage , image , solved	5	1894	August 21, 2018
How to create annotations using prodigy for image BB or segmentation / image	1	1224	May 29, 2018
image classification output for Yolo image	3	3352	May 11, 2018
Annotations for images usage , image	1	1042	December 6, 2018

Pytorch based bounding box annotations

Related topics