Image Classification Example?

titani0us · September 6, 2019, 3:48pm

Hello,

Looking at the example recipes, and I see there is a documented recipe for creating annotations within images as bounding box annotations. Is there a similar recipe for regular image labeling? E.g. This is a photo of X, where you are classifying the entire image, rather than bounding boxes within the image?

It isn't clear from the documentation (at least at this point) how we would be able to build a custom recipe to use something like MobileNet on TF to learn from our annotations on Prodigy.

I primarily bought a license hoping prodigy would help me with an image classification problem, but the solution does not seem obvious at this point.

Any help would be appreciated.

ines · September 6, 2019, 4:13pm

Hi! The specifics of how you integrate the model of course comes down to the implementation – but ultimately, you probably want to use the classification interface and stream in examples that look like this:

{"image": "https://example.com/image.jpg", "label": "SOME_LABEL"}

If you want to assign multiple labels at the same time, you could also use the choice interface and do something similar to this example – just with an image instead of a text: https://prodi.gy/docs/workflow-custom-recipes#example-choice You can find more details on the data formats in the "Annotation task formats" section of your PRODIGY_README.html btw.

To label with a model in the loop, you essentially want two main components in your recipe: a function that uses the model to make predictions and yields outs scored examples, and an update callback that receives answers and updates the model. You might find this example recipe useful, that shows this using a dummy model that "predicts" random numbers:

github.com

explosion/prodigy-recipes/blob/master/textcat/textcat_custom_model.py

# coding: utf8
from __future__ import unicode_literals

import prodigy
from prodigy.components.loaders import JSONL
from prodigy.components.sorters import prefer_uncertain
from prodigy.util import split_string
import random


class DummyModel(object):
    # This is a dummy model to help illustrate how to use Prodigy with a model
    # in the loop. It currently "predicts" random numbers – but you can swap
    # it out for any model of your choice, for example a text classification
    # model implementation using PyTorch, TensorFlow or scikit-learn.

    def __init__(self, labels=None):
        # The model can keep arbitrary state – let's use a simple random float
        # to represent the current weights
        self.weights = random.random()

This file has been truncated. show original

In your case, that'd be an image model instead of a text classifier. One thing to keep in mind when choosing a model implementation is that you want your model to be sensitive enough to updates and small batches. After you submit one or two batches, you ideally already want to see a result and see different suggestions. That's not always the default configuration for computer vision models, as they typically expect to be updated with larger batches. (On the other hand, of course, you also don't want it to be too sensitive so that one small mistake immediately ruins your model.)

titani0us · September 6, 2019, 8:51pm

Hello!

Your response is super-helpful.

It does raise the question however about resource requirements of the machine this is running on. Is their guidance for that?

Additionally - are you aware of any solid vision models that are that sensitive? Even using transfer learning approaches from smaller models like MobileNet, I’m not sure 2-3 training batches would get it done. Are you aware of any good examples elsewhere on the internet that shows a practical implementation of what you have described?

honnibal · September 6, 2019, 10:51pm

Regarding choice of vision models, you might find this comparison helpful: https://github.com/explosion/prodigy-recipes/blob/master/image/tf_odapi/docs/model_performance.md . It was done as part of this object detection example:

I do think you should be able to find an image classification model that works acceptably on CPU. The benchmarks in the repo refer to object detection, so I think classification should be a bit more efficient.

Regarding the batch sizing and number of examples to update: ultimately you'll need to try this out on your problem and see what's working well.

titani0us · September 9, 2019, 7:55pm

Thank you @honnibal !

Responses and feedback have been stellar here. So cool to see a project that has an actually active community supporting it.

Topic		Replies	Views
What's a recipe for (dead) simple binary (or multiclass) image classification? usage , image , custom	2	716	October 23, 2019
Image classification usage , image , custom	1	1437	November 9, 2017
Prodigy functionality on entities annotation/image classification/model training usage	1	555	May 10, 2019
image classification output for Yolo image	3	3415	May 11, 2018
Manual Image Annotation usage , image , solved	5	1923	August 21, 2018

Image Classification Example?

Related topics