Classification interface for images

Hi, I’m trying to setup prodigy to verify image classification results.

I made the file image_verify.py

 from prodigy.components.loaders import JSON

 @prodigy.recipe('image.verify',
 dataset=("The dataset to use", "positional", None, str),
 source=("Path to json", "positional", None, str)
 )
 def image_verify(dataset, source):
 stream = JSON(source)

 return {
     'view_id': 'classification',
     'dataset': dataset,
     'stream': stream,
 }

and then an images.json file of the form

[{
     "image": "/absolute/path/to/image.jpg",
     "label": "class"
  },
...]

then I ran prodigy image.verify bdd images.json -F image_verify.py but the images do not show up (though the labels do). I also tried with a relative path instead of absolute path. Am I misunderstanding the data format for the classification interface?

Your format is correct, but I think the problem is this: By default, modern browsers will block images and other resources from local file paths for security reasons. So if you open the developer tools, you probably see something like "blocked local resource". See this thread for more details:

To work around this, you could either serve the files locally on a different port, upload them so an S3 bucket or something similar, or use Prodigy's fetch_images helper that takes a stream of image tasks and converts all file paths (local and URLs!) to base64-encoded data URIs.

from prodigy.components.preprocess import fetch_images
stream = fetch_images(stream)

Just keep in mind that this will add the image data to the annotation tasks and store it all in the database. So if you're working with lots of really large images, your database can potentially get quite large, too.

Thanks so much for the quick response! I used fetch_images and all is working now.

1 Like