How to export annotated data of an image classification labeling job

fondis · January 19, 2021, 8:17am

Hi,

i used the (adjusted) example code in the documentation with the mark recipe for a binary image classification task:

prodigy mark finger_lens_images ./images --loader images --label FINGER_OVER_LENS --view-id classification

and I labeled images. It worked fine and I saved the annotations
When I use:

python -m prodigy db-out dataset_name> datafile_name.json

a file with this name is written in the defined location and is not empty and especially large 45.000 KB for 45 labeled images, Opening it with Wordpad, it seems there is not the expected data structure, like I know it from text classification tasks (e.g. image, label given etc.) Do I miss something? Is there another way to export the data or did I make something wrong in the initial code with the setup of the labeling environment?

The output looks like it is a huge entry about just one image file:

Start of output:
{"image":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAVgAAAFeCAYAAADANQddAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAP

In between, just code like this 0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAP

End of ouput:
zOxkwvNmp6IkPcA9tlMML7fIPQoZ7ImLYYaYvWIWPtTijnr0VswEycG+KJ092G4kqXgYji8tGdeiOkQ2+Klan4i0GU7UBc5j4I6zQA/x/wvBelQzk//QAAAABJRU5ErkJggg==","text":"FILENAME_COPY","meta":{"file":"FILENAME:PNG"},"label":"LABEL_DEFINED","_input_hash":857793485,"_task_hash":859886761,"_session_id":null,"_view_id":"classification","answer":"accept"}

ines · January 20, 2021, 4:54am

Hi! By default, Prodigy will encode the images loaded from disk as base64, so that the image data is saved with the annotations and you never lose the reference to the data that was annotated. You can read more about this here: https://prodi.gy/docs/computer-vision#box-data-format

So the data you're seeing here are the actual encoded images, stored together with the annotations. If the file name in the meta is enough to idenfity the images you annotated, you can remove the encoded image data.

During annotation, you can also use the image-server loader to start a local web server to host the images, or load in JSON(L) data with image URLs instead (e.g. from an S3 bucket). In those cases, you just need to make sure you never lose or rename the original images files, otherwise your annotations may go to waste because you won't be able to reproduce what you/the annotator saw (and won't be able to train anything from the data etc.).

fondis · January 28, 2021, 11:58am

Hi and thanks a lot for your answer! This helped a lot
The point is, that I thought it was the output for just one image

Topic		Replies	Views
Using image.manual to correct bounding box annotations usage , image , solved	2	635	December 11, 2020
How to export my annotations?	2	18	March 3, 2025
How to export annotation of image manual without image string base64 usage , done , image , solved	20	2489	July 13, 2022
How to access and remodel a dataset that has already been annotated with prodigy for images ? image	2	425	November 8, 2022
How to visualise annotated images with corresponding label after annotation finishes in prodigy? usage , image	1	891	March 8, 2019

How to export annotated data of an image classification labeling job

Related topics