How to export annotated data of an image classification labeling job

Hi,

i used the (adjusted) example code in the documentation with the mark recipe for a binary image classification task:

prodigy mark finger_lens_images ./images --loader images --label FINGER_OVER_LENS --view-id classification

and I labeled images. It worked fine and I saved the annotations
When I use:

python -m prodigy db-out dataset_name> datafile_name.json

a file with this name is written in the defined location and is not empty and especially large 45.000 KB for 45 labeled images, Opening it with Wordpad, it seems there is not the expected data structure, like I know it from text classification tasks (e.g. image, label given etc.) Do I miss something? Is there another way to export the data or did I make something wrong in the initial code with the setup of the labeling environment?

The output looks like it is a huge entry about just one image file:

Start of output:
{"image":"

In between, just code like this 0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAFiUAABYlAUlSJPAAAP

End of ouput:
zOxkwvNmp6IkPcA9tlMML7fIPQoZ7ImLYYaYvWIWPtTijnr0VswEycG+KJ092G4kqXgYji8tGdeiOkQ2+Klan4i0GU7UBc5j4I6zQA/x/wvBelQzk//QAAAABJRU5ErkJggg==","text":"FILENAME_COPY","meta":{"file":"FILENAME:PNG"},"label":"LABEL_DEFINED","_input_hash":857793485,"_task_hash":859886761,"_session_id":null,"_view_id":"classification","answer":"accept"}

Hi! By default, Prodigy will encode the images loaded from disk as base64, so that the image data is saved with the annotations and you never lose the reference to the data that was annotated. You can read more about this here: https://prodi.gy/docs/computer-vision#box-data-format

So the data you're seeing here are the actual encoded images, stored together with the annotations. If the file name in the meta is enough to idenfity the images you annotated, you can remove the encoded image data.

During annotation, you can also use the image-server loader to start a local web server to host the images, or load in JSON(L) data with image URLs instead (e.g. from an S3 bucket). In those cases, you just need to make sure you never lose or rename the original images files, otherwise your annotations may go to waste because you won't be able to reproduce what you/the annotator saw (and won't be able to train anything from the data etc.).

Hi and thanks a lot for your answer! This helped a lot
The point is, that I thought it was the output for just one image