Using image.manual to correct bounding box annotations

GautierA · December 9, 2020, 1:21pm

Hello,

I want to use Prodigy to correct annotation of a vision ml algorithm.
I currently have a json file that is constructed almost like what's indicated in the documentation. The only difference is that the key of each bracket correspond to the frame of the video.

{
    "0": [
        {
            "x": 309,
            "y": 380,
            "width": 73,
            "height": 37,
            "points": [
                [
                    309,
                    380
                ],
                [
                    309,
                    453
                ],
                [
                    346,
                    380
                ],
                [
                    346,
                    453
                ]
            ],
            "center": [
                327.5,
                416.5
            ],
            "label": 3
        },
        {
            [...]
        }
    ],
    "1": [
        [...]
    ]
}

I have every frame of that video exported in a folder (with their name being "[FrameNumber].png").
I want to use image.manual to review the annotations and correct them.

I face two different problems:

How to format the json file in order to import it in the prodigy database ?

Prodigy doesn't seems to like the way I use dictionnary to link the frame number to its annotation.
Instead of using a dictionnary, I'm gonna be forced to stock every image (through its base 64 encoded image data) directly in that json imported file ? Isn't there any way to link it directly for a local file ?

How to export the database without the base64 image associated to each annotation ?

When I try to use the db-out command on some annotation, it gives me the annotation in the correct format but also the base 64 image data in that file.
Is there any way to only export annotation data from the database and not the image data ?

Thanks,

Best,
Gautier

ines · December 9, 2020, 10:37pm

You don't need to import anything upfront to annotate it with Prodigy – you can also write your own custom recipe that takes the input data in your format, and generates annotation examples from them. Or you can generate them as a preprocessing step, that's up to you

The only thing that's important is that the data that gets sent out by your stream follows the expected JSON format. You can find an example of the data format here: Annotation interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP Each example should have a key "image" and a key "spans" containing a list of bounding boxes, just like the ones you already have in your data.

You don't have to use base64 – you can also set "image" to a URL (e.g. hosted in an S3 bucket or a local web server). The problem with local file paths is mainly that modern browsers typically block them for security reasons – so you either want to send the image data with each task (works well for smaller images) or serve them somewhere. Prodigy also comes with an Images and ImageServer loader that helps you load/serve files from a directory: https://prodi.gy/docs/api-loaders#loaders-file

The --remove-base64 flag on the built-in image.manual recipe will remove the base64-encoded data before the examples are placed in the database. See here: Built-in Recipes · Prodigy · An annotation tool for AI, Machine Learning & NLP

In a custom recipe, you can use the before_db callback to implement this (or make any other modifications to the JSON data before saving it): Custom Recipes · Prodigy · An annotation tool for AI, Machine Learning & NLP

GautierA · December 11, 2020, 2:55pm

Thank you a lot for your detailed answers ! It now works perfectly.

Regards,

Gautier

Topic		Replies	Views
Editing already annotated images? enhancement , usage , done , image , front-end	4	1568	July 1, 2019
How to export annotation of image manual without image string base64 usage , done , image , solved	20	2484	July 13, 2022
Extracting annotations from database usage , image	1	834	June 21, 2019
image.manual with model in the loop usage , image	8	954	June 24, 2020
Using dataset uploaded from python for annotations usage , image	1	809	June 21, 2019

Using image.manual to correct bounding box annotations

Related topics