How to visualise annotated images with corresponding label after annotation finishes in prodigy?


(rohit) #1

I have annotated images with 5 choices in total for the image classification task.

Now, I want to reiterate over these annotated images with the corresponding
label selected to visualize the dataset for verification whether
have I assigned correct label or not.

Please help me out with how to do this.

(Ines Montani) #2

Hi! Prodigy’s input and output format are pretty much identical – so if you’ve annotated data, you can always export it and load it back in using a recipe like mark, which just shows you exactly what’s in the data.

So for example, let’s say you have a dataset image_choice and you’ve used the choice interface to collect annotations. You could then export the dataset and load it back in with mark and the given interface name. For example:

prodigy db-out image_choice > image_choice_data.jsonl
prodigy mark image_choice_verification image_choice_data.jsonl --view-id choice

Don’t forget to use a different dataset name for the second round – you definitely want a clear separation between the two. Also, if you’ve rejected examples during the initial annotation, you might want to filter the exported JSONL data and only show the annotations that were accepted. For example, like this:

from prodigy.util import read_jsonl, write_jsonl

data = read_jsonl('/path/to/image_choice_data.jsonl')
accepted = [eg for eg in data if eg['answer'] == 'accept']
write_jsonl('/path/to/image_choice_data_accepted.jsonl', accepted)

Finally, you could also use a script like this to slightly modify the data and display it differently. For instance, let’s say you’ve annotated the images in the choice view, but you want to view them as one image with the selected label on top. You could then format each task to have the original "image" and a "label", using the one that was selected:

converted_data = []
for eg in data:  # Your loaded annotations
    # Get the label(s) selected in the choice interface. This will
    # be a list like ['LABEL_ID'] or multiple if multiple choice is allowed
    accepted = eg['accepted']
    for label in accepted:  # Create one new example per label
        new_eg = {"image": eg["image"], "label": label}