How to access and remodel a dataset that has already been annotated with prodigy for images ?

How to access and remodel a dataset that has already been annotated with prodigy for images

I anoter with prodigy bounding box a dozen images that I validated each time and I would have liked to have again access to this annotated dataset and come back on each image to make or not modifications but I can not return on these images after the have validated.

I tried to make this order but nothing happens.

prodigy db-out test > ./Documents/test.jsonl
prodigy mark retest ./Documents/test.jsonl --view-id choice

hi @Mat!

Thanks for your question and welcome to the Prodigy community :wave:

Have you seen this post:

As it mentions, you may want to use load images by URL's using image-server loader.

Also related, are you aware of base 64 encodings?

Using base64-encoded data URIs and storing the image data with the annotation task is the safest way to ensure that you never lose the reference to the original data. If you only store URLs or file names and the original files are ever renamed or get lost, your annotations will be useless. However, keep in mind that all task data will be also be stored in the database – including the base64-encoded images. In some cases, this can lead to unexpected results and database bloat .

Therefore, you may want to turn off base 64 encodings when annotating. If you do that, then your annotations will only look like this (that is, "image" is the URL, not base64 encoding):

  "image": "",
  "spans": [{"points": [[155, 15], [305, 15], [305, 160], [155, 160]], "label": "LAPTOP"}]

Here's a related post that includes a snippet you can use in a custom recipe to prevent the base 64 encodings in order to use the mark recipe:

Hopefully these two posts can help you out. Let me know if you have any further questions!

Hello, thank you for your help. I tried to make my own custom recipe but I can’t get what I want. Here’s my code:

import prodigy
from prodigy.components.loaders import Images
from prodigy.util import b64_uri_to_bytes
from prodigy.components.db import connect
import glob
import os
import sys

#permet de modifier les exemples avant qu’ils ne soient placés dans la base de données.
#examples	liste	Liste d’exemples de dictionnaires annotés.

def before_db(examples,data_source_directory):
    liste_image = glob.glob(data_source_directory + '/*.jpg')

    for eg in examples:
        # If the image is a base64 string and the path to the original file
        # is present in the task, remove the image data
        if eg["image"].startswith("data:"):
            for image in liste_image:
                if eg["meta"]["file"] in os.path.basename(image):
                    eg["image"] = data_source_directory+"/"+eg["meta"]["file"]
    return examples

    dataset_name=("The dataset to use", "positional", None, str),
    data_source_directory=("data_image_directory", "positional", None, str),
	loader=("Comma-separated label(s)", "option", "i", str),
	label=("Comma-separated label(s)", "option", "l", list))
def modify_images(dataset_name:str,data_source_directory:str, loader, label):
    db = connect()
    examples = db.get_dataset(dataset_name)
        "stream" :examples,
		"view_id": loader,

What I try to do here is that when I annotate with the "image.manual" function of the photos from my personal files, and I save the annotated images , well, I no longer have access to them at all, I cannot make any changes or corrections.
With this code I would like to again have access to my annotated dataset and be able to modify it.
I find it strange that prodigy does not propose a function or recipe so that one can after saving the annotated images , return to modify or correct these annotations.

I executed the following command:

prodigy modify-images testv ../Downloads/OneDrive_1_29_08_2022 --loader image --label FENETRE_OUVERTE,FENETRE_FERMEE,PORTE_OUVERTE,PORTE_FERMEE -F ./

thank you in advance