How to access and remodel a dataset that has already been annotated with prodigy for images ?

hi @Mat!

Thanks for your question and welcome to the Prodigy community :wave:

Have you seen this post:

As it mentions, you may want to use load images by URL's using image-server loader.

Also related, are you aware of base 64 encodings?

Using base64-encoded data URIs and storing the image data with the annotation task is the safest way to ensure that you never lose the reference to the original data. If you only store URLs or file names and the original files are ever renamed or get lost, your annotations will be useless. However, keep in mind that all task data will be also be stored in the database – including the base64-encoded images. In some cases, this can lead to unexpected results and database bloat .

Therefore, you may want to turn off base 64 encodings when annotating. If you do that, then your annotations will only look like this (that is, "image" is the URL, not base64 encoding):

{
  "image": "https://images.unsplash.com/photo-1554415707-6e8cfc93fe23?w=400",
  "spans": [{"points": [[155, 15], [305, 15], [305, 160], [155, 160]], "label": "LAPTOP"}]
}

Here's a related post that includes a snippet you can use in a custom recipe to prevent the base 64 encodings in order to use the mark recipe:

Hopefully these two posts can help you out. Let me know if you have any further questions!