What is Prodigy's behaviour in annotating two identical images?

Hypothetically, in a computer vision job, there can be two identical images (i.e. identical base-64 encoded image strings) in the input jsonl file. In this scenario, does Prodigy repeat the same images twice or does it only provide the task once?

Apology if the answer is available in an FAQ page somewhere, I attempted a 15 minutes browse through past topics and documentation and was not able to find the answer myself.

Thank you!

Hi! By default, those two images (assuming they're actually identical) would receive the same _input_hash values so Prodigy would consider them identical. If you're using a workflow like image.manual or another recipe that excludes based on input, you would only see this image once and the second image would be skipped. You can also assign your own hashes if you want to, e.g. if you want to treat two images as identical, even though their bytes are slightly different.

There are of course workflows where you want to exclude based on the task hash instead (a combination of the input + annotations). For example, if you're classifying images with binary labels and want to ask multiple questions about the same image. This section has some more details on how hashing and deduplication works: https://prodi.gy/docs/api-loaders#hashing