Yes, that's pretty much what I had in mind: for each example annotated with bounding boxes, you create a new examples for each bounding box that only contains that one bounding box. In code, it would look something like this:
from prodigy.components.db import connect
db = connect()
dataset = db.get_dataset("your_dataset_here")
examples = 
for task in dataset:
for span in eg.get("spans", ): # create one example per bounding box
eg = copy.deepcopy(task) # deepcopy the example for each bounding box
eg["spans"] = [span] # add single bounding box
You could then use
examples as the input stream and add a key
"options" to each example with the multiple choice options to choose from (color, other attributes etc). You could also sort the list of
examples before you send them out, for example, by
span["label"], so you do all shirts first, then all pants, and so on.
It could also be cool to output some stats at this stage: for example, how many items of clothing are there, which types are common? Which types are super rare, and what could potentially be a mistake?
If you wanted to add more automation, you could also do that here when you create and sort the
span gives you access to the bounding box information (
eg["image"] is typically the base64-encoded image data or the image URL. So using an image library, you could already try to guess the dominant color (e.g. like this) and group them together. So your annotators can go through lots of examples in a row, and all they have to think about is something like "are these pants blue? (and if not, what are they?)".