Prodigy saved annotations labeled incorrectly

I am going back through my saved annotations in prodigy and I realize some of the annotations were saved incorrectly. Any reason this is happening? For instance, this light blue region → should be labeled as speaker_01 but instead is marked as overlap in certain areas. Same thing happening for one speaker, shows light green region but marked as speaker_01, speaker_02 and speaker_03 in different areas, when it should all say speaker_03. This did not happen in every saved annotation but a large portion of the files.

Welcome to the forum @Chan_cL! :waving_hand:

Thanks for reporting this — we were able to identify the issue.
Unfortunately, there's a bug where resizing or dragging an existing region can overwrite its label with whatever label is currently selected in the sidebar.
So if you had "overlap" selected and dragged an "speaker_01" region even slightly, that region would silently become "overlap". This also explains why some annotations are fine — it only triggers when you interact with a region while a different label is active.

Can you confirm whether you were resizing or repositioning any of the regions during annotation? That would match the pattern we're seeing.

Regarding the saved data: Unfortunately, once the annotations were saved, the incorrect labels were persisted to the database — Prodigy doesn't keep a revision history for individual annotations.
That said, you should be able determine the correct labels from the color field saved in the database, which will correspond to the original label's color.

To build a recovery script, we need to know:

  1. Did you use custom label_colors in your config/prodigy.json? If so, what were the values?
  2. What was the exact --label argument you passed to the recipe, and in what order? (e.g., --label speaker_01,speaker_02,speaker_03,overlap) — the order matters because colors are assigned by position.
  3. What's the dataset name?

With that info we can provide you a script that maps the colors back to the correct labels and patches the dataset.
We're working on a fix for the underlying issue. Sorry for the trouble.