Restarting a custom recipe without overwriting annotations stored in the database

joebuckle · November 5, 2022, 4:12am

Good day,

We created a custom recipe where we annotate audio files with categories for text classification, but also at the same time, allow editing the transcript of the audio file. After finishing an entire dataset and saving it (no more tasks), is there a way for us to restart reviewing the dataset, to allow us to go through the transcripts again to update them, without overwriting what is already in the prodigy database?

Regards,
Joe

ryanwesslen · November 7, 2022, 2:35pm

hi @joebuckle!

Thanks for your question!

I'm a little confused by you using both "reviewing" and "restart".

Typically, when I hear "review", I think of something like using the review recipe where a person will review others annotations.

Did you mean that? If so, then the review recipe would be your best starting place.

Or did you mean that you simply want to "annotate again" using your same custom recipe (hence "restart")?

If it's the second, are you aware of multi-user sessions?

To do this, you can restart your process (e.g., python -m prodigy custom_recipe ...) saving into the same dataset, but instead of going to whatever URL is launched, add in the prefix ?session=new_session (where you can type whatever you want for new_session). This will now save in the same dataset your new annotations but with a new key called _session_id that allows you to differentiate your annotations.

If you don't specify a session name, Prodigy will automatically use a timestamp. So any previous annotations in that dataset that you didn't explicitly specify the session name, can still be identified. But it'll have a time stamp session id, not a named one.

Let us know if this answers your question or if you have further questions!

joebuckle · November 7, 2022, 4:12pm

We want to annotate again using our custom recipe from the beginning of the data, but we also want to see what annotations/labels/categories have been saved before in the UI, not annotate again from scratch.

ryanwesslen · November 7, 2022, 5:08pm

Ah. It sounds like you want a recipe to "correct" your annotations.

Do you have a model?

Prodigy's typical workflow (manual -> correct recipes) is based on the idea that the correct recipe has a model in the loop that you're correcting the model's predictions.

If not, that's okay. But without a model, the custom recipe may be more like review than the correct. But same idea applies.

I would suggest creating a new second custom recipe. From an organization workflow, you'll make things easier down the road with a second recipe than trying to add this new task of correcting into the same recipe.

For the new recipe, check out this thread:

If after the second (correct) round you wanted a third/final round to compare, you may find the diff UI to be helpful. There you could combined the first (manual/current recipe) and the second (corrected) at the same time.

Example from docs:

Hope this helps!

joebuckle · November 17, 2022, 8:59pm

Thanks!

Topic		Replies	Views
Continue to annotate same data in new session enhancement , done	19	4002	October 5, 2018
Resume Multi-Session Annotation streams	2	516	April 22, 2021
Restarting prodigy on same dataset doesn't skip completed tasks (custom recipe)	3	356	October 5, 2022
Review Bug usage , review	4	717	November 30, 2020
Review recipe unable to save a dataset to itself usage , review	3	690	July 13, 2022

Restarting a custom recipe without overwriting annotations stored in the database

Related topics