How do I reload a dataset to access and continue labelling?

jamesaphoenix · July 27, 2020, 8:02pm

Hi everyone

I've created my first dataset and have uploaded some labels. After closing the first session, how can I just reload the dataset/web app and start labelling on labels that I didn't do during the previous session?

Best regards,

jamesaphoenix · July 27, 2020, 8:03pm

import subprocess
subprocess.run("prodigy textcat.manual linkedin_posts ./data.json --label Evergreen",
shell=True, check=True, capture_output=True)

ines · July 28, 2020, 9:00am

Hi! I hope I understand your question correctly – but you should be able to just restart the server, and Prodigy will start where you left off. By default, Prodigy will skip examples that were already annotated, so you're not asked the same question twice.

If you want to re-annotate the examples in your stream, you can save the result to a new dataset, and you'll be able to start at the beginning again.

jamesaphoenix · July 28, 2020, 10:06am

Yes Ines! That's exactly what I was looking for

What is the exact syntax for restarting the server, do I still need to pass in the original source file or not?

prodigy textcat.manual linkedin_posts ./data.json --label Evergreen

ines · July 28, 2020, 1:49pm

You can just re-start with the exact same command Prodigy reads your data as a stream (if possible) and it doesn't need to import anything upfront. It just saves the annotations you collect. So when you restart the server, Prodigy will read in the file and start again with the next example that's not yet annotated in the dataset.

jamesaphoenix · July 28, 2020, 3:54pm

Thank you Ines

Topic		Replies	Views
How to continue anotate in saved dataset? usage , database , solved	1	385	February 24, 2022
Accessing (un)finished annotations after local host internet outages usage	3	295	May 10, 2023
Restart Text classification and want to add additional labels usage , textcat , solved	4	819	July 24, 2020
Resuming annotation with a model in the loop usage , solved	2	1343	March 6, 2018
Prodigy says "No tasks available." usage , solved , streams	14	1038	October 7, 2021

How do I reload a dataset to access and continue labelling?

Related topics