How do I reload a dataset to access and continue labelling?

Hi everyone :slight_smile:

I've created my first dataset and have uploaded some labels. After closing the first session, how can I just reload the dataset/web app and start labelling on labels that I didn't do during the previous session?

Best regards,

import subprocess
subprocess.run("prodigy textcat.manual linkedin_posts ./data.json --label Evergreen",
shell=True, check=True, capture_output=True)

Hi! I hope I understand your question correctly – but you should be able to just restart the server, and Prodigy will start where you left off. By default, Prodigy will skip examples that were already annotated, so you're not asked the same question twice.

If you want to re-annotate the examples in your stream, you can save the result to a new dataset, and you'll be able to start at the beginning again.

1 Like

Yes Ines! That's exactly what I was looking for :grinning:

What is the exact syntax for restarting the server, do I still need to pass in the original source file or not?

prodigy textcat.manual linkedin_posts ./data.json --label Evergreen

You can just re-start with the exact same command :smiley: Prodigy reads your data as a stream (if possible) and it doesn't need to import anything upfront. It just saves the annotations you collect. So when you restart the server, Prodigy will read in the file and start again with the next example that's not yet annotated in the dataset.

1 Like

Thank you Ines :slight_smile:

1 Like