Accessing (un)finished annotations after local host internet outages

Hi team,

I tunneled my local port to annotators for tagging dataset_A. Unfortunately, my local host's internet was down, so I needed to restart the Jupyter Notebook session.

After restarting, how can I access the already finished taggings of dataset_A? And how can the annotators continue with the unfinished part of dataset_A?

Will "!python -m prodigy textcat.manual dataset_A input.jsonl --label X Y Z" allows annotators to continue tagging dataset_A from where they left off?

Thanks a lot!
Taylor

Hi Taylor,

in theory your annotators should be able to continue when you restart the server because Prodigy comes with a hashing mechanism that keeps track of which examples have already been annotated. Prodigy checks the database for hashes which it uses to filter the stream of examples. Effectively, that means that examples with the same hash are just skipped locally.

It's explained in more detail here:

If you have follow up questions related to this: feel free to ask!

Thanks for the response. I'd appreciate further guidance on making my annotators reconnect to dataset_A.

Should I use the following code and share the link with them, and they can continue from where they left off? Or should I use a different code? I'm worried that the code I'm using will overwrite existing annotations and create a whole new dataset_A.

python -m prodigy textcat.manual dataset_A input.jsonl --label X Y Z

Thanks a lot!
Taylor

First, one comment about your line of code.

python -m prodigy textcat.manual dataset_A input.jsonl --label X Y Z

I think it should be this:

python -m prodigy textcat.manual dataset_A input.jsonl --label "X,Y,Z"

The --label param needs a comma separated string to denote the labels.

Should I use the following code and share the link with them, and they can continue from where they left off?

That should just work, yes. Note; are you annotating with/without overlap at the moment? You might get duplicate annotations if feed_overlap is to true, but this might also be a good thing. This way, you can check if your annotators agree on the annotation. More details on this can be found here:

I'm worried that the code I'm using will overwrite existing annotations and create a whole new dataset_A.

Do you have evidence of this?