Finished files

7fc2efeac2ef416 · September 13, 2021, 11:32am

Hey there

My name is Roy and i'm a data engineer.
We are starting to use prodogy in out departnment.
I was wondering how file handeling iis managed, i.e is there any mechanism to move files that are allready checked completly to another location or mark them as checked ?

ines · September 15, 2021, 12:37am

Hi! Prodigy doesn't really have a concept of a file being "finished" – whether or not you're done with it depends on a lot of factors that you decide. Maybe you want to re-annotate a file, collect more annotations from a different annotator, or experiment with a different label scheme. Or maybe you're using a workflow with a model in the loop that selects some examples from your file and skips others. So it's not always very useful to think of a file as being "finished".

That said, if you want to check whether all examples in a file have been annotated, you can export your annotations or load them via the Database API and use the _input_hash to indentify unique examples referring to the same input (e.g. text). This lets you identify the examples that have been annotated. This also happens internally when Prodigy decides whether to show an example for annotation. If no examples are available anymore, all data has been annotated.

You can also stream in JSON data with your own internal IDs added to the examples. This will be passed through and saved with the annotations in the database. Based on the collected annotations, you can then check whether you have annotated all examples in a given file, how many annotations you have for each example, and so on.

Topic		Replies	Views
Few records in in the db for the same example usage	26	627	June 13, 2023
Annotation tasks finish even when more samples are in the jsonl dataset usage , solved , streams	5	444	April 8, 2022
Allow annotators to continue annotating later usage , front-end	5	355	August 7, 2023
Continue to annotate same data in new session enhancement , done	19	4001	October 5, 2018
Reviewing/Editing annotated data usage , review , streams	1	930	June 23, 2020

Finished files

Related topics