Give a specific order of presentation depending on the annotator

Hello!

I am new at using Prodigy, and first I'd like to thank the developers, you made a great work, it is easy to understand and the documentation and support are really complete and helpful!

Here is my question: I'd like to make an annotation task, with several annotators and I'd like to know if there is way to give a different order of presentation to each annotator. The main idea is that we want to avoid a possible influence of the order in our experiment.

Thanks in advance for your answer and the attention given to this question,

I wish you happy holidays.

Bests,

Estelle

Hi and thanks, that's nice to hear :smiley:

The easiest way to implement something like this would be to start a separate instance for each annotator and use a custom stream for each of them, with a different order of examples (randomly, or specific based on the annotator). Since the only difference between the instances is the stream, you might not even need a custom recipe and can just use a custom loader. See here for an example: https://prodi.gy/docs/api-loaders#loaders-custom

If you want your loader to be more elegant, you could use a library like typer to let it take arguments on the command line, so you could do something like loader.py --annotator estelle | prodigy ... or loader.py --random --n-examples 10 | prodigy ... etc.

(If your data is in JSON and you know jq, there's probably also a super elegant way to do the shuffling/ordering in a single line on the CLI, then pipe that forward to the recipe and set --loader json. But I'm not a jq wizard, so I couldn't give you any code example for the jq part :sweat_smile:)

Hello,

Thank you so much for your fast answer (sorry I have been on holidays so I didn't see it earlier).

I will give it a try using your indications, and I will let you know if I was able to implement something good!

Bests,

Estelle

Hello,

I wanted to make a little feedback on the solution I implemented to be able to customize the order of presentation depending on the annotator.

It was finally quite easy using this little trick:
I use a CSV containing the name of each annotator and the order of the files (I only work with 10 files so it's not a heavy file to process) and I use sys.argv to study the arguments.

The input command is the same as always :
prodigy sort-video estelle utils/order_files.csv -F test_recipe.py

except the name of the dataset is also the name of the annotator and I replaced the source with my CSV file.
To get the appropriate stream for the annotator, in my "get_stream" function, I first check if the annotator is in the CSV file, if not I create an order of file for this annotator and create a JSON file with the right order of presentation. Then I get the JSON file as my stream.

At the end each annotator has its own instance and I dedicate a port per annotator.
(Tip: I am currently using NGROK to deploy my solution).

I don't know if it's very clear and the solution is ideal, but I hope it will eventually help some people wanting to do the same thing as me!

Bests,

Estelle

1 Like

Thanks for the update, that sounds like a great solution :+1:

1 Like