hi @miladrogha!
If you know which examples you want each annotator to work on, is there a reason why you couldn't (before running Prodigy) create two separate files: batch1.jsonl
and batch2.jsonl
?
Also - do you absolutely need to run them simultaneously? Let's say you want to run ner.manual
. The simplest approach would be run:
python -m prodigy ner.manual ner_dataset blank:en batch1.jsonl --label label1
Annotate and close the server. Then run:
python -m prodigy ner.manual ner_dataset blank:en batch2.jsonl --label label1
If you do need to run simultaneously, you could assign each a different port:
PRODIGY_PORT=8081 python -m prodigy ner.manual ner_dataset blank:en batch1.jsonl --label label1
PRODIGY_PORT=8082 python -m prodigy ner.manual ner_dataset blank:en batch2.jsonl --label label1
I put these on different ports but you could use one of these on the default port.
One thing to note - be careful that your dataset doesn't have duplicates across the two example sets. Also, if each of the two batches have multiple annotators, be sure to use unique session names, ideally that are created at the start.
Alternatively - to be even safer - I would recommend that you save annotations (at first) to two separate datasets. The reasoning is to avoid any edge case where the DB is pulling at the same the time. Then if you wanted them inside the same dataset, you can run db merge batch1,batch2
to combine two datasets into one.
Also FYI, that this post above is from 2019. While at first glance a lot of its content is the same, there have been a lot of big changes to Prodigy, namely our recent release of Task Routing. I don't recommend you need custom task routers, I mention this for other readers to make them aware of the more advanced ways we've developed to handle task routing.