Deployment of multiple recipes


(C Swart) #1


We have multiple tasks we’d like to deploy with one prodigy server like a couple choice tasks, some classification tasks and potentially a NER task how would I go about this? If I understand correctly the way to do this is spawn an instance for each task on a different port currently? I am not sure how scalable this solution is and I am curious to what you recommend.

Best wishes,

(Ines Montani) #2

Yes, using different ports or hosts is definitely the easiest solution. You can define the port and host globally in your prodigy.json, locally in a prodigy.json or .prodigy.json in the current working directory, or in your code via the 'config' setting of a recipe. This means that you can also populate the values via your own, custom environment variables.

If you’re looking to have multiple annotators working on the same tasks, you can solve this via a custom recipe. There are several threads on the forum discussing different approaches. One would be to set up a “single provider, multiple consumers” architecture and use your own service to orchestrate the stream and send out batches of tasks to the individual annotators. (This thread might be especially relevant, as it contains a successful end-to-end solution.)

If you’re annotating with a model in the loop, it’s important to keep in mind that an annotation session will always be somewhat stateful. So you’ll usually want a separate instance of Prodigy for each annotator. (If you have multiple users connecting to the same instance, this can easily lead to worse results. In the best case scenario, all annotators will be making similar decisions and thus moving the model in the same direction. In the worst case scenario, different annotators will move the model in opposite directions and making its predictions useless, resulting in a much worse selection of annotation examples.)

(Akshita Sood) #3

What if similar thing is to be done with one task and multiple users ?
I want to run the same recipe using different threads, and save the output at one place for comparison.
What would be the workflow for that?