Hi,
I'm afraid this might be a stupid question, and not primarily related to prodigy.
I'm working on a quality assurance system for a text categorization model. For each review task a seperate data set is going to be created. Each of my annotators has their dedicated port to work on. Only one annotator is going to work on each task. So really nothing fancy here.
I got a ticket system where the annotator can pick their tasks. The ticket system will than call a Flask REST API that will format the recipe string and start a webserver via prodigy.serve(). Everything works fine. The only problem is once prodigy.serve() is called the application is blocked, and no other sessions can be started from this Python programm. So basically how do I start multiple prodigy session from within a single python program?
What's the error message you're seeing? If you open your browser's developer tools, do you see a request error there? Maybe CORS related? Is there any traceback in the terminal?
It seems your goal is to use Prodigy from a Flask application. Here's a past post on why to be cautious on running Prodigy within any other web server (like Flask):
One option we recommend is to divide up the annotation work so that each annotator only needs to deal with a small part of the annotation scheme. For instance, if you’re working with many labels, you would start a number of different Prodigy services, each specifying a different label, and each advertising to a different URL. Prodigy can be easily run under automation, for instance within a Kubernetes cluster, to make this approach more manageable. If you do want to have multiple annotators working on one feed, Prodigy has support for that as well via named multi-user sessions. You can create annotator-specific queues using query parameters, or use the query parameters to distinguish the work of different annotators so you can run inter-annotator consistency checks.
What's the error message you're seeing? If you open your browser's developer tools, do you see a request error there? Maybe CORS related? Is there any traceback in the terminal?
I can start one prodigy session without problem. But like pointed out in the linked comment, once a prodigy session is started via prodigy.serve it's there is a process running and the endpoint won't respond because the return statement is never reached.
My endpoint looks somewhat like this (I'm using flask-restful)
is there a way to start the process "in the background"? For me this is the first time I'm developing a complicated web service. I'm a career changer that has a solid grasp on NLP, but very limited knowledge of web development. So I might miss something very obvious here.
I thought by making each task it's own dataset and giving each annotator a fixed port to work on, I could get around the more complicated solutions.
Thanks for the background! That makes sense how you're approaching -- definitely running on each port is best given different data/tasks but challenging if you're trying to run simultaneously. Running simultaneous processes is even harder to run without containers and/or orchestration engine like Kubernetes, which is likely out-of-scope.
Another other idea you may want is to run these multiples processes in separate terminals rather than in Python. You can do this manually (e.g., open up different terminal windows) or you could use a terminal multiplexer like screen (see comment below):
Or tmux is another common option as mentioned below.
You can run on different ports like prodigy.server() in terminal by prefixing each command: