Hi! If you're looking for user management and monitoring at that scale, this is probably outside of the scope of Prodigy standalone. You might want to have a look at the upcoming Prodigy Scale, which is a separate product we're developing for exactly that use case. See here for details:
While Prodigy includes some features you can use to build a multi-user system, you still have to decide how you want it to be set up. For instance, how do you want the authentication to work? How should the users authenticate and should they all annotate the same stream, or should everyone annotate different examples? And do you want to use one instance, or start a new Prodigy server for different users? These are all things you can add around the prodigy
Python library and how you do this depends on your requirements and your use case. For example, you could run a web server that lets the user log in and then starts up a new annotation task for them by executing a recipe script.
As of v1.7, the library ships with with a built-in feature to name user sessions. Here's the relevant part in the PRODIGY_README.html
:
Multi-user sessions
This update was shipped in preparation of the upcoming Prodigy Scale, a full-featured, standalone application for large-scale multi-user annotation project powered by Prodigy.
As of v1.7.0, Prodigy supports multiple named sessions within the same instance. This makes it easier to implement custom multi-user workflows and controlling the data that's sent out to individual annotators.
To create a custom named session, add
?session=xxx
to the annotation app URL. For example, annotator Alex may access a running Prodigy project viahttp://localhost:8080/?session=alex
. Internally, this will request and send back annotations with a session identifier consisting of the current dataset name and the session ID – for example,ner_person-alex
. Every time annotator Alex labels examples for this dataset, their annotations will be associated with this session identifier.The
"feed_overlap"
setting in yourprodigy.json
or recipe config lets you configure how examples should be sent out across multiple sessions. By default (true
), each example in the dataset will be sent out once for each session, so you'll end up with overlapping annotations (e.g. one per example per annotator). Setting"feed_overlap"
tofalse
will send out each example in the data once to whoever is available. As a result, your data will have each example labelled only once in total.As of v1.8.0, the
PRODIGY_ALLOWED_SESSIONS
environment variable lets you define comma-separated string names of sessions that are allowed to be set via the app. For instance,PRODIGY_ALLOWED_SESSIONS=alex,jo
would only allow?session=alex
and?session=jo
, and other parameters would raise an error.
So, in summary, you can direct different annotators to custom URLs ending with ?session=name
and control whether annotators are seeing the same examples or not using the "feed_overlap"
setting.