prodigy Multi-user session access

Hi Prodigy team,

Currently when I am running prodigy server, and using ngrok - at a time multiple users cannot annotate.
We want to use this annotation application accessible to 100+ users at the same time and monitor the annotation by each user. Also, I want to create a seamless multi user authentication. Could you please provide some detailed information regarding this ?

P.S. I read the information provided in the documentation and I found it not sufficient to understand how to setup what I need.

Thanks a lot,
Vandana

1 Like

Hi! If you're looking for user management and monitoring at that scale, this is probably outside of the scope of Prodigy standalone. You might want to have a look at the upcoming Prodigy Scale, which is a separate product we're developing for exactly that use case. See here for details:

While Prodigy includes some features you can use to build a multi-user system, you still have to decide how you want it to be set up. For instance, how do you want the authentication to work? How should the users authenticate and should they all annotate the same stream, or should everyone annotate different examples? And do you want to use one instance, or start a new Prodigy server for different users? These are all things you can add around the prodigy Python library and how you do this depends on your requirements and your use case. For example, you could run a web server that lets the user log in and then starts up a new annotation task for them by executing a recipe script.

As of v1.7, the library ships with with a built-in feature to name user sessions. Here's the relevant part in the PRODIGY_README.html:

Multi-user sessions

This update was shipped in preparation of the upcoming Prodigy Scale, a full-featured, standalone application for large-scale multi-user annotation project powered by Prodigy.

As of v1.7.0, Prodigy supports multiple named sessions within the same instance. This makes it easier to implement custom multi-user workflows and controlling the data that's sent out to individual annotators.

To create a custom named session, add ?session=xxx to the annotation app URL. For example, annotator Alex may access a running Prodigy project via http://localhost:8080/?session=alex. Internally, this will request and send back annotations with a session identifier consisting of the current dataset name and the session ID – for example, ner_person-alex. Every time annotator Alex labels examples for this dataset, their annotations will be associated with this session identifier.

The "feed_overlap" setting in your prodigy.json or recipe config lets you configure how examples should be sent out across multiple sessions. By default (true), each example in the dataset will be sent out once for each session, so you'll end up with overlapping annotations (e.g. one per example per annotator). Setting "feed_overlap" to false will send out each example in the data once to whoever is available. As a result, your data will have each example labelled only once in total.

As of v1.8.0, the PRODIGY_ALLOWED_SESSIONS environment variable lets you define comma-separated string names of sessions that are allowed to be set via the app. For instance, PRODIGY_ALLOWED_SESSIONS=alex,jo would only allow ?session=alex and ?session=jo, and other parameters would raise an error.

So, in summary, you can direct different annotators to custom URLs ending with ?session=name and control whether annotators are seeing the same examples or not using the "feed_overlap" setting.

Hi,

So I have a problem where I want to annotate the same dataset again & again by multiple users(Basically to capture multiple annotations on each task to analyse all of them).
Currently I'm handling through mutl-user session, where ?session=xxx can create a seperate session for each user. The problem in this approach is let's say one of the user has completed annotating all of the tasks, and all other users are annotating simultaneously. So what happens is whenever any other user refresh the url it shows no tasks available.

How do I solve this issue/ how should I enable multiple users to tag same data?

@akshayklr057 Hi, check out the answer on your other thread here – setting "force_stream_order" will make sure that all examples are re-sent until they're answered, and always sent out in the same order:

How can I add my fellow annotaters in my project?

You don't have to do anything to add people – Prodigy doesn't have a concept of a "user", so you can just start an instance and send your annotators the link. You can either start multiple instances on different hosts and/or ports, or a single instance that people can access with named sessions by adding ?session=... to the URL.

Hi,

I hooked up postgres to Prodigy, and figuring out the data schema. I don't see the session identifier anywhere in the database. Is that stored? I'd like to distinguish test users from actual ones for instance.

Hi! The session ID should be stored with every example as the _session_id key in the JSON. Prodigy will also create session datasets for each session, e.g. {dataset}-{session_id}.