How to manage multiple annotators?

hi @sudarshan85!

First off - great work! This is a fascinating workflow and especially impressive under a highly secure environment.

Several thoughts. Let's start with your core question:

Step 1: Enabling flagging

Have you considered using flagging for annotators to "flag" problems with each record?

Just add this to your global prodigy.json: "show_flag": true

@koaning has a great tutorial on this:

With this, annotators can identify problems at the moment, then keep moving on.

Step 2: (Optional) Add in input text for which annotator to send and message for them

You could also create a custom interface with blocks. If you created two input texts, one for who to send to (the other annotator) and the other for text to state the message/explanation.

Something like this:

Export-1688150368396

import prodigy
from prodigy.components.preprocess import add_tokens
import requests
import spacy

@prodigy.recipe("cat-facts")
def cat_facts_ner(dataset, lang="en"):
    # We can use the blocks to override certain config and content, and set
    # "text": None for the choice interface so it doesn't also render the text
    blocks = [
        {"view_id": "classification", "label": "concept_of_interest"},
        {"view_id": "text_input", "field_id": "send_to", "field_label": "Send annotation to:", "field_suggestions": ["Steve", "Cindy", "Oliver", "Deepak"]},
        {"view_id": "text_input", "field_id": "comments", "field_rows": 3, "field_label": "Explain your decision"}
    ]

    def get_stream():
        res = requests.get("https://cat-fact.herokuapp.com/facts").json()
        for fact in res:
            yield {"text": fact["text"]}

    nlp = spacy.blank(lang)           # blank spaCy pipeline for tokenization
    stream = get_stream()             # set up the stream
    stream = add_tokens(nlp, stream)  # tokenize the stream for ner_manual

    return {
        "dataset": dataset,          # the dataset to save annotations to
        "view_id": "blocks",         # set the view_id to "blocks"
        "stream": stream,            # the stream of incoming examples
        "config": {
            "blocks": blocks         # add the blocks to the config
        }
    }

Vincent has created another awesome video just on that:

Perhaps you could try to use some custom javascript to only reveal the blocks (input_text) when the item is flagged.

Step 3: Use the flagged annotations to other annotators

Now each flagged example will be identified in the DB along with who to send it to and the text, you would need to write up a script to serve up those examples as a stream to that annotator. This will depend a bit on your workflow but the simplest way is to export out flagged examples to separate .jsonl files for each annotator the flagged examples are intended for secondary review.

You could then repeat this and if you wind up getting no resolution (e.g., the 2nd annotator agrees there's something wrong), create a new route to use the review recipe to review 2+ reviews. This could be some arbiter like a manager or the most senior teammate.

Step 4: Update the Annotations Guidelines

One other suggestion is that perhaps examples that meet some quality (e.g., maybe the 2nd reviewer also flags that example), save as examples that you add to your annotation guidelines, which can be shown in Prodigy with instructions: "path/to/my_page.html". Perhaps you could use a .jinja template to automatically populate the instructions html.

We wrote a case study where the Guardian had a similar follow up with the ultimate guide to help improve their annotation guidelines:

Perhaps the post can help SMEs who are new to an NLP workflow could appreciate the workflow (e.g., what are annotation guidelines used for, an example of flagging, etc.).

Just curious - are you aware of the difference between your global and local prodigy.json and overrides?

When you run Prodigy, it will first check if a global configuration file exists. It will also check the current working directory for a prodigy.json or .prodigy.json . This allows you to overwrite specific settings on a project-by-project basis.

Make sure to use all three for the three levels of your service.

  • global: Settings for all users (e.g., all users are connected to the right database)
  • user: Setting for each user; can also be thought of as "project" based as it applies to any tasks run by the user
  • task: Setting on the task level for each user

Sure that's definitely an option. Just want to make sure - are you aware of named-multi user sessions? This approach tends to be the more popular and default behavior for multiple annotators. But running a 1 dataset / 1 port per user is another option. Here's a good pro/con of each:

It's worth noting we're planning to make some of our biggest changes to Prodigy next week with v1.12 release. We currently have the candidate out now:

First, I'd be curious of your thoughts on task routing:

One of the motivations for this is partial_overlap, where you may specify that all annotators review x% of data. That data may be used for calibration like inter-annnotator agreement. Alternatively, you may set up routes based on different criteria -- one being the flagged information mentioned previously. With a little trial-and-error, I bet you could build a sophisticated routing system that may have multiple stages.

Just curious - while you have no internet, would it be possible to run Docker? This may help you in running instances that could even be possible scaled up (say Kubernetes) on premise.

We also wrote new deployment docs where we provide ideas of how to deploy Prodigy via Docker (among other ways):

We're also planning to release soon new "Metrics" components that include built-in and custom metrics like IAA. We decided to move it as a follow-up to our initial v1.12.0 release but I can post back once that's available (either thru alpha or released). One benefit of having IAA integrated with Prodigy would be to compute the metrics on the fly - as a means of ongoing annotation QA, which could make task routing even more powerful!

In v1.12, check out the new filter_by_patterns recipe. Perhaps it can help :slight_smile:

That's my initial thoughts but the team can look over your post next week and see if there are additional suggestions.

Hope this helps!