Working with Annotating teams in prodigy

I am a new prodigy user with a team of annotators:

  • If i have annotators A,B, C, etc accessing prodigy via a webapp, is it possible for annotators to have access to other annotator's ignore list?
  • Is it possible to perform sentiment analysis and POS tagging on the same data at the same session? Or, should the data be loaded twice on different ports? one for sentiment categorization and the other for POS tagging?

Thanks for any help.

Hi @miketg!

If i have annotators A,B, C, etc accessing prodigy via a webapp, is it possible for annotators to have access to other annotator's ignore list?

In principle it is not possible for one session to have access to other sessions' annotations. In most cases it should not be recommended as annotators should work independently to prevent bias.
Technically, you could implement a custom router that makes calls to the DB to check for ignored questions but before recommending anything I'd like understand more why you would want this access and what would you like to do with this information.
The short answer, though is that it is not immediately possible.

Is it possible to perform sentiment analysis and POS tagging on the same data at the same session? Or, should the data be loaded twice on different ports? one for sentiment categorization and the other for POS tagging?

You should be able to define a task with two sub-tasks: one for POS tagging and the other one for sentiment via pages UI.
It could look something like this:

from typing import Dict, List, Any

import prodigy
import spacy
from prodigy.core import Arg, recipe
from prodigy.components.stream import get_stream
from prodigy.components.preprocess import add_tokens
from prodigy.types import StreamType
from prodigy.util import set_hashes


# Constants
DEFAULT_OPTIONS = [
    {"id": "option_1", "text": "BAZ"},
    {"id": "option_2", "text": "QUX"}
]

DEFAULT_POS_LABELS = ["adj", "noun", "verb"]


def create_choice_page(
    text: str,
    options: List[Dict] = DEFAULT_OPTIONS,
    choice_style: str = "multiple"
) -> Dict:
    """Create a choice page configuration."""
    return set_hashes({
        "text": text,
        "view_id": "choice",
        "options": options,
        "config": {"choice_style": choice_style}
    })


def create_pos_page(
    text: str,
    tokens: List[Dict],
    labels: List[str] = DEFAULT_POS_LABELS
) -> Dict:
    """Create a POS tagging page configuration."""
    return set_hashes({
        "text": text,
        "tokens": tokens,
        "view_id": "pos_manual",
        "config": {"labels": labels}
    })


def create_pages(example: Dict[str, Any]) -> Dict[str, Any]:
    """Create all pages for a given example."""
    pages = [
        create_choice_page(text=example["text"]),
        create_pos_page(
            text=example["text"],
            tokens=example.get("tokens", [])
        )
    ]
    return set_hashes({"pages": pages})


def add_pages(stream: StreamType) -> StreamType:
    """Process the input stream and generate pages."""
    for example in stream:
        paginated_example = create_pages(example)
        yield set_hashes(paginated_example)


@prodigy.recipe(
    "test-recipe",
    dataset=Arg(help="Dataset to save answers to."),
    source=Arg(help="Input source")
)
def test_recipe(dataset: str, source: str) -> Dict[str, Any]:
    """
    Process text files and create a multi-page annotation interface.
    
    Args:
        dataset: Name of the dataset to save annotations
        source: Input source
        
    Returns:
        Dictionary containing recipe configuration
    """
    stream = get_stream(source)
    nlp = spacy.blank("en")
    
    # Process stream
    stream = add_tokens(stream=stream, nlp=nlp)
    stream = add_pages(stream=stream)
    
    return {
        "dataset": dataset,
        "view_id": "pages",
        "stream": stream,
    }

This should result in the UI that let's annotators switch between the two subtasks:
pages-ezgif.com-video-to-gif-converter

That said, it is usually recommended not to mix unrelated annotation task in a single session so as not the break the annotators focus. It's usually more efficient and less error prone to two two passes over data each dedicated to single task. If you want to parallelize this workflow, then yes spinning up two Prodigy servers each on different port would be the way.

Technically, you could implement a custom router that makes calls to the DB to check for ignored questions but before recommending anything I'd like understand more why you would want this access and what would you like to do with this information.
The short answer, though is that it is not immediately possible.

Ah okay. The scenario I have in my head is one where annotator A does not know the correct label to apply to the present so they flag for another reviewer to look at it. Of course, in typing this out, that may not be ideal so i wonder what the recommendation is in handling those questions which are flagged as 'odd'

You should be able to define a task with two sub-tasks: one for POS tagging and the other one for sentiment

This is awesome! Just want to confirm - it can be any combination of text labeling? POS tagging, multi-classification, Name-entity recognition.

Thank you for the reply and help!

I think one reasonable approach could be flagging them and review them in an alternative pass.
More on flagging tasks in Prodigy here: Web Application · Prodigy · An annotation tool for AI, Machine Learning & NLP

it can be any combination of text labeling? POS tagging, multi-classification, Name-entity recognition.

Yes, each page can have a different view_id and task level settings can set or overridden on the page level.