Add labeled gold data in between unlabeled data

Asma · May 20, 2024, 7:14am

Hello
I am planning to add labeled gold data in between unlabeled data

the goal is to check if the annotators are randomly annotating or not.
If they annotate the gold data wrongly, I want to notify them that they need to focus during annotation.

How can I do that? I have searched but couldn't find any script that supports that.
Thank you

magdaaniol · May 20, 2024, 12:52pm

Hi @Asma,

We currently do not have any built-in feature that supports the behavior described.
There is a couple of ways you could go about it:

post-hoc validation
You could db-out your annotators' datasets and run a script that compares the answers to the gold annotations and outputs a report for you.
Alternatively, if you format your gold annotations so that they have an _annotator_id field (e.g. "gold_annotator") and you concatenate this gold dataset with each of your annotators' datasets, you should also be able to run Prodigy inter-annotator agreement metrics to get a report. Note though that this would have to be run per annotator as the IAA metrics do not provide report per annotator.
online validation
You could leverage Prodigy validate_answer callback that would compare the submitted answer and compare it to the gold annotation and generate a message to the annotator while they annotate. For this to work you'd have to read the gold annotations in the custom recipe and key them by some identifier e.g. the _input_hash, so that the validation function could look something like:

from typing import Dict

def correct_answers(eg: Dict, gold_answer: Dict) -> bool:
    # Assuming this function compares 'eg' with 'gold_answer' and returns a boolean
    # Implement the comparison logic here
    pass

def validate_answer(eg: Dict, gold_answers: Dict):
    if eg["_input_hash"] in gold_answers:
        result = correct_answers(eg, gold_answers[eg["_input_hash"]])
        assert result is True, "You need to pay more attention!"

Since the eg argument will be provided by the controller, you might want to call this function only with gold_answers using partial. In other words the validate_answer callback with some extra arguments would be returned from the recipe like so:

{
...
"validate_answer": partial(validate_answer, gold_answers=gold_answers),
...
}

You would then need to look at the logs to see how often this validation fired and for which annotators that could also be automated by a script of course.

Let us know if you need help with the implementation of any of these solutions!

Asma · May 27, 2024, 4:14am

I am using this script below (recipe.py)

import prodigy
import srsly
from typing import Dict
from functools import partial
import logging

# Set up logging
logging.basicConfig(level=logging.DEBUG)

@prodigy.recipe(
    "my-custom-recipe",
    dataset=("Dataset to save answers to", "positional", None, str),
    jsonl_file=("Jsonl File to Label", "positional", None, str),
    gold_file=("Gold Answers Jsonl File", "positional", None, str)
)
def my_custom_recipe(dataset, jsonl_file, gold_file):
    try:
        # Load your stream from the JSONL file
        stream = list(srsly.read_jsonl(jsonl_file))
        logging.info(f"Loaded {len(stream)} examples from {jsonl_file}")
        if len(stream) == 0:
            logging.error("No examples found in the JSONL file.")
            return {"dataset": dataset, "stream": [], "view_id": "blocks"}
    except Exception as e:
        logging.error(f"Failed to load stream from {jsonl_file}: {e}")
        return {"dataset": dataset, "stream": [], "view_id": "blocks"}

    try:
        # Load gold answers from the JSONL file
        gold_data = list(srsly.read_jsonl(gold_file))
        gold_answers = {item["_input_hash"]: item for item in gold_data}
        logging.info(f"Loaded {len(gold_answers)} gold answers from {gold_file}")
        if len(gold_answers) == 0:
            logging.error("No gold answers found in the JSONL file.")
            return {"dataset": dataset, "stream": [], "view_id": "blocks"}
    except Exception as e:
        logging.error(f"Failed to load gold answers from {gold_file}: {e}")
        return {"dataset": dataset, "stream": [], "view_id": "blocks"}

    def correct_answers(eg: Dict, gold_answer: Dict) -> bool:
        logging.debug(f"Evaluating example: {eg}")
        logging.debug(f"Gold answer: {gold_answer}")

        # Check if the necessary fields exist in the example and gold answer
        required_fields = ['text', 'label1', 'label2']
        for field in required_fields:
            if field not in gold_answer:
                logging.warning(f"Gold answer missing '{field}' field: {gold_answer}")
                return False

        # Ensure that the example has 'label1' and 'label2' keys with default values if they don't exist
        example_label1 = eg.get('label1', None)
        example_label2 = eg.get('label2', None)

        return (eg['text'] == gold_answer['text'] and
                example_label1 == gold_answer['label1'] and
                example_label2 == gold_answer['label2'])

    def validate_answer(eg: Dict, gold_answers: Dict):
        logging.debug(f"Validating example: {eg}")
        if "_input_hash" in eg:
            input_hash = eg["_input_hash"]
            if input_hash in gold_answers:
                result = correct_answers(eg, gold_answers[input_hash])
                assert result is True, "You need to pay more attention!"
            else:
                logging.warning(f"No gold answer found for _input_hash: {input_hash}")
        else:
            logging.warning("Example does not contain '_input_hash'")

    # Ensure each example in the stream contains the necessary fields
    for example in stream:
        if "_input_hash" not in example or "text" not in example:
            logging.error(f"Example missing required fields: {example}")
            continue
        # Add default values for 'label1' and 'label2' if they don't exist
        example.setdefault('label1', None)
        example.setdefault('label2', None)
        example.setdefault('accept', [])
        example.setdefault('reject', [])
        example.setdefault('ignore', [])

    blocks = [
        {"view_id": "html", "html_template": "{{text}}"},
        {
            "view_id": "choice",
            "field_id": "label1",
            "text": "اختر المستوى الأول:",
            "choices": [
                {"id": "easy", "text": "سهل"},
                {"id": "medium", "text": "متوسط"},
                {"id": "hard", "text": "صعب"}
            ]
        },
        {
            "view_id": "choice",
            "field_id": "label2",
            "text": "اختر المستوى الثاني:",
            "choices": [
                {"id": "easy", "text": "سهل"},
                {"id": "medium", "text": "متوسط"},
                {"id": "hard", "text": "صعب"}
            ]
        }
    ]

    return {
        "dataset": dataset,
        "view_id": "blocks",
        "stream": stream,
        "config": {
            "blocks": blocks
        },
        "validate_answer": partial(validate_answer, gold_answers=gold_answers)
    }

However I faced error

my data sample look like:
(gold.jsonl)

{"_input_hash": 3, "text": "ميرزاجن3 هو واحد من مجموعة من الأدوية تسمى مضادات الاكتئاب.تستخدم أقراص ميرزاجن لعلاج مرض الاكتئاب.", "label1": "صعب", "label2": "متوسط"}

(sample.jsonl)

{"_input_hash": 1, "text": "ميرزاجن1 هو واحد من مجموعة من الأدوية تسمى مضادات الاكتئاب.تستخدم أقراص ميرزاجن لعلاج مرض الاكتئاب."}
{"_input_hash": 2, "text": "ميرزاجن2 هو واحد من مجموعة من الأدوية تسمى مضادات الاكتئاب.تستخدم أقراص ميرزاجن لعلاج مرض الاكتئاب."}
{"_input_hash": 3, "text": "ميرزاجن3 هو واحد من مجموعة من الأدوية تسمى مضادات الاكتئاب.تستخدم أقراص ميرزاجن لعلاج مرض الاكتئاب."}
{"_input_hash": 4, "text": "ميرزاجن4 هو واحد من مجموعة من الأدوية تسمى مضادات الاكتئاب.تستخدم أقراص ميرزاجن لعلاج مرض الاكتئاب."}

!python -m prodigy my-custom-recipe testing_new_1 sample.jsonl gold.jsonl -F recipe.py

magdaaniol · May 27, 2024, 3:21pm

Hi @Asma ,

This error usually means that the structure of the dictionary returned by the recipe is not as expected by the front-end.
Looking at your recipe (thanks so much for providing reproducible example - it always helps a lot!)
the name of the key for options in the choice block is incorrect. It should be options and not choices:

 {
            "view_id": "choice",
            "field_id": "label2",
            "text": "اختر المستوى الثاني:",
            "options": [
                {"id": "easy", "text": "سهل"},
                {"id": "medium", "text": "متوسط"},
                {"id": "hard", "text": "صعب"}
            ]
        }

With this change the recipe loads correctly. The validation logic looks correct to me but I haven't tested it.

Topic		Replies	Views
Annotation strategy for gold-standard data usage , ner , solved , best-practices	5	2704	October 26, 2018
Difference in quality in make-gold vs trained model's annotations (and others) ner	1	598	August 10, 2018
NER, additional checking after highlighting spans usage , ner	2	275	July 2, 2021
Compare recipe: prevent accepting blank annotation usage	1	306	September 3, 2021
ner silver-to-gold resulted in annotating the same objects multiple times bug , ner	3	815	December 13, 2021

Add labeled gold data in between unlabeled data

Related topics