gold standard feature (similar to MTurk)

In Mechanical Turk, there is a feature to prevent the annotators from gaming the labeling process, and the idea is to randomly present a gold standard question, where we already know what the correct answer is. If the annotator misses the gold standard question, then we give feedback and record that they missed the answer.

With Prodigy, do you have suggestions on how this can be implemented?

We were thinking about doing the following:

  • configure the answer to a gold standard question in the meta field & use hide_meta in the prodigy.json
  • inject Javascript to check whether or not the gold standard was answered correctly
  • record how many times the user failed the gold standard question in Javascript
  • at the end of the task, we log the number of failures. if this exceeds some threshold, we throw out the user’s results

Stray questions:

  • is it possible to prevent the user from moving forward until they have answered the gold standard correctly?
  • is it possible to modify the meta field after they answer something? for example, it isn’t useful for us to know that they got the right answer eventually. instead, we want to know if they got the gold standard question wrong at least once.

Thank you!

Hi! That's an interesting idea – there might be some design decisions in Prodigy that are not perfectly optimised for those "low trust" use cases, but I'm pretty sure your idea should work with an approach similar to the one you describe :slightly_smiling_face:

You could do that, yes! You could also just add it as any other field in the task data, like "correct_answer" or something. This wouldn't be surfaced to the annotator, and you could still use the "meta" field for information that you do want the annotator to see.

A function to update the current task data is exposed as window.prodigy.update and the current content as window.prodigy.content. So you could use that to keep a counter of the wrong attempts on the task dict. For example, something like "wrong_attempts": 5:

const { update, content } = window.prodigy
update({ wrong_attempts: content.wrong_attempts + 1 })

What types of annotations are you collecting? The only problem I can think of is that when the user hits the accept/reject/ignore buttons, the answer is submitted and you can listen to the event – but there's no easy way to stop it. So you might have to add your own "submit" button that calls a custom function that validates the answer against your gold data, updates the task (e.g. increments the counter for failed attempts) and then either pops up an alert, or calls window.prodigy.answer with "accept".

For example, something like this:

function onCustomSubmit() {
    const { content, update, answer } = window.prodigy
    // Validate the answer by checking the content (current task)
    // Disclaimer: this is obviously a bad equality check, but
    // I'm just trying to illustrate the idea
    if (content.spans != content.correct_answer.spans) {
        alert('This was wrong. Please try again.')
        update({ wrong_attempts: content.wrong_attempts + 1 })
    }
    else {
        answer('accept')
    }
}

I guess you could also use JavaScript to add your custom Submit button and hide the existing buttons. It's not very elegant and not usually something I'd recommend, but it seems like the best option in your case.

1 Like

Great, thank you so much for your support! This helps a lot.