Update history_text after annotation

Hi,

For one of my use cases, it is helpful for the history sidebar to show the annotation for each sample.

We cluster similar examples together, so our annotation process is generally very quick. It's reasonably likely that a differing label is an error. Right now, we log when this happens, but that means annotators need to have the terminal open. It would be much cleaner for annotators to only focus on the UI.

I tried populating the history_text value in the validate_answer callback (in other words, after an annotation is made), but it seems that history_text is only used if provided in the streaming function. Is there any way to populate the history text in the validate_answer callback?

If that's not possible, I have another idea that might be more broadly useful: have an option to show users warnings based on the validate_answer function. Right now, validate_answer can only raise errors, but there might be conditions that aren't always errors but should be double checked.

Thanks!

Hi @zkl!

Prodigy recipe callbacks are called when a batch of answers is received by the server. So if you need to modify the content of the current task based on the content of the previous task, you'd need to set the batch_size to 1 so that you can "intercept" each answer.

Also, as you rightly point out it only makes sense to modify the history_text in the streaming function i.e. before it is sent to the UI. Modifying it in the callback such as update or validate_answer wouldn't affect the task shown in the UI as it operates on the tasks that have already been accepted and sent back to the server.

You could store the previous answer in a global variable that gets updated in the callback e.g. update or validate_answer and use make your streaming function refer to the same variable while creating the task structure:

PREV_ANSWER = None
    
    def custom_get_stream(source):
        nonlocal PREV_ANSWER
        stream = get_stream(source)
        stream.apply(add_labels_to_stream, stream=stream, labels=["A"])
        for eg in stream:
            if PREV_ANSWER:
                eg["history_text"] = PREV_ANSWER
            yield eg

    stream = custom_get_stream(source)

    def update(answers):
        nonlocal PREV_ANSWER
        for eg in answers:
            answer = eg.get("answer")
            if PREV_ANSWER is not None:
                assert PREV_ANSWER == answer, "The previous answer was different!"
            else:
                PREV_ANSWER = answer
            eg["history_text"] = PREV_ANSWER

That said, trying to control so tightly the feedback between adjacent annotations is difficult because Prodigy maintains internal buffer of the tasks to ensure a smooth annotation experience. So even if you're using a batch_size of 1, there may always be at least one example "in transit" that's sent back to the server, while Prodigy asks for more examples in the background. So the update to the history_text will only be reflected in the next batch. That's probably not what you want (I still wanted to explain a bit more how the callbacks work)

In your case you're probably better off implementing this behaviour in javascript so that you're not restricted by processing information in batches but can react to each prodigyanswer event.
Here's an example of such simple "custom validation warning alert" that (I think) implements the last idea you put forward:

// define global variable PREV_ANSWER that stores previous answer
let PREV_ANSWER;

document.addEventListener('prodigyanswer', event => {
    const { answer, task } = event.detail;
    
    // if PREV answer is defined compare with the current answer
    if (PREV_ANSWER !== undefined && PREV_ANSWER !== answer) {
        // Ask user what to do with a confirm dialog
        const confirmUpdate = confirm(`Warning: The current answer is different from the previous answer.\n\nPrevious answer: ${PREV_ANSWER}\n\nCurrent answer: ${answer}\n\nClick 'OK' to accept anyway or 'Cancel' to update your answer.`);
        
        if (confirmUpdate) {
            // User chose "Accept anyway" - update PREV_ANSWER
            PREV_ANSWER = answer;
            // Allow the event to continue normally
        } else {
            // User chose "Cancel" - let the event complete, then undo
            
            // Use setTimeout to wait for the event to complete before clicking undo
            setTimeout(() => {
                const undoButton = document.querySelector('.prodigy-button-undo');
                if (undoButton) {
                    undoButton.click();
                    // Don't update PREV_ANSWER since we're undoing
                } else {
                    console.error("Undo button not found");
                    alert("Could not find undo button. Please manually adjust your answer.");
                }
            }, 100); // Short delay to ensure the answer is processed first
            
            // Allow the event to continue so the answer gets registered before we undo it
            return;
        }
    } 
    
    // Update PREV_ANSWER unless we're in the process of undoing
    PREV_ANSWER = answer;
});

It's a bit hacky in that it emulates the undo action, which necessary to allow for task modification, but it also permits accepting the answer which does not pass validation in that the answer is different from the previous one.
I used binary labels but you should be able to adapt it easily to other kinds of Prodigy annotations.

This should result in the following confirm pop up:


The annotators could hit OK to accept the current answer and Cancel to go back to editing the current question.

1 Like

Sorry for the delay. This was super helpful and worked for me with just a few tweaks, thanks so much!

1 Like