Hi @zkl!
Prodigy recipe callbacks are called when a batch of answers is received by the server. So if you need to modify the content of the current task based on the content of the previous task, you'd need to set the batch_size
to 1 so that you can "intercept" each answer.
Also, as you rightly point out it only makes sense to modify the history_text
in the streaming function i.e. before it is sent to the UI. Modifying it in the callback such as update
or validate_answer
wouldn't affect the task shown in the UI as it operates on the tasks that have already been accepted and sent back to the server.
You could store the previous answer in a global variable that gets updated in the callback e.g. update
or validate_answer
and use make your streaming function refer to the same variable while creating the task structure:
PREV_ANSWER = None
def custom_get_stream(source):
nonlocal PREV_ANSWER
stream = get_stream(source)
stream.apply(add_labels_to_stream, stream=stream, labels=["A"])
for eg in stream:
if PREV_ANSWER:
eg["history_text"] = PREV_ANSWER
yield eg
stream = custom_get_stream(source)
def update(answers):
nonlocal PREV_ANSWER
for eg in answers:
answer = eg.get("answer")
if PREV_ANSWER is not None:
assert PREV_ANSWER == answer, "The previous answer was different!"
else:
PREV_ANSWER = answer
eg["history_text"] = PREV_ANSWER
That said, trying to control so tightly the feedback between adjacent annotations is difficult because Prodigy maintains internal buffer of the tasks to ensure a smooth annotation experience. So even if you're using a batch_size
of 1, there may always be at least one example "in transit" that's sent back to the server, while Prodigy asks for more examples in the background. So the update to the history_text
will only be reflected in the next batch. That's probably not what you want (I still wanted to explain a bit more how the callbacks work)
In your case you're probably better off implementing this behaviour in javascript
so that you're not restricted by processing information in batches but can react to each prodigyanswer
event.
Here's an example of such simple "custom validation warning alert" that (I think) implements the last idea you put forward:
// define global variable PREV_ANSWER that stores previous answer
let PREV_ANSWER;
document.addEventListener('prodigyanswer', event => {
const { answer, task } = event.detail;
// if PREV answer is defined compare with the current answer
if (PREV_ANSWER !== undefined && PREV_ANSWER !== answer) {
// Ask user what to do with a confirm dialog
const confirmUpdate = confirm(`Warning: The current answer is different from the previous answer.\n\nPrevious answer: ${PREV_ANSWER}\n\nCurrent answer: ${answer}\n\nClick 'OK' to accept anyway or 'Cancel' to update your answer.`);
if (confirmUpdate) {
// User chose "Accept anyway" - update PREV_ANSWER
PREV_ANSWER = answer;
// Allow the event to continue normally
} else {
// User chose "Cancel" - let the event complete, then undo
// Use setTimeout to wait for the event to complete before clicking undo
setTimeout(() => {
const undoButton = document.querySelector('.prodigy-button-undo');
if (undoButton) {
undoButton.click();
// Don't update PREV_ANSWER since we're undoing
} else {
console.error("Undo button not found");
alert("Could not find undo button. Please manually adjust your answer.");
}
}, 100); // Short delay to ensure the answer is processed first
// Allow the event to continue so the answer gets registered before we undo it
return;
}
}
// Update PREV_ANSWER unless we're in the process of undoing
PREV_ANSWER = answer;
});
It's a bit hacky in that it emulates the undo
action, which necessary to allow for task modification, but it also permits accepting the answer which does not pass validation in that the answer is different from the previous one.
I used binary labels but you should be able to adapt it easily to other kinds of Prodigy annotations.
This should result in the following confirm pop up:
The annotators could hit OK to accept the current answer and Cancel to go back to editing the current question.