This is a good point and an interesting suggestion The only aspect here that could be tricky is that the task data can easily get very verbose if it's tracking the entire task on every update.
Mostly thinking out loud, but you might be able to implement something like this by listening to the prodigyupdate
event and adding a task property "history"
with timestamped versions of the given task:
document.addEventListener('prodigyupdate', event => {
const { task } = event.detail
const history = task.history || {}
// Add entry with timestamp and selected properties you want to track
history[Date.now()] = { spans: task.spans, answer: task.answer }
window.prodigy.update({ ...task, history })
})
This way, you can decide which values you care about (e.g. "spans"
and "answer"
, or "label"
etc.). And later on, you can even run some automated diagnostics, e.g. if the goal is named entities and the history is significantly longer than the final number of "spans"
, it can indicate that the annotator went back and forth a lot. And then you can look at the result in more detail.