Hello! I've been deploying / adapting some prodigy tasks for my team. But a need has come up that we need to address:
We are pulling examples from a database based on sql criterias, and after the team tags those examples, we push them again to the database where our workflow continues. But the supervisor of this team needs to review and QA those labels before allowing those examples into a model.
Se we would need to plug something after prodigy that allows us to review, in bulk, and faster than an actual annotator. This should be some kind of editable spreadsheet or something like that (we don't want to introduce human error at this point so it would be about reject/accept or change label for a group of examples).
I've seen this: Feature Request: Bulk Dataset Drop - #4 by justindujardin
So my question is, what would be the best practice to address this kind of pipeline? Is there a solution, or a tool that could work for this? Is customizing prodigy with html reasonable?
Thank you so much!
Have you seen our review recipe? It doesn't offer a table view, but it should allow somebody to review the work of others. If this doesn't suffice, could you explain why? If I understand what's missing, I might be better able to suggest an alternative.
Hi! Thank you for your reply!
I've seen that recipe but haven't tried it. The main issue with this approach, I think, is the balance between quality and speed. We have 3 annotators and 1 supervisor. The domain is rather specific so, often, the supervisor needs to explain new criteria to the team after a mistake was made. He is confident that a quick overview of things, in a table for example, would be enough in most cases, and fast enough.
Thanks again for helping me think this!
There are a few downsides to a "labeltable"-view in my experience. Part of my experience has to do with shortcuts. I'm must faster hitting my keyboard than I am moving my cursor. But another downside is that you typically aren't as precise when dealing with bulk batches. The final segment of my "find bad labels in image data"-video (found here) gives a demonstration of what I mean by this. This is my personal experience though; it may be that this does not apply to your use case. You certainly could spend the effort of making a custom interface if you really feel that it would be better.
One thing that might save a lot of time though: are there moments where the supervisor is not needed? I can imagine if 1 of the 3 annotators disagrees with the label that a supervisor is required, but if they all agree? Is it really that likely that the supervisor would disagree? If the supervisor is the bottleneck, it may also be a possibility to give the supervisor a subset of data points.
Also, what kind of data are you labelling? Text classification? NER?
Right now we are working with multiple choice text classification. But we will be doing NER in the future.
You are right about the supervisor working on a subset. I'm thinking on giving them some kind of overall visibility (yes a spreadsheet of sorts) and make them able to identify a problematic subset of that. If the supervisor can quickly see that some labels are good, or some annotator is in general ok, then they could keep a smaller set of samples to review.
I need to think now on how the workflow would be, so that I can launch a prodigy review task using that input.
(the review recipe supports multiple choice right? I'm not finding examples)
I'd assume you should be able to do multiple choice reviews there, yes.
It may depend a bit on your version, though. The docs here remind me that if you're using an earlier version of Prodigy (v.1.7.1 or earlier) that you may need to pass the