As I go deep with active learning in Prodigy, I’m more and more confused about the prefer_uncertain function. It said that the prefer_uncertain rerank the examples, but the sort operation works in one batch or several batches?
I assume it doesn’t sort in whole dataset. If that is the case, after choosing the uncertain items for user to annotate, what about the rest in these batches? Were they just threw away. I’m so curious that as the model updates, these previous certain items may have different scores. Don’t they should be considered again?
What’s more, what’s the rule in prefer_uncertain? Does it prefer the one of score close to 0 or close to 0.5 and -0.5? I ask this question because when I take a look of the scores of annotated items in order. I didn’t find any rule about the score order.
By the way, I’m running prodigy with a pytorch model. Could you give any idea about how to verify the active learning really works?
Thanks a lot!