Why is there no active learning for manual/gold standard annotations?

ldorigo · May 10, 2019, 8:10am

Hi there,

Apologies in case this was brought up before or in the documentation, I haven’t found a good answer.

From what I understand (and I might be wrong, I haven’t studied the details behind the active learning strategy yet), the way the binary active learning works is by doing one weight update for each batch of binary annotations. The model then tries to predict the next bunch of entities and asks the user for the ones it’s most uncertain about.

While you have made a compelling point on why the binary workflow is great for quick annotation, it requires to start with a model that has at least some idea of what the entities are.

I am wondering why the gold-standard-creation recipe does not update the model along the way? It does already use the model to suggest entities (making it easier for the user to quickly correct them instead of marking them all manually). It seems like updating the model after each batch (or even after every single annotated paragraph) would make it better and better along the way, reducing the amount of manual corrections the user needs to bring.

honnibal · May 10, 2019, 1:24pm

You can definitely provide an update() callback function in a recipe that uses the manual interface. Custom recipes are very useful in general, as the Python API is much more powerful than the recipe arguments.

Ultimately we decided to avoid trying to cram every combination of possible options into the built-in CLI. If you make the CLI too complicated, you end up programming via the CLI, and learning too many arbitrary details. At some point it’s better to switch over to Python. The source for the built-in recipes is provided with prodigy, and you can find starter custom recipes here: https://github.com/explosion/prodigy-recipes

ldorigo · May 10, 2019, 7:38pm

You’re right, I’ve been relying on the CLI so far as it’s easier to get into, but it’s probably time I get into using the API directly.

Thanks for your quick answer as usual.

Topic		Replies	Views
Non binary active learning ner , best-practices	2	371	October 14, 2022
Active learning and correct directly, instead of binary classification first ner	2	281	August 30, 2023
active learning and update function ner , best-practices	1	1030	February 25, 2021
Saved annotation not excluded in active learning recipe bug , textcat	3	420	February 13, 2022
When is the model called and the scores updated in the textcat teach method textcat	1	347	August 19, 2022

Why is there no active learning for manual/gold standard annotations?

Related topics