Apologies in case this was brought up before or in the documentation, I haven’t found a good answer.
From what I understand (and I might be wrong, I haven’t studied the details behind the active learning strategy yet), the way the binary active learning works is by doing one weight update for each batch of binary annotations. The model then tries to predict the next bunch of entities and asks the user for the ones it’s most uncertain about.
While you have made a compelling point on why the binary workflow is great for quick annotation, it requires to start with a model that has at least some idea of what the entities are.
I am wondering why the gold-standard-creation recipe does not update the model along the way? It does already use the model to suggest entities (making it easier for the user to quickly correct them instead of marking them all manually). It seems like updating the model after each batch (or even after every single annotated paragraph) would make it better and better along the way, reducing the amount of manual corrections the user needs to bring.