I am creating a custom recipe, similar to textcat that uses a scikit trained model for active learning and i am trying to figure out how to display a custom progress bar to the annotators.
I am defining a function progress with arguments, session, total and loss but the problem I am running into is that session and total seem to be passed initially but for any subsequent iterations only loss is passed. What I want to display is a percentage of annotated examples or even better a percentage of positive annotated examples (from a fixed number that i would like to collect).
Hi! I just had a quick look and I think you might be hitting an edge case here where Prodigy assumes that a custom recipe exposing an update callback that returns a non-
None loss and a custom progress function should only receive the loss for the progress. That might be true for the built-in recipes, but obviously not all custom recipes. I'll see if we can add a fix for this!
In the meantime, you could probably work around this by storing the counts and everything else you need within your recipe function and then using that to calculate the progress in your
progress callback. For instance, something like this:
# In your recipe function
goal = 1234
total_accepted = 0
# Your other update logic here
accepted = [a for a in answers if a["answer"] == "accept"]
total_accepted += len(accepted)
def progress(*args, **kwargs):
return total_accepted / goal
You could even take other metrics into account, like examples that are already present in the dataset. The progress is calculated on each update, so if your batch size isn't super low, fetching all examples from the dataset on each call shouldn't be a problem.
Thanks for the answer. This does the trick.
From your answer I also understood though that removing update should work which I tried and it did work. Update is not necessary to me at this point since I mostly use the model to pick the most promising examples to reduce annotation time. And plus the current model does not have a partial_fit to do incremental updates.
It turns out that the solution you suggested is closer to what I initially wanted to achieve. Removing update allowed me to inform the user how many examples of the goal have been annotated but what I really want is to communicate how many positive examples have been annotated.
When I am trying to implement your recommendation though it throws me a reference before assignment error for total_accepted, not sure where I should initialise the variable so that it is available, i tried inside the recipe, in the top of the file and even in the init of the class with no success. Can you give me some direction as to how the state of the variable can be shared between the progress function and where it is initialized?
Thanks again in advance
@nsorros Ah, I think you might have to add
nonlocal total_accepted to the top of your function to tell Python that you mean the existing variable.
Edit: Edited my above post to add this.
Thanks for the super quick response, that did the trick indeed.