Custom progress bar

Hello :wave:

I am creating a custom recipe, similar to textcat that uses a scikit trained model for active learning and i am trying to figure out how to display a custom progress bar to the annotators.

I am defining a function progress with arguments, session, total and loss but the problem I am running into is that session and total seem to be passed initially but for any subsequent iterations only loss is passed. What I want to display is a percentage of annotated examples or even better a percentage of positive annotated examples (from a fixed number that i would like to collect).

Hi! I just had a quick look and I think you might be hitting an edge case here where Prodigy assumes that a custom recipe exposing an update callback that returns a non-None loss and a custom progress function should only receive the loss for the progress. That might be true for the built-in recipes, but obviously not all custom recipes. I'll see if we can add a fix for this!

In the meantime, you could probably work around this by storing the counts and everything else you need within your recipe function and then using that to calculate the progress in your progress callback. For instance, something like this:

# In your recipe function
goal = 1234
total_accepted = 0

def update(answers):
    # Your other update logic here
    accepted = [a for a in answers if a["answer"] == "accept"]
    total_accepted += len(accepted)

def progress(*args, **kwargs):
    return total_accepted / goal

You could even take other metrics into account, like examples that are already present in the dataset. The progress is calculated on each update, so if your batch size isn't super low, fetching all examples from the dataset on each call shouldn't be a problem.



Thanks for the answer. This does the trick. :pray:

From your answer I also understood though that removing update should work which I tried and it did work. Update is not necessary to me at this point since I mostly use the model to pick the most promising examples to reduce annotation time. And plus the current model does not have a partial_fit to do incremental updates.

1 Like