This a feature request for a ‘nice-to-have’ as this is totally not the most important thing out there.
It might be nice to have a sort of estimate for number of labels per minute that can be generated. It helps to know how much time might yield how many labels. This is very similar to the
tqdm package in python.
This could even be used to generate an ETA, but obviously only on datasets that are finite.
Yes, I love that idea! We’ve been doing this manually sometimes and then calculating the average seconds per annotation. It’s actually pretty motivating and I’m still surprised how fast some of the annotation can get once you’re in a good flow
We could make this an optional feature and maybe even display it in the UI underneath the progress or something. It won’t have to be updated in real time and could just be returned by the REST API periodically, just like the progress.
In the meantime, you could probably implement a simple version of this via a the
update method in a custom recipe – or just on a per-session basis in the
on_exit callbacks. So you log the time on load, calculate the difference and then use the
session_annotated attribute of the controller to get the total number of annotations in that session. Something like this:
total_mins = end_time - start_time
count = ctrl.session_annotated
print(total_mins * 60 / count, "seconds per annotation")
print(count / total_mins, "annotations per minute")
The controller also gives you access to the database, so if you want even more detailed stats, you could fetch all annotations for the current
session_id dataset, look at their answers, the total number of spans etc (depending on the task) and calculate stats from that.