Hi! I have a pretty complex dataset (hundreds of labels) that I am building a custom recipe for. Before running Prodigy, I cluster like samples together (so when they are streamed to Prodigy, class A appears in a group, class B appears in a group, etc). One thing that would be really helpful would be bringing the last used label to the top of the pile.
While the below logic works, the sorting order only gets updated every time new samples get streamed. It's not a huge pain, but it would be nice if the options could lazily evaluate when the annotation screen actually appears. Any thoughts or ideas? Here's my current code:
last_used_label = None
def get_sorted_options():
nonlocal last_used_label
labels_list = load_labels()
labels_list.remove(last_used_label)
labels_list.insert(0, last_used_label)
return [{"id": label, "text": label} for label in labels_list]
def get_stream():
for eg in load_b64_from_paths(image_paths):
## other logic
eg["options"] = get_sorted_options()
yield eg
def update(answers):
nonlocal last_used_label
for answer in answers:
last_used_label = answer["label"]
While the below logic works, the sorting order only gets updated every time new samples get streamed. It's not a huge pain, but it would be nice if the options could lazily evaluate when the annotation screen actually appears.
I understand that you want to update the last_used_label after every single annotation as opposed every batch and sort the options if necessary.
Be default the update callback is called when a batch of answers is saved into the DB. If you want to update for every single annotation, you can try setting batch_size to 1. You won't be able to undo the annotated examples anymore (as they will be stored in the DB immediately after hitting accept) but you will be able to update on each annotation.
Then inside the get_stream, you could only apply the get_sorted_options if the first option is different from last_used_label, which would cover the lazy evaluation requirement.
If setting batch_size to 1 is not really an option, then you could try sorting the labels in the front end by listening to prodigyanswer event to intercept the answer and update the options in the options container (I haven't tested this solution yet). Here you can find some info on using custom javascript.