This is great - thank you Ines! The annotating is already so much faster without including these obvious rejections. As suggested on exit I assign those examples as reject however I am a little uncertain on how to save these back to my data set. The relevant part of my recipe is
def reject_stopwords(removed_examples):
for eg in removed_examples:
eg['answer'] = 'reject'
print (eg)
def on_exit(ctrl):
nonlocal removed_examples
# this is called when you exit the Prodigy server
dataset_name = ctrl.dataset
database = ctrl.db
# do something here
reject_stopwords(removed_examples)
I am not sure if I should make the updated rejected examples into a list and pass that back or if I can update the dataset on the fly?
Thanks.