Exporting not_flagged annotation

I am having all examples in 1dataset. From there I am able to export only the flagged via -F key (I can even re-add them as new dataset)

However, how I can export only the unflagged examples. Is there recipe that could be used for this?

edit:
the use case is : for training I need only the unflagged examples to make ner.gold-to-spacy

When you flag an example, Prodigy will add "flagged": True to the annotation example. So you could write a script that gets the examples in a dataset and filters out examples that have "flagged" set to True. For example:

from prodigy.components.db import connect

db = connect()
examples = db.get_dataset("your_dataset")
filtered_examples = [eg for eg in examples if not eg.get("flagged")]

You could then save out the filtered examples to a JSONL file, or use db.add_dataset and db.add_examples to create a new dataset and add the filtered examples to it. You can find more details on the available database methods in the “Database” section of your PRODIGY_README.html.

1 Like

Thanks for the fast reply, I tried the script but I am receiving error on

examples = db.get_examples("dataset_name")

it dumps some lines and ends with

ValueError: invalid literal for int() with base 10: ‘d’

Sorry, that was a typo in my example – I meant the get_dataset method that takes the name of a dataset and returns the examples. Just edited the code in my previous post.

1 Like

now it works, thanks!

however, it is weird because in the Readme there is actually a proper method METHOD Database.get_examples. Although it might do another thing

Yes, Database.get_examples is the method that gets individual examples from a list of example IDs. Database.get_dataset gets all examples in a dataset, given the dataset ID.

1 Like