How to edit existing texts that were added to a dataset using db-in

Hey, and thanks! :smiley:

Datasets in Prodigy are append-only by design – I've written some more about that concept on this thread:

Prodigy gives you direct access to the datasets via its Python API – so you can use that to implement any filtering or search logic you need, over any fields in the JSON records. You could do a simple keyword search over the "text" values, or do something more complex with regular expressions (or even spaCy if you want a more advanced NLP-powered search :smiley:).

from prodigy.components.db import connect

db = connect()
examples = db.get_datasset("dataset_name")
for eg in examples:
    # do something here...

examples here is a list of dictionaries representing the individual examples. If you've found examples you want to edit, you could either export them to a file and re-annotate them (if you want to change entity spans or more complex stuff), or edit them in your script, and then save the result (previously correct examples, edited examples) to a new dataset.