Automatically accept NER

hi @AakankshaP!

Thanks for your question. Sorry on the delay - our team has been pretty busy :slight_smile:

Our team has been thinking about this because we had a similar question a few weeks ago:

Some initial thoughts. If you know the criteria (e.g., using meta data) for when to auto accept and have the rules/patterns, could you just have a script that takes your input file and partitions it into two files?

  • File 1: what should be annotated (perhaps passing that through a standard loader)
  • File 2: what would be "auto accepted", outputted as a .jsonl file.

You could then have a script that appends the appropriate metadata (e.g., view_id, accept keys) so that the and adds those examples to the database.

from prodigy.components.db import connect

examples = [{"text": "hello world", "_task_hash": 123, "_input_hash": 456}]

db = connect()                                     # uses settings from prodigy.json
db.add_examples(examples, ["test_dataset"]) 

But the key would be you'd need to set_hashes and add_tokens too.

Long story short, I agree there could be a better way to do this. Let me talk with a few teammates if we can get started on a cleaner option.

If this is a blocker, I'd recommend in the short term to remove those you want "auto-accepted" first so you can keep labeling. Then we can hopefully pile together a script for the "auto-accept" .jsonl on how to convert them as "auto-accept".