Automatically accept NER

Hi, I'm currently working on a custom NER recipe, I use a regex pattern matching to automatically highlight the spans for annotator.
For eg: If there is a pattern match, the label is assigned and highlighted during annotation.

I want to also add the functionality to accept the examples without explicitly clicking on the ( :white_check_mark:) during annotation.
Whichever example matches the regex pattern should directly be accepted and added to the database, I don't want to look at those examples during annotation.
How do I do the automatic accept/reject functionality?

hi @AakankshaP!

Thanks for your question. Sorry on the delay - our team has been pretty busy :slight_smile:

Our team has been thinking about this because we had a similar question a few weeks ago:

Some initial thoughts. If you know the criteria (e.g., using meta data) for when to auto accept and have the rules/patterns, could you just have a script that takes your input file and partitions it into two files?

  • File 1: what should be annotated (perhaps passing that through a standard loader)
  • File 2: what would be "auto accepted", outputted as a .jsonl file.

You could then have a script that appends the appropriate metadata (e.g., view_id, accept keys) so that the and adds those examples to the database.

from prodigy.components.db import connect

examples = [{"text": "hello world", "_task_hash": 123, "_input_hash": 456}]

db = connect()                                     # uses settings from prodigy.json
db.add_examples(examples, ["test_dataset"]) 

But the key would be you'd need to set_hashes and add_tokens too.

Long story short, I agree there could be a better way to do this. Let me talk with a few teammates if we can get started on a cleaner option.

If this is a blocker, I'd recommend in the short term to remove those you want "auto-accepted" first so you can keep labeling. Then we can hopefully pile together a script for the "auto-accept" .jsonl on how to convert them as "auto-accept".

Thank you so much sorry for the delayed reply. I figured it out through this issue, thank you so much for helping me out, appreciate it!