I want to capture and count the number of samples that had differences between the annotators annotations. Specifically, for an NER task. For example, a discrepancy for my purpose is a difference between the highlighted text. I understand there is a review recipe with a an auto accept function. This auto accept seems to do exactly what I want but rather than just skipping examples that have no differences, I want to count the ones that have differences. How can I do this?
One approach I thought about was to pull the db into python and match up the tokens. Although, seems cumbersome as there may be a lot highlighted text.
EDIT:
Maybe this code:
def filter_auto_accept_stream(
stream: Iterator[Dict[str, Any]], db: Database, dataset: str
) -> StreamType:
"""
Automatically add examples with no conflicts to the database and skip
them during annotation.
"""
task_hashes = db.get_task_hashes(dataset)
for eg in stream:
versions = eg["versions"]
if len(versions) == 1: # no conflicts, only one version
if TASK_HASH_ATTR in eg and eg[TASK_HASH_ATTR] in task_hashes:
continue
sessions = versions[0]["sessions"]
if len(sessions) > 1: # multiple identical versions
# Add example to dataset automatically
eg["answer"] = "accept"
db.add_examples([eg], [dataset])
# Don't send anything out for annotation
else:
yield eg
found by exploring the package: python -c "import prodigy;print(prodigy.__file__)"