I has created a custom filter_auto_accept_stream that check the two equal responses of two annotators datasets for automatically add the equal responses to review, the code is simple:
elif len(versions) == 2:
if versions[0]["answer"] == versions[1]["answer"]:
eg["answer"] = versions[0]["answer"]
db.add_examples([eg], [dataset])
else:
yield eg
This code work perfectly when I work with datasets with 2000 rows, but when I work with datasets with 3000-4000, the problem is that prodigy server start previously to finish review, and a lot of annotations are lost. Some tricks or tips?
I has put prints inside filter_auto_accept_stream code, I put my code and part of output:
Code inside my recipes/review.py
def filter_auto_accept_stream(
stream: Iterator[Dict[str, Any]], db: Database, dataset: str
) -> StreamType:
"""
Automatically add examples with no conflicts to the database and skip
them during annotation.
"""
for num, eg in enumerate(stream):
versions = eg["versions"]
if len(versions) == 1: # no conflicts, only one version
sessions = versions[0]["sessions"]
if len(sessions) > 1: # multiple identical versions
# Add example to dataset automatically
eg["answer"] = "accept"
db.add_examples([eg], [dataset])
# Don't send anything out for annotation
elif len(versions) == 2:
print(f"V2: {num} {versions[0]['answer']} == {versions[1]['answer']}")
if versions[0]["answer"] == versions[1]["answer"]:
eg["answer"] = versions[0]["answer"]
db.add_examples([eg], [dataset])
else:
yield eg
else:
yield eg
Output:
V2: 0 reject == reject
V2: 1 reject == reject
V2: 2 reject == reject
... # hide entries for avoid long log
V2: 1752 reject == reject
V2: 1753 accept == reject
V2: 1754 reject == reject
✨ Starting the web server at http://0.0.0.0:8080 ...
Open the app in your browser and start annotating!
V2: 1755 reject == reject
V2: 1756 reject == reject
... # hide entries for avoid long log
V2: 2662 reject == reject
V2: 2663 reject == reject # This is my last entry, but my datasets are of 4000 lines
I exec the next command: env/bin/python -u -m prodigy review example_referee example_annotator1,example_annotator2 --label example --show-skipped --auto-accept