EntityRecognizer.make_best(silver_data) seems to ignore entities in silver data

jjjamie · July 17, 2019, 3:03pm

The docs and https://github.com/explosion/prodigy-recipes/blob/master/ner/ner_silver_to_gold.py#L49
suggest it’ll merge the silver annotations with annotations produced by the model to find the best possible analysis given the constraints.

By way of reproduction, I took some values that occurred while running the ner.silver-to-gold recipe

import spacy
from prodigy.models.ner import EntityRecognizer
nlp = spacy.load("en_core_web_lg")
ner = EntityRecognizer(nlp)
list(ner.make_best(
    [
        {
            "text":text,
             "spans":[{"text":"SECURITY BANK CORP","label":"ORG","start":0,"end":18}],
             "no_missing":True,
             "_input_hash":-124546672,
             "_task_hash":1252032485,
             "answer":"accept"
        }
    ]
))

gives

[{'text': 'SECURITY BANK CORP <SBKC.O> SAYS EARNINGS INCREASE',
  'spans': [{'start': 0,
    'end': 20,
    'text': 'SECURITY BANK CORP <',
    'rank': 0,
    'label': 'ORG',
    'score': 1.0}],
  '_input_hash': -124546672,
  '_task_hash': 1252032485}]

Would you expect this to take my annotation in preference to the one produced by the model, or have I got the wrong idea?

(This is with Prodigy 1.8.3. Possibly related: Binary annotated data missed out in making gold data also suggests the silver annotations don’t show up in the gold data.)

Thanks for any help!

Jamie

honnibal · July 17, 2019, 7:09pm

That does look wrong! Thanks for the simple example. Looking into it.

Topic		Replies	Views
Annotation strategy for gold-standard data usage , ner , solved , best-practices	5	2707	October 26, 2018
Gold/Silver Dataset Confusion usage , ner , solved	2	1489	September 3, 2019
Help with messy data usage , ner	8	666	January 20, 2019
Fixing NER Spans usage , ner , solved	4	662	May 7, 2018
Ambiguous NER annotation decisions usage , ner , solved , best-practices	12	4674	February 12, 2018

EntityRecognizer.make_best(silver_data) seems to ignore entities in silver data

Related topics