Is there a way to show a score together with the labels for each entity occurrence in the ner_manual view?

Hi!

What I would like to do is to show the score of predictions, together with the highlighted labels inside the ner_manual view (using ner.make-gold recipe).

So, I am trying to modify the ner.make-gold recipe code, and create my own recipe which does the following:

1 - calculates the score of each span should not be an issue, I was thinking to use the logic in this post: Accessing probabilities in NER

2 - Then, I would like to add the “score” field to each span dict inside the make_tasks function as following:

    def make_tasks(nlp, stream):
    """Add a 'spans' key to each example, with predicted entities."""
    texts = ((eg["text"], eg) for eg in stream)
    for doc, eg in nlp.pipe(texts, as_tuples=True):
        task = copy.deepcopy(eg)
        spans = []
        for ent in doc.ents:
            if labels and ent.label_ not in labels:
                continue
            spans.append(
                {
                    "token_start": ent.start,
                    "token_end": ent.end - 1,
                    "start": ent.start_char,
                    "end": ent.end_char,
                    "text": ent.text,
                    "label": ent.label_,
                    "source": spacy_model,
                    "input_hash": eg[INPUT_HASH_ATTR],
                    "score": MY_SCORE,
                }
            )
        task["spans"] = spans
        task = set_hashes(task)
        yield task

Now, the most unclear step is: how do I modify the current ner_manual view to show the scores?

1 - do you have an already existing way to do this which I am missing?
2 - if not, could you suggest a way to change the ner_manual view? Unfortunately i cannot access the code of the template…
3 - If option 1 and option 2 are not feasible, what else can i do?

Actually by extending a little the scope of the question, it would be nice to have a way to personalize and slightly change the already existing Prodigy views. But at the moment I cannot find a way to achieve this.

Thanks,
Kasra

Hi! It’s definitely possible to write your own custom view using the html_template, javascript and global_css settings. CSS and JavaScript overrides are also available for existing views, so you can customise them without having to build everything from scratch.

It sounds like the simplest way to achieve this would be to just append it to the label? I haven’t tried it yet, but it should work. For instance, like this:

"label": "{} ({})".format(ent.label_, MY_SCORE)

This will show something like PERSON (0.75). You probably want to round the score to fixed number of decimals here if you’re not already doing this.

If you do want to style it differently, you could add some custom JavaScript that selects the highlighted span and then applies some custom styling to the scores, or inserts them based on the task data (available as window.prodigy.content). You can find more details on this in your PRODIGY_README.html.

Thank you! Your solution of appending the score in parenthesis is very good.
Now I have got another issue actually: I tried to adapt the make_tasks function with the logic in this post (Accessing probabilities in NER) to calculate the scores.
Here is the code:

    def make_tasks(nlp, stream):
    """Add a 'spans' key to each example, with predicted entities."""
    texts = ((eg["text"], eg) for eg in stream)
    for doc, eg in nlp.pipe(texts, as_tuples=True):

        task = copy.deepcopy(eg)
        spans = []
        (beams, _) = nlp.entity.beam_parse([doc], beam_width=16, beam_density=0.0001)

        for score, ents in nlp.entity.moves.get_beam_parses(beams[0]):
            for ent in doc.ents:
                if labels and ent.label_ not in labels:
                    continue

                spans.append(
                    {
                        "token_start": ent.start,
                        "token_end": ent.end - 1,
                        "start": ent.start_char,
                        "end": ent.end_char,
                        "text": ent.text,
                        "label": "{} ({})".format(ent.label_, score),
                        "source": spacy_model,
                        "input_hash": eg[INPUT_HASH_ATTR],
                    }
                )
        task["spans"] = spans
        task = set_hashes(task)
        yield task

However, I notice that the model returns always score = 1.0 for all the entities.
I tried some of the models trained by me, plus I tried to run the recipe with the standard “en_core_web_lg-2.0.0” and “en_core_web_sm” models on an english text. Result is always score=1.0.

Am i doing anything wrong in calculating the scores? Or is this expected?
If it is wrong, what can i do to get the model score of each entity prediction?

Thanks
Kasra

That code looks right, but there must be more right? The snippet you pasted doesn’t show the part where you actually sum the scores and calculate the probabilities. Maybe that’s where the bug is?

Hi!
Yes you were right, one problem is that i was not adding the scores. The other problem however is that if the doc is too small, then the score returned by the function nlp.entity.moves.get_beam_parses(beams[0]) is always one.

So here is my solution:

 def make_tasks(nlp, stream):
    """Add a 'spans' key to each example, with predicted entities."""
    texts = []
    texts2 = []
    for eg in stream:
        texts.append((eg["text"], eg))
        texts2.append(eg["text"])
    docs = list(nlp.pipe(texts2, disable=['ner']))
    (beams, _) = nlp.entity.beam_parse(docs, beam_width=16, beam_density=0.0001)
    entity_scores = defaultdict(float)
    for doc, beam in zip(docs, beams):
        for score, ents in nlp.entity.moves.get_beam_parses(beam):
            for start, end, label in ents:
                if label in labels:
                    entity_scores[(doc.text, start, end, label)] += score

    for doc, eg in nlp.pipe(texts, as_tuples=True):
        task = copy.deepcopy(eg)
        spans = []

        for ent in doc.ents:
            if labels and ent.label_ not in labels:
                continue

            spans.append(
                {
                    "token_start": ent.start,
                    "token_end": ent.end - 1,
                    "start": ent.start_char,
                    "end": ent.end_char,
                    "text": ent.text,
                    "label": "{} ({})".format(ent.label_, round(entity_scores[(doc.text, ent.start, ent.end, ent.label_)], 3)),
                    "source": spacy_model,
                    "input_hash": eg[INPUT_HASH_ATTR],
                }
            )
        task["spans"] = spans
        task = set_hashes(task)
        yield task

This code works, but it has the drawback that with big input text files (more than 2-3 Mbs) the annotation process takes too long to start.

This instead is the solution that returns always score=1.0:

  def make_tasks(nlp, stream):
    """Add a 'spans' key to each example, with predicted entities."""
    texts = ((eg["text"], eg) for eg in stream)
    for doc, eg in nlp.pipe(texts, as_tuples=True):
        task = copy.deepcopy(eg)
        spans = []

        (beams, _) = nlp.entity.beam_parse([doc], beam_width=16, beam_density=0.0001)          
        entity_scores = defaultdict(float)
        for score, ents in nlp.entity.moves.get_beam_parses(beams[0]):
            for start, end, label in ents:
                 if label in labels:
                     entity_scores[(start, end, label)] += score

        for ent in doc.ents:
            if labels and ent.label_ not in labels:
                continue
            spans.append(
                {
                    "token_start": ent.start,
                    "token_end": ent.end - 1,
                    "start": ent.start_char,
                    "end": ent.end_char,
                    "text": ent.text,
                    "label": "{} ({})".format(ent.label_, round(entity_scores[(ent.start, ent.end, ent.label_)], 3)),
                    "source": spacy_model,
                    "input_hash": eg[INPUT_HASH_ATTR],
                }
            )
        task["spans"] = spans
        task = set_hashes(task)
        yield task

What is the reason why by running the function nlp.entity.beam_parse(...) on each doc i get always score=1? Also, do you maybe have another better suggestion to achieve the same result?

Thanks,
Kasra

It looks like you’ve got your loop-scopes wrong in the first solution? I’m pretty sure It doesn’t make sense to accumulate the entity scores across all the documents. You want to accumulate over one beam, not across the whole corpus. This is also why it’s taking so long to start: you’re parsing the entire dataset to get those scores, before you can yield out a single example.

I haven’t run this code, but this looks more correct to me:


 def make_tasks(nlp, stream):
    """Add a 'spans' key to each example, with predicted entities."""
    text_and_eg = ((eg["text"], eg) for eg in stream)
    for doc, eg in nlp.pipe(text_and_eg, disable=["ner"], as_tuples=True):
        eg = copy.deepcopy(eg)
        # This gives up on parsing a batch of documents, for convenience. If we wanted, we could
        # loop over a batch of examples, and handle the batching that way. But this should be fine.
        beam = nlp.entity.beam_parse([doc], beam_width=16, beam_density=0.0001)[0][0]
        # Count the entity scores.
        entity_scores = defaultdict(float)
        for score, ents in nlp.entity.moves.get_beam_parses(beam):
            for start, end, label in ents:
                if label in labels:
                    entity_scores[(doc.text, start, end, label)] += score
        spans = []
        for ent in doc.ents:
            if labels and ent.label_ not in labels:
                continue
            spans.append(
                {
                    "token_start": ent.start,
                    "token_end": ent.end - 1,
                    "start": ent.start_char,
                    "end": ent.end_char,
                    "text": ent.text,
                    "label": "{} ({})".format(ent.label_, round(entity_scores[(doc.text, ent.start, ent.end, ent.label_)], 3)),
                    "source": spacy_model,
                    "input_hash": eg[INPUT_HASH_ATTR],
                }
            )
        eg["spans"] = spans
        eg = set_hashes(eg)
        yield eg

Hi! Your solution is almost correct. The reason why i was getting always score = 1.0 is because i was not disabling the ner in the command nlp.pipe(....) which (i guess) is rounding the scores to 1.
So, the only way I solved it is the following:

 def make_tasks(nlp, stream):
"""Add a 'spans' key to each example, with predicted entities."""
text_and_eg = ((eg["text"], eg) for eg in stream)
for doc, eg in nlp.pipe(text_and_eg, disable=["ner"], as_tuples=True):
    eg = copy.deepcopy(eg)
    # This gives up on parsing a batch of documents, for convenience. If we wanted, we could
    # loop over a batch of examples, and handle the batching that way. But this should be fine.
    beam = nlp.entity.beam_parse([doc], beam_width=16, beam_density=0.0001)[0][0]
    # Count the entity scores.
    entity_scores = defaultdict(float)
    for score, ents in nlp.entity.moves.get_beam_parses(beam):
        for start, end, label in ents:
            if label in labels:
                entity_scores[(doc.text, start, end, label)] += score
    spans = []
    doc = nlp(eg['text'])
    for ent in doc.ents:
        if labels and ent.label_ not in labels:
            continue
        spans.append(
            {
                "token_start": ent.start,
                "token_end": ent.end - 1,
                "start": ent.start_char,
                "end": ent.end_char,
                "text": ent.text,
                "label": "{} ({})".format(ent.label_, round(entity_scores[(doc.text, ent.start, ent.end, ent.label_)], 3)),
                "source": spacy_model,
                "input_hash": eg[INPUT_HASH_ATTR],
            }
        )
    eg["spans"] = spans
    eg = set_hashes(eg)
    yield eg

Basically I added disable=['ner'] and then I added the line doc = nlp(eg['text']) with the ner activated before looping through the document entities. I had to do it since the ents coming from the command nlp.entity.moves.get_beam_parses(beam) do not contain all the fileds needed by the span.

Thanks for the help.
Kasra

just an additional question: if I add the score in parenthesis for each label in the ner.make_gold recipe as we discussed above, will this affect the logic of batch-train recipe? Because I saw that the parenthesis with the scores are stored as new labels and I am thinking if this will lead the ner.batch-train to consider PERSON (0.5) different than for example PERSON (1.0)

Thanks,
Kasra

Yes, I was actually going to mention this before (sorry if I didn’t!). I’d suggest maybe including a property like "orig_label" which each span and then converting that back before you train.

It’s a bit hacky, but should work. I’ll think about ways to make it a bit easier to do stuff like this out-of-the-box in the future.

ok thanks! I will do that for now :slight_smile:

1 Like