prefer_uncertain in ner.teach?

First of all, congratulations on a great tool! It’s really going to help me.

I’ve been trying to figure out how to get prefer_uncertain in ner.teach. It looks like it’s in the recipe code, but I’m getting a lot of score = 1.0 entities coming up in training. Is this a bug or do I have something mis-configured?

Also, a small documentation issue: The very last code block in the NER workflow has text classification code instead of NER code (here).

Thanks! We’re glad you like it – we’re pretty happy with how it’s shaping up! That said, there’s still lots of tweaking and testing to do, so we’re glad to have you testing it :slight_smile:

The sorting dynamics are something that can particularly benefit from more “playtesting”. It’s possible there are some underlying bugs in the spaCy beam-search code that’s backing the NER probability estimates, too. I’ll explain a little about how this works so you can do some digging, to figure out where the problem might be in your specific case. That will hopefully reveal what knob to twiddle.

To get the confidence of the entities, the sentence is analysed using beam-search, to produce k different parses – each parse being a list of (start, end, label) triples, with each triple describing an entitiy. To figure out the probability of a particular (start, end, label) triple, we normalize the scores, and sum the scores of each parse that contains the entity. So if an entity is in 15/16 parses, and the only parse it’s not in has probability 0.01, we say the entity has probability 0.99. If the entity is in 9 parses that sum to 0.73, we say its probability is 0.73.

An entity that’s in all parses has probability 1.0. So, one possibility is that the model simply isn’t returning much diversity on your data, at all. Here’s how you could check that:

docs = list(nlp.pipe(texts, disable=['ner']))
beams = nlp.entity.beam_parse(docs, [d.tensor for d in docs], beam_width=32, beam_density=0.001)
for doc, beam in zip(docs, beams):
    for score, parse in nlp.entity.move.get_beam_parses(beam):
        print(score, [(label, doc[start : end]])
                     for start, end, label in parse])

The relevant bits of spaCy we’re using can be found here:

The two parameters to adjust here are the beam_width and the beam_density. The code above says to produce maximum 32 analyses, but to prune the beam so that the lowest-ranking analysis has a score at least 0.1% of the top one.

If the model’s probabilities are looking good, then the problem really is in the sorting dynamics. One parameter you could adjust for that is the bias parameter. A bias of 0.0 means that the uncertainty is used as the priority to sort the examples. A negative bias will skew towards lower scores, and a positive bias will skew towards higher scores.

Thanks! Should be fixed now.


That’s really helpful and interesting. Thank you! I’ll play around with it over the next few days and see if I can get it optimized for what I’m trying to do.