How to compute tag SCORE in pos.teach?


(Jeff) #1

With pos.teach, you show a score in the bottom right corner that indicates the confidence in the highlighted tag. Could you let me know how this is computed?

I tried:


since the docs describe Token.prob as a log probability, but the result is very different from the score in pos.teach.

I’d like to use this score for my own recipes.

(Matthew Honnibal) #2

@jeff doc[0].prob is the unigram probability of the word, i.e. its frequency. To get the probabilities from spaCy’s tagger, you would do:

doc = nlp.make_doc(eg['text'])
tagger = nlp.get_pipe('tagger')
token_vectors = tagger.model.tok2vec([doc])
scores = tagger.model.softmax(token_vectors)

You can read more details in the Tagger implementation here:

(Jeff) #3

Perfect. Just what I needed.

(Jeff) #4

@honnibal, could you please provide a similar code chunk to get the probabilities for the dep labels of the parser?

I’ve been looking through Spacy code but haven’t been able to figure it out.

(Matthew Honnibal) #5

Getting probabilities out of the parser is much harder, because of the way the model’s objective works. There have been a few discussions about this on the issue tracker:✓&q=probabilities . I think this thread is the most comprehensive about the issue: