How to compute tag SCORE in pos.teach?

pos
spacy
solved

(Jeff) #1

With pos.teach, you show a score in the bottom right corner that indicates the confidence in the highlighted tag. Could you let me know how this is computed?

I tried:

math.exp(doc[0].prob)

since the docs describe Token.prob as a log probability, but the result is very different from the score in pos.teach.

I’d like to use this score for my own recipes.


(Matthew Honnibal) #2

@jeff doc[0].prob is the unigram probability of the word, i.e. its frequency. To get the probabilities from spaCy’s tagger, you would do:

 
doc = nlp.make_doc(eg['text'])
tagger = nlp.get_pipe('tagger')
token_vectors = tagger.model.tok2vec([doc])
scores = tagger.model.softmax(token_vectors)

You can read more details in the Tagger implementation here: https://github.com/explosion/spaCy/blob/master/spacy/pipeline.pyx


(Jeff) #3

Perfect. Just what I needed.


(Jeff) #4

@honnibal, could you please provide a similar code chunk to get the probabilities for the dep labels of the parser?

I’ve been looking through Spacy code but haven’t been able to figure it out.


(Matthew Honnibal) #5

Getting probabilities out of the parser is much harder, because of the way the model’s objective works. There have been a few discussions about this on the issue tracker: https://github.com/explosion/spaCy/issues?utf8=✓&q=probabilities . I think this thread is the most comprehensive about the issue: https://github.com/explosion/spaCy/issues/881