currently I use spacy only for lemmatization/parsing of colloquial language, without word vector capabilities. I have a problem with the POS-tagging. my code
tokens = [tok.lemma_.lower().strip() for tok in doc if tok.pos_ != ‘PRON’]
recognizes “my”, “your” etc as PRON, but not “mine”, “your’s” My current hack is to modify all occurrences of “mine” by “my” but it’s hardly elegant. (The word “mine” does not occur as “the explosive device” in my document.)
suggestions? or just keep hardcoding
thanks,
andreas