TextCat outcome depends on words that are not in the vocabulary

All of spaCy's models use hash embeddings: Can you explain how exactly HashEmbed works ?

This embedding strategy is used to avoid having to initialize with a fixed-size vocabulary at the beginning of training. Instead, there is no specific bound on the number of words the model is able to learn. New words will continue to influence the training.

Incidentally, even without the hash embeddings most models would learn from unseen words: pretty much all models will have some vector for unknown words that's being updated. Most models also have subword features, which will cause the model to be updated from unknown words.