Working with languages not yet supported by Spacy

My first guess was that the spaCy model you’re loading in hasn’t loaded the vectors correctly. If the vectors aren’t loaded, the similarity would be 0.0 for all words. Currently I think this would result in it fruitlessly looping over the vocabulary. However, your script looks correct.

I then had another look at the terms.teach recipe. It looks like it checks the vocabulary entries for is_lower and is_alpha. Could you check whether these attributes are set correctly in your vocabulary?

You should also be able to add some print statements in the recipe to help figure out what’s going wrong. If you want to see how the recipe is supposed to work, try it with some English terms using the en_core_web_lg model.