We are using an entity annotated with prodigy in your similarity function to suggest to a user the most appropriate product code from 4.5k codes where the code description fits the entity
Before applying the similarity function we manipulate the word frequency of the 4.5k code descriptions to surface the most important words. This gives us a huge column where each rows has to be nlp'ed - this take ca 14mins.
Can you recommend a more efficient way of doing this?