Attributes of trained model

cwix · January 30, 2022, 1:16pm

We are using the Span Annotator to train a model that recognizes medical concepts contained in unstructured medical documentation.

Given that the model vectorizes each NER based on a multi-token context window ( 4 tokens on each side of NER - default setting), ....and assuming that we are using a very large training corpus,

.... do the resultant nearest neighbor vectors in the trained model possess some form of relatedness ?

For example, would vectors for NERs: heart attack and myocardial infarction ( these are synonyms ) likely be found in proximity to each other using cosine similarity ?

Thanks very much

C

ljvmiranda921 · February 2, 2022, 5:02am

Hi @cwix, welcome to Prodigy!

If you're using spaCy's entity recognizer, then yes, the vectors should possess some form of relatedness. You can look into some of spaCy's model architectures and how it affects the word vectors and their similarity.

Topic		Replies	Views
Span vs NER, compatibility with transformers models ner , spacy , transformers , spancat	2	354	May 17, 2023
Training new model using annotations from ner.manual ner , spacy	2	676	June 28, 2018
Improve trained models with annotations usage , ner , training	3	517	September 20, 2021
train a Spacy 'en_core_web_md' manually using ner.manual usage , ner , medical	1	1267	October 18, 2019
The effect of segmentation on NER training ner	1	776	March 21, 2018

Attributes of trained model

Related topics