I am trying to use the scorer function for F,P,R calculation on the predictions I got from the Spacy model. Using the below code I load the trained spacy model and the test data (rows of text) to predict the labels.
sp_model = spacy.load('/Folder/CustommodeltrainedinSpacy/') test_data = pd.read_csv('/data.csv') data_target = data['Text_column'] #Text column df = pd.DataFrame(columns =['NAME','TAG'],index=range(0,len(data_target)+1)) for i in range(0,len(data_target)): doc=sp_model(str(data_target[i])) for colname in df.columns: for ent in doc.ents: if colname == ent.label_: df[colname][i] = ent.text
I have annotated data for 2 custom labels ‘NAME’, ‘TAG’ and trained the model. The above snippet gives me a dataframe with two columns of the tags with appropriate text which was tagged.
NAME TAG 0 John Author 1 Mike Student
Now that I have the predictions from the trained model, how do I evaluate using the scorer function?
import spacy from spacy.gold import GoldParse from spacy.scorer import Scorer def evaluate(ner_model, examples): scorer = Scorer() for input_, annot in examples: doc_gold_text = sp_model.make_doc(input_) #Here I used my trained model gold = GoldParse(doc_gold_text, entities=annot['entities']) pred_value = sp_model(input_) #trained model on input scorer.score(pred_value, gold) return scorer.scores
My test data is just a column of text. But here the ‘for’ loop has ‘input_,annot’ as variables to loop. In the ‘for loop’ where I do the prediction, I already used the below snippet to create the ‘doc’ element.
Also, is it necessary for me to use the ‘GoldParse’ function to get the scores?