How are answers from active learning used in training

Superscope · September 11, 2019, 3:19pm

I have what I think is a basic question (if this has been addressed on the forum pls point, poking around I couldn't find an answer).

The annotations from active learning have a label and an answer. What role does the answer play in the supervision of the learner that prodigy invokes when it trains? I had assumed that all rejections are thrown away for training, but not sure given the accept/reject matrix that prodigy displays after training.

Here is how I invoke the teach and train phases:

prodigy textcat.teach direction en_vectors_web_lg data/filtered.jsonl -t2v vectors/lmao_vectors.bin -pt patterns/direction_patterns.jsonl -l BUY,SELL

prodigy textcat.batch-train direction en_vectors_web_lg -o models/direction_cat_model -n 10 -t2v vectors/lmao_vectors.bin -l SELL,BUY

(FYI, the lmao_vectors.bin file is the result of training the corpus using the en_vectors_web_lg)

and here are the results:

Loaded model en_vectors_web_lg
Using 20% of examples (471) for evaluation
Using 100% of remaining examples (1886) for training
Dropout: 0.2  Batch size: 10  Iterations: 10  

#            LOSS         F-SCORE      ACCURACY  
01           0.243        0.967        0.955                                    
02           0.064        0.972        0.962                                    
03           0.038        0.966        0.953                                    
04           0.031        0.971        0.960                                    
05           0.022        0.971        0.960                                    
06           0.031        0.966        0.953                                    
07           0.026        0.967        0.955                                    
08           0.048        0.972        0.962                                    
09           0.042        0.969        0.958                                    
10           0.038        0.967        0.955                                    

accept   accept   295
accept   reject   1  
reject   reject   136
reject   accept   16 

Correct     431
Incorrect   17

Baseline    0.35              
Precision   1.00              
Recall      0.95              
F-score     0.97              
Accuracy    0.96

In that context, how is the reported baseline calculated?

Much thanks

ines · September 12, 2019, 8:33am

When you collect binary annotations, you'll only have incomplete information about the text – but Prodigy can still use that information to update the model accordingly. spaCy's models were specifically designed to be updated with sparse annotations (which is not always the case for NLP model implementations). The data you train on doesn't have to be the complete gold-standard, and we can update a model and move it in the right direction, even if all we know is "these tokens are not an org, but could be anything else" or "this text is not about buying, but could be about any of the other labels".

I'm showing some examples in my slides here: https://speakerdeck.com/inesmontani/belgium-nlp-meetup-rapid-nlp-annotation-through-binary-decisions-pattern-bootstrapping-and-active-learning?slide=12

It basically works like this: To update the model, we need the gradient of the loss function, which is calculated from the predicted distribution and the target distribution. If we don't know the full target distribution, and only that some labels are wrong, we can assign those a probability of 0, and then split the rest proportionally. So if we know that label A is wrong, and nothing about labels B and C, but the model predicted a much higher probability for B, we can reflect this in the update we make. That's essentially how Prodigy updates the model with binary annotations.

Here's an NER example that shows the calculation:

Superscope · September 13, 2019, 1:10pm

Hi Ines, thanks for your prompt reply (as always!) and the presentation...very clear.

Topic		Replies	Views
active learning and update function ner , best-practices	1	1032	February 25, 2021
Custom model Requirements usage , custom	8	2919	March 25, 2019
From Choice annotations to binary annotations with Teach usage , textcat , spacy	4	988	January 2, 2019
how to score everything in active learning?	4	225	October 24, 2022
textcat.teach showing same text twice (and not using active learning?) textcat	15	2300	August 15, 2018

How are answers from active learning used in training

Related topics