I have what I think is a basic question (if this has been addressed on the forum pls point, poking around I couldn't find an answer).
The annotations from active learning have a label and an answer. What role does the answer play in the supervision of the learner that prodigy invokes when it trains? I had assumed that all rejections are thrown away for training, but not sure given the accept/reject matrix that prodigy displays after training.
Here is how I invoke the teach and train phases:
prodigy textcat.teach direction en_vectors_web_lg data/filtered.jsonl -t2v vectors/lmao_vectors.bin -pt patterns/direction_patterns.jsonl -l BUY,SELL
prodigy textcat.batch-train direction en_vectors_web_lg -o models/direction_cat_model -n 10 -t2v vectors/lmao_vectors.bin -l SELL,BUY
(FYI, the lmao_vectors.bin file is the result of training the corpus using the en_vectors_web_lg)
and here are the results:
Loaded model en_vectors_web_lg
Using 20% of examples (471) for evaluation
Using 100% of remaining examples (1886) for training
Dropout: 0.2 Batch size: 10 Iterations: 10
# LOSS F-SCORE ACCURACY
01 0.243 0.967 0.955
02 0.064 0.972 0.962
03 0.038 0.966 0.953
04 0.031 0.971 0.960
05 0.022 0.971 0.960
06 0.031 0.966 0.953
07 0.026 0.967 0.955
08 0.048 0.972 0.962
09 0.042 0.969 0.958
10 0.038 0.967 0.955
accept accept 295
accept reject 1
reject reject 136
reject accept 16
Correct 431
Incorrect 17
Baseline 0.35
Precision 1.00
Recall 0.95
F-score 0.97
Accuracy 0.96
In that context, how is the reported baseline calculated?
Much thanks