Problem with annotation

Tatiana · May 28, 2020, 7:44am

Hello,
Could you help me please with my annotation problem? I annotated 3 classes with "prodigy textcat.manual id_dataset filename.txt --loader TXT --label Positive"
{"text":"...my text ....","_input_hash":173569426,"_task_hash":-2060165333,"label":"Positive","_session_id":null,"_view_id":"classification","answer":"accept"}
also added Negative and neutral examples with the same command. when i started to train i got wrong results.
with textcat.batch-train with flag E accuracy is 0, without E is 1. I understand that it is somethign wrong with my annotation, but i do not know what. I loaded plain text.

I also tried with --label Positive,Negative,Neutral. Than structure looks like
{"text":"...text....","_input_hash":-1984147309,"_task_hash":297950317,"options":[{"id":"POSITIVE","text":"POSITIVE"},{"id":"NEGATIVE","text":"NEGATIVE"},{"id":"Neutral","text":"Neutral"}],"_session_id":null,"_view_id":"choice","accept":["NEGATIVE"],"answer":"accept"}
But it also gave 1 as accuracy. Could you help me with it please?
Thank you.

ines · May 30, 2020, 10:03am

Hi! It looks like you're training from binary examples, but all your annotations have "answer": "accept"? Is that correct? If you're training from binary annotations, you typically want examples examples where the label applies (e.g. "label": "POSITIVE", "answer": "accept") and examples where the label doesn't apply (e.g. "label": "POSITIVE", "answer": "reject").

That's likely a problem here, because the model seems to have learned to always predict that the binary decision is "accept", and this seems to always be the correct answer, so you end up with 100% accuracy.

(Btw, just a note, if you want the data to be consistent, make sure you use consistent capitalisation in the labels. Labels are case-sensitive, so you may end up with incompatible data if some annotates Negative and some annotates NEGATIVE.)

Tatiana · June 1, 2020, 5:16pm

Thank you for you answer. I tried both (binary and selection from several variants).
Interesting, when i used a dataset with flags --label Positive,Negative,Neutral -E, but add 2 classes (Positive + Negative), and it worked fine. At least, I think it worked with 2 classes with flag -E and multiclass annotation

But when i added to the same dataset a third class Neutral, i got strange output.

So, it seems to be ok for 2 classes, but not for 3. even I added it in the same way as 2 before.
{"text":"....text.... .","_input_hash":1435220674,"_task_hash":1052675403,"options":[{"id":"Positive","text":"Positive"},{"id":"Negative","text":"Negative"},{"id":"Neutral","text":"Neutral"}],"_session_id":null,"_view_id":"choice","accept":["Neutral"],"answer":"accept"}
{"text":"-.....text.....","_input_hash":-1153493732,"_task_hash":-1670976934,"options":[{"id":"Positive","text":"Positive"},{"id":"Negative","text":"Negative"},{"id":"Neutral","text":"Neutral"}],"_session_id":null,"_view_id":"choice","accept":["Positive"],"answer":"accept"}
Classes were annotated with the command "prodigy textcat.manual datesetname neutral.json --label Positive,Negative,Neutral -E".
I am not sure how to add the first class correctly.

ines · June 2, 2020, 9:23am

How exactly are you adding the other class? Are you re-annotating all your data from scratch? Or are you just adding more annotations to the same dataset?

If you're adding to the same dataset, that could explain what's going on, because you'd essentially end up with some annotations with 2 labels and some annotations with 3 labels, which can lead to inconsistent results.

Tatiana · June 2, 2020, 9:34am

Thank you for your answer.
I added more to the same dataset.
But when I started with Positive/Negative, I always wrote --label Positive,Negative,Neutral. At least, if i do db-out, I can see than text from the first part has also 3 classes, but accept flags are only Negative or Positive.
[{"id":"Positive","text":"Positive"},{"id":"Negative","text":"Negative"},{"id":"Neutral","text":"Neutral"}],"_session_id":null,"_view_id":"choice","accept":["Negative"],"answer":"accept"}

I will try to annotate all three classes at once. Thank you.

Tatiana · June 2, 2020, 10:47am

It works after reannotation as you suggested. Thank you for your help!

Topic		Replies	Views
from textcat.manual to textcat.teach usage , textcat , best-practices	1	574	February 13, 2022
textcat_multilabel with only some labels annotated for some examples	5	377	June 14, 2022
Inconsistent results textcat	4	466	March 20, 2020
Multi-label annotation with Transfer Learning textcat , solved , best-practices	5	980	June 6, 2020
Practical use of rejected textcat.teach annotations for downstream tasks	2	89	May 24, 2024

Problem with annotation

Related topics