I am testing out the latest release (v1.14.4
) and the new IAA metrics. I tried a few commands for a spans
task, and I got the following errors. This dataset has been annotated by two annotators and contains multiple labels.
prodigy metric.iaa.span dataset:ner_task_v2 multilabel -l ACCOUNT,ACTIVITY,AMOUNT,BANK,CARDINAL,CURRENCY,DATE,FAC,FREQUENCY,GPE,LANGUAGE,ORDINAL,ORG,PERCENT,PERSON,STT_ERROR,VEHICLE
Using 17 label(s): ACCOUNT, ACTIVITY, AMOUNT, BANK, CARDINAL, CURRENCY, DATE,
FAC, FREQUENCY, GPE, LANGUAGE, ORDINAL, ORG, PERCENT, PERSON, STT_ERROR, VEHICLE
unrecognized arguments: multilabel
prodigy metric.iaa.doc dataset:ner_task_v2 multilabel -l ACCOUNT,ACTIVITY,AMOUNT,BANK,CARDINAL,CURRENCY,DATE,FAC,FREQUENCY,GPE,LANGUAGE,ORDINAL,ORG,PERCENT,PERSON,STT_ERROR,VEHICLE
ℹ Using 2 annotator IDs: ner_task_v2-annotator1, ner_task_v2-annotator2
✘ Requested labels: FREQUENCY, GPE, CARDINAL, FAC, BANK, STT_ERROR,
LANGUAGE, ACCOUNT, ORG, VEHICLE, AMOUNT, CURRENCY, ACTIVITY, PERCENT, PERSON,
DATE, ORDINAL were not found in the dataset. Found labels: .
prodigy metric.iaa.span dataset:ner_task_v2 -l ACCOUNT,ACTIVITY,AMOUNT,BANK,CARDINAL,CURRENCY,DATE,FAC,FREQUENCY,GPE,LANGUAGE,ORDINAL,ORG,PERCENT,PERSON,STT_ERROR,VEHICLE
Using 17 label(s): ACCOUNT, ACTIVITY, AMOUNT, BANK, CARDINAL, CURRENCY, DATE,
FAC, FREQUENCY, GPE, LANGUAGE, ORDINAL, ORG, PERCENT, PERSON, STT_ERROR, VEHICLE
ℹ Using 2 annotator IDs: ner_task_v2-annotator1, ner_task_v2-annotator2
✘ Requested labels: CURRENCY were not found in the dataset. Found
labels: ACCOUNT, VEHICLE, LANGUAGE, FAC, TIME, STT_ERROR, FREQUENCY, CARDINAL,
DATE, GPE, PERCENT, BANK, ORG, PERSON, AMOUNT, ACTIVITY, ORDINAL.
Is this saying that just DATE
, ORDINAL
and CURRENCY
are not present in the dataset? All of these labels might not be represented in the dataset. Is there a way to ignore labels that are not present?
This is what the recipe produces. Each annotator reviews the output and adjusts/removes/adds spans by label. I want to compare their answers based on the spans
field: