Hello!
I'm running multiclass classification tasks and using "view_id": "choice"
. I've defined "options" for each task as
[
{"id": 0, "text": "A"},
{"id": 1, "text": "B"},
{"id": -1, "text": "C"},
]
as suggested in Computer Vision · Prodigy · An annotation tool for AI, Machine Learning & NLP.
Everything seems to work.
Then prodigy metric.iaa.doc dataset:<dataset> multiclass
fails with
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/venv/lib/python3.11/site-packages/prodigy/__main__.py", line 50, in <module>
main()
File "/venv/lib/python3.11/site-packages/prodigy/__main__.py", line 44, in main
controller = run_recipe(run_args)
^^^^^^^^^^^^^^^^^^^^
File "cython_src/prodigy/cli.pyx", line 123, in prodigy.cli.run_recipe
File "cython_src/prodigy/cli.pyx", line 124, in prodigy.cli.run_recipe
File "/venv/lib/python3.11/site-packages/prodigy/recipes/metric.py", line 137, in metric_iaa_doc
m.measure(stream)
File "cython_src/prodigy/components/metrics/iaa_doc.pyx", line 92, in prodigy.components.metrics.iaa_doc.IaaDoc.measure
File "cython_src/prodigy/components/metrics/_util.pyx", line 53, in prodigy.components.metrics._util._validate_dataset
File "cython_src/prodigy/components/metrics/_util.pyx", line 143, in prodigy.components.metrics._util._validate_labels
TypeError: sequence item 0: expected str instance, int found
After some time I figured out to change task options to
[
{"id": "a", "text": "A"},
{"id": "b", "text": "B"},
{"id": "c", "text": "C"},
]
and metric.iaa.doc
started working properly.
I assume this is a bug but maybe I misunderstand something.
- Would be great if either examples stopped using integer ids (if they are unsupported), or
metric.iaa.doc
started working with them - I think Prodigy would really benefit from user input validation (or at least typing) with something like Pydantic so that there would never be inconsistency in what user can enter and what Prodigy expects and accepts.
Thanks!