Hi there, new to Prodigy but very excited to get training models to see how predictive my current labels are and try some active learning from there!
Could anyone help me with the correct format of the annotations needed for textcat-multilabel?
I have several labels, of which multiple could be assigned to a text at once.
I am currently trying to read in the JSONL which contains my existing annotations (not made in Prodigy), where each dict has the following format:
{"text": " I can't complain, you've got to take the rough with the smooth.", "cats": {"30": false, "12": false, "24": false, "19": true, "25": false, "23": false, "11": false, "32": false, "36": false, "13": false, "33": false, "15": false, "28": false, "14": false, "20": false, "17": false, "16": false, "27": false, "37": true, "38": false, "21": false, "10": false, "31": false, "29": false, "22": false}}
prodigy db-in
reads these into my database but says :
✔ Created unstructured dataset 'verbal' in database SQLite
✔ Imported 984 annotated examples and saved them to 'verbal' (session
2024-01-17_20-32-34) in database SQLite
Found and keeping existing "answer" in 0 examples
Then, indeed, when I try to train I get an error:
TypeError: [E930] Received invalid get_examples callback in
MultiLabel_TextCategorizer.initialize. Expected function that returns an iterable of Example objects but got: []
Any help on how to adjust my JSONL to the correct format would be greatly appreciated. Thanks!