What is the syntax for textcat.teach for multi-class?


I can manually label as

prodigy textcat.manual email ./subject-text-stratified-990.jsonl --exclusive --label x,y,z

What is the equivalent for teach?

prodigy textcat.teach email  blank:en ./subject-text-stratified-990.jsonl --label x,y,z 

I get this when I try the above.

Using 3 label(s): x, y, z
Added dataset email to database SQLite.
TypeError: 'str' object does not support item assignment

Any help appreciated.

Hello @jdewsnip,

thank you for your question and welcome to the prodigy forum!

Could you post an example of your .jsonl-file? I suspect that you might use the key "meta" with a string value in your data, for which Prodigy uses a dict as value. During the prediction, it tries to add a score to the meta-dictionary which could cause the error you see.

Ah ok that makes sense.

"meta":"{\"uuid\":\"<xxx@mail.gmail.com>\",\"subject\":\"RE: la la la\",\"date\":\"2020-03-07 09:21:24\",\"to\":\"someguy@somewhere.com\",\"from\":\"a@b.com\",\"n_attachments\":4,\"topic_name\":\"foo\",\"topic_num\":6}"

I have "validate": false in prodigy.json and I guess this can not be ignored for teach as it updates the meta?

Sorry @jdewsnip , our system thought that your replies were spam which is why they're hidden. I unmarked one of them as non-spam, so hopefully it is not hidden anymore.

textcat.manual does not access the meta-key which is why the error does not occur there.

You could change your meta-value to a dictionary with a key like "additional_information" or "mail_header" having the string as value:

"meta": {"additional_information": "{\"uuid\":\"<xxx@mail.gmail.com>\",\"subject\":\"RE: la la la\",\"date\":\"2020-03-07 09:21:24\",\"to\":\"someguy@somewhere.com\",\"from\":\"a@b.com\",\"n_attachments\":4,\"topic_name\":\"foo\",\"topic_num\":6}"}

Or you encode your string directly as a dictionary, similar to this:

"meta": {"uuid": "<xxx@mail.gmail.com>","subject":"RE: la la la","date":"2020-03-07 09:21:24","to": "someguy@somewhere.com", "from":"a@b.com","n_attachments":4,"topic_name":"foo","topic_num":6}

I hope one of the two approaches works for you. Please let me know if not or if you have any further questions.