prodigy train tagger not working

meatball_nlp · October 22, 2020, 1:38am

I am using prodigy to train a pos tagger. However when I try to run the train command I get the following results:

Created and merged data for 0 total examples
Using 0 train / 0 eval (split 50%)
ValueError: not enough values to unpack (expected 2, got 0)

The file was created using the prodigy pos.correct recipe and when I run the db-out command I see the data formatted correctly.

{"text":"cfos","_input_hash":-1934297327,"_task_hash":1222700193,"tokens":[{"text":"cfos","start":0,"end":4,"id":0,"ws":false}],"spans":[{"start":0,"end":4,"token_start":0,"token_end":0,"label":"NNS"}],"_session_id":null,"_view_id":"pos_manual","answer":"ignore"}

Any clue or if this is a bug?

ines · October 22, 2020, 9:25am

Hi! Which version of Prodigy are you using and what's the exact command you're running? I just tried reproducing it by annotating a few examples with pos.correct and then training a model, but it all ran as expected

meatball_nlp · October 22, 2020, 5:04pm

Hi @ines - thanks for the quick reply! Using the latest version of prodigy (1.10.4 ) and here is the exact command I ran:

prodigy train tagger pos_oct_21_fine_grained en_core_web_md -o ./test

If I do prodigy db-out pos_oct_21_fine_grained I see the examples correctly printed in terminal. I am only using 13 examples if that helps debug.

If I run the command with coarse-grained training and batch_train it works correctly, but if I run the command with fine-grained tags with the all-purpose train command I get the above error. If I try using the coarse grained tags i.e. VERB it complains that new labels can't be added to the model.

ines · November 10, 2020, 3:51am

Thanks for the details and sorry about the delay! I've been trying to reproduce this but it always trains as expected for me I used the example you provided above (and changed "answer" to "accept"), and also used it for evaluation.

Are you sure the dataset you're using is provided correctly and contains annotations (that are annotated with "answer": "accept")? I just double-checked and the "0 total examples" reported on the CLI is the number of examples in the dataset that were accepted and contain "tokens".

Btw, one small change I made for the upcoming release: train now fails more gracefully if no training or evaluation examples are available, so you're not getting a cryptic traceback anymore.

Topic		Replies	Views
Error with pos.batch-train usage , solved	4	583	February 4, 2019
Custom POS tag model and errors spacy , custom , pos	3	2364	January 16, 2019
POS-tags messed up after ner.batch-train ner	1	475	April 18, 2018
Understanding and improving POS training. usage , pos	1	648	February 6, 2019
Error occur with execution pos.batch_train, pos.train_curve pos	2	660	June 19, 2018

prodigy train tagger not working

Related topics