Incredibly poor training results

zaibacu · October 23, 2018, 11:44am

Hello,

We are running Prodigy v1.6.1 and have done proof-of-concept type example which is relatively simple - from T-shirt description, identify material.

Trained on about 200 examples, did ner.batch-train, even got following results:

However, when launching Prodigy and trying to continue, absurd entities are still marked (with rather huge score as well)

and so on - question marks, the, even spaces are counted as correct entity

And in general, how our training pipeline looks:

We were following this guide: https://prodi.gy/docs/video-new-entity-type

ines · October 23, 2018, 12:13pm

Yes, this definitely looks suspicious! Could you show some examples of the data you collected? And how many instances of the MATERIAL terms from your patterns were in your data? Did they come up a lot? 200 examples is still a pretty small set, so it's possible that you simply didn't use enough data.

One quick note on this: When you tried out the model in Prodigy, did you use ner.teach? Because what you see here can potentially be very misleading: Prodigy will get all possible analyses for the sentence and present you the examples the model is most uncertain about, i.e. the ones with predictions closest to 0.5. So those aren't necessarily the entities with the highest scores or the most "correct" ones.

If you want to see how the model performs "in real life", it'd make more sense to load it with spaCy, process a bunch of (unseen) text and look at the MATERIAL entities in doc.ents. Those are the ones that the model will actually predict.

zaibacu · October 23, 2018, 12:56pm

Thank you for quick response

Ok, so I’ve trained quite a bit more examples:

Did training:

And when doing actual print stream (I’ve done it with Spacy as well, results seems to be similar)

It clearly catches trivial cases, however way too much of false-positives

honnibal · October 23, 2018, 2:49pm

The latest screenshot of training you’ve posted still shows a very small dataset. Is that the correct run? Because it shows only 206 training examples, and 14 examples used for evaluation. That’s probably just not enough data to train with.

Another problem is that it looks like you’re training on top of a model that already gets 13/14 correct. It’s better to start with a blank model each time you run batch train. Finally, try setting the batch size lower. When you have very little data, you usually want a low batch size.

zaibacu · October 25, 2018, 3:40pm

Thank you for insights.

Yes, for some reason, some of the annotated data is ignored, thus number for training was low.

Finally got decent performance just by using en_core_web_lg instead blank model. Can’t even remember exact reason why it was used in the first place.

ines · October 25, 2018, 7:31pm

I think the reason the number is lower is that a) you might have ignored examples and b) before training, Prodigy will combine all annotations on the same sentence into one training example. So if you're using ner.teach and you accept / reject multiple entities on the same text, all of these examples become one training example later on.

I think it might be a good idea to include a message about this in the output of the training recipes. Even just something like "Merging X examples.." or "Merged X annotations into X training examples".

Topic		Replies	Views
Debugging NER - batch_train with custom dataset ner	5	617	October 16, 2019
ner.batch train output - Right, wrong, accuracy returned as Zero ner	9	949	May 20, 2019
Does Prodigy load pre-annotated data? usage , ner , solved	23	2662	October 25, 2018
Prodigy not labeling correctly usage , ner	1	519	July 18, 2018
KeyError: 'text' when using ner.batch-train usage , ner , solved	6	954	February 13, 2019

Incredibly poor training results

Related topics