Hi,
I’m trying to demonstrate that I can get reasonably similar results by training a NER model with Prodigy as with spaCy, but failing. I have several thousand examples fully annotated in the spaCy format, with several additional entity types beyond the pre-trained ones. My spaCy training loop is pretty straight forward, I use a compounding batch size 4->16, train for just ~5 epochs and my accuracy is around 90% (this is an artificially generated dataset so may be a bit homogeneous). I took the same data, converted the format to Prodigy’s format:
{
"text":"who are W Chen and Jane M Doe",
"label":"GetPersonInfo",
"spans":[
{
"start":8,
"end":14,
"label":"PERSON"
},
{
"start":19,
"end":29,
"label":"PERSON"
}
],
"answer":"accept",
"_input_hash":-615591970,
"_task_hash":1411334382
}
and generated an equal amount of “reject” data by randomising all the entity labels. Feeding this dataset to prodigy ner.batch-train
does not work very well. The “before” accuracy is ~0.25, and right after the first epoch it’s quite close to 0.5, and stays there. I’m therefore guessing it just predicts “accept” or “reject” for everything. Note that this is for using the --no-missing
flag, if I don’t use it then the accuracy jumps to ~0.6-0.7 and then moves towards 0.5.
I then saw a post (which I can’t currently find again), where @honnibal mentioned that the "answer":"accept"
field should be added to any entity spans, so I tried that. But that resulted in 0.0 accuracy, so I think that must have been wrong.
Any idea what could be going wrong? I’ve tried reducing the learning rate by an order of magnitude, tried changing batch size. Happy to give more detailed information about my data/problem on request.