I used prodigy tool to mark almost 1000 texts to do the ner task, which has only five entity categories. I trained with the default pipeline of transformer and ner in the spacy tool. But the accuracy rate is only about 60. So I want to know where I can see the network structure of ner components in the spacy, so that I can modify it to improve the accuracy. Or do you have any suggestions for improving accuracy. Thank you.
Hi there!
Prodigy uses spaCy under the hood which comes with an elaborate configuration system that allows you to do hyperparameter tuning. Alternatively if you'd like to learn more about the components used under the hood, you can find more information about that on the model architectures section of the documentation.
That said, my usual advice is to not immediately dive into hyperparameter tuning. There can be issues with data quality that might explain the low accuracy as well. I cannot judge if you've already done this, but as a first exercise, I usually try to understand when the model makes mistakes. If it's ever clear that the model makes one kind of error often, the quickest fix is to find more relevant examples for it to train on.
Are there certain entities that the model gets wrong more often than other ones?