Hyperparameter tweaking for custom NER

Hello, we're continuing work on NER to identify children's names in court documents. We have hit a plateau on the accuracy I wanted to try a grid search as a next step. We were taking a look at the flowchart from https://prodi.gy/docs/named-entity-recognition and saw the flow "Are your annotations internally consistent?" -> "Are you sure?" -> "Tweak hyperparameters".

So, my questions are: what does internally consistent mean in this context, exactly? And why should I be doubly sure of that before trying a grid search or similar? I was interpreting internally consistent to mean "correctly annotated" but wanted to verify.

Thank you!

P.S. since I grabbed that flowchart, the link seems to have broken

Annotations are internally consistent if you'd annotate them all the same way again, basically. You'll likely have at least some errors in your annotations, either through careless mistakes, revising your policies as you annotate, differences of opinion between annotators, or unclear edge cases.

A simple way to look for inconsistencies is to run a model over its own training data. Look for cases where the model's decision disagrees with the annotation. Often those cases are actually incorrectly annotated.

If you do want to adjust the hyper-parameters, the main ones to fiddle with are the batch size and dropout. I wouldn't really bother doing a full grid search --- just try a few different values and see if the accuracy is sensitive to them. Try setting a batch size of 2 as a first test. You can also try setting the dropout to 0.

Thanks for the explanation and suggestions!