I am training a model with around 100k annotations using the
train ner recipe.
On the first iteration I get the following error;
ValueError: [E024] Could not find an optimal move to supervise the parser. Usually, this means that the model can't be updated in a way that's valid and satisfies the correct annotations specified in the GoldParse... ... For details, run: python -m spacy debug-data --help
Based on this support inquiry, I understand i need to check there are no entity spans with leading/trailing white spaces, so i create a json-files (not jsonl files) using
db-out . Then, as suggest by the above error, I run
$ python -m spacy debug-data ja ./train.json ./dev.json -b ./models/my_model/
This gives me the following error:
=========================== Data format validation =========================== ✘ Training data cannot be loaded: too many values to unpack (expected 2) ✘ Development data cannot be loaded: too many values to unpack (expected 2)
Below is a sample document from my annotation data.
I believe I have correctly formatted the data to run
debug-data, but is there something i have missed?