That's great to hear!
I'd recommend all uppercase. It will avoid any issues as I think spaCy can handle either, but Prodigy will only show labels in UI as uppercase.
Also, three other tips.
First, to keep your command short (and avoid possible misspellings due to long labels list), you can also specify your labels by a local text file. For example, if in your folder you have a file named
Then you can run it with:
python -m prodigy ner.manual test_ner model-best-all_text ./test_text.jsonl --label ./labels.txt
I mention this because I noticed in your original command, you accidentally put
REGULATOR,REGULATOR twice. I don't think this will have any problem, but it goes to show with such a long list, it's really easy to misspell or misspecify. But if you put it once in a local
.txt file, you'll always be consistent
The second tip is that if you run into issues like this to debug built-in recipes like
ner.correct, you can view the source code of all built-in recipes. Those recipes are in the path of
Location: of your Prodigy installation you find by running
prodigy stats and in the
recipes folder (e.g., you can find
ner.correct in the
By using custom logging, you can better diagnose any issues/questions you may have with any of the recipes. Let alone - if you learn some common syntax and conventions, you can reuse and begin developing your own custom recipes.
If you had still had issues, this was going to be my next recommendation. I mention in case it helps you debug your next possible recipe question.
Last - you have a really cool workflow. If you want a more reproducible workflow, consider converting your project into a spaCy project (which is being renamed
weasel package). For example, here's an example template for Prodigy. This will make reproducibility so much easier like converting your model into a spaCy package, running that model as a streamlit app, or a FastAPI app.
Let me know if you have any questions!