hi there
I just Download spacy-streamlit, and I have a model that finish training with Prodigy latest version (prodigy-1.11.4), and work so good on spacy, however when I am try to Download and use that model with spacy-streamlit, start complaining about tokenization like this:
ValueError: [E1005] Unable to set attribute 'POS' in tokenizer exception for ' '. Tokenizer exceptions are only allowed to specify ORTH and NORM.
Hi! It looks like this isn't really related to the Streamlit app and more an issue with the model (you'd likely see this error in any environment when running your model). It basically means that for some reason, you ended up with outdated tokenizer exceptions that are setting part-of-speech tags (which isn't allowed via the tokenizer), so spaCy complains.
Could you share some more details on how the model was trained? Which base language did you use? And did you do anything custom, e.g. add custom tokenizer exceptions?
Thanks for the details, this is really strange I have no idea where that tokenizer exception could be coming from. I just tried it with some test data and I can't reproduce this problem.
Just a random idea but could you try it again with a clean install (new virtual environment) of Prodigy and spaCy? Maybe your installation ended up in a weird state?
If that still doesn't solve it, are you able to share your data? It's fine if you can only do it privately (you can email me at ines@explosion.ai). Then I can try and reproduce it with the exact data so we can maybe track down the problem.