SpanCat Training Error on Custom Preprocessed Dataset

Thanks for your response.

Were your annotations altered with any pre- or post-processing? You should avoid any modifications to your annotations if you want to use prodigy train.

That message is a bit too vague for me to diagnose without more details. Was this all of the error message? If not, can you provide the full stack error message?

The closest I found was relating to tokenization:

I'm wondering if this is a tokenization problem because of some pre- or post-processing you may have done.

Alternatively if not, can you provide a small sample of your data like you did previously?

Also, moving forward, please avoid screen shots of code - you can instead copy/paste it directly. This enables it be searchable for the next user (e.g., now others could search by the same error message and find this post) :slight_smile: