Encountering an error with custom IOB2 format in the 'ner_spancat_compare' project


I am experiencing an issue with a project that I am trying to use for my custom IOB2 format. The project is located on explosion GitHub, and I have been able to successfully run it using the provided data/assets. Although I need to fix some parameter errors, I can understand the errors and fix them. However, when I try to replace the provided assets with my own custom IOB2 data, I encounter errors and I am not sure how to fix them.

KeyError: "[E018] Can't retrieve string for hash '6379371356491057537'. This usually refers to an issue with the `Vocab` or `StringStore`."

I have attached my example IOB2 file to this discussion and I am wondering if you could help me fix these issues. Thank you.

Example datasets: requirement-datasets/assets at main · daffahilmyf/requirement-datasets (github.com)

hi @daffahilmyf!

Thanks for your question and sorry you're having issues.

Could you repost this on the spaCy discussions forum? Sorry for the confusion, but your question is spaCy-specific and the spaCy core team can help you there. This forum is Prodigy-specific.

You may want to include your spaCy version (i.e., run spacy info) too.