Hi, we've been running into an error trying to use train-curve. I found ner.batch-train after ner.maual results error (Value error : [E024]) and some other answers (both regarding Prodigy as well as spaCy) that say this is caused by tokens that begin or end with whitespace, and as a solution we should remove the bad spans as they would be "reject" annotations anyway. However, we used exclusively manual labeling for the dataset in question, so the dataset is all accepts and this solution doesn't seem right. I was wondering if you could help me understand: does Prodigy create tokens for ner.manual that begin or end with whitespace? If so, wouldn't that mean those token are unusable for training without additional processing?
Followup -- this only affects NER spans, if I understand correctly, but the Prodigy jsonl format includes references to the original tokens in addition to the character indices in the text. When I correct the whitespace issue, do I also have to change the start/end of the original token, or is it enough to just adjust the start/end character indices of the NER span?