Token indices sequence length is too long

This is a warning coming internally from transformers or tokenizers and you don't see actual errors because long sequences are truncated internally before they're passed to the model.

If it happens rarely, you can probably ignore it. If it's frequent, you may want to adjust the window and stride for the transformer span getter in your config. See: Receiving the warning messgae 'Token indices are too long' even after validating doc length is under max sequence length · Discussion #9277 · explosion/spaCy · GitHub