Token indices sequence length is longer than the specified maximum sequence length for this model

hi @foscraft!

Does this help? See the linked spaCy discussions post too for more details.

Also you may want to search more in spaCy discussions or spacy-llm discussions pages for tips and similar examples.

For example, for your out-of-memory issue:

Since your problem is really a spaCy issue, you'll likely find a lot more helpful posts there (let alone the spaCy core dev team answers questions there).

Hope this helps!