It's hard to say for sure, but given that one dataset less works fine I'm indeed guessing it's a memory issue.
Are you able to export the datasets to the .spacy
format via the data-to-spacy recipe? If so, we might be able to pick it up from spaCy.