Hello @koaning ,
I got to realize a couple of things:
-
data-to-spacy
managed to build the.spacy
files required for training in spaCy. - When training the model however, I faced up again that
Killed
message.
Having the .spacy
files already generated however, I decided to move to another cloud-based VM, with more computational resources in regards of memory this time... And the training completed successfully. That indirectly explains the root cause of my issue.
Still, it would be awesome to have some updated code snipet to diagnose this problem (i.e., a snipet which can tell if you are actually running short of memory or not for your training dataset[s]), and some suggestions to avoid this problem for "big" training datasets.
Thank you.