Prodigy NER train recipe getting killed for no apparent reason

Hello @koaning ,

I got to realize a couple of things:

  1. data-to-spacy managed to build the .spacy files required for training in spaCy.
  2. When training the model however, I faced up again that Killed message.

Having the .spacy files already generated however, I decided to move to another cloud-based VM, with more computational resources in regards of memory this time... And the training completed successfully. That indirectly explains the root cause of my issue.

Still, it would be awesome to have some updated code snipet to diagnose this problem (i.e., a snipet which can tell if you are actually running short of memory or not for your training dataset[s]), and some suggestions to avoid this problem for "big" training datasets.

Thank you.