db-out killed

When running the db-out command on a dataset containing a large number of images the command is killed and no data is written to the specified file.

$ prodigy db-out insurance_img_annotation > insurance_img_annotation.jsonl

/home/COMPUTE/eadkins/anaconda3/bin/prodigy: line 1:  8221 Killed                  python -m prodigy "$@"

I realized (belatedly) that I was writing the actual image data to the task. I thought this was likely to be the problem and I am working on changing this for future jobs. (db-out works fine for large sets containing only text data). But for this one dataset, how can I recover my annotations?

This seems to be a memory issue. I copied the entire database prodigy.db from the .prodigy/ directory to another computer with more memory available and was able to run db-out successfully. This seems like a rather clunky workaround, but at least I have the annotations.

1 Like

Thanks for updating with your solution!

Even though it’s clunky, one thing I like about SQLite is that it gives you one straightforward file that you can back up and move around easily. Btw, if you’re ever in a situation where you want to change things manually (e.g. remove the encoded image data), you could also give the SQLite DB browser a try. Just make sure to back up the .db file before.

1 Like