Feature Request: db-in accepts a directory and imports all jsonl files in the directory

I have a workflow where I am running db-out programmatically into a directory, and I would like to be able to create new prodigy datasets from all the jsonl files in this directory. Currently db-in does not support passing a directory and I have to run db-in for each file in the directory.

I realize I can write a Python script that iterates through the directory and calls db-in for me, but it would be nice to have this built into Prodigy and I think it's a small feature (it would be basically the same code, just within Prodigy).

Thank you for consideration!

Hi @jspinella ,

We really appreciate the suggestion! We thought about it before, and we got to a conclusion that the input reading can get complex really quickly as there are lots of different scenarios that need be supported e.g. filtering by name or extension etc. For this reason we believe it's more useful to provide a simple, atomic command that is easy enough to integrate with any custom script so that the users can orchestrate the file handling in the way most suited to their needs (just like you did). Hope it makes sense?

1 Like

Understandable! Thank you for the quick reply.