I would like to define my database connection in code, so I can read the connection details from an encrypted file. I know I can specify the database details in prodigy.json, but I don't like to have sensitive information in there unencrypted, given that it will be committed to version control.
With the teach recipe I could just overwrite the db component, but how do I do the same when calling batch_train?
Hi, cant you just apply the method described in this Gist (this is also posted on the forum). And use the
prodigy.json.tpl to set up you DB connection via
env vars. Which is kind of best practice for handling creds in version control.
Thanks, that may be a workaround. However I'm working in a conda environment rather than a Docker container. Maybe there is an easier way?
Maybe a solution would be to use dotEnv and apply the
Another option could be to edit the recipe source and replace
DB = connect() with your custom database. The
DB here is expected to be a regular Prodigy
If you want things to be really elegant, you could also wrap your custom database class in a Python package and expose it via the
prodigy_db entry point group. You can find more details about this in the "Entry Points" section of your
PRODIGY_README.html. All your
prodigy.json would then need to do is specify
"db": "your_custom_db" and the entry point will tell Prodigy how to resolve that name and what database to initialize.
Thanks @ines, the entry point solution is very elegant indeed and works perfectly.