I'm practicing with the tutorial by Vincent - spaCy LLM and prodigy.
While I run the below command:
python3 -m prodigy ner.llm.correct annotated-food config.cfg examples.json
I get the following error:
UserWarning: Could not find the API key to access the OpenAI API. Ensure you have an API key set up via https://platform.openai.com/account/api-keys, then make it available as an environment variable 'OPENAI_API_KEY'.
~/.local/lib/python3.10/site-packages/spacy_llm/models/rest/openai/model.py:61: UserWarning: Authentication with provided API key failed. Please double-check you provided the correct credentials.
Working locally with environment file containing the API Org and key, the NER was successful, but next step to use with prodigy failed. I guess I know the reason; there is no place where the API key is provided for the ner.llm.correct recipe to work with.
Please help with how I can provide the API key on the command line or any other way with prodigy.
To be honest, the tutorials lack coverage and always assume some requirements are satisfied which are not indicated in any place. Also, your prodigy documentation is not thorough and clear. This is just to help you improve your documentation and tutorials.
Did you see the section on setting secrets for LLM in the docs?
You might be using a vendor, like OpenAI, as a backend for your LLM. In such cases you’ll need to setup up secrets such that you can identify yourself.
These secrets really need to be kept safe, which is why we recommend storing them in as environment variables in a
.env file. Here’s an example of such a file. You can consult the expected environment variable names for different providers in spaCy documentation
OPENAI_API_ORG = "org-..."
OPENAI_API_KEY = "sk-..."
While this is a recommended way, Prodigy shouldn’t make assumptions on how the environment variables are managed. The user needs to make sure these variables are loaded before executing Prodigy. One way to do that is to load them via python-dotenv:
dotenv run -- python -m prodigyrecipearguments
Please note that the open ai recipes do make this assumption and load the variables internally so this step should not be needed, but the variables need to be stored in
If there are environment variables missing you should see a helpful warning message that tells you which variables you need to add.
If you use an
.env file, you should make sure that it is added to a
.gitignore such that it never gets uploaded to a central repository. If somebody were to gain access to this key they might incur costs on your behalf with it.
Also this note is important too:
Please note that the environment variable names for OpenAI API keys are different for
spacy-llm recipes. For
xx.llm.xx recipes they are:
OPENAI_API_ORG , while for
xx.openai.xx recipes they are