Provide llm keys while running prodigy from terminal to use with spacy llm

Fantahun · December 22, 2023, 12:43am

Hello,
I'm practicing with the tutorial by Vincent - spaCy LLM and prodigy.
While I run the below command:
python3 -m prodigy ner.llm.correct annotated-food config.cfg examples.json

I get the following error:
UserWarning: Could not find the API key to access the OpenAI API. Ensure you have an API key set up via https://platform.openai.com/account/api-keys, then make it available as an environment variable 'OPENAI_API_KEY'.
warnings.warn(
~/.local/lib/python3.10/site-packages/spacy_llm/models/rest/openai/model.py:61: UserWarning: Authentication with provided API key failed. Please double-check you provided the correct credentials.

Working locally with environment file containing the API Org and key, the NER was successful, but next step to use with prodigy failed. I guess I know the reason; there is no place where the API key is provided for the ner.llm.correct recipe to work with.

Please help with how I can provide the API key on the command line or any other way with prodigy.

To be honest, the tutorials lack coverage and always assume some requirements are satisfied which are not indicated in any place. Also, your prodigy documentation is not thorough and clear. This is just to help you improve your documentation and tutorials.

Thanks

ryanwesslen · December 22, 2023, 4:30am

Hi @Fantahun,

Did you see the section on setting secrets for LLM in the docs?

Importance of `.env` files

You might be using a vendor, like OpenAI, as a backend for your LLM. In such cases you’ll need to setup up secrets such that you can identify yourself.

These secrets really need to be kept safe, which is why we recommend storing them in as environment variables in a .env file. Here’s an example of such a file. You can consult the expected environment variable names for different providers in spaCy documentation

OPENAI_API_ORG = "org-..."
OPENAI_API_KEY = "sk-..."

While this is a recommended way, Prodigy shouldn’t make assumptions on how the environment variables are managed. The user needs to make sure these variables are loaded before executing Prodigy. One way to do that is to load them via python-dotenv:

Example

dotenv run -- python -m prodigyrecipearguments

Please note that the open ai recipes do make this assumption and load the variables internally so this step should not be needed, but the variables need to be stored in .env.

If there are environment variables missing you should see a helpful warning message that tells you which variables you need to add.

If you use an .env file, you should make sure that it is added to a .gitignore such that it never gets uploaded to a central repository. If somebody were to gain access to this key they might incur costs on your behalf with it.

Also this note is important too:

Please note that the environment variable names for OpenAI API keys are different for spacy-llm recipes. For xx.llm.xx recipes they are: OPENAI_API_KEY and OPENAI_API_ORG , while for xx.openai.xx recipes they are PRODIGY_OPENAI_KEY and PRODIGY_OPENAI_ORG

Topic		Replies	Views
openai key and org ID ner	1	473	August 9, 2023
Could not find the API key to access the openai API usage	14	785	August 18, 2023
Training a new model, using OpenAI API usage , ner , spacy	2	38	January 18, 2025
Bug Report: Error in recipe or function for Openai NER V3? bug , ner , spacy	3	192	November 21, 2023
KeyError: 'logger' when executing a NER trainer done , spacy , training	3	384	August 17, 2021

Provide llm keys while running prodigy from terminal to use with spacy llm

Importance of .env files

Example

Related topics

Importance of `.env` files