Injecting environment variable in Weasel run script

TimothePearce · November 22, 2023, 2:18pm

Dear SpaCy Team,

I am currently working on a project that involves using Prodigy with a SpaCy project (weasel), and I've encountered a challenge in setting up environment variables.

I am using the weasel run <command> to execute a Prodigy recipe with the PRODIGY_ALLOWED_SESSIONS variable.

My project.yml file is defined as follows:

...
env:
  PRODIGY_ALLOWED_SESSIONS: "user_1,user_2,user_3"
...

commands:
  - name: "man"
    help: "Start the Prodigy manual annotation recipe"
    script:
      - "echo PRODIGY_ALLOWED_SESSIONS=${env.PRODIGY_ALLOWED_SESSIONS}"
      - "python -m prodigy textcat.manual ${vars.database} assets/${vars.document_paths} --label ${vars.labels}"
...

When I execute the weasel run man command, here is my output:

==================================== man ====================================
Running command: echo PRODIGY_ALLOWED_SESSIONS=
PRODIGY_ALLOWED_SESSIONS=
Running command: /opt/homebrew/Caskroom/miniforge/base/envs/annotations/bin/python3.11 -m prodigy textcat.manual ...
Using 11 label(s): ...

✨ Starting the web server at http://localhost:8080 ...
Open the app in your browser and start annotating!

As you can see, the PRODIGY_ALLOWED_SESSIONS environment variable is not being injected correctly.

I also tried using a vars variable, but that resulted in an error like:

Running command: PRODIGY_ALLOWED_SESSIONS=user_1,user_2,user_3 python -m prodigy textcat.manual ...'
Traceback (most recent call last):
...

FileNotFoundError: [E501] Can not execute command 'PRODIGY_ALLOWED_SESSIONS=user_1,user_2,user_3 python -m prodigy textcat.manual ... 
Do you have 'PRODIGY_ALLOWED_SESSIONS=user_1,user_2,user_3' installed?

How am I supposed to inject an environment variable like PRODIGY_ALLOWED_SESSIONS in a weasel project?

Thank you for your assistance.

ryanwesslen · November 22, 2023, 2:37pm

Hi @TimothePearce,

The env section is to make it possible to refer directly to env vars in your command definitions. The values aren't stored in project.yml, it's just a mapping from env var names to variable names you can use within a project command.

What I've typically done is to use python-dotenv and usedotenv run --.

For example, I'd have a .env file (for example here using LLM keys) in my root folder and then could run:

  - name: "prodigy-ner-fewshot"
    help: "Prodigy ner few shot"
    script:
      - "dotenv run -- python -m prodigy ner.llm.fetch ${vars.config-fewshot} ${vars.input} ${vars.output-fewshot}"
      - "python -m prodigy db-in ${vars.dataset-fewshot} ${vars.output-fewshot}"

Also just a heads up - if you have specific weasel questions, you're likely better off to post those directly on the weasel GH issues pages. While we're still a small team, we have multiple libraries and each respective teammate typically answers questions per their library they're maintaining (e.g., the spaCy core team typically focuses on spaCy GH issues, not this forum, which is for Prodigy-specific problems).

TimothePearce · November 22, 2023, 2:54pm

Hi @ryanwesslen

Thank you very much for your response.

I followed your suggestion to use python-dotenv, and it works perfectly.

I appreciate your help and the heads-up about where to post specific questions. Next time, I'll make sure to direct my queries to the appropriate GitHub issues page for more targeted assistance.

Thanks again for your support!

Topic		Replies	Views
adding PRODIGY_ALLOWED_SESSIONS in anaconda prompt usage , custom , solved	2	449	November 15, 2021
can we pass environment variable PRODIGY_HOME directly in the prodigy.serve argument usage , done , database	7	1373	July 2, 2021
Specifying config file explicitly? enhancement , done	1	581	August 12, 2021
Is it possible to configure the path to the prodigy.json file via a command line option? usage	1	1108	March 15, 2018
Could not find the API key to access the openai API usage	14	802	August 18, 2023

Injecting environment variable in Weasel run script

Related topics