Deploying Prodigy on Google Cloud Standard App Engine

How do I start prodigy using a python script instead of

python -m prodigy my_recipe -F recipes.py

I have to use prodigy where it has been installed with

pip install prodigy*.whl -t lib

instead of

pip install prodigy*.whl

The reason is that I want to deploy it on Google Cloud Standard App Engine. I solved it by having

entrypoint: PRODIGY_PORT=$PORT PYTHONPATH=lib python -m prodigy my_recipe -F recipes.py

in my app.yaml file. I also had to install prodigy using

pip install --no-deps prodigy*.whl -t lib

and manually add the prodigy dependencies to a requirements.txt file.

1 Like

Okay, just to confirm: Your last post includes the solution, right? :slightly_smiling_face:

Btw, you might also want to look into .pex files: https://pex.readthedocs.io/en/stable/whatispex.html

We use them a lot internally and it basically lets you package a whole Python executable into one file (the only thing you need pre-installed is Python itself, that’s it). And instead of python -m prodigy ..., you could then run something like prodigy_env.pex -m prodigy .... The nice thing here is that you can really package everything with it, including Prodigy itself, any spaCy models, custom modules etc. And all you ever have to deploy is one single file.

Yes I solved it. .pex sounds pretty damn cool. Thanks for that one!

I have another challenge though. I am getting the same task presented twice sometimes. Looking at _task_hash and _input_hash I do indeed have multiple occurrences of some of those. Shouldn’t the _task_hash be unique?

I am wrapping the generic recipe and wrapping each task/dict with prodigy.set_hashes.

EDIT

Just noticed that I needed to set exclude. That might be the solution - I’ll wait and see.