Using Prodigy PyPi server in requirements.txt or as index URL

Great news indeed! Congrats on the huge release-- very much looking forward to trying out the textcat.correct recipe.

One small issue with the new installation procedure from your repo. Using this line in requirements.txt:

prodigy @ https://${PRODIGY_LICENSE}@download.prodi.gy

(I should note that I've confirmed PRODIGY_LICENSE is set in the env.)

ERROR: Cannot unpack file /private/var/folders/jk/ld57c3v168129sjtmxtc4tjw0000gn/T/pip-unpack-g384q1u4/download.prodi.gy (downloaded from /private/var/folders/jk/ld57c3v168129sjtmxtc4tjw0000gn/T/pip-install-4d0ymt67/prodigy_a174278a6dd44ae8a5fd993455d58a24, content-type: text/html; charset=utf-8); cannot detect archive format
ERROR: Cannot determine archive format of /private/var/folders/jk/ld57c3v168129sjtmxtc4tjw0000gn/T/pip-install-1yytf_xl/prodigy_b6c40481ec7749bfaecc02e5c84e120b

Editing in real time!

Using this line in requirements.txt:

--find-links https://${PRODIGY_LICENSE}@download.prodi.gy

Gets us through the error above, but pip spits out this warning:

WARNING: Skipping page https://****@download.prodi.gy because the GET request got Content-Type: application/json.The only supported Content-Type is text/html

Will update when I find the proper requirements.txt incantation...

Hi! I moved your comment to a separate thread because this makes it easier to discuss it separately (and the nightly thread was getting kinda large) :slightly_smiling_face:

Definitely keep us updated if you've found a solution. I'm a bit confused why it doesn't work with --find-links – maybe it's the authentication? It definitely works in other cases, e.g. PyTorch, but that index URL is public. Is there a way you can make pip show you the response it gets? Since it's a JSON object, I suspect it might be some error message.

At the moment, our PyPi server is set up to provide a list of wheels that works with -f / --find-links, but that's not actually a real index URL (which would return a link prodigy pointing to a list of wheels). The main reason for this is simplicity: -f is nice and short, matches what other libraries like PyTorch do, and making the prodigy index the default means you don't have to type download.prodi.gy/prodigy.

An alternative solution would be for us to also provide a "real" index URL that you can specify via --extra-index-url. This basically comes down to just exposing another endpoint via our download service.

We also got a question about poetry the other day, which supports installing from a private index, but doesn't implement an equivalent of --find-links. So having a separate index URL available would likely also be very useful for making the PyPi downloads work out-of-the-box with poetry.

Hi Ines,

Thanks for splitting out the comment. I piped the installation attempt into a fresh venv to a log file that I can share with you but it has some AWS key info that I'm hesitant to post-- let me know if I should share it via email.

The error does indeed seem to come from authentication. The relevant output from pip install -r requirements.in -vvv is:

ERROR: HTTP error 403 while getting https://s3.eu-west-1.amazonaws.com/data.prodi.gy/dist/prodigy-1.11.0-cp39-cp39-macosx_10_14_x86_64.whl?AWSAccessKeyId=
[...long access keys and tokens...]
Could not install requirement prodigy from https://s3.eu-west-1.amazonaws.com/data.prodi.gy/dist/prodigy-1.11.0-cp39-cp39-macosx_10_14_x86_64.whl?AWSAccessKeyId=
[...]
because of HTTP error 403 Client Error: Forbidden for url: https://s3.eu-west-1.amazonaws.com/data.prodi.gy/dist/prodigy-1.11.0-cp39-cp39-macosx_10_14_x86_64.whl?AWSAccessKeyId=
[...]

I'm motoring along with the traditional local install at the moment but definitely see the utility in the PyPi server and alternative installation approach! I've been around the block with conda and pipenv and am right back to vanilla pip so don't have much insight into poetry, but that does seem to be where the wind is blowing these days. I'd give that a try, certainly, and the --extra-index-url seems like a fine way to cover more use cases-- but I'm not speaking from a ton of knowledge here.

Thanks again for all the great stuff,

Adam

Update: Just deployed a new version of our PyPi server that now also exposes an /index endpoint to use with --extra-index-url. So you should be able to add the following on the CLI or to your requirements.txt:

--extra-index-url https://****@download.prodi.gy/index

Let me know if this works now! Also, if there's a situation where you need to configure a proviate PyPi index with a URL + username + password, you can use the license key as the username and no password.

1 Like

Works a treat! The following at the bottom of requirements.txt installs prodigy successfully on a bare venv:

--extra-index-url https://${PRODIGY_LICENSE}@download.prodi.gy/index
-f https://${PRODIGY_LICENSE}@download.prodi.gy
prodigy

Awesome, thanks for reporting back! I think you might be able to leave out the -f in this case? Pretty sure --extra-index-url already has you covered.

Thanks for the tip! You are very much correct! Confirmed that removing the -f line works fine on a bare env.

Again, much appreciated!

Adam