Hi all,
this might sound really stupid, I am trying to re-install prodigy from scratch in a new clean environment (I have messed up the previous prodigy environment which was more or less working), but I keep receiving error messages right after the installation.
My framework is as folllws:
windows 10
Anaconda / Jupyter lab / Python 3.8
I've created a new environment in Anaconda, then I moved to jupyter terminal where I ran the following: pip install prodigy-1.10.8-cp36.cp37.cp38-cp36m.cp37m.cp38-win_amd64.w hl --user --force-reinstall conda install -n gio_prodigy_new -c conda-forge spacy python -m spacy download en_core_web_sm
which ends well.
In a Jupyter notebook then I can succesfully import prodigy and call prodigy.get_config().
However, when I run !python -m prodigy stats -l or !python -m prodigy ner.manual test_dataset blank:en ./test.jsonl --label TEST I end up with this error message:
ImportError: cannot import name 'get_array_module' from 'thinc.api' (C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\thinc\api.py)
Are there any steps I am missing? Any help would be very appreciated, thanks!
g.
Hi! Not stupid at all, managing virtual environments and various installation setups can sometimes be a headache
The fact that thinc.api can't be imported indicates that some code is trying to use functionality from Thinc v8, which is compatible with spaCy v3. However, these are not compatible with Prodigy 1.10.
Long story short, to use Prodigy 1.10, you need to make sure that you have spaCy 2.x installed on your system, and Thinc 7.x. It looks like your conda install statement will pull in spaCy v3, which won't work. Do you actually need this line? I would expect the pip install prodigy... line to pull in the correct spacy version for you?
If you can, it would be worthwhile to start again from a fresh new virtual environment to avoid any conflicts.
It could also be a good idea to double check your system in a Python console/IDE first, before moving to Jupyter. In the past, we've seen some issues where Jupyter didn't start in the correct environment, or was using system-wide installs instead of env-specific, which makes everything even more confusing.
Thanks Sofie!
Ok so following your advice I did the following:
created a fresh environment;
launched the anaconda prompt, and from there conda list -n test_prodigy, which showed (among a few other packages) that the python version is 3.8.8
activated the new environment, then pip install prodigy[...].whl, which surprisingly seemed to recognize a previous installation, and gave
[..] requirement already satisfied... prodigy is already installed with the same version of the provided wheel. use force-reinstall to force an installationof the wheel.
In the same environment, following your hunch about system-wide installs I launched where python , and I got the following:
When I move to the fresh environment in Jupyter, I get the same error as before when I try to launch prodigy:
Traceback (most recent call last):
File "C:\Anaconda\envs\test_prodigy\lib\runpy.py", line 194, in _run_module_as_main
return run_code(code, main_globals, None,
File "C:\Anaconda\envs\test_prodigy\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\prodigy_main.py", line 48, in
registry.recipes.get_entry_points()
File "C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\catalogue.py", line 127, in get_entry_points
result[entry_point.name] = entry_point.load()
File "C:\Anaconda\envs\test_prodigy\lib\importlib\metadata.py", line 77, in load
module = import_module(match.group('module'))
File "C:\Anaconda\envs\test_prodigy\lib\importlib_init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 783, in exec_module
File "", line 219, in call_with_frames_removed
File "C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\sense2vec_init.py", line 1, in
from .sense2vec import Sense2Vec # noqa: F401
File "C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\sense2vec\sense2vec.py", line 9, in
from .util import registry, cosine_similarity
File "C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\sense2vec\util.py", line 5, in
from thinc.api import get_array_module
ImportError: cannot import name 'get_array_module' from 'thinc.api' (C:\Users\Giovanni\AppData\Roaming\Python\Python38\site-packages\thinc\api.py)
I am relatively new to python, but so far I always managed to get things (more or less) working. This is indeed a bit confusing, any help is welcome!
g.
Hm, that's weird. When you say you created a new environment, do you mean a new conda environment? If so, could you try installing all this just with pip and not use conda? And create a new virtual environment with python -m venv your_venv? I think the two may sometimes not play well together...
Also it would be informative to see the output of pip list if you keep running into issues, so we can verify where the culprit lies...
Hi Sofie, thanks a lot for your help. I tried following your advices, and I managed to get prodigy working in a python venv framework. Did not make any progress on the Anaconda side, but it still feels like moving forward
In detail:
1_
Yes, the new environment was a conda one. I activated it in conda and I installed prodigy with pip: pip install prodigy-1.10.8-cp36.cp37.cp38-cp36m.cp37m.cp38-win_amd64.whl. Calling prodigy commands gets me the same error msg as yesterday. In the same environment, I also tried to install a 7.x version of thinc with pip install thinc==7.4.2, which gave me an error message at the end of the process:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sense2vec 2.0.0 requires catalogue<2.1.0,>=2.0.1, but you have catalogue 1.0.0 which is incompatible.
sense2vec 2.0.0 requires spacy<4.0.0,>=3.0.0, but you have spacy 2.3.5 which is incompatible.
sense2vec 2.0.0 requires srsly<3.0.0,>=2.4.0, but you have srsly 1.0.5 which is incompatible.
Successfully installed thinc-7.4.2
Ok cool, thanks for reporting back. I'm happy to hear things work with pip at least, and you can get Prodigy v10 running
For the conda environment, it definitely looks like there are pre-installed packages there. Have you tried creating a new conda env, with a name you haven't used before? And then run pip list right after creating it just to be 100% sure nothing is installed yet before you start installing Prodigy?
We can also try fixing your current conda environment. From the pip list of your conda env, two things jump at me that you might want to try correcting and see if it works then:
spacy-legacy should not be installed, you can remove this. This is only relevant when using spaCy v3 with Prodigy v11
sense2vec should be downgraded to 1.0.3 to work well with spaCy v2 and prodigy v10.
I corrected those and now prodigy is working fine in the conda env as well.
However I dont get why I get pre installed packages in conda, this still puzzles me:
I promise, I did this also the first time around but anyway, I tried once more to be 1000% sure. Here's what I did:
.1 go to command shell
.2 conda create --name final_prodigy_test python=3.8
.3 conda activate final_prodigy_test
.4 pip list (before installing prodigy)
As you can see prodigy is already there (together with a bunch of other stuff). I reckon this is not the appropriate place, but is there a reason why these packages are preinstalled in a fresh conda env?
It's been a while since I used conda myself, and I stopped using it because I did also run into weird behaviour from time to time, which doesn't happen anymore when just sticking to pip
One last question and then I'll let you go In case I were to drop Anaconda entirely, is there a notebook-like interface that you would recommend? (ie, I've read somewhere that jupyter can be installed via pip..)
I'm afraid I'm not really an expert on notebooks either, I typically use PyCharm as IDE to program Python in. Project Jupyter | Installing the Jupyter Software should have some information for you, and in general I guess StackOverflow is a better place to ask these general-purpose Python questions