Running Prodigy for the First Time Problem with thinc.api?

Steps thus far:

  1. Created a brand new VENV using command shell doing the following:
    conda create --no-default-packages -n test_new2 python3.8

  2. Upon creation, the following packages were downloaded and installed:
    image

  3. pip install ./prodigy*.whl — worked like a charm this time, no errors.

  4. Installed python -m spacy download en_core_web_sm with no errors.

I should be good to go now, right? So, I open up https://prodi.gy/docs/named-entity-recognition to see how this things works. I include news_headlines.jsonl file into my virtual environment and enter this code into the terminal:

prodigy ner.manual ner_news_headlines blank:en ./news_headlines.jsonl --label PERSON,ORG,PRODUCT,LOCATION

However, windows pops up a "how do you want to open this file" dialogue. When I view in "text" I see "#!/bin/sh
python -m prodigy "$@""

image

I assume Prodigy should automatically run after issuing this command? Or is there some additional setup I need to perform.

When I run the python -m prodigy stats command, I get the following:

image

When I run the python -m prodigy "$@" command I get this:

Traceback (most recent call last):
    return _run_code(code, main_globals, None,
  File "C:\Users\AndrewG\anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\prodigy\__main__.py", line 48, in <module>
    registry.recipes.get_entry_points()
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\catalogue.py", line 127, in get_entry_points
    result[entry_point.name] = entry_point.load()
  File "C:\Users\AndrewG\anaconda3\lib\importlib\metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "C:\Users\AndrewG\anaconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\__init__.py", line 1, in <module>
    from .sense2vec import Sense2Vec  # noqa: F401
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\sense2vec.py", line 9, in <module>
    from .util import registry, cosine_similarity
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\util.py", line 5, in <module>
    from thinc.api import get_array_module
ImportError: cannot import name 'get_array_module' from 'thinc.api' (C:\Users\AndrewG\anaconda3\lib\site-packages\thinc\api.py)       
PS C:\Users\AndrewG\anaconda3\envs\test_new2\code_files> python -m prodigy "$@"
Traceback (most recent call last):
  File "C:\Users\AndrewG\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\AndrewG\anaconda3\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\prodigy\__main__.py", line 48, in <module>
    registry.recipes.get_entry_points()
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\catalogue.py", line 127, in get_entry_points
    result[entry_point.name] = entry_point.load()
  File "C:\Users\AndrewG\anaconda3\lib\importlib\metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "C:\Users\AndrewG\anaconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\__init__.py", line 1, in <module>
    from .sense2vec import Sense2Vec  # noqa: F401
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\sense2vec.py", line 9, in <module>
    from .util import registry, cosine_similarity
  File "C:\Users\AndrewG\anaconda3\lib\site-packages\sense2vec\util.py", line 5, in <module>
    from thinc.api import get_array_module
ImportError: cannot import name 'get_array_module' from 'thinc.api' (C:\Users\AndrewG\anaconda3\lib\site-packages\thinc\api.py)  

Spacy version is 2.3.7.
image

My PIP List:

Package                            Version
---------------------------------- -------------------
aiofiles                           0.7.0
alabaster                          0.7.12
anaconda-client                    1.7.2
anaconda-navigator                 2.0.3
anaconda-project                   0.9.1
anyio                              2.2.0
appdirs                            1.4.4
argh                               0.26.2
argon2-cffi                        20.1.0
asn1crypto                         1.4.0
astroid                            2.5
astropy                            4.2.1
async-generator                    1.10
atomicwrites                       1.4.0
attrs                              20.3.0
autopep8                           1.5.6
Babel                              2.9.0
backcall                           0.2.0
backports.functools-lru-cache      1.6.4
backports.shutil-get-terminal-size 1.0.0
backports.tempfile                 1.0
backports.weakref                  1.0.post1
bcrypt                             3.2.0
beautifulsoup4                     4.9.3
bitarray                           1.9.2
bkcharts                           0.2
black                              19.10b0
bleach                             3.3.0
blis                               0.7.4
bokeh                              2.3.2
boto                               2.49.0
Bottleneck                         1.3.2
brotlipy                           0.7.0
cachetools                         4.2.2
catalogue                          1.0.0
certifi                            2020.12.5
cffi                               1.14.5
chardet                            4.0.0
click                              7.1.2
cloudpickle                        1.6.0
clyent                             1.2.2
colorama                           0.4.4
comtypes                           1.1.9
conda                              4.10.1
conda-build                        3.21.4
conda-content-trust                0+unknown
conda-package-handling             1.7.3
conda-repo-cli                     1.0.4
conda-token                        0.3.0
conda-verify                       3.4.2
contextlib2                        0.6.0.post1
cryptography                       3.4.7
cycler                             0.10.0
cymem                              2.0.5
Cython                             0.29.23
cytoolz                            0.11.0
dask                               2021.4.0
decorator                          5.0.6
defusedxml                         0.7.1
diff-match-patch                   20200713
distributed                        2021.4.0
docutils                           0.17
en-core-web-md                     2.3.1
en-core-web-sm                     2.3.1
entrypoints                        0.3
et-xmlfile                         1.0.1
fastapi                            0.44.1
fastcache                          1.1.0
filelock                           3.0.12
flake8                             3.9.0
Flask                              1.1.2
fsspec                             0.9.0
future                             0.18.2
gevent                             21.1.2
glob2                              0.7
greenlet                           1.0.0
h11                                0.9.0
h5py                               2.10.0
HeapDict                           1.0.1
html5lib                           1.1
idna                               2.10
imagecodecs                        2021.3.31
imageio                            2.9.0
imagesize                          1.2.0
importlib-metadata                 3.10.0
iniconfig                          1.1.1
intervaltree                       3.1.0
ipykernel                          5.3.4
ipython                            7.22.0
ipython-genutils                   0.2.0
ipywidgets                         7.6.3
isort                              5.8.0
itsdangerous                       1.1.0
jdcal                              1.4.1
jedi                               0.17.2
Jinja2                             2.11.3
joblib                             1.0.1
json5                              0.9.5
jsonschema                         3.2.0
jupyter                            1.0.0
jupyter-client                     6.1.12
jupyter-console                    6.4.0
jupyter-core                       4.7.1
jupyter-packaging                  0.7.12
jupyter-server                     1.4.1
jupyterlab                         3.0.14
jupyterlab-pygments                0.1.2
jupyterlab-server                  2.4.0
jupyterlab-widgets                 1.0.0
keyring                            22.3.0
kiwisolver                         1.3.1
lazy-object-proxy                  1.6.0
libarchive-c                       2.9
llvmlite                           0.36.0
locket                             0.2.1
lxml                               4.6.3
MarkupSafe                         1.1.1
matplotlib                         3.3.4
mccabe                             0.6.1
menuinst                           1.4.16
mistune                            0.8.4
mkl-fft                            1.3.0
mkl-random                         1.2.1
mkl-service                        2.3.0
mock                               4.0.3
more-itertools                     8.7.0
mpmath                             1.2.1
msgpack                            1.0.2
multipledispatch                   0.6.0
murmurhash                         1.0.5
mypy-extensions                    0.4.3
navigator-updater                  0.2.1
nbclassic                          0.2.6
nbclient                           0.5.3
nbconvert                          6.0.7
nbformat                           5.1.3
nest-asyncio                       1.5.1
networkx                           2.5
nltk                               3.6.1
nose                               1.3.7
notebook                           6.3.0
numba                              0.53.1
numexpr                            2.7.3
numpy                              1.20.1
numpydoc                           1.1.0
olefile                            0.46
openpyxl                           3.0.7
packaging                          20.9
pandas                             1.2.4
pandocfilters                      1.4.3
paramiko                           2.7.2
parso                              0.7.0
partd                              1.2.0
path                               15.1.2
pathlib2                           2.3.5
pathspec                           0.7.0
pathy                              0.6.0
patsy                              0.5.1
peewee                             3.14.4
pep8                               1.7.1
pexpect                            4.8.0
pickleshare                        0.7.5
Pillow                             8.2.0
pip                                21.0.1
pkginfo                            1.7.0
plac                               1.1.3
pluggy                             0.13.1
ply                                3.11
preshed                            3.0.5
prodigy                            1.10.8
prometheus-client                  0.10.1
prompt-toolkit                     3.0.17
psutil                             5.8.0
ptyprocess                         0.7.0
py                                 1.10.0
pycodestyle                        2.6.0
pycosat                            0.6.3
pycparser                          2.20
pycurl                             7.43.0.6
pydantic                           1.8.2
pydocstyle                         6.0.0
pyerfa                             1.7.3
pyflakes                           2.2.0
Pygments                           2.8.1
PyJWT                              1.7.1
pylint                             2.7.4
pyls-black                         0.4.6
pyls-spyder                        0.3.2
PyNaCl                             1.4.0
pyodbc                             4.0.0-unsupported
pyOpenSSL                          20.0.1
pyparsing                          2.4.7
pyreadline                         2.1
pyrsistent                         0.17.3
PySocks                            1.7.1
pytest                             6.2.3
python-dateutil                    2.8.1
python-jsonrpc-server              0.4.0
python-language-server             0.36.2
pytz                               2021.1
PyWavelets                         1.1.1
pywin32                            227
pywin32-ctypes                     0.2.0
pywinpty                           0.5.7
PyYAML                             5.4.1
pyzmq                              20.0.0
QDarkStyle                         2.8.1
QtAwesome                          1.0.2
qtconsole                          5.0.3
QtPy                               1.9.0
regex                              2021.4.4
requests                           2.25.1
rope                               0.18.0
Rtree                              0.9.7
ruamel-yaml-conda                  0.15.100
scikit-image                       0.18.1
scikit-learn                       0.24.1
scipy                              1.6.2
seaborn                            0.11.1
Send2Trash                         1.5.0
sense2vec                          2.0.0
setuptools                         52.0.0.post20210125
simplegeneric                      0.8.1
singledispatch                     0.0.0
sip                                4.19.13
six                                1.15.0
smart-open                         5.1.0
sniffio                            1.2.0
snowballstemmer                    2.1.0
sortedcollections                  2.1.0
sortedcontainers                   2.3.0
soupsieve                          2.2.1
spacy                              2.3.7
spacy-legacy                       3.0.8
Sphinx                             4.0.1
sphinxcontrib-applehelp            1.0.2
sphinxcontrib-devhelp              1.0.2
sphinxcontrib-htmlhelp             1.0.3
sphinxcontrib-jsmath               1.0.1
sphinxcontrib-qthelp               1.0.3
sphinxcontrib-serializinghtml      1.1.4
sphinxcontrib-websupport           1.2.4
spyder                             4.2.5
spyder-kernels                     1.10.2
SQLAlchemy                         1.4.7
srsly                              1.0.5
starlette                          0.12.9
statsmodels                        0.12.2
sympy                              1.8
tables                             3.6.1
tblib                              1.7.0
terminado                          0.9.4
testpath                           0.4.4
textdistance                       4.2.1
thinc                              7.4.5
threadpoolctl                      2.1.0
three-merge                        0.1.1
tifffile                           2021.4.8
toml                               0.10.2
toolz                              0.11.1
tornado                            6.1
tqdm                               4.59.0
traitlets                          5.0.5
typed-ast                          1.4.2
typer                              0.3.2
typing-extensions                  3.7.4.3
ujson                              4.0.2
unicodecsv                         0.14.1
urllib3                            1.26.4
uvicorn                            0.11.8
wasabi                             0.8.2
watchdog                           1.0.2
wcwidth                            0.2.5
webencodings                       0.5.1
websockets                         8.1
Werkzeug                           1.0.1
wheel                              0.36.2
widgetsnbextension                 3.5.1
win-inet-pton                      1.1.0
win-unicode-console                0.5
wincertstore                       0.2
wrapt                              1.12.1
xlrd                               2.0.1
XlsxWriter                         1.3.8
xlwings                            0.23.0
xlwt                               1.3.0
xmltodict                          0.12.0
yapf                               0.31.0
zict                               2.0.0
zipp                               3.4.1
zope.event                         4.5.0
zope.interface                     5.3.0

I'm stuck.

Hi! From looking at your traceback, it seems like the error happens when the sense2vec package loads its entry points and registers them with spaCy. This all happens in the background. You might have ended up with an incompatible version of sense2vec that expects a newer version of spaCy, so uninstalling sense2vec should solve the problem.

The prodigy command is implemented via a shell script and depending on the shell you use on Windows, it may not be able to interpret that correctly. Using python -m prodigy is the best solution and it's something we also recommend more generally, because it ensures that you're executing the correct version of a package CLI (and not something else that may be registered globally).

Thank you! It's working now.