How to use GPU to accelerate the train of NER tasks?

Hello, there. Thanks for creating this wonderful tool Prodigy for us. Today, I just came across one problem when I was about to use GPU to accelerate my NER training. Could you give some advice to shoot this trouble? Thanks in advance.

(venv) C:\Users\Jayshow\PycharmProjects\prodigy>python -m prodigy train ner ner_algorithms_final en_core_web_sm --gpu-id 0
usage: prodigy train [-h] [-t2v None] [-o None] [-e None] [-es None] [-n 10]
                     [-b -1] [-d 0.2] [-f 1.0] [-TE] [-NM] [-B] [-S]
                     {ner,textcat,tagger,parser} datasets spacy_model
prodigy train: error: unrecognized arguments: --gpu-id 0

@ines Could you please help me with the question above? Many thanks!

Hi! It looks like you're running Prodigy v1.10.x, which doesn't yet have the --gpu-id setting. If you upgrade to Prodigy v1.11, you'll be able to set the GPU via the CLI.

Btw, once you're serious about training and want to train on a GPU machine, it often makes sense to export your data with data-to-spacy and then train with spaCy directly. It doesn't really make a difference if your local development machine has a GPU, but if you want to train on a separate machine / in the cloud, it means you won't need to install Prodigy separately and import your data – you can just upload the exported corpus and install spaCy.

Thanks for reaching back to me. Yes, that's right! I installed two different versions of Prodigy on my Laptop, including v1.10.8 and v1.11.2. The unexpected error above was generated by using v1.10.8.

Yes, your advice is great. I was about to train a new NER model based on Spacy to automatically extract algorithms from text, but I encountered another trouble with GPU. The answers on Github do not help solve this trouble. Do you have ideas about this trouble? Any advice is welcome. Thanks!

(venv) C:\Users\Jayshow\PycharmProjects\spacy_transform>pip list
Package              Version
-------------------- ----------------
aiofiles             0.5.0
altair               4.1.0
argon2-cffi          20.1.0
astor                0.8.1
async-generator      1.10
attrs                21.2.0
backcall             0.2.0
base58               2.1.0
beautifulsoup4       4.9.3
bleach               3.3.1
blinker              1.4
blis                 0.4.1
bs4                  0.0.1
cachetools           4.1.1
catalogue            2.0.4
certifi              2020.6.20
cffi                 1.14.6
chardet              3.0.4
click                7.1.2
colorama             0.4.4
cupy-cuda90          9.0.0a1
cycler               0.10.0
cymem                2.0.3
debugpy              1.3.0
decorator            5.0.9
defusedxml           0.7.1
en-core-sci-lg       0.2.5
en-core-sci-sm       0.2.5
en-core-web-sm       2.3.1
en-core-web-trf      3.1.0
entrypoints          0.3
et-xmlfile           1.1.0
fastapi              0.68.0
fastrlock            0.6
filelock             3.0.12
gitdb                4.0.7
GitPython            3.1.18
grobid-client-python 0.0.2
h11                  0.9.0
httplib2             0.19.1
huggingface-hub      0.0.8
idna                 2.10
importlib-metadata   1.7.0
ipykernel            6.0.2
ipython              7.25.0
ipython-genutils     0.2.0
ipywidgets           7.6.3
jedi                 0.18.0
jiagu                0.2.3
Jinja2               3.0.1
joblib               0.16.0
jsonschema           3.2.0
jupyter              1.0.0
jupyter-client       6.2.0
jupyter-console      6.4.0
jupyter-core         4.7.1
jupyterlab-pygments  0.1.2
jupyterlab-widgets   1.0.0
kiwisolver           1.3.1
MarkupSafe           2.0.1
matplotlib           3.4.2
matplotlib-inline    0.1.2
mistune              0.8.4
murmurhash           1.0.2
nbclient             0.5.3
nbconvert            6.1.0
nbformat             5.1.3
nest-asyncio         1.5.1
nmslib               2.0.6
notebook             6.4.0
numpy                1.19.1
openpyxl             3.0.7
packaging            21.0
pandas               1.3.0
pandocfilters        1.4.3
parso                0.8.2
pathy                0.6.0
peewee               3.13.3
pickleshare          0.7.5
Pillow               8.3.1
pip                  21.2.4
plac                 1.1.3
preshed              3.0.2
prodigy              1.11.2
prometheus-client    0.11.0
prompt-toolkit       3.0.19
protobuf             3.17.3
psutil               5.7.2
pyarrow              4.0.1
pybind11             2.5.0
pycparser            2.20
pydantic             1.8.2
pydeck               0.6.2
Pygments             2.9.0
PyJWT                2.1.0
pyparsing            2.4.7
pyrsistent           0.18.0
pysbd                0.3.2
python-dateutil      2.8.2
pytz                 2021.1
pywin32              301
pywinpty             1.1.3
pyzmq                22.1.0
qtconsole            5.1.1
QtPy                 1.9.0
regex                2021.8.3
requests             2.24.0
sacremoses           0.0.45
scikit-learn         0.23.2
scipy                1.5.2
scispacy             0.2.5-unreleased
seaborn              0.11.1
selenium             3.141.0
Send2Trash           1.7.1
setuptools           57.4.0
six                  1.16.0
smart-open           5.1.0
smmap                4.0.0
soupsieve            2.2.1
spacy                3.1.2
spacy-alignments     0.8.3
spacy-legacy         3.0.8
spacy-lookups-data   1.0.2
spacy-transformers   1.0.3
srsly                2.4.1
starlette            0.14.2
streamlit            0.84.1
terminado            0.10.1
testpath             0.5.0
thinc                8.0.8
threadpoolctl        2.1.0
tokenizers           0.10.3
toml                 0.10.2
toolz                0.10.0
torch                1.1.0
tornado              6.1
tqdm                 4.48.2
traitlets            5.0.5
transformers         4.6.1
typer                0.3.2
typing-extensions    3.10.0.0
tzlocal              2.1
urllib3              1.25.10
uvicorn              0.13.4
validators           0.18.2
wasabi               0.8.2
watchdog             2.1.3
wcwidth              0.2.5
webencodings         0.5.1
websockets           8.1
wheel                0.37.0
widgetsnbextension   3.5.1
xlrd                 2.0.1
xlwt                 1.3.0
zipp                 3.1.0

(venv) C:\Users\Jayshow\PycharmProjects\spacy_transform>python -m spacy train config.cfg --output ./output --paths.train ./corpu
s/train.spacy --paths.dev ./corpus/dev.spacy --gpu-id 0
ℹ Saving to output directory: output
ℹ Using GPU: 0
Traceback (most recent call last):
  File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Python37\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Jayshow\PycharmProjects\spacy_transform\venv\lib\site-packages\spacy\__main__.py", line 4, in <module>
    setup_cli()
  File "C:\Users\Jayshow\PycharmProjects\spacy_transform\venv\lib\site-packages\spacy\cli\_util.py", line 69, in setup_cli
    command(prog_name=COMMAND)
  File "C:\Python37\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\Python37\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\Python37\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Python37\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Python37\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "C:\Python37\lib\site-packages\typer\main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "C:\Users\Jayshow\PycharmProjects\spacy_transform\venv\lib\site-packages\spacy\cli\train.py", line 55, in train_cli
    setup_gpu(use_gpu)
  File "C:\Users\Jayshow\PycharmProjects\spacy_transform\venv\lib\site-packages\spacy\cli\_util.py", line 515, in setup_gpu
    require_gpu(use_gpu)
  File "C:\Python37\lib\site-packages\thinc\util.py", line 187, in require_gpu
    raise ValueError("GPU is not accessible. Was the library installed correctly?")
ValueError: GPU is not accessible. Was the library installed correctly?

Try all my best to find a solution for this question but failed. :thinking:

This is difficult to debug from afar because there can be many reasons. Ultimately, this means that cupy doesn't work correctly or isn't installed with the correct version for your CUDA, or that it can't detect or doesn't support your GPU.

The suggestions in these threads should help you narrow in on the problem: