Suggestions for running on Google Compute Engine

solved
install
(Jeff) #1

I’m trying to install Spacy/Prodigy on Google Compute Engine, and ran into problems so I was hoping someone here could share what works for them.

This is what I tried:

  • Created a “Debian GNU/Linux 9 (stretch)” image with 1 vCPU (n1-standard-1). It has Python 3.5.3 which isn’t a great sign.
  • Installed PIP
  • Tried to install Spacy with pip install spacy but looks like the compiler isn’t available. Below is my error message.

Is there another instance type that works better?

  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.5-3.5.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I./python -I./lib -I/usr/include/python3.5m -c ./python/ujson.c -o build/temp.linux-x86_64-3.5/./python/ujson.o -D_GNU_SOURCE
  unable to execute 'x86_64-linux-gnu-gcc': No such file or directory
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
(Justin Du Jardin) #2

I’ve had success running Prodigy/spaCy on the Ubuntu 1804 set of images

1 Like
(Matthew Honnibal) #3

The following script should install all the system dependencies you need for spaCy and Prodigy for a clean Ubuntu 18.04 VM. It also installs a few other useful things:


#!/usr/bin/env bash

sudo apt-get update
sudo apt-get install -y build-essential
sudo apt-get install -y unzip libssl-dev zlib1g-dev libbz2-dev \
    libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
    xz-utils python-pip python-virtualenv python3-pip python3-venv \
    python-dev python3-dev libopenblas-base libopenblas-dev

You can then create a virtualenv and install your Prodigy wheel as follows:

python3 -m venv spacy-env
source spacy-env/bin/activate
python3 -m pip install $path_to_your_Prodigy_wheel

If you prefer to use Miniconda instead of pip, the following script will install Miniconda and create a conda environment with an optimized version of numpy:


#!/usr/bin/env bash

set -e

MINICONDA_URL="https://repo.continuum.io/miniconda"
MINICONDA_FILENAME="Miniconda3-latest-Linux-x86_64.sh"

function print_usage {
  echo
  echo "Usage: ./install-miniconda"
  echo
  echo "This script installs Miniconda to /opt/miniconda and creates a conda environment with the optimized numpy"
  echo
}

function run {
  # (re)create /opt/miniconda directory, give it to user
  sudo rm -rf /opt/miniconda
  sudo mkdir -p /opt/miniconda
  sudo chmod -R a+rwx /opt/miniconda
  # Download and run installer.
  wget $MINICONDA_URL/$MINICONDA_FILENAME -O /opt/miniconda/$MINICONDA_FILENAME
  sudo chmod a+rx /opt/miniconda/$MINICONDA_FILENAME
  /opt/miniconda/$MINICONDA_FILENAME -f -b -p /opt/miniconda
  # Allow others to execute miniconda
  sudo chmod -R a+rx /opt/miniconda/bin
  # Set up an environment and install optimized numpy
  /opt/miniconda/bin/conda create --prefix /opt/miniconda/numpy-mkl --copy -y numpy
}

print_usage
run "$@"

If you want to use GPU, you’ll need to install CUDA and ideally the CUDNN library. You need to login to download the CUDNN installer, so there’s an extra step or two you need to do from your own computer. Once you have the installer on the VM, the following script should work:


#!/usr/bin/env bash

set -e

# Install driver
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

sudo apt-get install -y nvidia-driver-396 libnvidia-compute-396 libnvidia-common-396 nvidia-utils-396

# Download toolkit and patch
wget -O cuda_9.2.88_396.26_linux.run -c https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda_9.2.88_396.26_linux --quiet
sudo mv cuda_9.2.88_396.26_linux.run /etc/install-cuda-9.2
wget https://developer.nvidia.com/compute/cuda/9.2/Prod/patches/1/cuda_9.2.88.1_linux --quiet
sudo mv cuda_9.2.88.1_linux /etc/patch-cuda-9.2
sudo chmod a+rx /etc/install-cuda-9.2
sudo chmod a+rx /etc/patch-cuda-9.2

sudo /etc/install-cuda-9.2 --toolkit --silent --verbose
sudo /etc/patch-cuda-9.2 --silent --accept-eula

sudo cp /tmp/binaries/cudnn-9.2-linux-x64-v7.1.tgz /etc/cudnn.tgz
sudo cp /tmp/runtime/cuda_bashrc /home/ubuntu/.bashrc
sudo chmod a+rwx /home/ubuntu/.bashrc

cd /tmp/binaries
tar -xzf cudnn-9.2-linux-x64-v7.1.tgz
sudo cp -r cuda/include/* /usr/local/cuda/include
sudo cp -r cuda/lib64/* /usr/local/cuda/lib64
sudo chmod -R a+rx /usr/local/cuda/include/*
sudo chmod -R a+rx /usr/local/cuda/lib64/*

Once you have CUDA installed, you need to add the following lines to your .bashrc script:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin

You can then run:

pip install cupy
pip install thinc_gpu_ops
2 Likes
(Jeff) #4

Thank you Justin and Matthew!

Do you have recommendations for the number of vCPUs?

(Jeff) #5

I ran a few experiments to compare multiple vCPUs and a GPU for training a tagger. Here are the results:

  • 2 vCPUs: 12h15m
  • 4 vCPUs: 11h33m
  • 8 vCPUs: 9h37m
  • GPU: 3h43m

So you are much better off using a GPU and increasing the number of vCPUs is not worth the extra cost.

(Matthew Honnibal) #6

@Jeff Yes, I agree: multiple CPUs for training are currently not advised, and in fact with the pip version of spaCy v2.0.18, too many CPUs can actually lead to too many threads being launched, which can decrease performance. In v2.1 of spaCy, we switch to single threading to prevent this kind of problem, and make it easier to run spaCy alongside other applications in a cloud environment.

1 Like
(Mitchell) #7

Hey Matthew,

I don’t know if this is the place to ask this, but I am trying to run ner.batch-train on a GCE instance, with a GPU. I have followed your script to pretty much the line. However, when I come to running batch-train I get the error below:

    Exception ignored in: <bound method Stream.__del__ of <cupy.cuda.stream.Stream object at 0x7ff6db28cc88>>
Traceback (most recent call last):
  File "cupy/cuda/stream.pyx", line 161, in cupy.cuda.stream.Stream.__del__
AttributeError: 'Stream' object has no attribute 'ptr'
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/prodigy/__main__.py", line 331, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 211, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/prodigy/recipes/ner.py", line 526, in batch_train
    baseline = model.evaluate(evals)
  File "cython_src/prodigy/models/ner.pyx", line 458, in prodigy.models.ner.EntityRecognizer.evaluate
  File "cython_src/prodigy/models/ner.pyx", line 460, in prodigy.models.ner.EntityRecognizer.evaluate
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/spacy/language.py", line 548, in pipe
    for doc, context in izip(docs, contexts):
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/spacy/language.py", line 572, in pipe
    for doc in docs:
  File "nn_parser.pyx", line 374, in pipe
  File "nn_parser.pyx", line 400, in spacy.syntax.nn_parser.Parser.parse_batch
  File "/home/mitch/spacy-env/lib/python3.6/site-packages/spacy/util.py", line 238, in get_cuda_stream
    return CudaStream() if CudaStream is not None else None
  File "cupy/cuda/stream.pyx", line 158, in cupy.cuda.stream.Stream.__init__
  File "cupy/cuda/runtime.pyx", line 331, in cupy.cuda.runtime.streamCreate
  File "cupy/cuda/runtime.pyx", line 334, in cupy.cuda.runtime.streamCreate
  File "cupy/cuda/runtime.pyx", line 144, in cupy.cuda.runtime.check_status
cupy.cuda.runtime.CUDARuntimeError: cudaErrorUnknown: unknown error

Using:
Ubuntu 18.06
Prodigy 1.7.1
Spacy 2.0.18
cupy 5.4.0
cuda 9.2
cudnn 9.2

I can’t really workout where I have gone wrong, other than a different spacy to what you used?

Cheers,
Mitch

(Matthew Honnibal) #8

Hmm. That’s not a very helpful error :(.

Does it work if you just do something like:


import cupy

arr1 = cupy.ones((4,4), dtype='float32')
arr2 = copy.ones((4,4), dtype='float32')
arr1 @ arr2

I just want to test whether cupy can get the GPU working at all, basically.

(Mitchell) #9

Hi Matthew,

Thanks for a quick reply.

So, the GPU is not working at all… I get the exact same unhelpful error when I run:

import cupy
arr1= cupy.ones((4,4), dtype='float32')

Is it possible I have done something wrong when setting up the machine? The only thing that I can see is different is perhaps the cudnn version that I used. I used version 7.4.1.5

(Matthew Honnibal) #10

Maybe check the compatibility with cudnn and cuda 9.2? I think cupy and Chainer have pretty decent troubleshooting docs, so try those.

(Mitchell) #11

Thanks Matthew, I managed to get to a point where your example cupy code works. I played about with versions until I got it to work - fun :).

When running ner.batch-train is there anything specific I have to do to force spacy to use the GPU? I am seeing identical epoch times to when I was running on a CPU. My assumption (likely wrong) from reading thread was spacy would use the GPU if it is available? Or do I need to pass use_device=0 into begin training? Like I have seen you mention in another thread on this forum? Cheers

(Matthew Honnibal) #12

You should be able to call spacy.prefer_gpu() in the recipe, somewhere before you call spacy.load(). I think that should be enough to use the GPU. I normally check either perf top or nvidia-smi to check that the GPU is being used. perf top is a bit less direct, but it shows you which C functions the time is being spent in.

(Mitchell) #13

Thanks. nvidia-smi returns output that I would expect. Although, just looking at it again, I notice the CUDA version in the output is 10.0, where I am using 9.2 for everything else. I will get that to the right version and try again.

Otherwise, I have a custom gpu wrapper recipe in which I call spacy.prefer_gpu() - which returns True. Also tried spacy.util.use_gpu(0) and spacy.require_gpu() And no luck. I get to a point where I get a ValueError: object __array__ method not producing an array from numpy. So my thinking is there is some incompatibility between thinc_gpu_ops/thinc/chainer and spacy…

(Matthew Honnibal) #14

That seems strange. You’re still running the ner.batch-train recipe right? I know that the current version has some problems using the textcat on GPU, which are fixed in spaCy v2.1. But the NER recipes should work.

Could you provide the full traceback?

(Mitchell) #15

Yeah, I am still running the ner.batch-train recipe.

Below is the full traceback. Let me know, if I can provide you with anymore information to help.

spacy 2.0.18
cupy 5.4.0
thinc 6.12.1
thinc_gpu_ops 0.0.4

Loaded model en_core_web_lg
Loaded 101839 evaluation examples from
‘attribute_dataset_home_garden_4676_cat_freq_thresh_1_data_read_mod_2642_tagging_cat_freq_thresh_combined_filtered_evaluation’
Traceback (most recent call last):
File “/usr/lib/python3.6/runpy.py”, line 193, in _run_module_as_main
main”, mod_spec)
File “/usr/lib/python3.6/runpy.py”, line 85, in _run_code
exec(code, run_globals)
File “/home/mitch/.local/lib/python3.6/site-packages/prodigy/main.py”, line 331, in
controller = recipe(args, use_plac=True)
File “cython_src/prodigy/core.pyx”, line 211, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File “/home/mitch/.local/lib/python3.6/site-packages/plac_core.py”, line 328, in call
cmd, result = parser.consume(arglist)
File “/home/mitch/.local/lib/python3.6/site-packages/plac_core.py”, line 207, in consume
return cmd, self.func(
(args + varargs + extraopts), **kwargs)
File “/home/mitch/.local/lib/python3.6/site-packages/prodigy/recipes/ner.py”, line 529, in batch_train
baseline = model.evaluate(evals)
File “cython_src/prodigy/models/ner.pyx”, line 458, in prodigy.models.ner.EntityRecognizer.evaluate
File “cython_src/prodigy/models/ner.pyx”, line 460, in prodigy.models.ner.EntityRecognizer.evaluate
File “/home/mitch/.local/lib/python3.6/site-packages/spacy/language.py”, line 548, in pipe
for doc, context in izip(docs, contexts):
File “/home/mitch/.local/lib/python3.6/site-packages/spacy/language.py”, line 572, in pipe
for doc in docs:
File “nn_parser.pyx”, line 374, in pipe
File “nn_parser.pyx”, line 416, in spacy.syntax.nn_parser.Parser.parse_batch
File “/home/mitch/.local/lib/python3.6/site-packages/numpy/core/numeric.py”, line 632, in ascontiguousarray
return array(a, dtype, copy=False, order=‘C’, ndmin=1)
ValueError: object array method not producing an array

(Matthew Honnibal) #16

Thanks, I’ll keep looking into this. It definitely does seem like an error in Prodigy. I hope it’s not too inconvenient to use the CPU in the meantime.