Prodigy ner.batch-train no longer multi threaded?

jprusa · June 5, 2019, 8:06pm

Since updating prodigy to the latest version and using spacy 2.1, ner.batch-train is no longer using multiple cores on the SageMaker Notebook Instance I have set up for hyper parameter tuning. This is an annoying problem as models are taking 7-8x longer to train, slowing down my experimentation.

Is there some flag for number of cores/threads that I am missing or something I should be doing different when installing? My current data set size doesn’t really warrant using gpu acceleration so I have been using the cpu installation of spacy.

This may be related to: https://github.com/explosion/spaCy/issues/3820; however, updating numpy and installing spacy using conda instead of pip does not change the behavior.

honnibal · June 5, 2019, 8:14pm

spaCy v2.1 should be quite a lot more efficient than v2.0, which didn’t use multiple cores very efficiently. It’s generally inefficient to launch threads for the matrix multiplications, and it often resulted in slower processing times than single-threaded execution.

I’m surprised that you’re finding the training slower, as I would expect it to be faster. Have you checked to make sure the batch sizes are the same? Also, what sort of CPU does the SageMaker Notebook server have? Is it a normal AWS VM, or something more exotic?

jprusa · June 6, 2019, 1:45pm

My Sagemaker instance is of the type ml.m5.4xlarge. With prodigy 1.8.1 ner-batch-train uses 99 to 100 %CPU, with 1.7.1. use is around 1400 %CPU.

honnibal · June 7, 2019, 8:53am

@jprusa Yes but is it actually faster with v1.7.1? This section of the v2.1 announcement explains the context around this: https://explosion.ai/blog/spacy-v2-1#matrix-multiplication

Briefly, using more machine resources is not a goal in itself. If we can use 4x more resources to train 3x faster, that’s often worth it. But using 10x as many resources to train 1.5x faster isn’t so appealing. In fact what often would happen in v2.0 is that numpy would launch far too many threads, and on a large instance you’d be using 14x as many resources to actually train 0.8x as quickly. That’s obviously a bad deal.

If you’re achieving 7x faster training with 14x the machine usage, that looks like a much better deal, so I’d like to understand why v2.1 is so much slower. Could you tell me the average number of words per text in your data, and the command you’re using to trigger the training?

jprusa · June 7, 2019, 2:02pm

I reinstalled 1.7.1 and found it to be slower as you expected. Upon more experimentation, I am getting a lot of variance in training times on the instance and some extremely slow file i/o (several minutes to load a model) and then other times it only takes seconds. It looks like I’m having AWS issues rather than a problem with spacy or prodigy.

Topic		Replies	Views
Prodigy ner.batch-train vs Spacy train usage , spacy , best-practices	13	3493	June 2, 2020
Ner Training with Prodigy vs Spacy ner , spacy , best-practices	2	1203	July 2, 2020
Will a GPU make training faster? spacy	7	7912	July 20, 2018
How to use GPU to accelerate the train of NER tasks? training	5	2429	August 25, 2021
Large Datasets Google Cloud usage , ner , google-cloud	5	1806	October 13, 2018

Prodigy ner.batch-train no longer multi threaded?

Related topics