I'm not sure if i'm just simply missing something or if this is an actual bug.
When I run prodigy train-curve -g 0 --spancat Dataset -c .\config.cfg it will still train on the CPU. prodigy train -g 0 --spancat Dataset -c .\config.cfg runs on the GPU as intended
Thanks for your message and welcome to the Prodigy community
I just checked the code and train-curve simply loops through the same train recipe so no obvious problems stand out.
Can you provide your config.cfg file? The GPU setting used while training is not in config.cfg so I don't think it will be the cause but this will allow us to try to replicate.
========================= Generating Prodigy config =========================
✔ Generated training config
=========================== Train curve diagnostic ===========================
Training 4 times with 25%, 50%, 75%, 100% of the data
% Score spancat
---- ------ ------
Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertModel: ['cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
ℹ Using GPU: 0
========================= Generating Prodigy config =========================
✔ Generated training config
=========================== Initializing pipeline ===========================
[2022-11-04 23:38:04,069] [INFO] Set up nlp object from config
Components: spancat
Merging training and evaluation data for 1 components
- [spancat] Training: 873 | Evaluation: 220 (20% split)
Training: 863 | Evaluation: 216
Labels: spancat (4)
[2022-11-04 23:38:04,960] [INFO] Pipeline: ['transformer', 'spancat']
[2022-11-04 23:38:04,963] [INFO] Created vocabulary
[2022-11-04 23:38:04,964] [INFO] Finished initializing nlp object
Some weights of the model checkpoint at bert-base-german-cased were not used when initializing BertModel: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Apologies for the delayed response. We've just added a new fix for the next Prodigy release that adds setup_gpu(gpu_id) to train-curve. I've added a "to-be-released" tag and will post back when that update is released.
This was in train but was not included in train-curve. setup_gpu() is from spaCy but here's what it does:
def setup_gpu(use_gpu: int, silent=None) -> None:
"""Configure the GPU and log info."""
if silent is None:
local_msg = Printer()
else:
local_msg = Printer(no_print=silent, pretty=not silent)
if use_gpu >= 0:
local_msg.info(f"Using GPU: {use_gpu}")
require_gpu(use_gpu)
else:
local_msg.info("Using CPU")
if gpu_is_available():
local_msg.info("To switch to GPU 0, use the option: --gpu-id 0")
Without this it wasn't printing the same message Using GPU: {use_gpu} to console. It also uses require_gpu(). Thanks again for reporting this problem and hope this resolves the issues!