config.cfg for bert.ner.manual

koaning · September 26, 2022, 10:06am

Before diving deeper into this question I just want to make sure that I understand what your goal is. If you're trying to train a BERT model, you can also use spaCy without having to resort to this custom recipe. To quote the docs:

New in Prodigy v1.11 and spaCy v3

spaCy v3 lets you train a transformer-based pipeline and will take care of all tokenization alignment under the hood, to ensure that the subword tokens match to the linguistic tokenization. You can use data-to-spacy to export your annotations and train with spaCy v3 and a transformer-based config directly, or run train and provide the config via the --config argument.

So just to check, are you trying to train a BERT model using spaCy? If so, you might just want to follow the steps that I describe here. If you're trying to generate data for another library, like Huggingface, that depends on the sentencepiece tokeniser ... then I can dive a bit deeper.

Topic		Replies	Views
BERT support for prodigy train ner usage , ner , spacy , solved	2	1042	June 30, 2021
BERT recipe when using transformer in pipeline? spacy , solved	8	1928	May 21, 2021
Training BERT on prodigy transformers , relations	3	831	February 2, 2023
data-to-spacy is not using my custom tokenizer ner , spacy	7	1108	May 15, 2023
transformers model for NER ner , spacy	6	423	October 31, 2023

config.cfg for bert.ner.manual

Related topics