config.cfg for bert.ner.manual

koaning · September 30, 2022, 8:33am

For my understanding, what is your goal?

Do you wish to train and update a Huggingface BERT model without spaCy? If so, you'll need to use that library to train a component and you can use the data generated from this recipe. You'd need to take the extra effort here, because Huggingface might use a different tokeniser.

If you wish to use BERT as part of a spaCy pipeline, then you can use the normal ner.manual recipe for annotation and you don't need to worry about the tokens. You can just use en_core_web_trf as a model when running the train command from Prodigy. Assuming that you've annotated a dataset called annotated_ner then you train command would look something like:

python -m prodigy train --ner annotated_ner --base_model en_core_web_trf

Topic		Replies	Views
BERT support for prodigy train ner usage , ner , spacy , solved	2	1028	June 30, 2021
BERT recipe when using transformer in pipeline? spacy , solved	8	1918	May 21, 2021
Training BERT on prodigy transformers , relations	3	823	February 2, 2023
data-to-spacy is not using my custom tokenizer ner , spacy	7	1098	May 15, 2023
transformers model for NER ner , spacy	6	416	October 31, 2023

config.cfg for bert.ner.manual

Related topics