config.cfg for bert.ner.manual

For my understanding, what is your goal?

Do you wish to train and update a Huggingface BERT model without spaCy? If so, you'll need to use that library to train a component and you can use the data generated from this recipe. You'd need to take the extra effort here, because Huggingface might use a different tokeniser.

If you wish to use BERT as part of a spaCy pipeline, then you can use the normal ner.manual recipe for annotation and you don't need to worry about the tokens. You can just use en_core_web_trf as a model when running the train command from Prodigy. Assuming that you've annotated a dataset called annotated_ner then you train command would look something like:

python -m prodigy train --ner annotated_ner --base_model en_core_web_trf