Issues with ner.batch-train with en_trf_bertbaseuncased_lg after creating a custom set of labels

darrengarvey · October 13, 2019, 9:51pm

Hey all.

I started by creating a dataset using prodigy dataset ... and then using prodigy ner.manual ... with a bunch of my own labels I annotated a bunch of examples.

I was initially planning on using the BERT model: en_trf_bertbaseuncased_lg, but trying to run batch-train with:

prodigy ner.batch-train demo_v01 en_trf_bertbaseuncased_lg --output /tmp/model --eval-split 0.2 --dropout 0.2

I got the following error:

KeyError: "[E001] No component 'trf_tok2vec' found in pipeline. Available names: ['sentencizer', 'ner']"

Is there some missing import in prodigy?

ines · October 14, 2019, 8:58am

Hi! We currently do not have an NER model implementation using the transformer weights. See here for details:

So running a transformer model with ner.batch-train doesn't really make sense – you'd always be training a regular spaCy NER model (so you might as well use a blank en model). To use the transformer models with Prodigy likely also require slightly modified training recipes, since the updating works slightly differently in those cases (and has additional configuration options).

Topic		Replies	Views
Training a NER model with en_trf_robertabase_lg usage , spacy , solved , transformers	3	1787	January 19, 2020
Training new entity type with en_pytt_bertbaseuncased_lg model usage , ner , transformers	5	2033	August 30, 2019
Training with base model en_core_web_trf throws error ner	8	461	April 3, 2024
BERT support for prodigy train ner usage , ner , spacy , solved	2	1031	June 30, 2021
Training NER does not make any progress usage , ner , training	3	866	December 16, 2021

Issues with ner.batch-train with en_trf_bertbaseuncased_lg after creating a custom set of labels

Related topics