New features idea

davebulaval · August 18, 2021, 1:10pm

After using your solution for about 4 months, here are some features that would be nice improvements.

Doc for the previous versions
Train for smaller (no transformer) NER model (yeah, they are nice but too big for some utilization or overkill)
Cross validation training for smaller models
More stats (i.e. number of elements per tag)
Confusion matrix for NER
NER (or else) errors visualization

ines · August 19, 2021, 10:10pm

Hi! Thanks for these suggestions Some quick comments:

I'm not 100% sure what you mean by this? We typically recommend using spaCy's CNN-based pipelines for training with Prodigy, and it's also the default configuration you get out-of-the-box. Transformer-based pipelines are a nice add-on if you want to suqeeze out the final percent of accuracy, but especially during development, what you typically care about most is whether your model is learning. So I I agree that transformer embeddings are often overkill here and spaCy provides good alternatives optimised for CPU.

Topic		Replies	Views
NER for Financial Text ner	14	1742	October 25, 2023
Issue getting Tranformer-based NER pipeline working usage , spacy , transformers	3	1268	January 29, 2021
questions on Multi NERs Annotation & Training at Once in a Sentence usage , ner , spacy	5	646	October 3, 2022
Question about configuration file when use en_core_scibert model for ner ner	7	102	September 1, 2025
Training new entity type with en_pytt_bertbaseuncased_lg model usage , ner , transformers	5	2044	August 30, 2019

New features idea

Related topics