SpaCy NER models Architecture details

eyabklt · June 16, 2021, 9:28am

I'm working on a NER custom project and I managed to train a blank spacy model and CamemBERT transformer and compare between them, now I have to write some documentation about both of those models, I did a research and I found out that the blank model is based on CNN and LSTM but there are no details about the layers and the parameters that were used in that architecture. and the same for the transformer.
So can anyone help me with some resources?

ines · June 17, 2021, 1:22am

If you're looking for details on the spaCy v2 NER model, this video by @honnibal explains how it works in detail:

For spaCy v3, the built-in architectures are all documented in the API reference. See here for details:

Not sure how relevant this is for your project, but training a blank spaCy model without initialising it with any pretrained embeddings and then comparing that to a model with pretrained embeddings doesn't sound that useful – unless this is explicitly what your comparison is about? Because the only real takeaway you'll get from this is "initialising with embeddings is usually better than initialising without embeddings", which was kind of obvious before. So if you're comparing different architectures (e.g. spaCy's transition-based approach vs. something else), you probably want to train a model using the same embeddings, e.g. CammemBERT: Embeddings, Transformers and Transfer Learning · spaCy Usage Documentation

eyabklt · June 20, 2021, 11:19am

Thank you @ines for your response

yeah, it's just a comparison to show that using a pre-trained model gives better results than the non-pretained one. but it's not the goal of the project it's just an observation, the goal is to create a NER model with good accuracy so I will be able to use it in an application.

ines · June 21, 2021, 12:45am

Ah okay, in that case, you definitely want to be using spaCy v3 because it'll let you train two models using the same architecture and settings but one with pretrained embeddings (e.g. CammemBERT) and one without, and maybe another one with just word vectors as features. This way, you can have a meaningful comparison, because the only variance between the experiments is the pretrained embeddings that are used.

Topic		Replies	Views
Transfer Learning for French NER usage , ner , spacy , transformers	1	652	May 25, 2021
Prodigy model details ner , spacy , solved	1	349	June 29, 2021
ner.manual recipe arg -- difference between using blank:en or another spacy model usage , ner , spacy , solved	4	1246	June 8, 2022
Blank spacy model vs en_core_web_xx usage , ner , spacy , custom	2	875	October 25, 2021
word embeddings from trained NER model? usage , ner , transformers	3	1671	December 13, 2021

SpaCy NER models Architecture details

Related topics