Similar models to en_core_web_lg/en_vectors_web_lg

ines · February 25, 2021, 10:53am

The en_core_web_lg and en_vectors_web_lg packages include word vectors, which are used as features in the model. You can easily create your own using any available pretrained vectors, e.g. from FastText: https://v2.spacy.io/usage/vectors-similarity#converting This isn't really "state-of-the-art", but it's a nice efficiency trade-off, because you can easily train these models on your local machine using only a GPU.

In spaCy v3, you can initialize your pipelines with transformer weights (including any embeddings available via the Hugging Face transformers library). This gives you results right up at the current state of the art for these tasks, so if you've been seeing good results training with vectors only, you'll likely get a boost in accuracy from initialising with transformers embeddings.

spaCy also lets you share a single transformer across multiple components (e.g. ner and textcat), which makes your pipelines more efficient. You can try it out by exporting your annotations with data-to-spacy and converting them to the new v3 format with spacy convert. You can then generate a training config for your specific requirements (language, components etc.) and train your pipeline with spacy train.

Make sure to use a separate virtual environment, since the latest stable Prodigy requires spaCy v2. We have a pre-release out that updates Prodigy to spaCy v3 – you can read more about it here: ✨ Prodigy nightly: spaCy v3 support, UI for overlapping spans, improved feeds & more

Topic		Replies	Views
mismatched structure when using tranformers model to train textcat (en_core_web_trf) textcat , spacy , transformers	16	1346	March 29, 2023
Spacy pretrain best practices usage , done , spacy	16	5280	March 13, 2020
en_vectors_web_lg loading issue usage , spacy , solved	1	1117	May 13, 2020
Loading fasttext vectors to spacy/prodigy ner , spacy , solved	9	1544	February 13, 2022
good configs for spacy pretraining usage , spacy	11	2607	November 22, 2022

Similar models to en_core_web_lg/en_vectors_web_lg

Related topics