Transfer Learning for NER

Hello,
I am kind of new in the NLP world, and I am doing a project for the University about Transfer Learning with Named Entity Recognition. I've been doing some research about this and trying to understand how Prodigy and the spaCy pretrain command could help me with my task.
I want to add a couple of new entities in the es_core_news_md model (in Spanish) and I want to train it with a Data Set I've been given about Digital Marketing (about 300 .pdf files with raw text of that topic) so as to the improved model could recognize this new entities.
My idea in this moment is to do something simillar to what Matthew Honnibal did on the "TRAINING A NEW ENTITY TYPE with Prodigy" video, firstly doing a terminology list using spaCy's word vectors. So my first question is: Is the terms.to-patterns some kind of "pretraining" of the model?
Assuming I do that and then I start teaching my new entities in context to the model by getting some predictions thanks to the patterns file, here comes my final question.
Would it be right if I say that I am doing Transfer Learning because I used the source domain,
which is es_core_news_md with word vectors, to create rules and then transfered that knowledge to my target model that could, hopefully, recognize some new entities about the Digital Marketing subject ?
I hope I was clear enough. Thank you!

Hi @lucascheistwer,

Glad we can help you get started with NLP, and I hope your project will be successful.

While the approach you've described makes sense, I think there's also a lot of value in keeping things simple, especially when you're first starting out. I would therefore advise you to start off by annotating some data with the ner.manual recipe. This will let you get a feel for your task, and make sure you're able to annotate it consistently. It will also help you understand how much a terminology list can help. You might find that the terms are too ambiguous for your task, or you might find that there's a small number of terms that are very useful.

If you find the annotation is quick with ner.manual, you can also go on to train an initial model with the manual annotations. I do think creating a terminology list with terms.to-patterns, or some other process to produce initial patterns, is likely to be useful -- but it's best to start out a bit more directly, so that you know you're making progress on your main goal.

Ok, I'll try that. Thank you very much!

Hi Matthew,

I want to train a custom NER model using spacy train command where I intend to apply transfer learning (initialise weights from pre-trained language model ). Could you give me the reference of a fully working codes for such model training?
Please advise
Thanks

You can find the documentation and examples here:

(Btw, also note that this is the forum for our annotation tool Prodigy. While the questions sometimes touch on spaCy, as spaCy integrates with Prodigy, we won't be able to answer general-purpose spaCy usage questions on here.)

hello, I'm interested in your question, I'm new also in the NLP field. concerning your question about whether using the Spanish pre-trained model of spacy is considered transfer learning or not, I can't find the response in the comments so I wonder did you managed to find an answer to the question?

Hi! If you're using a pre-trained entity recognizer or other component and are then updating it with more examples, that's not what's typically referred to as "transfer learning" – it's more like "domain adaptation". The idea of transfer learning is that you initialise your model with weights trained using a different objective – for instance, a language model trained to predict the next word. These representations encode a lot of knowledge about the language and the world that can be transferred between tasks. So if you use those weights to initialise your named entity recognizer, you can often achieve better results using a smaller set of labelled examples.

To give you a practical example: if you're using transformer weigths to initialise a Spanish model and train a tagger, parser and/or entity recognizer (e.g. like spaCy's Spanish transformer-based pipeline), that's what you'd refer to as transfer learning. Similarly, if you use spacy pretrain to pretrain weights before training your components.

1 Like