Transfer Learning for NER

Hi! If you're using a pre-trained entity recognizer or other component and are then updating it with more examples, that's not what's typically referred to as "transfer learning" – it's more like "domain adaptation". The idea of transfer learning is that you initialise your model with weights trained using a different objective – for instance, a language model trained to predict the next word. These representations encode a lot of knowledge about the language and the world that can be transferred between tasks. So if you use those weights to initialise your named entity recognizer, you can often achieve better results using a smaller set of labelled examples.

To give you a practical example: if you're using transformer weigths to initialise a Spanish model and train a tagger, parser and/or entity recognizer (e.g. like spaCy's Spanish transformer-based pipeline), that's what you'd refer to as transfer learning. Similarly, if you use spacy pretrain to pretrain weights before training your components.

1 Like