Hi and welcome! (And sorry, I don't know French, so I can only reply in English )
By default, the lemmas are not used as features in the models for NER or POS, so their accuracy won't make a difference. That said, if you're using rule-based lemmatization that takes the POS tags into account, the quality of the POS tags can impact the quality of the lemmas. And of course, lemmas might be useful for extracting information (e.g. to write more generic match patterns) – but that depends on your use case.
This sounds like a good approach – after doing your initial annotations, you typically want to batch train a model (e.g. using prodigy train
or spaCy directly) so it can learn as accurately as possible from the initial annotations. You can then use that pretrained model as the base model in ner.correct
and correct its predictions. Prodigy lets you specify the name of an installed spaCy model or a local path, so you can run prodigy ner.correct your_dataset /path/to/model
etc.
I definitely think that using a model to suggest annotations can save you a lot of time If you have several labels, make sure you collect enough initial examples to pretrain your model (and enough examples of every label). Maybe start with 200-500 examples (sentences or paragraphs) and then run your first training experiment.
When using NER, make sure that your entity types still follow the same conceptual idea of "named entities", otherwise your model might struggle to learn them efficiently. They don't have to be PERSON
or ORG
, but they should work in a similar way and describe distinct expressions like proper nouns with clear boundaries that can be determined from the local context. If that's not the case, a named entity recognition model might not be the right fit for what you're trying to do. Instead, you might want to experiment with a hybrid pipeline of more generic and classic NER labels + a text classification model.
Yes, I think it ultimately depends on how accurate the POS tags predicted by the existing model are – assuming that you're working with a language that spaCy provides a pretrained pipeline for (or where you have an existing corpus like Universal Dependencies).
A good first experiment could be to just stream in some random examples and their predicted POS tags and just annotate whether there are even errors or not. You could go through tag by tag, or remove all labels that are incorrect. At the end of it, you can calculate the error rate – if that's super low, you might not need to do much custom work. If it's higher, you can look at the particular cases the model gets wrong and collect some manual annotations for those examples.
Yes, that's perfect