This is just a very rough rule of thumb but the idea is this: if you're starting off with an existing model, its weights will be based on the entity labels it was trained on. If you're now introducing many new labels, you're constantly "fighting" the existing weights, and there's a high chance you end up with conflicts and are trying to teach the model to suddenly predict something very differently from what it learned before, which can potentially even mess up other entity types. You may end up with very confusing results and weights that are difficult to reason about. So if you're looking to introduce many new labels, it's often cleaner and more efficient to train new weights from scratch.
Hi there, any chance the flow chart can be updated to replace the deprecated items? It's been great having a flow chart to follow, but being new to the software it's been a challenge trying to understand what i'm doing wrong when it comes to using ner.teach vs. ner.correct etc.
Also, some good news: be on the lookout for an updated NER workflow very soon on social media or we'll post it back here
The recommendations will be the same but it'll update the syntax. Also, we we're adding several links from the documentation or Prodigy support issues to provide context for many of the decision nodes.
We also have plans soon to move onto other models like spancat and textcat too!