annotate multi phrases using ner.make-gold

Hi, I’m using ner.make-gold to label multiple phrases entities. Are the entities limited to certain number of words? Does make-gold work like a dictionary? prodigy will look for exact string as I labeled?


The ner.make-gold recipe will stream in the model's predictions and ask you to manually correct them. The result of this is an annotated dataset – what you do with that later on is up to you. You could use it as training data for a statistical model, in which case the model will learn to generalise based on the examples so it learns to predict similar entities in context.

Not in terms of labelling, no. You can highlight single tokens or multiple tokens or even very long spans, that's up to you. Just keep in mind that if your goal is to train a named entity recognition model, the highlighted words should also be named entities. This usually means proper nouns and "real world objects" or "categories of things". (Nobody's going to stop you from labelling half sentences and trying to train a model – but a NER model likely won't be able to learn anything meaningful from that.)