NER annotation scheme: GPE vs. LOC

Hello, doing some NER annotation using prodigy and having difficulty choosing between GPE and GEO.
Any good guideline or clarification on where to choose one fo these?


What's the GEO label? I've never seen that one before. Ultimately, that's not really a question anyone else can answer for you – it depends on the labels you want in your model and how you define the distinctions. If you're working with a pretrained model that was trained on an existing corpus, you can usually find the annotation guidelines used to create that corpus, and how the labels are defined. In that case, it makes sense to try and follow that, so your data is consistent.

1 Like

Oups, after your comments I check again and saw that I misunderstood something. GEO doesn't exist it was LOC.
Thanks a lot.

Ah, glad it got solved! If you're using the OntoNotes5 annotation scheme, e.g. the one used by the pretrained English models, the distinction in that corpus is something like this:

  • GPE: geopolitical entities, e.g. everything with a governing body like cities and countries. Examples: "Germany", "Buenos Aires".
  • LOC: everything else that's a physical location or area, like "Kalahari Desert" or "Silicon Valley"

Yep, now it's crystal clear. thanks for your patience.