Manual Annotation Response for Text Without Entities

farlee2121 · February 2, 2018, 10:33pm

I’m tagging a somewhat complex entity type and have been trying to get a model to reasonable predictions with manual tagging. However, i’m a bit confused about the proper response scheme.

If a proposed chunk of text has no entities, should it be Accept, Reject, or Ignore?

My initial thought was reject, because it’s not an entity. However, I could also see accept, because it correctly has no entities (even more relevant to ner.make-gold). Ignore would also make sense because there is no relevant info.

What is the intended usage?

honnibal · February 2, 2018, 10:56pm

In the manual mode, “Accept” means “the entity annotations in this text are correct”. So if there are no entities marked and no entities missing, you should click Accept.

In a sense though, with the manual mode the semantics of your annotations are up to you though — for instance, there’d be nothing stopping you from reversing the meaning of “accept” and “reject”. The annotations will be attached to the data either way, for you to work with.

I would suggest using “reject” to mean that the tokenization or input text was incorrect, as then you know that the annotation on that example isn’t fully correct.

farlee2121 · February 3, 2018, 5:56pm

Thanks for clearing it up for me!

farlee2121 · February 13, 2018, 9:11pm

A thought, are reject and ignore necessary for make-gold?

As I am making an evaluation set under these semantics, I realized that in make-gold i’m always approving because part of my task is making the annotations correct.

honnibal · February 13, 2018, 10:19pm

I usually use reject to signal that the tokenisation is wrong for the manual and make-gold recipes. Ignore can always be useful too.

mhigginslp · March 16, 2018, 6:35pm

How does the NER use rejected samples to aid in training?

honnibal · March 16, 2018, 6:47pm

If the sample is reject during the manual model for tokenization, it wouldn’t be able to learn from that. But here’s what happens during ner.teach.

Let’s say you’ve got some example like “Her daughter is named Apple.”, and it’s tagged “Apple” as ORG. When you hit “reject”, we don’t know which analysis is correct, but we do know analyses that include “Apple|U-ORG” aren’t right. We use these bits of knowledge to come up with the most satisfactory set of parses under the current model, and then update the weights so that the score of this set of parses is increased.

In a paper I would term this a global model with partial supervision. The training objective is similar to the one I used in this paper: https://transacl.org/ojs/index.php/tacl/article/view/234/39 . The citation I give is:

Xu Sun, Takuya Matsuzaki, Daisuke Okanohara,
and Jun’ichi Tsujii. 2009. Latent variable perceptron algorithm for structured classification.
In IJCAI, pages 1236–1242

There are several papers that describe how to do the structured neural network model, instead of a structured perceptron. One of the syntaxnet papers discusses this at length as though it were new, but there was already a paper by Yue Zhang’s group that did the same thing, and IIRC another one or two before that.

Topic		Replies	Views
ner.manual usage , ner , solved	1	609	July 28, 2019
How to treat entity-free text in manual/match modes. usage , ner	1	474	April 16, 2019
ner.manual: Accept/Reject confusion usage , ner , solved	2	801	February 22, 2021
Should I accept a piece with no labeled entities? ner , solved	1	406	December 12, 2019
When to reject in ner.manual or ner.make-gold? usage , ner , solved	1	1291	October 17, 2018

Manual Annotation Response for Text Without Entities

Related topics