Coreference resolution

ines · February 15, 2021, 4:02am

Yes, the idea behind this workflow is that it assumes you're using the part-of-speech tagger's and/or named entity recognizer's output as features or for candidate selection in coreference model. So being able to edit those pre-defined annotations would be very misleading – you'd be annotating data for a state that you would never actually get to at runtime.

The rel.manual recipe lets you feed in data that contains pre-annotated "spans" – either set by your own process, or created in a separate annotation process (e.g. ner.manual).

Alternatively, you can also provide --patterns that define the tokens and spans to label. For example, one or more tokens tagged as proper noun:

{"label": "PROPN", "pattern": [{"POS": "PROPN", "OP": "+"}]}

In fact, this is also how the coref.manual recipe does it under the hood: it calls into rel.manual and provides some custom patterns to select the candidate spans. You can check out the implementation by looking at recipes/coref.py in your Prodigy installation (run prodigy stats to find out the local path).

This would allow you to annotate overlapping spans in the text – the relations UI currently only shows one "layer" of spans, everything else would get really messy and difficult to visualise in a way that's actually helpful. So if you have nested spans and complex relations, it probably makes sense to deal with this separately, or use the relation labels to encode the entity types. I recently posted an example of how this could look for non-contiguous entity spans:

Topic		Replies	Views
Annotating coreference on NER annotated text usage , ner , coref	3	226	May 13, 2024
Reviewing coreference annotations solved , review , relations , coref	4	603	April 15, 2021
rel.manual custom recipe usage	1	316	May 2, 2022
NER and Coref/Rel advice usage , relations , coref	4	753	December 27, 2022
Dynamic choices for binary long-range coreference usage , custom , coref	2	650	December 22, 2021

Coreference resolution

Related topics