Coreference resolution

Yes, the idea behind this workflow is that it assumes you're using the part-of-speech tagger's and/or named entity recognizer's output as features or for candidate selection in coreference model. So being able to edit those pre-defined annotations would be very misleading – you'd be annotating data for a state that you would never actually get to at runtime.

The rel.manual recipe lets you feed in data that contains pre-annotated "spans" – either set by your own process, or created in a separate annotation process (e.g. ner.manual).

Alternatively, you can also provide --patterns that define the tokens and spans to label. For example, one or more tokens tagged as proper noun:

{"label": "PROPN", "pattern": [{"POS": "PROPN", "OP": "+"}]}

In fact, this is also how the coref.manual recipe does it under the hood: it calls into rel.manual and provides some custom patterns to select the candidate spans. You can check out the implementation by looking at recipes/coref.py in your Prodigy installation (run prodigy stats to find out the local path).

This would allow you to annotate overlapping spans in the text – the relations UI currently only shows one "layer" of spans, everything else would get really messy and difficult to visualise in a way that's actually helpful. So if you have nested spans and complex relations, it probably makes sense to deal with this separately, or use the relation labels to encode the entity types. I recently posted an example of how this could look for non-contiguous entity spans:

1 Like