Please confirm if this is possible; I think it is not possible, in which case pls consider this as a feature idea (not so much a request).
I would like to annotate texts to create entity spans in an example and then annotate those entities with relations.
rels.manual allows this, but the interface is tough since each word in the example is rendered as an image. It would be nice if the example text could be rendered as it is in the ner interface (i.e as text). Words would only able to participate in a relation if those words have been annotated as an entity span.
Hi! What exactly are you trying to do that would need the tokens to be rendered as actual text? To allow the more complex interactions for drawing relationship arcs, including the flexible arrows, token boundaries, token merging etc, the relations UI uses a <canvas> and that's also what the text is drawn on.
If you know that only some tokens are relevant for what you're doing, it's definitely a good idea to reduce visual complexity by only allowing certain tokens to be selected, defined by match patterns. That's also how some of the more specific built-in workflows like the coref recipes do it.
The easiest way to do this would be to use the ner.manual recipe to annotate the entities, then feed that data forward into a workflow like rel.manual with disable patterns that disable all tokens that are not entities (e.g. tokens with "ENT_TYPE": ""). You'll then only be able to connect entities, not all tokens.
thanks much Innes. I think disable patterns will get us there.
That said what about the following:
"John supported France instead of Spain because his older brother supported France too."
I might want to annotate that "John" and "France" participate in something like a Support relation, and connect an arc to an explanatory span. Easy enough to enable Person and Country ents for example, but it would be hard to enable (not disable) text like the "because his older brother supported France".
The general idea to annotate a relation along with explanatory text. This is unusual but i was considering this approach to produce a dataset of ents, relations and explanations to try some experiments in extractive summarisation.
Any thoughts welcome, but thanks for pointing out the disable pattern feature.
That's true, and if your annotation objective includes "every token can in theory be connected to any other token", it makes it more difficult to simplify the task because you'll have to allow the possibility of any token being connected.
Another potential problem you'll very easily have with the explanatory texts is that it'll be hard to enforce consistency because the boundaries are often vague and ambiguous. One annotator may highlight "because [...] too", someone else might not include the "because", someone else might only highlight "his older brother". You can solve this with more detailed annotation instructions or by introducing more constraints around what can be selected.
One thing to experiment with in your case could be to not mix the very straightforward entity relation annotation and the more vague explanatory span highlighting in the same interface, because those are pretty different types of annotations. For example, you could focus on the actual relation annotations first, and then make another pass over the data where you display the relation statically and then show the raw text again and let the annotator highlight the explanation. (This could especially be helpfu in the beginning if you have multiple annotators because you can annotate the same example multiple times and it'll make it pretty easy to spot significant conflicts in boundaries etc.)
then feed that data forward into a workflow like rel.manual
Hello Ines, thanks for all of your awesome work.
I wondered what have you meant here, how can I feed the results of a NER task to a consecutive REL task? Is there a way to have both NER and REL in the same 'blocks'?
(If you did it both in the same interface, you'd kinda end up with the same problem as described earlier in the thread where your possible spans have very ambiguous boundaries and you only know the constraints for the relations once you've annotated the spans. So it can make more sense to do it in two steps.)