Manual dependency annotation in text (not binary)


(Boris) #1

Hello team
Trying to figure out if there is a way to manually annotate dependencies in text.
I’ve checked but it doesn’t suit my needs as I need not a binary interface but a person manually creating actually 2 things together:

  1. Assigning Named Entity tags
  2. Creating predefined relationships between them.

The main issue seems to be UI support where it’s possible not only to display those arcs but also to draw them between tokens.

Also was thinking about workaround, when I just have extra NE tokens like “start_arc_type1”/“end_arc_type1” and so on, but then I need to be able to assign multiple tags to one token, which also doesn’t seem to be possible.

Thank you.

(Ines Montani) #2

Hi and sorry, I think I somehow missed this thread!

You’re right that we currently don’t have a solution for fully manualy dependency annotation – we haven’t yet developed a method that’s more efficient than what’s already available in tools like Brat. So if you want to do 100% raw fully manual dependency annotation from scratch, this is probably a much better solution than Prodigy.

Are you able to share what types of relationships / labels you’re annotating?

Annotating two tokens with multiple labels each and a dependency between them in one go definitely introduces a high cognitive load (and higher error potential). So we usually recommend trying to break these things down into smaller tasks that can be completed independently and look for things you can automate. (This is especially relevant if you’re still in the development phase – if you need to adjust your label scheme at some stage, you’ll be able to iterate quicker and hopefully need to throw away less data). Some examples: If your multi-class label scheme is hierarchical, it’s often more efficient to focus on top level categories and then do the fine-grained classification in a separate step. If you have a pre-trained model, you could use that to pre-select suggestions, or to define the constraints and limit the options (e.g. if a custom relationship only exists between nouns and your part-of-speech tagger is good, this lets you automate a lot of decisions a human would otherwise have to go through manually).

That said, there are definitely cases where you just can’t get around doing everything manually and we’re still working on a solution to make this possible and more efficient using Prodigy :slightly_smiling_face: