I've been going through some of the pre-existing recipes and I absolutely love the ease-of-use of the tool, the visuals and the customization options. In my research, I am interested in error/suggestion annotation of (machine) translation. You typically annotate words/spans of the translation and label them with an existing hierarchical category (similar to the span categorization recipe). A potential addition is allowing for a comment on top of the category label for the annotator to provide some more information about their choice. Those two things, I can figure out I think.
The potentially harder issue that I am faced with is being able to incorporate the source text in the annotation scheme. Some errors are not mistakes in the target language (like grammatical errors) but are wrong because it is not the correct translation. It would therefore be incredibly useful to be able to link the labeled translation span to a span in the source text. This would require a couple of things, and this topic is to ask whether that is feasible at all to create myself within Prodigy (or whether there are plans to have such functionality in the future). I can think of the following:
- access to two sentences in the interface, the source sentence and the target sentence;
- ideally the option to have (read) access to other sentences in the same document;
- the ability to mar spans in both source/target (already possible with span categorization);
- the ability to link two spans to each other across source/target sentences;
- have a useful to export this information.
Linking spans to each other also seems useful for entity linking and other coreference use-cases. I guess that the hardest part would be to incorporate both a source and target sentence in a single annotation instance.
If this is not feasible within Prodigy because that's simply outside its scope, I completely understand!