Audio UI enhancement: keyboard shortcuts and clickthrough

TSSlade · September 19, 2020, 7:44pm

Is there a mechanism to initiate (and then conclude) an audio_span via keyboard shortcut?

In an ideal world

I could map e.g. spacebar to an action like toggle_span so that I could do a rough cut of audio annotation without needing to remove my hands from the keyboard
it would apply whatever option is currently selected
the method for selecting an existing annotation span and thus shifting it/removing it would be distinct
the second press of the spacebar would close the span that had been opened by the initial press.

...point 3. being very important because my use case is that of overlapping voices. So I'd like to be able to do a pass through my looping audio while applying label-1, then toggle over to label-2 by pressing '2' on my keyboard, and be able to mark the onset of a label-2 span even in the middle of an existing label-1 span by hitting spacebar (or whatever).

If the 'close active open span' action had to be mapped to a different keystroke, that'd be fine too (although marginally less smooth).

If there were a distinct action for toggling an audio_span that was basically "remove if span exists" or "truncate span to this cursor position" or something, I'd want to be able to assign that to a different keyboard shortcut.

TSSlade · September 20, 2020, 2:56am

When annotating audio via audio.manual or a similar custom recipe, it is common for audio_spans requiring different labels to be partially or fully overlapping.

In those instances, it appears that an out-of-the-box audio.manual approach does not support initiating a new audio_span at a point that is already encompassed by an existing audio_span; you have to either

start at the end (assuming it ends after the existing span ends) and trace it backward to its origin, or
temporarily displace the existing span out of the way and replace it once the current span's boundaries have been defined.

For spans which are fully encapsulated by other spans, the approach (2.) is the only option.

Is there a mechanism to allow, for instance, an initial click to select the top-level span, and a second click to engage with the underlying audio waveform? Assuming there is not, would it be possible to enable such an interaction sequence via custom JavaScript or come up with an alternative that would accomplish the same goal of being able to initiate an overlapping audio_span without needing to displace the original?

ines · September 22, 2020, 9:30am

Thanks for the detailed enhancement suggestions! I merged both topics into one thread because they're both related to the same interface and your specific task.

For the clickthrough mechanism, we could consider something similar to what the image_manual UI has: a keyboard shortcut that lets you toggle clickthrough/not clickthrough. The main challenge for the audio UI is mostly the integration with WaveSurfer, so I'd have to see what's possible.

TSSlade · September 24, 2020, 3:04am

Thanks for considering them! In the meantime one of the workarounds we're considering is the business of having multiple rounds of annotation on the same input stream (à la Multi-stage speaker audio classification with pyannote.sad.manual and audio manual). It's not as ideal, as much because of the context-switching as because of the additional overhead of getting the audio input chunking to work...but if others stumble across this thread before that enhancement makes it onto the roadmap and into reality, it's one possible resolution.

tcwalther · February 3, 2021, 11:32am

The clickthrough enhancement for fully overlapping labels is something I'd also be very interested in. I think I'd prefer a keyboard modifier for dragging a region instead for having the click-through; making adding a region the default behaviour. Or, if that's not possible, simply disable dragging of regions (resizing at the edges should still be allowed). If I understand this PR correctly, that should fix it:

ines · February 4, 2021, 9:51am

Ah cool, thanks for the pointer! I will try this out And now that I think about it, a shortcut for resizing/selecting would probably also make it consistent with the image annotation UI that defaults to clickthrough and allows clicking on the whole shape by pressing shift.

tcwalther · February 18, 2021, 10:30am

Hey Ines,

Do you have an update or ETA on when this would land in the nightly? It currently prevents me from using Prodigy to label our multi-label dataset.

Best,
Thomas

ines · February 19, 2021, 12:45pm

We don't have an ETA for this currently, sorry! I do think it's a good feature request but I don't want to disrupt your project plans by asking you to wait for it You can keep an eye on the changelog here to see when it lands and I'll also be updating this thread

TSSlade · May 14, 2021, 1:57am

Hey, Ines! Thanks for all the work you and the team have done on Prodigy. Any update on whether this feature is on a near- or medium-term roadmap?

Topic		Replies	Views
✨ Audio annotation UI (beta) news , audio	21	4955	March 10, 2023
Text spans and Image spans simultaneously enhancement , ner , done , image , front-end	10	758	December 20, 2024
Adding multiple choices results in duplicated audio streams for audio_manual custom , solved , audio	2	20	August 26, 2024
Spans and Relations: Start in span mode by default enhancement , front-end	2	498	June 21, 2021
Can I extend annotations to adjacent tokens in UI?	3	163	July 28, 2023

Audio UI enhancement: keyboard shortcuts and clickthrough

Related topics