variable audio_rate for audio annotation

TSSlade · August 26, 2020, 2:50am

I have an audio annotation task that is focused on applying multiple (occasionally overlapping) classes to speech regions. It would be extremely helpful to be able to accelerate or decelerate the playback rate to make it easier for annotators to carefully specify onset/offset boundaries, especially in areas of overlapping speakers.

Is this possible, whether out of the box or through some workaround? I've only been able to find this mention of an audio_rate configuration parameter, but no discussion of whether there's a way to update that on the fly as opposed to setting it once at the outset of the session and then just living with it.

ines · August 26, 2020, 9:56am

Hi! At the moment, the audio_rate setting can only be defined once when you start up the server, and it's then configured on the interactive audio player when it's created.

I'll put this on my list of enhancement issues and see if we can at least make it adjust the rate if the current annotation task is updated. Then you'd be able to call something like window.prodigy.update from JavaScript and even implement your own custom controls using a custom HTML block with a button, slider, dropdown or whatever else you need.

ines · September 8, 2020, 11:16am

Update: Just released Prodigy v1.10.4, which exposes the underlying WaveSurfer instance via window.wavesurfer, so you can access it and implement custom controls (like audio rate, but also various other settings). wavesurfer.js is the library Prodigy uses under the hood, and it exposes different methods on its player.

For example, you could use a custom interface with two blocks: audio_manual for the audio UI and html for some custom controls. The "html_template" could look like this:

<button onclick="window.wavesurfer.setPlaybackRate(2)">2x speed</button>
<button onclick="window.wavesurfer.setPlaybackRate(1)">1x speed</button>

TSSlade · September 8, 2020, 2:06pm

This is fantastic news, @ines! Thank you for adding this, and doing so so quickly!

TSSlade · September 18, 2020, 6:00am

@ines - thank you for the new feature, and the guidance! I've been able to get something running, which is incredibly exciting.

It occurs to me, now that I've seen it in action: wavesurfer.js doesn't have a time-stretching feature, does it? It'll be pretty difficult to accomplish the annotation I want to do when the pitch is correspondingly transformed by playing the samples with an audio_rate setting of 0.5 or 0.1, especially since part of the tagging we need to do is dependent upon the pitch of the audio being listened to.

I am fairly certain that swapping out wavesurfer.js for some other library would be do-able, given how modularly you all built this thing...and it might even be worth my attempting to figure out (although I'd be painfully slow!). I just don't know the JS library ecosystem well enough to know what's out there that would already accomplish what you all have called upon wavesurfer to do AND ALSO then do this thing for me...

TSSlade · September 18, 2020, 1:40pm

Looks like it's do-able in-ish WaveSurfer! (Paired with the soundtouch.js plugin, I guess?)

Here's their demo: http://wavesurfer-js.org/example/stretcher/

And the PR that added the feature, which helpfully links back to various associated issues and discussions: https://github.com/katspaugh/wavesurfer.js/pull/1214

So I guess the question is how to access whatever wavesurfer.js is doing within Prodi.gy to enable the appropriate soundtouch.js filter.

ines · September 19, 2020, 9:58am

Cool to see that it's possible with Wavesurfer, that definitely makes things easier! (While it's theoretically possible to integrate another player as well, it would have been a pretty involved task because there are many moving parts to consider to really make the interactive audio annotation work.)

I just had a quick look at the integration of soundtouch.js here: http://wavesurfer-js.org/example/stretcher/app.js

It does require a bit of code to be added in different places, but afaik, Wavesurfer lets you add multiple handlers (e.g. wavesurfer.on('play', () => {})), so you could try just adding the time stretcher code to the existing window.wavesurfer object exposed by Prodigy via custom JavaScript and see what happens. You also need to make sure soundtouch is available, but if you're lazy and just want to test things, you could literally just copy-paste the code into your custom JavaScript string.

TSSlade · September 23, 2020, 4:45am

Forgive me if these are dumb questions, but I am pretty far outside of my depth when it comes to JS.

As an initial cut, I basically loaded in the contents of both soundtouch.js and stretcher.js into separate <script type='text/javascript'>the file contents here</script> elements in the same custom html_template file that is successfully serving up my acceleration and deceleration buttons.

I basically made the following changes:

'use strict';

// Create an instance
// var wavesurfer = {};  // TS: Commented out b/c I want to link to the window.wavesurfer instance

// Init & load
document.addEventListener('DOMContentLoaded', function() {
    // Init wavesurfer
//    wavesurfer = WaveSurfer.create({  // TS: As above - this would create a new one, and I want to plug into an existing one...
//        container: '#waveform',
//        waveColor: 'violet',
//        progressColor: 'purple',
//        loaderColor: 'purple',
//        cursorColor: 'navy'
//    });
//    wavesurfer.load('../../example/media/demo.wav');

var wavesurfer2 = window.wavesurfer;  // TS: Linking my new var to the existing object...updated all subsequent instances of `wavesurfer` to `wavesurfer2`

    // Time stretcher
    wavesurfer2.on('ready', function() {
        var st = new window.soundtouch.SoundTouch(
            wavesurfer2.backend.ac.sampleRate
        );
        var buffer = wavesurfer2.backend.buffer;
        var channels = buffer.numberOfChannels;
        var l = buffer.getChannelData(0);
        var r = channels > 1 ? buffer.getChannelData(1) : l;
        var length = buffer.length;
        var seekingPos = null;
        var seekingDiff = 0;

        var source = {
            extract: function(target, numFrames, position) {
                if (seekingPos != null) {
                    seekingDiff = seekingPos - position;
                    seekingPos = null;
                }

                position += seekingDiff;

                for (var i = 0; i < numFrames; i++) {
                    target[i * 2] = l[i + position];
                    target[i * 2 + 1] = r[i + position];
                }

                return Math.min(numFrames, length - position);
            }
        };

        var soundtouchNode;

        wavesurfer2.on('play', function() {
            seekingPos = ~~(wavesurfer2.backend.getPlayedPercents() * length);
            st.tempo = wavesurfer2.getPlaybackRate();

            if (st.tempo === 1) {
                wavesurfer2.backend.disconnectFilters();
            } else {
                if (!soundtouchNode) {
                    var filter = new window.soundtouch.SimpleFilter(source, st);
                    soundtouchNode = window.soundtouch.getWebAudioNode(
                        wavesurfer2.backend.ac,
                        filter
                    );
                }
                wavesurfer2.backend.setFilter(soundtouchNode);
            }
        });

        wavesurfer2.on('pause', function() {
            soundtouchNode && soundtouchNode.disconnect();
        });

        wavesurfer2.on('seek', function() {
            seekingPos = ~~(wavesurfer2.backend.getPlayedPercents() * length);
        });
    });
});

...and that's pretty much it. I don't really understand quite enough about how JS works to make other edits...but my assumption was that if

rather than initiating a NEW wavesurfer instance I pointed the code above to the existing instance created by Prodigy,
I removed any other instantiation code that would be creating a player view (b/c Prodigy obviously handles that already), and
I pointed all the other invocations of the wavesurfer object to my new (renamed) alias of the window.wavesurfer object, perhaps I'd have some luck.

So far, no joy...but I'm not sure what other threads to start pulling at.

ines · September 23, 2020, 5:43pm

In theory, this should work, yes! What goes wrong in your case really depends on the errors you're seeing etc. The first thing I would check is whether your other .js files are actually loaded correctly and loaded before your custom JS. For a quick hacky test, you can literally just copy-paste the contents of the files and put them above your other code – this way, you definitely make sure it runs in the right order.

TSSlade · September 24, 2020, 2:55am

Maybe I'm missing something fundamental about which files are involved in this process and how they get processed by Prodigy. So here's my attempt at quick-and-hacky:

Soundtouch.js and Stretcher.js each in a script tag at the top of the same custom html_template that is being loaded by my recipe.
Confirmed: the HTML template IS being loaded, as evidenced by the playback-speed-controlling buttons showing up at the top of my interface.
Dumb little "Hello world!"-style console.log("Custom HTML entered...") JS script in <script></script> tags at the top of the HTML is not writing to console.
Dumb little "Hello world!"-style console.log("Soundtouch.js script entered..."); JS statement within the stretcher script (line 22) is not writing to console.
Dumb little "Hello world!"-style console.log("Stretcher.js script entered..."); JS statement within the stretcher script (line 1021) is not writing to console.
Dumb little "Hello world!"-style console.log("Slowing to 0.1"); JS statement within the buttons' on-click attributes (line 1105) IS writing to console.

From which I conclude...what? Not entirely sure. It seems like anything that's directly in a <script> tag within my custom.html is not getting processed, but the JS within the buttons' on-click attribute IS getting processed?

multiclass-audio-template.html (39.9 KB)

ines · September 24, 2020, 8:47am

I guess that makes sense and I wonder if the templating logic actually strips out <script> tags because they're typically considered unsafe. But that's why I was suggesting to just dump the whole code into your custom JavaScript instead of using multiple script tags.

TSSlade · September 24, 2020, 11:27am

Ah! Okay, that's helpful. Sorry, I was misunderstanding your guidance with respect to the path of least resistance. I thought by dumping these pieces into script tags in the file that was already getting successfully imported I was following your recommended 'quick and hacky' approach... Because by adding them into the mix in a file that we had already gotten the system to communicate with and consume as expected I was minimizing new points of failure.

What you're ACTUALLY saying is that I dump all of that stuff into a 'my_custom_js.js' file (just as I did with the custom HTML), read that in as e.g. my_javascript, and then have an entry in the config statement that says 'javascript': my_javascript.

Is that correct?

TSSlade · September 24, 2020, 1:30pm

Okay, so, closer - with all of the above dumped into a custom timestretcher.js file that contains both the stretcher.js and the soundtouch.js code, the stupid little console.log('some text here') tests all printed to console as expected.

TSSlade · September 25, 2020, 5:08am

Finally got it to work! Turns out the issue was basically two-fold:

The custom javascript gets appended to the HTML after the bundle.js script, rather than before...and since all of the stretcher.js magic is wrapped up in the document.addEventListener('DOMContentLoaded', function() {} call, it never gets executed...I guess because the audio_manual recipe already basically renders the wavesurfer player/viewer object once the DOM content is loaded, so that whole trigger has already come and gone before my custom JS ever gets read in?
I wound up having to replace the document.addEventListener wrapper (which I think might have needed to be window.addEventListener anyway?) with a sleep function of 1000ms. ...and that did the trick! I'll post the final script here for others' sake when I'm back at that machine.

ines · September 25, 2020, 7:56am

Ah cool, glad to hear you got your experiment working! Also makes me happy to see that it's possible to extend the interface in such a complex way via Prodigy's scripting API

I don't know off the top of my head how a DOMContentLoaded listener would behave inside another DOMContentLoaded listener but your interpretation definitely sounds reasonable. Btw, Prodigy also fires custom events you can listen to, including prodigymount. When that's fired, you know that the app has mounted and window.prodigy etc. are available. But not sure if this would make a difference here.

TSSlade · September 25, 2020, 9:14pm

Final timestretcher.js file attached for others' future reference...just given a .html extension b/c upload policies don't accept JS. timestretcher.js.html (39.7 KB)

Topic		Replies	Views
audio speed options	2	10	December 8, 2024
✨ Audio annotation UI (beta) news , audio	21	4959	March 10, 2023
Change a label in audio.manual recipe enhancement , audio	5	719	August 8, 2021
Multi-stage speaker audio classification with `pyannote.sad.manual` and `audio manual` usage , custom , audio	13	2109	September 28, 2020
Annotating video regions video	17	517	September 6, 2023

variable audio_rate for audio annotation

Related topics