Error with annotation for speaker diarization

VShakib · June 22, 2021, 2:01am

Hi,

I'm using Prodigy to annotate some audio files for speaker diarization. I'm also using the pyannote open-source model which provides some Prodigy recipes as shown here. I'm getting a weird issue where the audio file won't load and I can't annotate it. I'm getting this error with both the pyannote.dia.manual recipe and the standard audio.manual recipe. It's worth noting that I thought the error was because there were spaces in the file names, but that turned out not to be the issue.

These are the commands I've tried running:
prodigy pyannote.dia.manual test_dataset test/
prodigy audio.manual test_dataset test/ --label SPEAKER1,SPEAKER2

How should I go about trouble-shooting this?

VShakib · June 22, 2021, 3:17am

As a follow up, it seems like a sample wav file that I downloaded on the internet works, but the wav files I currently have don't. Are there any requirements/restrictions on types of wav files I can use, like sample rate or something else?

VShakib · June 22, 2021, 3:30am

Apparently the audio annotation supports stereo audio but not mono. I could convert all my files to stereo, but is there any better/easier way of doing it / am I missing something?

ines · June 23, 2021, 2:10am

Hi! Did you test it with the same files converted to stereo and did that solve the issue you were having? If so, this might be related, although I'm confused that it'd fail like this and give you a blank UI That's definitely strange and unideal.

If it's not related to stereo vs. mono, how large are your audio files and how are you loading them in? By default, Prodigy will encode the file as a base64 string, which is a fine solution for small snippets and means that the original data will be stored with the example (and it makes it easy to create short snippets and stream them in programmatically without having to store them on disk). However, if the file is very large, this can potentially lead to loading issues if it's all sent over REST as a string. In that case, you could try the audio server loader via --loader audio-server, which will serve the files via a local web server. Alternatively, you can also provide them as URLs (e.g. via an S3 bucket) with a JSONL file and --loader jsonl.

VShakib · June 23, 2021, 6:54pm

Yes, after I converted my files to stereo, instead of mono, it worked fine.

ines · June 24, 2021, 3:19am

Thanks for checking, that's helpful! Glad to hear that there's at least a temporary workaround then.

I'll try and reproduce this to figure out what the underlying problem could be. Also, if you have a short sample of a working stereo and non-working mono snippet, that'd be helpful as well (also to double-check that there's nothing else that could be relevant here).

Topic		Replies	Views
pyannote support audio	4	290	October 31, 2023
Multi-stage speaker audio classification with `pyannote.sad.manual` and `audio manual` usage , custom , audio	13	2100	September 28, 2020
✨ Audio annotation UI (beta) news , audio	21	4953	March 10, 2023
Prodigy error when reviewing audio annotation coupled with videos usage , audio , video	7	822	December 5, 2020
Another issue with web interface. usage , solved , audio	4	527	October 28, 2021

Error with annotation for speaker diarization

Related topics