With a freshly-created prodigy
environment supplemented by the pyannote
github repo and the associated develop
branch, the error trace below is the consistent result across multiple .wav
files created from other source files via ffmpeg -i source-file.mp3 -f s16le -ar 16k -ac 1 destination-file.wav
, all of which we were able to pass through a pyannote
-driven SAD inference process when working outside of Prodigy. Any guidance, please?
(prodigy) → ls -lah
total 110288
drwxr-xr-x@ 5 tsslade staff 160B Aug 25 21:00 .
drwxr-xr-x@ 26 tsslade staff 832B Aug 25 20:59 ..
-rw-r--r--@ 1 tsslade staff 47M Jun 25 19:14 1593136205820.wav
(prodigy) → prodigy pyannote.sad.manual speech_activity .
Using cache found in /Users/tsslade/.cache/torch/hub/pyannote_pyannote-audio_master
Using cache found in /Users/tsslade/.cache/torch/hub/pyannote_pyannote-audio_master
/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/embedding/approaches/arcface_loss.py:170: FutureWarning: The 's' parameter is deprecated in favor of 'scale', and will be removed in a future release
warnings.warn(msg, FutureWarning)
Traceback (most recent call last):
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/prodigy/__main__.py", line 60, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 318, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "cython_src/prodigy/core.pyx", line 138, in prodigy.core.Controller.__init__
File "cython_src/prodigy/components/feeds.pyx", line 56, in prodigy.components.feeds.SharedFeed.__init__
File "cython_src/prodigy/components/feeds.pyx", line 155, in prodigy.components.feeds.SharedFeed.validate_stream
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/toolz/itertoolz.py", line 376, in first
return next(iter(seq))
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/interactive/recipes/sad.py", line 99, in sad_manual_stream
speech: Annotation = pipeline.compute_speech(file).to_annotation(
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/interactive/pipeline.py", line 170, in compute_speech
sad_scores = self.sad(current_file)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/features/wrapper.py", line 280, in __call__
return self.scorer_(current_file)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/features/base.py", line 149, in __call__
y, sample_rate = self.raw_audio_(current_file, return_sr=True)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/features/utils.py", line 237, in __call__
y = self.get_features(y, sample_rate)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/pyannote/audio/features/utils.py", line 173, in get_features
y = librosa.core.resample(y.T, sample_rate, self.sample_rate).T
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/librosa/core/audio.py", line 548, in resample
util.valid_audio(y, mono=False)
File "/Users/tsslade/miniconda3/envs/prodigy/lib/python3.8/site-packages/librosa/util/utils.py", line 305, in valid_audio
raise ParameterError(
librosa.util.exceptions.ParameterError: Mono data must have shape (samples,). Received shape=(1, 24682736)