I am attempting to provide some model-created diarisation outputs for audio.manual, however I keep getting the error:
(prodigy_clean_test) aaron@aaron ~/Desktop/PDP/test PRODIGY_LOGGING=verbose prodigy audio.manual clean_test_ds ./pre_annotated_audio.jsonl --label SERVER,CUSTOMER,PASSENGER1,PASSENGER0,UNKNOWN
Using 5 label(s): SERVER, CUSTOMER, PASSENGER1, PASSENGER0, UNKNOWN
15:51:03: RECIPE: Calling recipe 'audio.manual'
15:51:03: RECIPE: Starting recipe audio.manual
15:51:03: {'dataset': 'clean_test_ds', 'source': './pre_annotated_audio.jsonl', 'label': ['SERVER', 'CUSTOMER', 'PASSENGER1', 'PASSENGER0', 'UNKNOWN'], 'loader': 'audio', 'autoplay': False, 'keep_base64': False, 'fetch_media': False, 'exclude': []}
15:51:03: get_stream: Loading audio files
15:51:03: get_stream: Rehashing stream
15:51:03: get_stream: Removing duplicates
15:51:03: /home/aaron/.prodigy/prodigy.json
15:51:03: VALIDATE: Validating components returned by recipe
15:51:03: CONTROLLER: Initialising from recipe
15:51:03: CONTROLLER: Recipe Config
15:51:03: {'labels': ['SERVER', 'CUSTOMER', 'PASSENGER1', 'PASSENGER0', 'UNKNOWN'], 'audio_autoplay': False, 'auto_count_stream': True, 'dataset': 'clean_test_ds', 'recipe_name': 'audio.manual'}
15:51:03: VALIDATE: Creating validator for view ID 'audio_manual'
15:51:03: CONTROLLER: Using `no_overlap` router.
15:51:03: VALIDATE: Validating Prodigy and recipe config
15:51:03: FILTER: Filtering duplicates from stream
15:51:03: {'by_input': True, 'by_task': True, 'stream': <_cython_3_0_11.generator object at 0x7c3b70dcf880>, 'warn_fn': <bound method Printer.warn of <wasabi.printer.Printer object at 0x7c3b74ed10d0>>, 'warn_threshold': 0.4}
================================= Traceback =================================
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/aaron/anaconda3/envs/prodigy_clean_test/lib/python3.11/site-packages/prodigy/__main__.py", line 50, in <module>
main()
============================== Warning message ==============================
✘ Error while validating stream: no first example.
This likely means that your 'loader' could not find any examples in the
'source'. Ensure you're using a source with some examples and that not all
examples are being filtered out by preprocessing functions in your recipe. This
can also mean all the examples in your stream have been annotated in datasets
included in your --exclude recipe parameter.
I've tested this in new environments, and all of the files are in the correct place!
This is the pre_annotated_audio.jsonl (limited to 1 output):
{"audio": "/home/aaron/Desktop/PDP/test/MP3s/0bf9288a.mp3", "audio_spans": [{"start": 0.03096875, "end": 5.11034375, "label": "SPEAKER_00"}, {"start": 5.228468750000001, "end": 9.34596875, "label": "SPEAKER_01"}, {"start": 9.83534375, "end": 13.969718750000002, "label": "SPEAKER_00"}, {"start": 11.57346875, "end": 12.180968750000002, "label": "SPEAKER_01"}, {"start": 13.63221875, "end": 16.83846875, "label": "SPEAKER_01"}, {"start": 17.36159375, "end": 19.06596875, "label": "SPEAKER_00"}, {"start": 18.711593750000002, "end": 20.80409375, "label": "SPEAKER_01"}, {"start": 19.21784375, "end": 19.30221875, "label": "SPEAKER_00"}, {"start": 19.62284375, "end": 19.75784375, "label": "SPEAKER_00"}, {"start": 19.80846875, "end": 20.83784375, "label": "SPEAKER_00"}, {"start": 20.83784375, "end": 26.32221875, "label": "SPEAKER_01"}, {"start": 26.94659375, "end": 30.23721875, "label": "SPEAKER_00"}, {"start": 30.203468750000003, "end": 34.67534375, "label": "SPEAKER_01"}, {"start": 35.11409375, "end": 36.04221875, "label": "SPEAKER_00"}, {"start": 36.90284375, "end": 41.391593750000006, "label": "SPEAKER_01"}, {"start": 37.99971875, "end": 41.35784375, "label": "SPEAKER_00"}, {"start": 41.56034375, "end": 43.43346875, "label": "SPEAKER_00"}, {"start": 43.619093750000005, "end": 44.985968750000005, "label": "SPEAKER_01"}, {"start": 45.188468750000006, "end": 49.76159375, "label": "SPEAKER_00"}, {"start": 49.01909375, "end": 50.689718750000004, "label": "SPEAKER_01"}, {"start": 50.892218750000005, "end": 50.976593750000006, "label": "SPEAKER_01"}]}
I can't see anything wrong with what I am doing! I am using version 1.15.8