Providing JSONL files for audio.manual - No first example

I am attempting to provide some model-created diarisation outputs for audio.manual, however I keep getting the error:

(prodigy_clean_test)  aaron@aaron  ~/Desktop/PDP/test  PRODIGY_LOGGING=verbose prodigy audio.manual clean_test_ds ./pre_annotated_audio.jsonl --label SERVER,CUSTOMER,PASSENGER1,PASSENGER0,UNKNOWN
Using 5 label(s): SERVER, CUSTOMER, PASSENGER1, PASSENGER0, UNKNOWN
15:51:03: RECIPE: Calling recipe 'audio.manual'
15:51:03: RECIPE: Starting recipe audio.manual
15:51:03: {'dataset': 'clean_test_ds', 'source': './pre_annotated_audio.jsonl', 'label': ['SERVER', 'CUSTOMER', 'PASSENGER1', 'PASSENGER0', 'UNKNOWN'], 'loader': 'audio', 'autoplay': False, 'keep_base64': False, 'fetch_media': False, 'exclude': []}
15:51:03: get_stream: Loading audio files
15:51:03: get_stream: Rehashing stream
15:51:03: get_stream: Removing duplicates
15:51:03: /home/aaron/.prodigy/prodigy.json
15:51:03: VALIDATE: Validating components returned by recipe
15:51:03: CONTROLLER: Initialising from recipe
15:51:03: CONTROLLER: Recipe Config
15:51:03: {'labels': ['SERVER', 'CUSTOMER', 'PASSENGER1', 'PASSENGER0', 'UNKNOWN'], 'audio_autoplay': False, 'auto_count_stream': True, 'dataset': 'clean_test_ds', 'recipe_name': 'audio.manual'}
15:51:03: VALIDATE: Creating validator for view ID 'audio_manual'
15:51:03: CONTROLLER: Using `no_overlap` router.
15:51:03: VALIDATE: Validating Prodigy and recipe config
15:51:03: FILTER: Filtering duplicates from stream
15:51:03: {'by_input': True, 'by_task': True, 'stream': <_cython_3_0_11.generator object at 0x7c3b70dcf880>, 'warn_fn': <bound method Printer.warn of <wasabi.printer.Printer object at 0x7c3b74ed10d0>>, 'warn_threshold': 0.4}

================================= Traceback =================================

File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/home/aaron/anaconda3/envs/prodigy_clean_test/lib/python3.11/site-packages/prodigy/__main__.py", line 50, in <module>
    main()

============================== Warning message ==============================

✘ Error while validating stream: no first example.
This likely means that your 'loader' could not find any examples in the
'source'. Ensure you're using a source with some examples and that not all
examples are being filtered out by preprocessing functions in your recipe. This
can also mean all the examples in your stream have been annotated in datasets
included in your --exclude recipe parameter.

I've tested this in new environments, and all of the files are in the correct place!

This is the pre_annotated_audio.jsonl (limited to 1 output):


{"audio": "/home/aaron/Desktop/PDP/test/MP3s/0bf9288a.mp3", "audio_spans": [{"start": 0.03096875, "end": 5.11034375, "label": "SPEAKER_00"}, {"start": 5.228468750000001, "end": 9.34596875, "label": "SPEAKER_01"}, {"start": 9.83534375, "end": 13.969718750000002, "label": "SPEAKER_00"}, {"start": 11.57346875, "end": 12.180968750000002, "label": "SPEAKER_01"}, {"start": 13.63221875, "end": 16.83846875, "label": "SPEAKER_01"}, {"start": 17.36159375, "end": 19.06596875, "label": "SPEAKER_00"}, {"start": 18.711593750000002, "end": 20.80409375, "label": "SPEAKER_01"}, {"start": 19.21784375, "end": 19.30221875, "label": "SPEAKER_00"}, {"start": 19.62284375, "end": 19.75784375, "label": "SPEAKER_00"}, {"start": 19.80846875, "end": 20.83784375, "label": "SPEAKER_00"}, {"start": 20.83784375, "end": 26.32221875, "label": "SPEAKER_01"}, {"start": 26.94659375, "end": 30.23721875, "label": "SPEAKER_00"}, {"start": 30.203468750000003, "end": 34.67534375, "label": "SPEAKER_01"}, {"start": 35.11409375, "end": 36.04221875, "label": "SPEAKER_00"}, {"start": 36.90284375, "end": 41.391593750000006, "label": "SPEAKER_01"}, {"start": 37.99971875, "end": 41.35784375, "label": "SPEAKER_00"}, {"start": 41.56034375, "end": 43.43346875, "label": "SPEAKER_00"}, {"start": 43.619093750000005, "end": 44.985968750000005, "label": "SPEAKER_01"}, {"start": 45.188468750000006, "end": 49.76159375, "label": "SPEAKER_00"}, {"start": 49.01909375, "end": 50.689718750000004, "label": "SPEAKER_01"}, {"start": 50.892218750000005, "end": 50.976593750000006, "label": "SPEAKER_01"}]}

I can't see anything wrong with what I am doing! I am using version 1.15.8

Welcome to the forum @afletcher53! :waving_hand:

By default, the audio loader expects to load files from a directory. If you are passing a JSONL file with reference to local paths, you'd need to use the JSONL loader instead, by specifying --loader jsonl on the command line.
Secondly, the browser wouldn't render a local file due to security constraints. For this reason you should convert your local paths to a base64-encoded data URIs. You can do that by specifying -FM (fetch media) option on the commad line.
To sum up, your recipe command should look like:

 PRODIGY_LOGGING=verbose prodigy audio.manual clean_test_ds ./pre_annotated_audio.jsonl --label SERVER,CUSTOMER,PASSENGER1,PASSENGER0,UNKNOWN --loader jsonl -FM

You can learn more about these (and other options) in audio.manual docs here and here.