Let's assume this is your data:
# wow-video.jsonl
{"video":"https://videos.ctfassets.net/bs8ntwkklfua/37JfmMxgY5URkfzGHVFzz6/7b3a2c28337b907156b9d188d915f808/Midnight_in_Paris_Wow_10_480p.mp4","text":"Midnight in Paris","_input_hash":-2034503873,"_task_hash":-591136661,"_view_id":"audio_manual","audio_spans":[{"start":0.2603688983,"end":0.5032927285,"label":"WOW","id":"07e199a6-a4b1-403e-9415-ae6a2043bddf","color":"rgba(255,215,0,0.2)"}],"answer":"accept","_timestamp":1669147206,"_is_binary":false}
{"video":"https://videos.ctfassets.net/bs8ntwkklfua/2DfiudsiNA5n0CR2ibzu38/a405bd61b9c11240aa4dd7e163cc0903/Cars_3_Wow_9_480p.mp4","text":"Cars 3","_input_hash":1465204248,"_task_hash":-460213143,"_view_id":"audio_manual","audio_spans":[{"start":0.0898107358,"end":0.8975401325,"label":"WOW","id":"bfbba0b5-2583-4bd7-9931-1e46e35f25df","color":"rgba(255,215,0,0.2)"}],"answer":"accept","_timestamp":1669147212,"_is_binary":false}
{"video":"https://videos.ctfassets.net/bs8ntwkklfua/7tA15W0lgJr7CpJNCf1m4Z/86a953df81d88f5fa79e35aa1e4e4dc1/The_Big_Bounce_Wow_2_480p.mp4","text":"The Big Bounce","_input_hash":334946211,"_task_hash":-1916326215,"_view_id":"audio_manual","audio_spans":[{"start":0.19570347,"end":0.5278422713,"label":"WOW","id":"eeb6f211-f55e-478c-8c62-8ad9bdd1ff33","color":"rgba(255,215,0,0.2)"}],"answer":"accept","_timestamp":1669147215,"_is_binary":false}
First, load that data into Prodigy as dataset:
python -m prodigy db-in wow-video-annotations wow-video.jsonl
The key is having the audio_spans
populated as mentioned here.
Now run this:
python -m prodigy audio.manual review_wow_data dataset:wow-video-annotations --label WOW --loader video
The key is using dataset:
prefix to use your loaded dataset (with the audio_spans
) as your source. Be sure the value in your --label
corresponds to the labels you have in your audio_spans
.
Let me know if this works!