I am excited for the updates especially as I work with video and captions.
So, my question is, can I use this for captioning videos? and segmenting videos to captions? I see the audio example. But could someone point me to how I could go about using this for video captioning.
Hi! The new
audio_manual UIs support both audio and video files – so whether you use audio or video just depends on the loader you use for the data. If you're using the
audio.transcribe recipe with
--loader video, you'll be able to load in a directory of video files: https://prodi.gy/docs/recipes#audio-transcribe
If you're working with longer video files, you might want to split them into smaller chunks, or even pre-select the segments that contain content you want to transcribe. You can do that in Python and then stream in tasks with a key
"video", mapped to the base64-encoded dat (which you can easily convert from bytes).