webm

Is there a way to play webm files using the html tags? Is there a sample I can work from?

Hi! I haven't tried this yet, but since webm is natively supported by most modern browsers (I think?), you might just have to tell the Video loader to also consider the file extension .webm:

stream = Video("/directory", file_ext=(".mpeg", ".mpg", ".mp4", ".webm"))

This will create one task per video file in the directory and include the base64-encoded data as the key "video", and you'll be able to render that using the audio and audio_manual interfaces.

Alternatively, if you want to use a fully custom HTML interface and just show a video player, you could also use a html_template with a <video> element. The {{video}} variable in this example will include the content of the "video" key in your annotation task:

<video src="{{video}}" controls></video>

Thanks Ines. Where am I specifying the stream variable with the audio_manual recipe? I'm still new to prodigy so I'm still figuring things out.

You can do this by writing a custom recipe based on the audio.manual recipe, either by writing a new one or by wrapping the existing one and passing in a different stream. If you just want to hack at things and see how the built-in recipes are implemented, you can also run prodigy stats to find the location of your Prodigy installation and then open the file recipes/audio.py.

Ultimately, Prodigy recipes are just Python functions that return a dictionary of components. This means you can also call them as a function in your custom recipe. If you just want to overwrite the stream, you can pass an already loaded stream into the function, like this:

from prodigy.recipes.audio import manual as audio_manual
from prodigy.components.loaders import Video
from prodigy.util import get_labels
import prodigy

@prodigy.recipe(
    "custom.audio.manual",
    dataset=("Dataset to save annotations to", "positional", None, str),
    source=("Data to annotate (file path or '-' to read from standard input)", "positional", None, str),
    label=("Comma-separated label(s) to annotate or text file with one label per line", "option", "l", get_labels),

)
def custom_audio_manual(dataset, source, label):
    stream = Video(source, file_ext=(".mpeg", ".mpg", ".mp4", ".webm"))
    components = audio_manual(dataset, stream, label)
    return components