Loading multiple streams to a recipe

Hi,
I was trying to load an audio file and a JSONL file using the file paths as two different streams and wanted to map them to two separate choice interfaces. Is that something I can do on prodigy? If yes, is there any custom recipe or just an example recipe which I can refer?

Thank you.

Hi! You can definitely do that with a custom stream that loads both sources using the respective loader (e.g. Audio or JSONL) and then puts them together (e.g. using zip). How you do that kinda depends on how the data is structured and what the final result should look like. Do you want to annotate one audio with different multiple choice options, or multiple audios as different options?

One limitation at the moment is that Prodigy currently only supports having one audio UI per interface. But if you just want to include a simple audio player, you can always do this with "html" a native <audio> element. The code shared in this thread shows a similar example:

In the case of having choice options with audio players, you would then include a "html" key with the audio player for each option defined in "options". You can see an example of the JSON format expected by the choice interface here: Annotation interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP

Hi,
Thanks that's helpful. So after I put both my audio file and my json file in a zip folder, how do I return both of them in the recipe? This is what my code looks like right now.

return {
        "view_id": "blocks",     # set the view_id to "blocks"
        "dataset": dataset,
        "stream":  {Audio(audio_path), get_stream(stream)},

I would like to annotate one audio with multiple choice options.

Ah, sorry if I phrased this in a confusing way. What I meant was the zip() function: Using the Python zip() Function for Parallel Iteration – Real Python

So you can do something like this to get the first example of stream 1 and stream 2, then the second of both streams, and so on.

def combine_streams(stream1, stream2):
    for example1, example2 in zip(stream1, stream2):
        # do something with the examples here

You can then combine the data that's available in the two streams and create the JSON format you need for your interface.

Sorry I must've framed my question wrong. So I basically want to annotate an audio file and a JSONL file simultaneously. I'm not sure how zip() would combine an audio file's data and JSONL data and would let me load a "combined" stream on prodigy.

Ah sorry, just to make sure I understand your question correctly: Do you have an example of the input data in the JSONL (aside from the audio file) and the result that you're looking to achieve in the annotation UI? I think this will make it a bit easier for me to understand what you're trying to do and how to solve it in the recipe.

1 Like