Editing Text and Linking Audio via Annotation Instructions

hi @jai!

Thanks for your question and welcome to the Prodigy community :wave:

For audio transcription, you could use the audio.transcribe recipe. If all of your audio files are unique, you could load them along with the transcription as files like .jsonl. You may need to do a small amount of python pre-processing but check out the file loader docs for audio. Just had a similar request earlier this week on how to handle a raw .csv file:

You can combine different recipes to create custom recipe/interfaces using blocks. So if you wanted, you could combine different interfaces like the audio.transcribe with the rel.manual, which would enable labeling spans/relations.

The one tricky part is that if the user had to correct to a transcription, you'd need to update (refresh) the text passed to the rel.manual after the user has used the textbox to correct the transcription. Does this sound right?

For this, you'd likely need to use an update / callback using some JavaScript. There is an example of something similar where we show how you can use a button to change existing text to a different case. In theory, I suspect you could try to do the same with a text box that first provides the original transcription, then a user can edit/correct it. Then they could click the button to activate the call back which then updates the corrected transcript and resends to the rel.manual. I haven't tried this but would be interested to see if it's possible.

Alternatively, perhaps the simplest solution would be to run this in two rounds. Round 1, you simply fix/correct transcriptions with audio.transcribe. Round 2, you use corrected transcriptions only in rel.manual and treat it like a typical span/relations annotation. I tend to prefer simpler tasks then trying to do everything at the same time, so I would likely choose this route.

Thanks again for your question and let us know if you have further questions!