Thanks for considering them! In the meantime one of the workarounds we're considering is the business of having multiple rounds of annotation on the same input stream (à la Multi-stage speaker audio classification with pyannote.sad.manual
and audio manual
). It's not as ideal, as much because of the context-switching as because of the additional overhead of getting the audio input chunking to work...but if others stumble across this thread before that enhancement makes it onto the roadmap and into reality, it's one possible resolution.