Podcast speaker prediction with Prodigy and scikit-learn

Just came across this really cool project using Prodigy to label audio snippets from the Syntax.fm podcast to train a scikit-learn model to predict who is speaking. The repo includes the whole pipeline, including Prodigy recipes and config and notebooks :sparkles:

Twitter thread with more details: