I was trying to work on multi-class imbalance problem on multi-classifier and wanted to build synthetic data using the snorkel (https://hazyresearch.github.io/snorkel/). My thoughts are to use Prodigy and seed terms to build the pattern and then use the patterns generated by Prodigy to feed snorkel and build labeling functions. Iterate through this loop until I get the accuracy I need. Let me know what you guys think? Or does the Prodigy teach recipe does this concept already and I don’t need another tool like Snorkel to achieve this
I haven’t actually played with Snorkel yet, but I know a lot of work has gone into it, so I’m sure it does some things Prodigy doesn’t. If you have success with it I hope you’ll keep us updated — it would be good to develop a recipe that uses the two together.