Hi! Thanks for these suggestions Some quick comments:
I'm not 100% sure what you mean by this? We typically recommend using spaCy's CNN-based pipelines for training with Prodigy, and it's also the default configuration you get out-of-the-box. Transformer-based pipelines are a nice add-on if you want to suqeeze out the final percent of accuracy, but especially during development, what you typically care about most is whether your model is learning. So I I agree that transformer embeddings are often overkill here and spaCy provides good alternatives optimised for CPU.