batch-train from the UI

Not really – the Prodigy UI is really just the annotation interface for labelling data. Batch training a model is a Python process that’s completely separate from the data collection, and it potentially takes a while, so there’s not really a logical path to integrating the two.

I guess if you wanted to, you could implement this in the stream generator – after X annotations, you fetch the current dataset, call the batch_train function on it and once you have an updated artifact, load it and swap out the existing nlp object. But there are a lot of open questions around this – for example, what do you do with annotations that are collected while your model is training? Are you going to keep track of those and then perform another update afterwards?

Btw, on a related note: Prodigy Scale comes with built-in training runs that you can set up in the app and run with the latest state of the dataset to keep track of the results. But this feature is also mostly designed to have continuous training experiments running to see how the annotations you collect improve the model.