batch-train from the UI

We’d like to be able to batch train from the UI and then to be able swap out the model during a session.

Is there any good way to do this in a custom recipe?

Not really – the Prodigy UI is really just the annotation interface for labelling data. Batch training a model is a Python process that’s completely separate from the data collection, and it potentially takes a while, so there’s not really a logical path to integrating the two.

I guess if you wanted to, you could implement this in the stream generator – after X annotations, you fetch the current dataset, call the batch_train function on it and once you have an updated artifact, load it and swap out the existing nlp object. But there are a lot of open questions around this – for example, what do you do with annotations that are collected while your model is training? Are you going to keep track of those and then perform another update afterwards?

Btw, on a related note: Prodigy Scale comes with built-in training runs that you can set up in the app and run with the latest state of the dataset to keep track of the results. But this feature is also mostly designed to have continuous training experiments running to see how the annotations you collect improve the model.

Thanks for the helpful reply, Ines. I didn’t think of hacking the sorter to handle this kind of thing, but I see some of the complications it could cause. Prodigy Scale does look like it would better fit our use-case – I signed up for the beta, we’d love to try it.