integration with guildai for storing runs


I have been using prodigy for annotating and building models.

However, I am noticing that, as the need for training and comparing various models increases, I would like to be able to store these as guild runs.

(Guild is a framework for storing ML experiments , similar to MLFlow or Sacred).

Do you know how this could be done?

Hi! I haven't used myself but it does indeed look pretty similar to other experiment management tools, so it shouldn't be difficult to integrate :slightly_smiling_face: It just depends on what you want to log, and where.

I just had a brief look at the docs and I didn't immediately find details on the "track without changing your code" part. But if it lets you "wrap" commands and capture the output, you could probably just run your Prodigy commands with it.

The train recipe is really just a Python function and it returns a (best_scores, baseline) tuple after each training run. You can see how it looks in recipes/ in your Prodigy installation. So if you want to log the final best accuracy, you could write a script that calls into train() with the respective arguments and then logs the result via Guild. You could also send other info, like the name of the Prodigy dataset used, the Prodigy version and all other settings

Not sure if it makes sense to log "annotation runs" in the same way, but it's definitely possible with a similar approach, using a custom recipe.

Ah cool, thanks! That makes perfect sense. Didn't think of train as a recipe that could be customized :laughing: