Problems starting recipes programatically

Hi,

I'm trying to integrate Prodigy into a Python-based automated workflow. I want to start a task like this:

from prodigy import recipes
    
output=recipes.ner.make_gold(
    dataset='ner-dataset',
    spacy_model=Path(r'/mnt/c/model'),
    label=['MY_ENT']
)

This works, and output is then a dictionary like this {'view_id': 'ner_manual', 'dataset': 'ner-dataset'...}.

I assume this is a config for the web server to use. But I can't see where I need to use it to start it serving. I tried server(output, {}), but there's an error AttributeError: 'dict' object has no attribute 'view_id'. So I assume I need to wrap output in an object first?

Hi! Recipes return a dictionary of components, that are then used to create a controller and start the server when executed on the command line. I think what you're looking for is the prodigy.serve method that does all of this for you – you can find the details in your PRODIGY_README.html.

A few things to note that are currently not ideal but a side-effect of running the recipes that way. (I'd love to come up with a slightly smarter recipe decorator and serve helper, but it'd probably be a breaking change).

  • prodigy.serve takes the recipe arguments as positional arguments in order. So even if you're not using one of the arguments (and wouldn't set it on the command line), you'll have to pass in None.
  • Some recipes use handy functionality provided by our command line parsing library plac and convert the incoming arguments. For instance, if you pass in --label PERSON,ORG, it will call a helper function to split the string and turn it into a proper list. When you run the recipes from Python, we typically don't need these helpers – so when you call the recipe function and it needs a list of labels, you'll need to pass in ['PERSON', 'ORG'] instead of 'PERSON,ORG'.

Finally, you could also execute Prodigy in a subprocess or using something like Fabric and compose the commands that way.