Using set_recipe and/or Controller to compose dynamic recipes

Hello,

We are beginning to investigate composing dynamic recipes on demand, where the view id can be set dynamically, without composing a static recipe class for each view we would like to use.

In order to accomplish this, it seems like the prodigy.set_recipe method would be useful. I was wondering if there is an example of this method in use, as I'm uncertain what I would need to set as the 'function' param.

The other thought I had would be to init the Controller on the fly. So, if set_recipe can't serve this purpose, I was wondering if there's an example of serving a recipe from the Controller class.

For the most part we're not doing anything fancy or super custom, just pulling tasks from a database table based on project name.

Looking forward to your response

Hi @kwaddle ,

Each Prodigy recipe needs to run in its own isolated environment that includes a dedicated server, REST API, and web app. This isolation is essential for maintaining consistency in how data is modified and served to annotators. Because of this, there isn't a way to serve multiple recipes on the same server.

Regarding the set_recipe function, its purpose is specifically for registering custom recipes functions with Prodigy to make them available for the Controller just like the built-in ones.

Our recommended solution would be to spin up Prodigy server per annotation task. However, if you're working on annotations without a model in the loop, there is a approach you could try. The entire logic for defining annotation task is Python at the recipe level so you could implement a stream generating function that modifies tasks based on specific conditions.

def get_stream(examples, other_examples):
    for eg in examples:
        if SOME_CONDITION:
            yield eg
        elif SOME_OTHER_CONDITION:
            yield something_else
    for eg in other_examples:
        # and so on

To handle different user interfaces within your tasks, you can use view_id as a configuration parameter at the task level. This allows you to design streams where tasks are rendered with different UIs. When it comes to ensuring tasks are saved in the correct dataset, you'll want to implement the logic in the before_db callback using the DB API.

There are some important workflow considerations to keep in mind when implementing this kind of mixed approach. First, you'll need to decide whether you want to send any question to any annotator - if not, you'll need to implement a custom task router. Additionally, it's worth carefully considering the impact of task switching on your annotators. Experience shows that frequent task switching usually isn't beneficial for annotation quality and efficiency.

Hi @magdaaniol,

Thanks so much for the reply! This was very informative :slight_smile:

I think I might have miscommunicated what my goal is.

Right now, we are running individual projects, isolated in containers using kubernetes. Each project has distinct tasks and annotators. The issue for the time being is that we're juggling three recipes with six configurations based on our annotation lifecycle. This has led to bloated containers that take too long to start, and cost too much to run. I also feel our current solution is rather brittle and difficult to maintain, while only supporting ner.manual and review.

I'm wanting to streamline this by eliminating the static recipe files, and instead compose each recipe as the container is starting. Once the container is running, the intention would be for the view to remain unchanged, but I would like to be able to select any of the text-based interfaces on offer before or during container start.

I don't know that it's strictly necessary to name custom recipes, as our dataset names are dictated using our unique project codes. My thinking on this was that I could use the controller to compose and launch the recipe on the fly. I've found how to create the controller with all necessary components but haven't been able to figure out how to actually launch the recipe using the controller.

I have been able to dynamically compose our configurations based on the command options for each recipe. I would like to expand this to only needing one or two files to build a recipe on demand. In this way I hope I can speed up our images, as well as add a more robust and maintainable way to launch more than just the two recipes we're able to support now.

Hi @kwaddle,

You should be able to launch your custom Controller by directly calling the prodigy.app server with the instance of the Controller as argument:

from prodigy.components.stream import get_stream
from prodigy.core import Controller

stream = get_stream("../news_headlines.jsonl")
recipe_components_dict = {
    "view_id": "ner_manual",
    "dataset": "test",
    "stream": stream,
    "exclude": [],
    "config": {
        "lang": "en",
        "labels": ["ORG", "PER"],
        "exclude_by": "input",
        "ner_manual_highlight_chars": False,
    },
}

controller = Controller.from_components("ner.manual", recipe_components_dict)
from prodigy.app import server
server(controller, controller.config)

Here I'm using from_components method but you can, of course, initialize Controller class using the default constructor.
The first argument to from_components is the name of the recipe - that would work only if the name was registered using the @recipe decorator.

You can also delegate the Controller creation logic to Prodigy and use prodigy.serve top-level function (not sure if you've seen it). This is usually what we recommend for starting Prodigy programatically.
Does this help you to move forward?

1 Like

@magdaaniol

Thank you so much!!

Yes this definitely helps :slight_smile: