Custom Recipe - Stream Parameters

landon · November 9, 2018, 7:53pm

I wrote a custom recipe. I implemented both stream() and update() to both interface with an external API to GET messages, and POST annotations, respectively.

Naively, I would implement my GET messages endpoint to get the oldest “un-annotated” message, and then the POST marks the message as annotated. It would be great if I could further select by different filters, select by message_id, select by message type, etc. Is this possible?

This is my stream function:

def stream():
    while True:
        url = 'http://localhost:8000/message' # <--- i want to add parameters here, from the GET parameters in the browser
        yield requests.get(url).json()

Ideally, I was thinking that I could put GET parameters in my browser when using prodigy, and then pass those parameters to my API in my stream() function.

ines · November 13, 2018, 2:16pm

Hi! I hope I understand your question and use case correctly – you want to customise what’s fetched from your stream, right?

In that case, using custom arguments in your custom recipe might be the most convenient solution? All arguments to the recipe function automatically become arguments on the command line. So you could do something like this:

@prodigy.recipe('custom-recipe')
def custom_recipe(dataset, msg_type, order_by):  # <-- whatever you want

    def stream():
        while True:
            # add the parameters to your request
            params = {'type': msg_type, 'order_by': order_by}
            url = 'http://localhost:8000/message'
            yield requests.get(url, params=params).json()

    # other stuff here

    return {
        'dataset': dataset, 
        'stream': stream
        # etc.
    }

You can then execute the recipe from the command line like this:

prodigy custom-recipe dataset_name comment date -F your_recipe.py

The @recipe decorator also lets you define argument annotations in Plac’s argument annotations format. The descriptions will be shown when you run the recipe with --help on the command line. You can also define the argument type (so values will be converted accordingly), and whether it’s positional, an option or a flag. For example, here are some random made-up parameters:

@prodigy.recipe('custom-recipe',
    dataset=("The dataset to use", "positional", None, str),
    msg_type=("The message type", "option", "t", str),
    order_by=("Order stream by", "option", "o", str),
    per_page=("Items per response", "option", "p", int),
    include_title=("Include message titles", "flag", "i", bool)
)
def custom_recipe(dataset, msg_type=None, order_by='created', 
                  per_page=10, include_title=False):
    """Custom recipe that integrates API."""
    def stream():
        while True:
            params = { ... } # and so on

On the command line, you can then do the following to see the recipe description and arguments by typing --help:

prodigy custom-recipe --help -F your_recipe.py

And usage could look like this:

prodigy custom-recipe your_dataset --msg-type comment --order-by date --per-page 20 --include-title

Or, shortcuts:

prodigy custom-recipe your_dataset -t comment -o date -p 20 -i

landon · November 13, 2018, 5:44pm

Thank you @ines, but unfortunately that not what I’m looking for.

I’m looking for controls/parameters from the browser. Your approach would seemingly require restarting the prodigy process for each changing of a parameter. Put another way, what if I have two annotators, each working at the same time. I want to send one annotator to:

http://prodigy.mydomain.com:8080/?type=comment
http://prodigy.mydomain.com:8080/?type=question

–or–

http://prodigy.mydomain.com:8080/?source=our_blog
http://prodigy.mydomain.com:8080/?source=email_messages

I’d like to pass these GET params dynamically into the stream() function, onto my internal api requests such that they can provide different tasks accordingly.

Does that make sense? Is that possible?

ines · November 13, 2018, 6:13pm

Ah okay!

And no, the app is explicitly connected to one standalone Prodigy process. Even if it was possible to make the stream generator yield different things via the browser, there’s only one stream generator per process – and if it changes, it’d change for everyone connected to that process.

If you’re working with multiple annotators, you usually always want one process per user. The reason for this is that most Prodigy sessions are inherently stateful – for example, as soon as you’re putting a model in the loop or updating anything, you need a clear separation fo the models. You also usually want to store the annotations in different datasets and possibly with different metadata assigned, so you can evaluate and compare them more easily.

In your case, it looks like you already have the “one provider, multiple consumer” stuff figured out via your API (which is usually the “hard part”) – so it’d make much more sense to start the recipe multiple times on different ports with different parameters. @andy’s multiuser Prodigy extension has a nice example of this.

If you want to do this more elegantly via the browser, you could have a simple app that takes care of launching those processes automatically based on a URL. For example:

user accesses prodigy.yourdomain.com/?user=landon&type=question
your service starts your recipe script on an available port, creates a new dataset if necessary and passes in the parameters
once Prodigy is running, user is redirected to the web app on the given host/port and can start annotating
optional: kill process after certain period of inactivity and free up the port

Topic		Replies	Views
Dynamically add to stream during update usage , api	5	1030	September 1, 2019
Is it possible for me to control the entire active learning loop? custom , front-end	5	1213	March 20, 2018
Change annotation recipe when prodigy is already running usage	4	1433	December 2, 2018
Prodigy loads all the stream before annotation starts usage	6	825	March 26, 2019
Multiple annotators with different data api , solved	4	3206	January 12, 2018

Custom Recipe - Stream Parameters

Related topics