Reusing existing recipe_args

I’m trying to write a custom recipe. I’m copying PLAC annotation from core.py:recipe_args. The type value for the argument is often the string <value is a self-reference, replaced by this string>. Presumably this is some internal placeholder used by Prodigy. How does it work?

In particular I’m asking because I’m trying to take a path to a match patterns file as a positional argument, so I’m copying recipe_args['patterns'] and changing it as necessary. If I leave that string in place I get the error

ValueError: '<value is a self-reference, replaced by this string>' is not callable

Presumably the Prodigy internals replaces this with the appropriate factory function.

Things seem to work if I omit this element from the PLAC tuple. Is there any reason I should keep it in?

Could you post an example of your code or just the recipe decorator and function arguments you’re using?

The recipe_args are Plac argument annotation tuples, to make it easier for Prodigy to reuse them across the individual recipes. So in your recipe, it should look something like this:

from prodigy import recipe_args

@prodigy.recipe('my_recipe'
    patterns=recipe_args['patterns'])
def my_recipe(patterns=None):
    print(patterns)

Under the hood, recipe_args is a dictionary that contains entries like this:

'patterns': ("Path to match patterns file", "option", "pt", Path)

This makes the argument an option that can be used as --patterns (if your argument is called patterns) or -pt, and the command-line input will be converted from a string to a pathlib.Path. You can also define your own argument annotations or leave them out completely (which means that all command-line arguments are going to be positional and passed in as strings).

@recipe('ner.print-pattern-stream',
        spacy_model=recipe_args['spacy_model'],
        patterns=(
                'Path to match patterns file',
                'positional',
                None,
                '<value is a self-reference, replaced by this string>'),
        source=recipe_args['source'],
        api=recipe_args['api'],
        loader=recipe_args['loader'])
def print_pattern_stream(spacy_model, patterns, source=None, api=None, loader=None):
    ...etc...

I copied the patterns tuple from recipe_args['patterns'], changing 'option' to 'positional' and removing the short argument name 'pt'.

If I change this to just patterns=('Path to match patterns file', 'positional') everything works fine.

Ahhh, I think I know what’s going on here.

The idea for the recipe_args is that they can be used to replace the tuples (see my example above) – not copied. So instead of the tuple, you can just use recipe_args['patterns'] in your recipes. This is the original tuple:

("Path to match patterns file", "option", "pt", Path)

In the compiled Python source, the Path reference (to pathlib.Path) seems to get replaced with that '<value is a self-reference, replaced by this string>' string. So if you just copy that output instead of using recipe_args['patterns'] in your script, you’ll end up with this string instead of the actual Path class.

The fourth argument of the tuple is the argument type or a converter function. When the argument is passed in from the command-line, this will be called on the argument value. This also explains why you see this error – you’re passing in a string, not a callable. Argument types can be built-ins like str, int or bool, but also other callables and functions. So in this case, patterns.jsonlPath('patterns.jsonl').

I copied and modified recipe_args instead of just dereferencing it because in my standalone recipe the patterns file should be a positional rather than an optional argument. This probably wouldn’t be the case if this functionality was incorporated into the ner.print-stream recipe.

I didn’t know about the Path converter function. Here I think I don’t need it because I end up writing

model = PatternMatcher(spacy.load(spacy_model)).from_disk(patterns)

Yes, this makes sense. I think in the long run, you’ll probably build up your own set of arguments annotations for your custom recipes that cover your needs and preferences. There’s actually quite a lot of cool stuff you can do with those (the Plac documentation has some more examples as well).

The PRODIGY_README.html also includes a list of the most important, reusable recipe_args and their annotations, including the type or converter function.