The docs state that the source argument (e.g. for textcat.teach) is optional and defaults to sys.stdin, however this does not work in practice:
> cat my_data.jsonl | python -m prodigy textcat.html.teach -l ACCEPT -F textcat_html.py --lo jsonl some_dataset en_core_web_md
usage: prodigy textcat.html.teach [-h] [-a None] [-lo None] [-l] dataset spacy_model source prodigy textcat.html.teach: error: the following arguments are required: source
How does your custom
textcat.html.teach recipe look? From the error message, it seems like the
source argument isn’t optional there.
In Prodigy’s built-in recipes, the
source argument should be optional and default to
get_stream helper then handles that and loads the source – either from a file with a given loader, or from
stdin if it’s
None. In a custom recipe, you can choose to do it the same way – or require the argument. That’s up to you.
Ah yes, looks like I copied the recipe from somewhere where source was indeed not optional by default. My bad. To be honest I forgot that I wasn’t using a built-in recipe - thanks for the quick help!