I'm trying to set up a multi label text classification but fail at the basics of replicating and modifying recipes from this repo https://github.com/explosion/prodigy-recipes
Firstly: Problems with basic examples
I can't get the a provided recipe running as communicated in the repo. There it says to run (as an example NER):
python -m prodigy -F prodigy-recipes/ner/ner_teach.py
which results in this error:
✘ Can't find recipe or command '-F'.
Run prodigy --help to see available options
Could this be outdated? Since all the other run commands I've seen around prodigy documentation mostly pass further arguments such as datasets, models, sources, etc. Addiationally the method signature (and decorator arguments) in ner_teach.py
also indicates the need to pass arguments in the command line.
Secondly: Problems with adapting examples
Now, since I need text classification I've run the textcat_* scripts in the repo without arguments, to no avail, and then with arguments. In order to zero in on error causes I've cut away a lot of code from the original function. After that and after running with respective arguments in the command line a prodigy instance is successfully started up and sample text is correctly loaded, however the label choice form is missing.
The code is this:
import prodigy
from prodigy.components.loaders import JSONL
from prodigy.models.textcat import TextClassifier
from prodigy.util import split_string
import spacy
from typing import List, Optional
@prodigy.recipe(
"foo_cat",
dataset=("The dataset to use", "positional", None, str),
spacy_model=("The base model", "positional", None, str),
source=("The source data as a JSONL file", "positional", None, str),
label=("One or more comma-separated labels", "option", "l", split_string),
)
def foo_cat(
dataset: str,
spacy_model: str,
source: str,
label: Optional[List[str]] = None,
):
nlp = spacy.load(spacy_model)
model = TextClassifier(nlp, label)
update = model.update
stream = JSONL(source)
return {
"view_id": "classification",
"dataset": dataset,
"stream": stream,
"update": update,
}
and the command is this:
prodigy foo_cat some_dataset "de_core_news_sm" news_headlines.jsonl -F prodigy-recipes/textcat/foo_cat.py --label bla,ble,blo
Which runs and results in the web server looking like this:
And here the label choice form is missing. I'd like to have a non-mutually-exclusive form like this here: https://prodi.gy/docs/text-classification#manual
How to do this please?
prodigy stats:
============================== ✨ Prodigy Stats ==============================
Version 1.9.9
Location /home/steff-vm/mara/acdh-prodigy-utils/venv/lib/python3.6/site-packages/prodigy
Prodigy Home /home/steff-vm/.prodigy
Platform Linux-5.3.0-51-generic-x86_64-with-Ubuntu-18.04-bionic
Python Version 3.6.9
Database Name SQLite
Database Id sqlite
Total Datasets 9
Total Sessions 105
Cheers,
Stefan