Hi,
I'm trying to adjust the texcat_teach
recipe to prefer_high_scores
.
For this I copied the texcat_teach
recipe from GitHub and pasted it to a new *.py file "custom_teach.py"
The default texcat_teach
recipe does seem to work fine and the server is being started like so:
$ prodigy textcat.teach nar_5 blank:en all_fin.txt --loader txt --label "nar_5_proration" --patterns anno_patterns.jsonl
Without changing anything in the code yet, when calling the recipe from the file I created with
$ prodigy textcat.teach nar_5 blank:en all_fin.txt --loader txt --label "nar_5_proration" --patterns anno_patterns.jsonl -F custom_teach.py
I get the error message
usage: prodigy textcat.teach [-h] [-l None] [-p None] [-e None] dataset spacy_model source
prodigy textcat.teach: error: unrecognized arguments: --loader txt
which seems odd, because should be the exact same code as in the default recipe. At this point, I did not even change to prefer_high_scores
.
I then added a loader for TXT to the recipe in my file.
Now there seems to be an issue with the db-out
jsonl file that I gave as patterns (even though this procedure did not raise any issues with the default recipe), it fails at the first line already:
`ValueError: Invalid pattern: {'text': 'It is anticipated that the Final Settlement Date will be July 7, 2020, the first business day after the Expiration Time.', '_input_hash': 964872305, '_task_hash': 137696480, 'options': [{'id': 'nar_sum_ins_per_bo', 'text': 'nar_sum_ins_per_bo'}, {'id': 'nar_sum_bo_discl', 'text': 'nar_sum_bo_discl'}, {'id': 'nar_sum_paper', 'text': 'nar_sum_paper'}, {'id': 'nar_sum_withdraw', 'text': 'nar_sum_withdraw'}, {'id': 'nar_1_details', 'text': 'nar_1_details'}, {'id': 'nar_2_consent', 'text': 'nar_2_consent'}, {'id': 'nar_3_instruct', 'text': 'nar_3_instruct'}, {'id': 'nar_4_proceed', 'text': 'nar_4_proceed'}, {'id': 'nar_5_proration', 'text': 'nar_5_proration'}, {'id': 'nar_6_doc', 'text': 'nar_6_doc'}, {'id': 'nar_7_restrictions', 'text': 'nar_7_restrictions'}, {'id': 'settle_date', 'text': 'settle_date'}, {'id': 'miex', 'text': 'miex'}, {'id': 'milt', 'text': 'milt'}, {'id': 'bois', 'text': 'bois'}, {'id': 'accrued', 'text': 'accrued'}, {'id': 'consent_fees', 'text': 'consent_fees'}, {'id': 'early_fees', 'text': 'early_fees'}, {'id': 'offer_price', 'text': 'offer_price'}], '_session_id': None, '_view_id': 'choice', 'config': {'choice_style': 'multiple'}, 'accept': ['settle_date'], 'answer': 'accept'}
So I decided to try a new file "nar_5_patterns" that should match the expected patterns syntax which looks like this:
{"label":"nar_5_proration","pattern":"As a result, no series of Notes accepted for 3 purchase will be prorated."} ...
with
$ prodigy textcat.teach nar_custom blank:en all_fin.txt --loader txt --label "nar_5_proration" --patterns nar_5_patterns.jsonl -F custom_teach.py
and I now get the error message
✘ Failed to load task (invalid JSON on line 1)
What am I missing?
Why would it work when loaded as default but not from a file? Is the recipe I got from GitHub outdated?
Is there another way to access and copy the textcat_teach
recipe? I figure it should be stored somewhere on my system as it is accessed when launching the server, but I couldn't find it yet.
Thanks in advance!