not able to get the example running. https://github.com/explosion/projects/tree/master/ner-food-ingredients

deewuok · December 18, 2023, 12:28am

py -m prodigy train --ner food_data en_vectors_web_lg ---paths.init_tok2vec .\tok2vec_cd8_model289.bin --eval-split 0.2 --output tmp_model

is not working... (I followed one of the discussions where --init_tok2vec was replaced.

--output is not working
C:\Users\dwu\Documents\ner_Prodigy\ner-food-ingredients>py -m prodigy train --ner food_data en_vectors_web_lg ---paths.init_tok2vec .\tok2vec_cd8_model289.bin --eval-split 0.2
Using CPU

========================= Generating Prodigy config =========================
Auto-generating config with spaCy
Generated training config

=========================== Initializing pipeline ===========================
✘ Error parsing config overrides
-paths -> init_tok2vec not a section value that can be overridden

I downloaded tok2vec_cd8_model289.bin it is the folder
the first three steps seem to work just not the training step

(these are listed below just for reference )

Create a phrase list using seed terms. Requires sense2vec and a vectors package.

py -m prodigy sense2vec.teach food_terms ./s2v_reddit_2015_md --seeds "garlic, avocado, cottage cheese, olive oil, cumin, chicken breast, beef, iceberg lettuce"

Convert the phrase list to a match patterns file.

py -m prodigy  terms.to-patterns food_terms --label INGRED --spacy-model blank:en > ./food_patterns.jsonl

Label data manually with help from the patterns.

py -m prodigy  ner.manual food_data blank:en ./reddit_r_cooking_sample.jsonl --label INGRED --patterns food_patterns.jsonl

magdaaniol · December 20, 2023, 4:37pm

Hey @deewuok ,

You might have already seen my answer to the very same question here

When you override config setting you should use double (not triple) dashes (I think there is spelling mistake in the original answer). So could you try:

 py -m prodigy train --ner food_data en_vectors_web_lg --paths.init_tok2vec .\tok2vec_cd8_model289.bin --eval-split 0.2 --output tmp_model

Thanks!

deewuok · December 22, 2023, 7:05pm

Two New problems. Dont know if this is the best place to post or start a new thread. (I'll guess I'll start the new thread for the issue with s2v_reddit_2015_md.tar.gz and 01_Preprocess_Reddit.ipynb. and just state the other issue here on --output

However

the command line you gave ran (except for --output). says it cant find it.
its not a big deal just letting you, but it would be nice to know if there is way to specify it for the future.

py -m prodigy train --ner food_data en_vectors_web_lg --paths.init_tok2vec .\tok2vec_cd8_model289.bin --eval-split 0.2 --output tmp_model
Using CPU
Traceback (most recent call last):
File "", line 198, in run_module_as_main
File "", line 88, in run_code
File "C:\Users\dwu\AppData\Local\Programs\Python\Python311\Lib\site-packages\prodigy_main.py", line 50, in
main()
File "C:\Users\dwu\AppData\Local\Programs\Python\Python311\Lib\site-packages\prodigy_main.py", line 44, in main
controller = run_recipe(run_args)
^^^^^^^^^^^^^^^^^^^^
File "cython_src\prodigy\cli.pyx", line 98, in prodigy.cli.run_recipe
File "cython_src\prodigy\cli.pyx", line 99, in prodigy.cli.run_recipe
File "C:\Users\dwu\AppData\Local\Programs\Python\Python311\Lib\site-packages\prodigy\recipes\train.py", line 288, in train
overrides = parse_config_overrides(list(_extra))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dwu\AppData\Local\Programs\Python\Python311\Lib\site-packages\spacy\cli_util.py", line 108, in parse_config_overrides
cli_overrides = _parse_overrides(args, is_cli=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\dwu\AppData\Local\Programs\Python\Python311\Lib\site-packages\spacy\cli_util.py", line 127, in _parse_overrides
raise NoSuchOption(orig_opt)
click.exceptions.NoSuchOption: No such option: --output

[starting issue about s2v_reddit_2015_md.tar.gz and 01_Preprocess_Reddit.ipynb in a new thread) and thank you for your help.

magdaaniol · January 3, 2024, 11:22am

Hi @deewuok,

I responded to your question about the input file in the dedicated thread here. Hope it helps!

About the traincommand:
To start with a tip: you can run the command with the -h flag ( for help) to quickly see all the available options. If you do that, you'll see that, indeed, there's no --output option. The location of the output model is the optional positional argument that should be listed first. So your command should be:

python -m prodigy train ./tmp_model --ner food_data en_vectors_web_lg --eval-split 0.2 --paths.init_tok2vec=.\tok2vec_cd8_model289.bin

Also note that the config overrides ( paths.init_tok2vec) in your case, should appear at the end.

Topic		Replies	Views
sense2vec ner usage , ner , spacy	1	298	October 6, 2021
command line error while pretraining ner_food_ingredients - "no such option: --init-tok2vec" usage , ner	4	649	December 20, 2023
pretrained tok2vec weights - prodigy v 1.11 bug , ner , spacy	5	735	October 21, 2021
Can't find model 'en_vectors_web_lg'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory. install , solved	3	3231	July 28, 2020
Prodigy sense2vec.teach recipe with gensim w2vec usage , spacy , terms , solved , sense2vec	3	605	March 6, 2021

not able to get the example running. https://github.com/explosion/projects/tree/master/ner-food-ingredients

Related topics