Since Prodigy exports the patterns files, I am assuming it is a fair question to ask here.
I asked the same question on the Spacy github support page after I received the same error message for another patterns file. The two people who responded were a bit mystified. It is not a big deal to re-export the files via Prodigy when I receive this message for the smaller patterns files, which resolved the problem the first time.
In this case, there are about 6,000 ingredients in ing_patterns.jsonl.
Can you please explain what this error message means? And how I might fix it? It keeps popping up - though not consistently - for reasons unknown. It surprises me, but I found no reference to a resolution anywhere.
I tried playing with the file names, and the file locations, but it doesn't seem to make a difference. I don't know why the system expects to find a config file. I have a config and base_config file in the main folder, set to defaults. And there is no apparent, direct relationship between those files and these files. Or none that I could find.
Here is all the Spacy-related code from that page:
nlp = spacy.load("en_core_web_lg")
@Language.component("set_custom_boundaries")
def set_custom_boundaries(doc):
for token in doc[:-1]:
if token.text == '\n':
doc[token.i + 1].is_sent_start = True
return doc
nlp.add_pipe("set_custom_boundaries", before="parser")
rulerIngs = nlp.add_pipe("entity_ruler", name="rulerIngs", before="ner")
rulerUms = nlp.add_pipe("entity_ruler", name="rulerUms", before="ner")
rulerAmts = nlp.add_pipe("entity_ruler", name="rulerAmt", before="ner")
rulerMods = nlp.add_pipe("entity_ruler", name="rulerMods", before="ner")
rulerIngs.from_disk("patterns/ing_patterns2021-10-11.jsonl")
rulerUms.from_disk("patterns/ums_patterns2021-10-11.jsonl")
rulerAmts.from_disk("patterns/amts_patterns2021-10-16.jsonl")
rulerMods.from_disk("patterns/mods_patterns2021-10-11.jsonl")
Here is the entire error message:
(venv) C:\Users\rober\food_ner>python test.py
Traceback (most recent call last):
File "C:\Users\rober\food_ner\test.py", line 45, in <module>
rulerIngs.from_disk("ing_patterns2021-10-11.jsonl")
File "C:\Users\rober\AppData\Local\Programs\Python\Python39\lib\site-packages\spacy\pipeline\entityruler.py", line 429, in from_disk
from_disk(path, deserializers_cfg, {})
File "C:\Users\rober\AppData\Local\Programs\Python\Python39\lib\site-packages\spacy\util.py", line 1225, in from_disk
reader(path / key)
File "C:\Users\rober\AppData\Local\Programs\Python\Python39\lib\site-packages\spacy\pipeline\entityruler.py", line 428, in <lambda>
deserializers_cfg = {"cfg": lambda p: cfg.update(srsly.read_json(p))}
File "C:\Users\rober\AppData\Local\Programs\Python\Python39\lib\site-packages\srsly\_json_api.py", line 51, in read_json
file_path = force_path(path)
File "C:\Users\rober\AppData\Local\Programs\Python\Python39\lib\site-packages\srsly\util.py", line 24, in force_path
raise ValueError(f"Can't read file: {location}")
ValueError: Can't read file: ing_patterns2021-10-11.jsonl\cfg
Thank you in advance for your feedback.
Robert
Here is also a link to the discussion on the Spacy support page, after I received the same message for a different patterns file.