EDIT: I still don't know what I did wrong, or why the patterns file was not working. But I did eventually figure out how to solve the problem. I ran the dataset through Prodigy again and accepted all of the entries. Then, I used the terms.to-patterns recipe to export it another patterns file. It is working great so far.
Still, two-three questions remain if you have a moment.
I could find only a few references to that error message online and nothing that explained what it meant. What does it mean? What would cause it?
Is it possible to include or "embed" the USDA seven-digit ID number for each ingredient in the patterns file for each ingredient?
I'm thinking the answer is a "no." As a follow-up, do you have any thoughts on how I could link that ID, which contains the nutritional breakdown of each ingredient, to their counterparts in the patterns file?
(I currently have it all linked in a large csv file).
I received this error message. Unfortunately, Google shows no matches for this phrase: "Cannot add pattern for zero tokens to matcher". When I searched for help here, I could not find any reference to it either.
Could you help explain it to me? And how can I fix it?
If I run my code using your food_patterns.jsonl file, it works fine. If I run it with my food_patterns_rp file (see below), I receive the error message as shared in the screenshot below, despite no visual differences that I can detect.
I worked on the problem all weekend, but no luck so far.
As I mentioned in a previous post, the purpose of my app is to read a culinary recipe as a block of text and return a nutritional profile of that recipe. I studied your related video for initial guidance (Training a NAMED ENTITY RECOGNITION MODEL with Prodigy and Transfer Learning - YouTube) and modeled my patterns file after the food_patterns.jsonl file at this link:
There was one exception, but I removed it.
Because I need to retrieve the USDA nutritional breakdown for each ingredient, I had amended the food_patterns.jsonl format to include the corresponding USDA ID for that ingredient in my patterns file. I need SOME way to reference it, after it matches an ingredient on file to one in the recipe.
I hope that makes sense. The app reads the recipe. It finds an ingredient that matches one in the patterns file. Once that match is made, I can find the nutritional values for that ingredient by referencing the USDA ID. But I cannot do that without pinning that ID to that ingredient in the patterns file (or finding some workaround).
But I am getting ahead of myself a little...
The BIG problem right now is that I cannot create a patterns file that works.
And I am stuck, lost at sea.
This first screen shot shows the entire Traceback message.
This second screenshot show the relevant code I am working with:
Here is also a copy of my current patterns file (there have been many).
I have poured over this for hours and I see no visual difference between this dysfunctional file (that throws the error message above) and the original food_patterns. jsonl that works just fine. Would you please take a look and let me know?
I greatly appreciate your assistance.
food_patterns_rp.jsonl (464.7 KB)