ValueError: [E012] Cannot add pattern for zero tokens to matcher.

EDIT: I still don't know what I did wrong, or why the patterns file was not working. But I did eventually figure out how to solve the problem. I ran the dataset through Prodigy again and accepted all of the entries. Then, I used the terms.to-patterns recipe to export it another patterns file. It is working great so far.

Still, two-three questions remain if you have a moment.

  1. I could find only a few references to that error message online and nothing that explained what it meant. What does it mean? What would cause it?

  2. Is it possible to include or "embed" the USDA seven-digit ID number for each ingredient in the patterns file for each ingredient?

  3. I'm thinking the answer is a "no." As a follow-up, do you have any thoughts on how I could link that ID, which contains the nutritional breakdown of each ingredient, to their counterparts in the patterns file?

(I currently have it all linked in a large csv file).

Thanks. :slight_smile:

Original Post:

Hello.

I received this error message. Unfortunately, Google shows no matches for this phrase: "Cannot add pattern for zero tokens to matcher". When I searched for help here, I could not find any reference to it either.

Could you help explain it to me? And how can I fix it?

If I run my code using your food_patterns.jsonl file, it works fine. If I run it with my food_patterns_rp file (see below), I receive the error message as shared in the screenshot below, despite no visual differences that I can detect.

I worked on the problem all weekend, but no luck so far.

As I mentioned in a previous post, the purpose of my app is to read a culinary recipe as a block of text and return a nutritional profile of that recipe. I studied your related video for initial guidance (Training a NAMED ENTITY RECOGNITION MODEL with Prodigy and Transfer Learning - YouTube) and modeled my patterns file after the food_patterns.jsonl file at this link:

There was one exception, but I removed it.

Because I need to retrieve the USDA nutritional breakdown for each ingredient, I had amended the food_patterns.jsonl format to include the corresponding USDA ID for that ingredient in my patterns file. I need SOME way to reference it, after it matches an ingredient on file to one in the recipe.

I hope that makes sense. The app reads the recipe. It finds an ingredient that matches one in the patterns file. Once that match is made, I can find the nutritional values for that ingredient by referencing the USDA ID. But I cannot do that without pinning that ID to that ingredient in the patterns file (or finding some workaround).

But I am getting ahead of myself a little...

The BIG problem right now is that I cannot create a patterns file that works.

And I am stuck, lost at sea.

This first screen shot shows the entire Traceback message.

This second screenshot show the relevant code I am working with:

Here is also a copy of my current patterns file (there have been many).

I have poured over this for hours and I see no visual difference between this dysfunctional file (that throws the error message above) and the original food_patterns. jsonl that works just fine. Would you please take a look and let me know?

I greatly appreciate your assistance.

Thanks again.

food_patterns_rp.jsonl (464.7 KB)

.

Sorry you had so much trouble with this. Looking at your patterns file, you have several lines like this:

{"label": "INGRED", "pattern": []}

The pattern is empty, so the matcher can't match it - that's what a "pattern for zero tokens" is. I guess the script you use to generate your patterns has an issue somewhere. If you remove those lines your file should work.

As to embedding the IDs for things, there's a feature in the EntityRuler for that. You can check the docs for details, but basically you just add an ID field:

{"label":"INGRED", "pattern":[...], "id": "1234567"}

and that will be available on any matches of that specific pattern.

1 Like

Oh, please don't worry about it. I am having a good time. I know I am flying blind and that means I will make mistakes. And it will take me more time in the beginning.

Its a great tool, making it possible for me to take on projects otherwise beyond my grasp.

Thanks for your response.

1 Like