Can NER or Text Classification help with basic numerical conversions?

For example, I used Prodigy NER to tag units of measurement used in American cooking like cups, teaspoons, tablespoons, etc...As shown in the patterns file line below, I included the weight in grams for each unit. I need to convert each unit to grams to ultimately produce a nutritional profile of the entire recipe.

How can I - for one example - use Prodigy to replace one cup for 128 grams? I mean, can the patterns file be used to replace words with their corresponding amounts? If not, is there another way Prodigy can assist with developing the Python dictionaries needed to perform these conversions?

Examples of patterns file I refer to in the question:

`{"label": "UMS", "pattern": [{"lower": "cup"}], "grams": "128"}`
{"label": "UMS", "pattern": [{"lower": "tablespoon"}], "grams": "14.3"}
{"label": "UMS", "pattern": [{"lower": "quart"}], "grams": "946"}

I was able to answer this question, and its all very easy to do.

Just replace the word "grams" in the patterns file above with the word "id," and you can access these amounts with a simple list comprehension:

grams = [ent.ent_id_ for ent in doc.ents if ent.label_ == 'UMS' and ent.ent_id_ ]

UMS is the NER tag I use for units of measurement used in cooking.

If the entities' tag is UMS, this list returns its value in grams (for water). Fortunately, the USDA produces spreadsheet for density ratios, so I can adjust the weight depending on the ingredient. For example, one cup of olive oil has a density ration of .9 because it is 10% lighter than water. So I can multiply this number by .9 if the ingredient is olive oil.

Obscure question. But one I could answer.

1 Like