Hi, I have a data set of recipes. I'm trying to add mutually exclusive categories to each recipe. To start, I tried to identify just desserts.
When I ran that experiment then tested the model in Spacy, I got really great results.
Now, I'm trying the experiment again. This time by categorizing recipes as Desserts or Cocktails. When I run the model against test results I see that each entry has both Desserts and Cocktails.
I see some references to an --exclusive flag in testcat.batch-train but I'm using Prodigy 1.9 and was directed by the application to use "train testcat." That command doesn't have an -E flag.
My workflow looks like this:
- build a seed list for desserts following your example in the Insults video
- export that seeds list as patterns file.
- build a seed list for cocktails
- export that seeds list as a patterns file
- merge the two patterns file
- Textcat.teach against a sample dataset using the merged patterns file
- Train the model.
- Run spacy test code against a second data set.
An example output from my test code:
slow cooker ground beef stroganoff slow cooker gro,{'DESSERTS': 0.0001183362037409097, 'COCKTAILS': 4.539786823443137e-05}
So, a couple of questions.
- What should I be doing differently?
- Is there a more efficient workflow?