(Where programming_langs is a text file containing the names of programming languages, one per line.)
I actually haven’t been able to figure how to convert a list of words to a list of patterns using Prodigy recipes. What’s an example of how to do that?
You can still omit the argument, though, which will print the individual patterns on the command line, so you can pipe them forward or use less to view them:
prodigy terms.to-patterns programming_langs --label PROG_LANG | less
programming_langs (or the first argument for that matter) should be the name of a dataset containing the terms. This is because the recipe is originally intended to be used with terminology lists created by terms.teach. If you already have a text file, you'll need to add it to a dataset first (which is easy, because db-in supports the same loaders as the other streaming recipes):
This usually indicates that the encoding of the file isn't valid utf-8 (unicode). Could you try explicitly chaning the encoding to utf-8? For example from the command line using a tool like iconv or in your text editor (e.g. in Visual Studio Code: Change encoding > UTF-8).