Yes, the pattern files support the same syntax as spaCy’s rule-based Matcher, so you can definitely write “smarter” token patterns. For example, here’s a pattern that matches the case-insensitive tokens “apple” and “iphone”, and an optional number token like “11”:
There’s a lot more you can do with token attributes here. If you’re starting off with a pre-trained model, you could also use part-of-speech tags. For example, only match “love” if it’s used as a verb and not as a noun.
Important note: The rule-based matching docs also describe some new features like the extended pattern syntax that are only available in spaCy v2.1. Those are also marked with a little “2.1” tag. You’ll be able to use those once the new version of Prodigy for spaCy v2.1 is available – see here for details.
Just keep in mind that the current version of Prodigy isn’t compatible with spaCy v2.1 yet – we want to make sure the new version is 100% stable and tested by everyone before we ask Prodigy users to retrain their models. See here for details: