Apply rule pattern only on first sentence in document

Hi there!

A very short question this time: I know the list of available token attributes for rule based matching (I'm using the entity-ruler). I have a rule that should only be applied on the first sentence of the doc.

When looking at the token attributes, I could use the index i or sent.start for this, but these don't seem available for the rule patterns. I tried for example

rule = {"label": "TEST", "pattern": [{"I": {"<": 10}}]}

but then I get the error

     83         DOCS:
     84         """
---> 85         matches = list(self.matcher(doc)) + list(self.phrase_matcher(doc))
     86         matches = set(
     87             [(m_id, start, end) for m_id, start, end in matches if start != end]

matcher.pyx in spacy.matcher.matcher.Matcher.__call__()

matcher.pyx in spacy.matcher.matcher.find_matches()

matcher.pyx in spacy.matcher.matcher.transition_states()

matcher.pyx in spacy.matcher.matcher.update_predicate_cache()

matcher.pyx in spacy.matcher.matcher._ComparisonPredicate.__call__()

TypeError: an integer is required

I would like to avoid to define a custom extension attribute just for this if there is an easier way I'm missing.

Thank you for your help!

Unfortunately this is a gap in spaCy at the moment --- there is indeed no attribute for this. We could add one, but in the mean time you'll need to use an extension attribute. I'm surprised this hasn't come up before actually.

It would be nice, if you could include this in the future, but for now I will stick to the extension attributes.

Thank you!