For some values of n
and m
, document.char_span(n, m)
returns None
. Will it only return a Span
object if the character offsets coincide with token boundaries? That seems to be what is happening, but I don’t see this mentioned in the documentation.
Yes, that’s correct. Doc.char_span
returns None
if the if the character indices don’t map to a valid span. I’ve just updated this in the documentation to make it more clear. Thanks!
In the future, I think a better place for reporting spaCy-only problems like this one is the spaCy issue tracker or for usage questions, StackOverflow. This way, more people will see it and if something is a bug, it’ll make it easier for us to track the changes on GitHub.
Posting spaCy-only questions on StackOverflow with the tag spacy
. I already have a new one up there.