" As of v1.9, tokens containing only newlines (or only newlines and whitespace) are unselectable by default, so you can’t include them in spans you highlight. To disable this behavior, you can set "allow_newline_highlight": true in your prodigy.json ."
The actual behaviour is not what I'd expect. I noticed in one of my projects (labelled by someone else) that there were a bunch of labelled entities with newlines and whitespace at the end. I did some testing with a very simple example of text with two words separated by a combination of spaces an a newline. In the ner.manual view, the newline is correctly greyed out, and I can't select it on its own. I can, however, select an adjacent word along with the whitespace/newline, resulting in an entity span that ends with whitespace.
My expectation for default behaviour is that selecting only whitespace, or an entity span beginning or ending with a whitespace token, should not be possible. Selecting an entity that spans a newline token should probably be possible, although it suggests poor formatting of the input text.
could you share the text example that you've used? I'd like to make sure I can reproduce what you experience locally but I wasn't on some texts that I generated.
and the command prodigy ner.manual test blank:en example.jsonl --label ENT with prodigy==1.11.10 and spacy==3.5.0. I am able to select "has \n " or " \n lots" as an entity.
In my initial setup, I've been using Prodigy v1.11.7 and spaCy v3.4.1.
When I run your example, this is what the interface looks like:
I'm unable to select the newline and it seems like everything is working as expected.
I then tried again with Prodigy v1.11.10 and spaCy v3.5 and got the same results.
This is making me wonder if there's perhaps a global ~/.prodigy/prodigy.json file around that might have the "allow_newline_highlight": true setting. Could you varify that?
If not, is there something "special" about how you're making your selection? What browser/operating system are you using?
I forgot to ping back, but this issue should have been solved in v1.12.0 as one of the many bugfixes. I figured I'd check, does this issue still persist?