Thanks @ines - I assumed that was the reason for displaying whitespace characters. In my case, I will have a couple of subject matter experts doing some labeling of documents with a familiar format to them. The important thing I’m missing with the current rendering is the visual cues from paragraph breaks and bulleted lists and similar. If I either just remove the whitespace or break my training examples down into smaller chunks (e.g. just showing text between ‘\n\n’ tokens), it will take them much longer to go through the documents we want to label.
For later modeling efforts on this task there is no semantic difference between ‘\n’ and ’ ', and it doesn’t really matter to me if trailing or preceding ‘\n\n’ tokens are captured (I can just remove them from training data or model outputs, they have no importance to the task at hand).