In order to allow faster annotation, the manual interface pre-tokenizes the text (so your selection can snap to the token boundaries). This means that single whitespace is used for splitting the words, e.g.
"Hello\nworld" will become
Additional whitespace will be preserved, though. The manual NER interface should then replace
\n with a
↵ character, to give you a visual indicator of the line break. The reason it works like that in manual mode (as opposed to just rendering a line break like in the other interfaces) is that you need a way of annotating the whitespace. Whitespace is important, because it can have an effect on the model – and the UI also needs to allow highlighting line break characters (which is very difficult if there’s no visual indicator).
Another thing to consider is that the manual interfaces (and pretty much all others) are really designed for shorter texts that you can focus on and work through quickly. So you might want to try adding more pre-processing to your contracts, and split them into paragraphs or even shorter units like sentences. This will also give you more “checkpoints” and save intermediate progress faster.
If you feel like you need the entire context of the contract to annotate the entities, it will actually be very difficult for your model to learn anything meaningful later on. The model is able to pick up on local context very well – but if it’s difficult for you, the human, to make the annotation decision based on the local context, it will be near impossible for the model to generalise any of those decisions.