Does prodigy treat new line chars as escape sequences when displayed in the annotation tool?

shruthik · December 28, 2023, 9:38pm

Hi. I am testing out different annotation tools for my company's specific use case, and prodigy caught my eye, because it is easily integrated with spacy and I use that library for a lot of different NLP tasks. Before I move ahead with this tool, I had a few questions:

The text that needs to be annotated has a lot of '\n' chars (new lines). Some of the annotation tools I tested did not display the text in new lines, but as literal chars.
For eg:
Raw data:
I went to the grocery store.\nI bought 4 apples.
Text I want displayed on the tool for annotation:
I went to the grocery store.
I bought 4 apples.
Does prodigy preserve the original format of the raw data, and treat them as escape sequences when displayed in the annotation tool?
The raw data needs to come from and stored in Postgres DB. Is that possible with this tool?
How is the tool installed so that more than one person can use it.
Is there a trial version of the tool?

Thanks!

ryanwesslen · December 29, 2023, 5:48pm

hi @shruthik,

Thanks for your question and welcome to the Prodigy community

Check out the docs for details on how Prodigy handles new line characters:

Why does Prodigy add ↵ characters for newlines?¶

A newline only produces a line break and is otherwise invisible. So if your input data contains many newlines, they can be easy to miss during annotation. To your model however, \n is just a unicode character. If you’re labelling entities and accidentally include newlines in some of them and not others, you’ll end up with inconsistent data and potentially very confusing and bad results. To prevent this, Prodigy will show you a ↵ for each newline in the data (in addition to rendering the newline itself). The tab character \t will be replaced by ⇥, by the way, for the same reason.

As of v1.9, tokens containing only newlines (or only newlines and whitespace) are unselectable by default, so you can’t include them in spans you highlight. To disable this behavior, you can set "allow_newline_highlight": true in your prodigy.json.

As it also references, there are config settings you can modify to turn this off.

Yes. See the docs on how to modify to Postgres DB out-of-the-box. You may also appreciate how to use a Postgres DB from a cloud source (in this case Digital Ocean, but can modify as you deem fit) via Docker in the deployment docs.

Prodigy is a developer annotation tool, so it's priced by developer seats. Company licenses sell in packs of five developer seats, which means any five developers may have access to Prodigy's library and/or CLI. You may have unlimited annotators (i.e., only have access to the served annotation tool). You may also install it on unlimited machines. Please see the terms for more details.

Prodigy runs entirely on your own hardware and never phones home or connects to our servers. So we typically do trials by hosting a VM that you can log in to. This gives you the full experience of the tool, including the scriptable back-end, and also makes it easy for us to log in and help if you get stuck. If you’re interested, email us at contact@explosion.ai. Please note that we’re only able to offer VM trials to companies and organizations, not individuals.

Hope this helps!

shruthik · January 2, 2024, 1:52pm

This helps a lot. Thanks!

Topic		Replies	Views
multiline spancat usage , spancat	3	157	June 17, 2024
Double-spaces preventing manual span annotations Getting Started	1	26	May 13, 2025
Custom Span Categorizer - Linebreaks? usage , front-end , solved , spancat	2	607	December 31, 2021
Does Prodigy support HTML annotation for NER usage , ner	3	1212	December 1, 2022
Fully manual NER annotations without tokeniser enhancement , ner , done	3	996	June 17, 2020

Does prodigy treat new line chars as escape sequences when displayed in the annotation tool?

Why does Prodigy add ↵ characters for newlines?¶

Related topics