newlines in relations annotation

AlejandroJCR · October 14, 2020, 3:26pm

Hello,
First of all thank you very much for the all new features

In our case we try to extract custom relations between entities that can be found in long sections of text.

The text may contain several paragraphs separated by one or several new lines. Currently, when we use rel.manual with the --wrap parameter over the entities that we had already annotate in ner.manual, the information about the layout is lost, this makes it a little bit more complicated to annotate the relationships.

Will be possible to visualize the line breaks in rel.maual similar to how it is done in ner.manual?

Will be also possible to customize the size of the space between lines when we use this interface?

Thank you very much for the great work as always! and very excited to experiment with all the new recipes

ines · October 15, 2020, 9:36am

Thanks, glad to hear the new features are useful

Oh, I totally thought it already did that Let me look into this, it's definitely something we want and should be easy to add. So I'll put this on my list for the next release.

Yes, there are two theme settings you can adjust: relationHeight (maximum height if line wrapping is disabled) and relationHeightWrap (maximum height with line wrapping). Also see here: Web Application · Prodigy · An annotation tool for AI, Machine Learning & NLP

AlejandroJCR · October 16, 2020, 8:02am

Thank you! for the quick response, I just tried the attributes relationHeightWrap and relationHeight and both worked good customizing the size of the line spacing.

hbredin · October 22, 2020, 12:10pm

This is also a feature I'd like to use.

Here is what a simple "hello\nworld" looks like in "ner.manual" view:

and how the same "hello\nworld" looks like in "relations" view:

Cheers,

Hervé.

ines · November 11, 2020, 9:44am

Update: Just released v1.10.5, which now correctly uses the symbols for newlines and tabs in the relations UI

hbredin · November 11, 2020, 10:49am

Awesome, thanks!

AlejandroJCR · November 11, 2020, 11:29am

Hi @ines, I just updated to the v1.10.5 and newlines are now displayed correctly with the ↵ character, thanks a lot! however, it would be very useful if their associated linebreaks were also visualized in the interface layout. Would it be possible to reproduce a similar behaviour as the hide_newlines ner config option in this interface?

hbredin · November 13, 2020, 1:23pm

I just checked and, indeed, new lines now appear as ↵ in relations. Thanks.

However, I confirm what @AlejandroJCR describes:
no actual line break is added.

It just shows

hello  ↵  world

instead of

hello  ↵  
world

even when hide_newlines is set to False.

david-waterworth · November 14, 2020, 11:21pm

This is similar to what I experienced with the ner_manual view. If I tokenised \n\t as as single token it displayed the control character symbols for both, added the newline (i.e. text was split over 2 lines) but didn't indent the next line. If I tokenised as \n,\t it seemed to work.

ines · November 16, 2020, 1:09am

Yes, that's currently expected – the relations UI just adds the symbols at the moment, it doesn't add actual line breaks. It should be possible to add line breaks for newline-only tokens and re-render the tokens accordingly, but I haven't looked into that yet. (Not sure how to solve newlines within tokens/spans in the relations UI, though, that's going to be quite difficult to visualise.)

The idea of the tab symbol is to have it replace the actual \t (like it's done in Word etc.)

niek · January 3, 2021, 5:02pm

Hi Ines,

I'm facing the same challenge with my relation extraction annotation step: it turns out to be very hard to interpret my texts without the proper line breaks that I was used to during the NER tagging step.

Just said that "It should be possible to add line breaks for newline-only tokens and re-render the tokens accordingly" but that's not something we as end users can do by tweaking javascript or css in the custom_theme configs, right?

Is this something that you may consider adding in a next version of Podigy?

If you have any suggestions on alternative solutions using custom recipes, UI's etc I would be happy to know too.

Many thanks

niek · January 12, 2021, 2:08pm

The closest I came to a workable solution is to pad all newlines with enough spaces so the token takes up the whole width:

I would also need to recalculate the indexes of the pre-annotated entities once before and once after the annotation step.

@ines could you give some insight into the possibility to solve this issue in Prodigy on short term? Maybe I can wait (and avoid the above workaround) if the timing permits.

Thx

ines · January 13, 2021, 5:32am

I can definitely implement the "true" newlines for the future, that should be no problem – it just needs some exprimentation, because this UI is a bit more complex than the others and canvas-based. (The relations UI was originally designed without newlines in mind, and I later added the icons for invisible characters later, so you're not just shown empty tokens.)

niek · January 13, 2021, 8:33am

Cool, thanks in advance! I will keep an eye on this topic then.

julietteBergoend · January 18, 2021, 9:01am

Hello Ines,
I'm facing the same problem as @AlejandroJCR and @hbredin in "relations" view. I am currently trying to annotate dialogues and it would be easier to have a separated line per locutor.
I keep an eye on this topic too, thank you

1danjordan · January 28, 2021, 5:29pm

I'm also interested in having newlines in the relationship interface!

ines · February 14, 2021, 10:56pm

Ah, almost forgot to update this thread: We released v1.10.6 yesterday, which includes support for "real" newlines in the relations UI. The newlines are added if wrapping is enabled and they currently collapse if you disable line wrapping.

AlejandroJCR · March 3, 2021, 12:53pm

Hey everyone, thank you very much for adding this new feature, it correctly displays the true newlines tokens for each annotation, however, I have noticed that after extending the token limit, I cannot more render long documents which encompass several paragraphs, because the browser threw an error or freezes, which it used to work in previous versions.

ines · March 4, 2021, 12:20am

Hmm, that's interesting I don't see how this could be related but then again, I don't think anything else changed in the interface. How do you have it configured? Are you using line wrapping or not? And how long are your documents (number of tokens)?

AlejandroJCR · March 4, 2021, 9:32am

Yes, I'm using the wrapping option, my documents are around 500-1500 tokens with a normal distribution of the size. Then token_limits > 750 are not working for now.

If might be useful, I already have a model in the loop which make predictions for each annotation, so at first glance, these are already populated, my use case has 21 relationable entities with 9 relations, then these are heavily populated tasks, could this influence some performance issues on the browser?

Topic		Replies	Views
Customizing rel.manual interface custom , front-end , relations	3	170	May 9, 2024
Wrap breaks for long documents ner , custom , relations	3	304	November 16, 2023
Change spacing for text in relationship view usage , front-end , solved	3	457	January 25, 2022
Highlighting individual characters in Relations UI usage , ner , relations	2	681	March 4, 2021
Rendering text in rels.manual as text usage , ner , front-end , relations	5	685	May 5, 2021

newlines in relations annotation

Related topics