Marking sentences for classification

spatiebalk · April 21, 2020, 2:38pm

I am currently using Prodigy to classify short texts. However, I would like to classify these short texts sentence by sentence. Instead of viewing and classifying each sentence separately, I was wondering whether it is possible to view text snippet by text snippet and then highlight the positive sentences in that current text snippet (in the same way that you can highlight tokens when you are training a NER model)?

ines · April 21, 2020, 3:51pm

Hi! There are several ways you could solve this and I think it's definitely a good idea to focus on making each sentence a selectable unit (and not highlighting each sentence token-by-token, which would be pretty inefficient).

One simple approach would be to use the choice interface with "choice_style": "multiple" and create one option per sentence, maybe grouped by paragraph or some other sensible unit. Then you can go through them and select all positive sentences, either by clicking on them, or using keyboard shortcuts.

Alternatively, a similar use case was posted on the forum a while ago and they ended up using the manual span interface, but feeding in data with a "tokens" property, but with one sentence as a "token". This would let you view the sentences in their natural flow, and you could double-click on them to select them, and even assign them different labels. See here for details:

spatiebalk · April 28, 2020, 12:55pm

Hi Ines!

Thank you for your quick response. By formatting the data in such a way that each sentence was a token (so the second option) it gave the desirable behavior!

However, the lay-out is now quite odd of the marked sentences. I read the other post you referred to, but couldn't quite figure out how to adjust the shape of the highlighted areas. Could you possibly help me with this as well?

Thanks in advance!

ines · April 28, 2020, 8:11pm

Glad it worked! And try setting display: block on the sentence "tokens", as described here to make the selection a block element:

Topic		Replies	Views
Workflow for sequential sentence classification usage , textcat , custom	6	955	May 15, 2020
Annotation for document segmentation usage , custom , front-end , solved	4	899	March 10, 2020
Classify sentences with paragraph visible usage , front-end , solved	3	492	January 30, 2023
Annotating sentence in text? usage , textcat	1	333	August 30, 2021
Sentence fragments in context for classification labeling task. ner , textcat , front-end	1	436	September 8, 2020

Marking sentences for classification

Related topics