Annotating multiple items in a view for clustering task

Hi everyone,

I would like to annotate the results of a clustering task. The idea is to randomly select 10 items (which are 1-2 sentences long texts) and ask the annotators to select those that should be in this cluster. To help, a title for the clustering (discriminative keyterms) will be shown. The annotator will check the items/texts that should belong to this cluster. I have used Prodigy for a couple of small tasks, but couldn't find a way to do this task as I would like to label 10 entries together.

Thank you very much for your help in advance!

Hi! I think the most straightforward solution would be to use the classification UI (which shows the "label" at the top) and add a "text" or "html" key to your source data that includes the combined texts, for example with line breaks:

{"html": "text 1<br /><br />text 2<br />...", "label": "CLUSTER TITLE"}

If you want, you could even add some more formatting, like <hr /> for a horizontal line to separate the examples.

All other properties of the JSON will be passed through and saved with the example in the database, so you could store the original examples / IDs / sentences with each combined example, so you'll be able to relate the annotations back to the original items later on. For instance:

    "html": "text 1<br /><br />text 2<br />...", 
    "label": "CLUSTER TITLE", 
    "orig_texts": ["text 1", "text 2", "..."]