textcat by sentence given context of larger document

My dataset is a series of conversations between two people. I would like to categorize each sentence of this conversation but I would like to see the context of the conversation when I perform the annotation.
Do you have any suggestions?

Just to make sure I understood the question correctly: You want the classifier to only see one sentence at a time and label that, but in the annotation interface, you also want to see the content?

One idea could be to use the "html" interface and add a "html" property to your task that contains the full text and highlights the sentence you’re currently labelling. You can then store the original sentence text in the "text". For example:

{
    "html": "Sentence one. <strong>Sentence two.</srong> Sentence three",
    "text": "Sentence two.",
    "label": "Some label"
}

This should be pretty easy to do programmatically. When you annotate the examples with the "html" interface, Prodigy be rendering the HTML – but when you train the classifier later on, Prodigy will use only the "text" and the "label", both of which will be preserved in the dataset.

Alternatively, you could also use a custom HTML template and add the context as separate keys to your task (e.g. one for the prefix and one for the suffix). All task properties will become available as template variables. So a template like this…

<h2>{{label}}</h2>
{{before_text}} <strong>{{text}}</strong> {{after_text}}

… can be populated with data like this:

{
    "text": "Sentence two.",
    "before_text": "Sentence one.",
    "after_text": "Sentence three.",
    "label": "Some label"
}

If you want this to be fancier – and if you can be bothered – you could even add some styling to your template to format it more like a conversation. Even chat bubbles or something! :wink: (I’ve always wanted to build an interface like this for Prodigy actually, haha.)