Highlighting the matching words for text classfication

justindujardin · February 7, 2018, 2:20am

I created variants on textcat.teach and texcat.eval that render examples using a custom template and attention weight data from @honnibal’s example.

The code is on Github, and it renders text where items with more attention have larger fonts, and those that have a bunch of attention get a special color.

attention_weights

I transform the attention weights into metadata for use in the HTML template:

def attach_attention_data(input_stream, nlp, attn_weights):
    """Attach attention weights to token data with each example"""
    for item in input_stream:
        tokens_data = []
        attn_weights.clear()
        doc = nlp(item['text'])
        for index, token in enumerate(doc):
            weight = float(attn_weights[0][index][0])
            # If the change is over some threshold, add color to draw the eye
            color = 'rgba(255,0,0,0.54)' if weight > 0.025 else 'inherit'
            tokens_data.append({
                't': token.text_with_ws,
                'c': color,
                's': min(2.5, 1 + weight * 2),
                'w': weight
            })
        item['tokens'] = tokens_data
        yield item

The template loops over tokens and renders each one with a span and custom styling:

<div>{{#tokens}}<span style="font-size:{{s}}em; color:{{c}};">{{t}}</span>{{/tokens}}</div>

It’s all wired up by attaching it to the stream inside the recipe:

@recipe('attncat.eval',...)
def evaluate(...):
    ...
    nlp = spacy.load(...)
    textcat = nlp.get_pipe('textcat')
    assert textcat is not None
    with get_attention_weights(textcat) as attn_weights:
        stream = ...
        stream = attach_attention_data(stream, nlp, attn_weights)
    return {
        'view_id': 'html',
        'stream': stream,
        ...
        'config': {..., 'html_template': template_text}
    }

    ...

Hope it helps. If the custom template does not appear to be working, be sure you do not have an entry for html_template in your prodigy.json file, because it will override the recipe.

Topic		Replies	Views
Model explanation enhancement , usage , custom	5	1185	May 17, 2023
visualisation text classification results \| print-stream and extraction of text usage , textcat	3	430	October 1, 2021
Best approach for using ner manual and mark usage , ner , solved	22	2345	January 20, 2020
textcat.teach not taking into account label value textcat , done	4	602	December 7, 2018
How textcat.teach works under the hood usage , textcat	16	94	March 26, 2025

Highlighting the matching words for text classfication

Related topics