Does Prodigy supports hierarchical annotation?

ines · February 28, 2019, 8:27am

Prodigy itself is pretty agnostic to what your annotations “mean” so you can definitely build a workflow like this. The only thing that’s kinda built-in is a strong focus on single decisions at a time and automating as much as possible.

One approach could be to use the choice interface and start by annotating the top-level buckets like A, B and C, without worrying about the lower-level categories. See here for an example recipe code. In the next step, you can then stream in the examples again and add different options, based on the top-level category that was selected – for example, A1 and A2 for A and so on. Prodigy streams are regular Python generators, so you can automate all of this logic by putting it in a function that yields annotation examples. For example, something like this:

hierarchy = {'A': ['A1', 'A2'], 'B': ['B1', 'B2'], 'C': ['C1', 'C2']}

def get_stream(examples):
    for eg in examples:   # the examples with top-level categories
        top_labels = eg['accepted']  # ['A'] or ['B', 'C'] if multiple choice
        for label in top_labels:
            sub_labels = hierarchy[label]
            options = [{'id': opt, 'text': opt} for opt in sub_labels]
            # create new example with text and sub labels as options
            new_eg = {'text': eg['text'], 'options': options}
            yield eg

Doing the levels in separate steps also allows you to iterate faster if you end up having to adjust the annotation scheme. Not all schemes are set in stone and if your annotators struggle with a top-level decision like B vs. C, they’ll likely also struggle with the lower-level decisions. So ideally, you want to find out about this as early as possible and before you commission the full fine-grained annotations on your entire corpus.

Topic		Replies	Views
Hierarchical labels	3	402	May 16, 2023
hierarchical text classification using spancat and potentially expanding/hiding label subclasses as they come in context textcat , front-end , spancat	6	473	September 21, 2022
Hierarchal text classification trouble shooting usage , textcat	5	541	August 17, 2021
prodigy use case for annotation having pre-annotated text usage , solved	8	1263	March 11, 2019
textcat: 2-level hierarchical classification textcat	15	671	July 5, 2023

Does Prodigy supports hierarchical annotation?

Related topics