Is it possible to use model-in-the-loop with multi text classification using the "choice" view_id?

I tried adapting the example textcat_teach from https://github.com/explosion/prodigy-recipes/blob/master/textcat/textcat_teach.py so that instead for view_id it would not be set to "classification" but to "choice".

Starting it up with the command PRODIGY_LOGGING=verbose python -m prodigy textcat.teach your_datas en_core_web_sm ./news_headlines.jsonl --label PERSON,ANIMAL,ROBOT -F prodigy-recipes/textcat/textcat_teach.py works but when loading the site in the browser it crashes. The console doesn't display anything error-related.

Several other modifications also did not work out. So I'm not sure, is the "choice" view possible at all with model-in-the-loop?

Hi! It's definitely possible – we just haven't tested how well it works in terms of the active learning, and you'd have to adjust the recipe so that it actually adds "options" with all labels you provide to each outgoing task. Otherwise, the choice interface can't render it (also see here for the expected format).

Some more threads on the topic with examples:

Thank you Ines, I could get it working with what is exemplified by this code:

import prodigy
import spacy


def get_basic_text_stream():

    yield "The story so far:"
    yield "In the beginning the Universe was created."
    yield "This has made a lot of people very angry "
    yield "and been widely regarded as a bad move."


choice_options = [
    {"id": 0, "text": "category A"},
    {"id": 1, "text": "category B"},
    {"id": 2, "text": "category C"},
]


def stream_pre_annotated():

    nlp = spacy.blank("en")
    nlp.add_pipe(nlp.create_pipe("textcat"))
    nlp.from_disk("./dummy_model")

    options = choice_options

    for text in get_basic_text_stream():

        cat_scores = nlp(text).cats
        options_accepted = []

        for o in options:
            if cat_scores[o["text"]] >= 0.5:
                options_accepted.append(o["id"])

        yield {
            "text": text,
            "options": options,
            "accept": options_accepted
        }


@prodigy.recipe("prodigy_textcat_pre_annotated_id")
def custom_recipe():

    return {
        "view_id": "choice",
        "dataset": "prodigy_standalone_dataset",
        "stream": stream_pre_annotated(),
        "config": {"choice_style": "multiple"}
    }


prodigy.serve("prodigy_textcat_pre_annotated_id")
1 Like