Is there a way to hide the value of the "text" key on the interface?

kaorisugi · June 19, 2020, 2:57am

Hi!

I'm making a custom recipe that uses textcat.teach. It is based on the source code of textcat_teach.py obtained from GitHub.It contains an interface that combines diff-view and classification-view with blocks.
When annotating, I only want to see the contents of the diff-view.
So is there a way to hide the value of the "text" key on the interface?

I tried removing the "text" key in the json task format, and only "added" and "removed" suitable for diff-view, but it caused a KeyError:'text'.

Here is my custom recipe.

@prodigy.recipe(
    "custom_tt_diff",
    dataset=("The dataset to use", "positional", None, str),
    spacy_model=("The base model", "positional", None, str),
    source=("The source data as a JSONL file", "positional", None, str),
    label=("One or more comma-separated labels", "option", "l", split_string),
    patterns=("Optional match patterns", "option", "p", str),
    exclude=("Names of datasets to exclude", "option", "e", split_string)
)

def textcat_teach(
    dataset: str,
    spacy_model: str,
    source: str,
    label: Optional[List[str]] = None,
    patterns: Optional[str] = None,
    exclude: Optional[List[str]] = None
):

    blocks = [{"view_id": "classification"},{"view_id": "diff"}]
    stream = JSONL(source)
    
    # Load the spaCy model
    nlp = spacy.load(spacy_model)
    model = TextClassifier(nlp, label)

    if patterns is None:
        predict = model
        update = model.update
    else:
        matcher = PatternMatcher(
            nlp,
            prior_correct=5.0,
            prior_incorrect=5.0,
            label_span=False,
            label_task=True,
        )
        matcher = matcher.from_disk(patterns)
        predict, update = combine_models(model, matcher)

    stream = prefer_uncertain(predict(stream))

    return {

        "dataset": dataset,  # Name of dataset to save annotations
        "stream": stream,  # Incoming stream of examples
        "update": update,  # Update callback, called with batch of answers
        "exclude": exclude,  # List of dataset names to exclude
        "view_id": "blocks",
        "config": {"lang": nlp.lang, "blocks":blocks}
    }

Here is json task format.

{
    "text": "This is a sample text. \n\n This is an sample text.",
    "added": "This is a sample text.",
    "removed": "This is an sample text."
}

ines · June 19, 2020, 10:03am

Hi! The "text" key should definitely be present in the underlying JSON data, because it's what the text classifier will use under the hood and what it will be updated with.

But the blocks let you override existing task properties for display in the UI, so you could do:

blocks = [{"view_id": "classification", "text": None},{"view_id": "diff"}]

This will prevent the classification UI from showing the "text" (which it otherwise would, by default).

kaorisugi · June 22, 2020, 4:48am

Thanks for the quick answer
I tried adding "text": Noneto the blocks in my custom recipe.
However, the following message was returned.

What could be the cause of this?

ines · June 22, 2020, 9:11am

Ah, sorry, could you try an empty string "" instead?

kaorisugi · June 23, 2020, 3:24am

Yes, I rewrote "blocks" like this.

blocks = [{"view_id":"classification","text": ""},{"view_id":"diff"}]

However, the content of the "text" remains visible, just with the addition of an empty line.

ines · June 23, 2020, 11:12am

Ahh, I think what might be happening here is that the diff UI can also render the text, if you provide it. That lets you stream in input text plus two diffed versions (e.g. for evaluating machine translation). What happens if you do {"view_id":"diff", "text": ""}?

Alternatively, if you just want the classification block for the label at the top and you don't care so much about the exact styling, you could also use a regular html block with an html_template and put the label in there.

kaorisugi · June 24, 2020, 3:01am

Wow! I solved it with your suggestion.
I should have added "text": "" to both classification and diff, like this:

blocks = [{"view_id":"classification","text": ""},{"view_id":"diff","text": ""}]

As you pointed out, I noticed that not only classification but also diff is rendering "text".
This will speed up the annotation. Thank you for your support!

ines · June 24, 2020, 12:59pm

Yay, glad it worked!

I'll add a fix for the error that came up with "text": None – there's no reason this shouldn't work and Prodigy should be able to handle it out-of-the-box.

kaorisugi · June 25, 2020, 9:59am

It would be nice to have a fix for "text": None, thank you

ines · July 2, 2020, 11:31am

Just released v1.10.1, which includes a fallback for None text values in the classification UI

Topic		Replies	Views
Adding a text box to a recipe usage , textcat , custom , solved	5	899	February 15, 2022
Extending UI to display additional fields for textcat.teach usage , textcat , custom , solved	2	806	January 30, 2019
Blocks not hiding text (using span_manual and choice view ids) textcat , spancat	7	202	January 18, 2024
Custom spacy pipe for Prodigy view textcat , spacy	2	670	November 21, 2019
Bug in textcat recipes when using blank textcat component in the loop	1	222	June 21, 2022

Is there a way to hide the value of the "text" key on the interface?

Related topics