Blocks not hiding text (using span_manual and choice view ids)

mv3 · January 7, 2024, 6:13pm

Hi, I have the following code but can't seem to get text to hide:

@prodigy.recipe(  # type: ignore
    "asset.txtcat",
    label=Arg("--label", "-L", help="What label to use on classification", converter=lambda v: v.lower()),
    key=Arg("--key", "-K", help="See enum member of the Concept asset type."),
)
def asset_txtcat(label: str, key: str):
    blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]
    dataset = f"is-{label}"
    span_labels = ["concept"]

    def add_option(stream):
        for ex in stream:
            ex["options"] = [{"id": label, "text": f"Is this classified as {label}?"}]
            yield ex

    asset = Selection(key=key).asset
    stream = get_stream(asset.stream())
    stream = add_tokens(asset.nlp, stream)
    stream = add_option(stream)

    return {
        "view_id": "blocks",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "lang": asset.nlp.lang,
            "labels": span_labels,
            "blocks": blocks,
            "choice_auto_accept": True,
        },
    }

This sort of resembles the tutorial code and I see it follows the answer found in a previous question. The texts appear regardless if I make both text keys = None.

blocks = [
        {"view_id": "spans_manual", "text": None},
        {"view_id": "choice", "text": None},
    ]

Also happens with the classification view_id, e.g.:

# remove `add_option`()`, replace `blocks =...` with
blocks = [{"view_id": "spans_manual"}, {"view_id": "classification"}]

I tried manipulating the text to determine if text length / weird characters affected the display but I think I've ruled these factors out.

Hoping for some direction, thanks!

magdaaniol · January 10, 2024, 10:07am

Hi @mv3,

Which text are you trying to hide? The text field in the blocks definition is for defining optional text above the given block. So (to use the example from the tutorial you mention), the following definition:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]

results in in this UI:

while this definition:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": "The text in blocks config"},
    ]

will generate the follwoing UI:

mv3 · January 10, 2024, 10:47am

Hi @magdaaniol, the first example is what I'm trying to achieve, either with:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]

or with:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "classification", "text": None},
    ]

To add a small graphic:

Using text: None doesn't seem to work for me and the "text i want to hide" still remains visible.

Ideally, I'd only want to see the "textcat label" since "text i want to hide" is the same text as the one in the spans.manual block.

magdaaniol · January 12, 2024, 11:24am

Thanks @mv3, looking at your code I still do not see where the duplication would come from.
Could you share the structure of a single task after this step in your recipe code:

stream = get_stream(asset.stream())

Thanks!

mv3 · January 14, 2024, 9:41am

Certainly @magdaaniol. It looks like this:

{'text': 'Philippine Navy cannot validly invoke the doctrine of state immunity from suit.',
 'label': 'political.powers',
 'meta': {'segment_id': 'sample_id'},
 'spans': [{'start': 60,
   'end': 78,
   'token_start': 10,
   'token_end': 12,
   'label': 'concept'}],
 '_input_hash': 1122400349,
 '_task_hash': -1567215762}

magdaaniol · January 16, 2024, 5:05pm

Thanks @mv3! Now I see the duplicate info
It comes from the fact that you have the label atrribute on the top level in the task. UI interprets this label attribute as the classification view which requires text (the text disable option is not available here as classification is meant to be used for binary decision workflows).
When you think of it, it would be hard to curate the dataset when you can modify the span block, but can only accept or reject the textcat block.

If you know the textcat labels upfront, maybe it make sense to use them in options with the preannotated one alreadt selected i.e. stored under the accept attribute of the task.

Something like:

def asset_txtcat(source: str,label: str):
    
    dataset = f"is-{label}"
    span_labels = ["concept"]
    textcat_labels = ["political.power", "another_label"]

    blocks = [
        {"view_id": "spans_manual", "labels": span_labels},
        {"view_id": "choice", "text": None},
    ]

    def add_option(stream):
        for ex in stream:
            label = ex["label"]
            del ex["label"]
            ex["options"] = [{"id": label, "text": label} for label in textcat_labels]
            ex["accept"] = [label]
            yield ex

    nlp = spacy.blank("en")
    stream = get_stream(source)
    stream = add_tokens(nlp, stream)
    stream = add_option(stream)

    return {
        "view_id": "blocks",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "blocks": blocks,
            "choice_auto_accept": True,
        },
    }

which would result in:

mv3 · January 17, 2024, 1:11pm

Hi @magdaaniol, thank you so much for the clarification!

This is exactly what I was going for:

... you can modify the span block, but can only accept or reject the textcat block

I wanted to see if it was possible to use pre-labelled tasks where the labels number in hundreds. With your answer, I think I'll proceed with using the tiered approach with the choice interface in text classification, e.g. using political as the label instead of political.power.

Appreciate your taking the time to answer with replicable code!

magdaaniol · January 18, 2024, 10:09am

Hi @mv3 ,
Glad I could help Just to briefly follow up: the need for binary text classification comes from the high number of textcat labels, then. And you were planning to have a copy of a question per textcat label so that the annotators can answer yes or no to each label, is that so?

In that case, all the more reason (imo) to split this annotation project into span annotation phase and textcat annotation phase. Imagine the span needs curation, this means that the annotators might end up modifying the span in each copy of the question.

The tiered annotation alternative is another good solution for sure!

Topic		Replies	Views
'text' field of the 'view_id' is being overwritten by the result of span labeling textcat	1	215	March 31, 2023
Spans not displayed in classification view enhancement , textcat , front-end , solved	2	488	October 31, 2019
Trouble with blocks annotation api usage , custom , front-end	1	467	June 30, 2020
Textcat correct recipe usage , textcat , solved	1	628	September 16, 2020
Is there a way to hide the value of the "text" key on the interface? usage , textcat , done , custom , front-end , solved	9	563	July 2, 2020

Blocks not hiding text (using span_manual and choice view ids)

Related topics