Blocks not hiding text (using span_manual and choice view ids)

Hi, I have the following code but can't seem to get text to hide:

@prodigy.recipe(  # type: ignore
    "asset.txtcat",
    label=Arg("--label", "-L", help="What label to use on classification", converter=lambda v: v.lower()),
    key=Arg("--key", "-K", help="See enum member of the Concept asset type."),
)
def asset_txtcat(label: str, key: str):
    blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]
    dataset = f"is-{label}"
    span_labels = ["concept"]

    def add_option(stream):
        for ex in stream:
            ex["options"] = [{"id": label, "text": f"Is this classified as {label}?"}]
            yield ex

    asset = Selection(key=key).asset
    stream = get_stream(asset.stream())
    stream = add_tokens(asset.nlp, stream)
    stream = add_option(stream)

    return {
        "view_id": "blocks",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "lang": asset.nlp.lang,
            "labels": span_labels,
            "blocks": blocks,
            "choice_auto_accept": True,
        },
    }

This sort of resembles the tutorial code and I see it follows the answer found in a previous question. The texts appear regardless if I make both text keys = None.

blocks = [
        {"view_id": "spans_manual", "text": None},
        {"view_id": "choice", "text": None},
    ]

Also happens with the classification view_id, e.g.:

# remove `add_option`()`, replace `blocks =...` with
blocks = [{"view_id": "spans_manual"}, {"view_id": "classification"}]

I tried manipulating the text to determine if text length / weird characters affected the display but I think I've ruled these factors out.

Hoping for some direction, thanks!

Hi @mv3,

Which text are you trying to hide? The text field in the blocks definition is for defining optional text above the given block. So (to use the example from the tutorial you mention), the following definition:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]

results in in this UI:

while this definition:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": "The text in blocks config"},
    ]

will generate the follwoing UI:

Hi @magdaaniol, the first example is what I'm trying to achieve, either with:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "choice", "text": None},
    ]

or with:

 blocks = [
        {"view_id": "spans_manual"},
        {"view_id": "classification", "text": None},
    ] 

To add a small graphic:

Using text: None doesn't seem to work for me and the "text i want to hide" still remains visible.

Ideally, I'd only want to see the "textcat label" since "text i want to hide" is the same text as the one in the spans.manual block.

Thanks @mv3, looking at your code I still do not see where the duplication would come from.
Could you share the structure of a single task after this step in your recipe code:

stream = get_stream(asset.stream())

Thanks!

Certainly @magdaaniol. It looks like this:

{'text': 'Philippine Navy cannot validly invoke the doctrine of state immunity from suit.',
 'label': 'political.powers',
 'meta': {'segment_id': 'sample_id'},
 'spans': [{'start': 60,
   'end': 78,
   'token_start': 10,
   'token_end': 12,
   'label': 'concept'}],
 '_input_hash': 1122400349,
 '_task_hash': -1567215762}

Thanks @mv3! Now I see the duplicate info :sweat_smile:
It comes from the fact that you have the label atrribute on the top level in the task. UI interprets this label attribute as the classification view which requires text (the text disable option is not available here as classification is meant to be used for binary decision workflows).
When you think of it, it would be hard to curate the dataset when you can modify the span block, but can only accept or reject the textcat block.

If you know the textcat labels upfront, maybe it make sense to use them in options with the preannotated one alreadt selected i.e. stored under the accept attribute of the task.

Something like:

def asset_txtcat(source: str,label: str):
    
    dataset = f"is-{label}"
    span_labels = ["concept"]
    textcat_labels = ["political.power", "another_label"]

    blocks = [
        {"view_id": "spans_manual", "labels": span_labels},
        {"view_id": "choice", "text": None},
    ]

    def add_option(stream):
        for ex in stream:
            label = ex["label"]
            del ex["label"]
            ex["options"] = [{"id": label, "text": label} for label in textcat_labels]
            ex["accept"] = [label]
            yield ex

    nlp = spacy.blank("en")
    stream = get_stream(source)
    stream = add_tokens(nlp, stream)
    stream = add_option(stream)

    return {
        "view_id": "blocks",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "blocks": blocks,
            "choice_auto_accept": True,
        },
    }

which would result in:

Hi @magdaaniol, thank you so much for the clarification!

This is exactly what I was going for:

... you can modify the span block, but can only accept or reject the textcat block

I wanted to see if it was possible to use pre-labelled tasks where the labels number in hundreds. With your answer, I think I'll proceed with using the tiered approach with the choice interface in text classification, e.g. using political as the label instead of political.power.

Appreciate your taking the time to answer with replicable code! :+1:

1 Like

Hi @mv3 ,
Glad I could help :slight_smile: Just to briefly follow up: the need for binary text classification comes from the high number of textcat labels, then. And you were planning to have a copy of a question per textcat label so that the annotators can answer yes or no to each label, is that so?

In that case, all the more reason (imo) to split this annotation project into span annotation phase and textcat annotation phase. Imagine the span needs curation, this means that the annotators might end up modifying the span in each copy of the question.

The tiered annotation alternative is another good solution for sure!

1 Like