New User: My text won't show up as part of the task.

Hello,

So I'm trying to set up a task with a custom recipe. I can't figure out why the text isn't getting displayed at all. It shows up when I use a built-in recipe or a random recipe I got from the repo, so it's definitely my code and not my data that's a problem. As I'm just starting to use this, it may be something really stupid but hey, that's what this forum is for.

This is the custom recipe:

    import prodigy
    from prodigy.components.loaders import TXT
    from prodigy.util import split_string
    from collections import Counter
    from typing import List, Optional
    @prodigy.recipe(
        "time",
        dataset=("The dataset to use", "positional", None, str),
        source=("The source data as a TXT file", "positional", None, str),
        
    )
    def time(dataset: str, source: str):
     
        stream = TXT(source)
       
        return {
            "view_id": "blocks",  # Annotation interface to use
            "config": {
                "blocks": [

                    {
                     "view_id": "ner_manual",
                     "labels": ["Time Expression"],
                    },
                    {
                     "view_id": "text_input",
                     "field_rows": 1,
                     "field_autofocus":True,
                     "field_label": "What is the start date of this sentence",
                     "field_placeholder": "01.01.2016",
                     "text":None
                    },
                    {
                     "view_id": "text_input",
                     "field_rows": 1,
                     "field_autofocus":True,
                     "field_label": "What is the end date of this sentence",
                     "field_placeholder": "01.01.2016",
                     "text":None
                    },
                    {
                     "view_id": "text_input",
                     "field_rows": 1,
                     "field_autofocus":True,
                     "field_label": "What is the span in days of this sentence",
                     "field_placeholder": "1",
                     "text":None
                    },
                    
                   
                ]
            },
            "dataset": dataset,  # Name of dataset to save annotations
            "stream": stream,  # Incoming stream of examples
        }

This is a sample of my data.

We will introduce a standalone award that recognises and rewards Irish-based artists
establish a dedicated philanthropic fund to help drive philanthropic giving to our cultural institutions
We will extend the Section 1003 tax relief to important heritage items that are donated to regional museums, as well as cultural institutions
Fine Gael will work to further improve the Section 481 Tax Relief
introduce a new scheme aimed at increasing the number of women working in the film industry
set up an apprenticeship programme to address the skills needed in the audio-visual sector
establish a new online portal which will act as a skills database for the audiovisual sector
establish a Creative Sector Taskforce, which will draw up an Action Plan for Growth across the entire sector, including audiovisual, gaming, animation and music.

How do I get it to do the thing?

You've added the "text": None option to your view. That suppresses the text in the task. Does removing it help?

Edit: Ah, wait, I think I understand what you are trying to do. You have a ner_manual view plus two text views. Of course the text views aren't supposed to display their texts.

Does the TXT loader give you text tasks or NER tasks? I think those are different. NER tasks (I think) need text plus tokens so the interface knows what the "minimum element" is that you can manipulate.

Edit #2: I have created a minimal test example and tried this (see text suppression test 1 - Pastebin.com). The NER interface is fine with getting just text and no tokens (I haven't checked what the TXT loader produces, but I strongly suspect that it produces NER-suitable tasks), so this isn't the problem here. However, if I add "text": None to the text view, that also suppresses the text for the NER view.

1 Like

Hi! There are two potential problems here:

  1. Loading your plain text with the TXT loader will convert it to Prodigy's JSON format, but if you want to annotate named entities manually, you'll also need to tokenize the text to add a key "tokens". The easiest way to do this is using the add_tokens preprocessor and a spaCy tokenizer of your choice. See here for an example: prodigy-recipes/ner_manual.py at master · explosion/prodigy-recipes · GitHub
  2. What exactly are you trying to achieve by setting "text": None on the text_input blocks? The text input interface doesn't render the "text" value in your task, so resetting it shouldn't be necessary. But it shouldn't also make a difference here.
1 Like

I did manage to get it working by removing the NER task entirely and simply displaying the text, but when I try to add the NER task it just disappears, which I guess means it's a token problem.

I guess I was just trying to make sure the text didn't show up in the second set of blocks, but I realised I don't need it and got rid of it. You're right it didn't make a difference.

Is there any way to make the blocks into separate fields? Right now they all count as the same field, and whatever I type into one block automatically gets typed into the other 2.

Yes, that's what the field_id setting is for – this defines the key used to store the data you type in. You can set it on the task or overwrite it on the block. See here for details: Annotation interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP

1 Like

Thank you!