text_input: field_suggestion based on function output?

Having a custom recipe containing

blocks = [
        {"view_id": "image", "spans": []},
        {"view_id": "text_input",
            "field_id": "caption",
            "field_rows": 4},
        {"view_id": "ner_manual"}
    ]

and an Input stream containing base64-encoded images and text.

  1. Is it possible to provide text_input with a field_suggestion from a previous annotation of an equal image or from a variable which is the return value of a function?
    e.g.:
blocks = [
        {"view_id": "image", "spans": []},
        {"view_id": "text_input",
            "field_id": "caption",
            "field_rows": 4,
            "field_suggestion": [lastCaptionForThisImage]},
        {"view_id": "ner_manual"}
    ]

or

blocks = [
        {"view_id": "image", "spans": []},
        {"view_id": "text_input",
            "field_id": "caption",
            "field_rows": 4,
            "field_suggestion": [getCaptionFor(image)]},
        {"view_id": "ner_manual"}
    ]

Hi! Doing this in entirely in Python would be tricky, because the code used to define the blocks in your recipe runs once on startup, and it doesn't have access to the annotations as you collect them.

One option would be to use custom JavaScript to store all captions the user has previously entered, and then use that to populate the <datalist> of the text input field, which is how the field suggestions are implemented under the hood.

I haven't tried this yet, but something along those lines could work:

let allCaptions = []
let prevTaskHash = null

document.addEventListener('prodigyanswer', event => {
    // This runs every time the user submits an annotation
    const { task } = event.detail
    // Update the captions with a unique list of previous + current
    // (sorry, this is really ugly in JavaScript)
    allCaptions = [...new Set([...allCaptions, task.caption])]
})

document.addEventListener('prodigyupdate', event => {
    // This runs every time the task is updated
    const { task } = event.detail
    if (prevTaskHash !== task._task_hash) {  // we have a new example
        const datalist = document.querySelector('.prodigy-content datalist')
        const options = allCaptions.map(caption => `<option value="${caption}" />`)
        datalist.innerHTML = options.join('')
        prevTaskHash = task._task_hash
    }
})

Hi Ines, thank you for your detailed reply.

Your code suggestion does indeed work for the intended use. (Just lagging 1 Example behind, but I guess that really is impossible to change. I.e. Example 1: suggestions[] EnterCaption1; Example 2: suggestions[] EnterCaption2; Example 3: suggestions[Caption1] ...)

There is, however, the unfortunate side effect that adding "field_suggestions" as follows:

    blocks = [
        {"view_id": "image", "spans": []},
        {"view_id": "text_input",
            "field_id": "caption",
            "field_rows": 4,
            "field_suggestions": [""]},
        {"view_id": "ner_manual"}
    ]

breaks the "field_rows" config, thus not allowing resizing of the input field anymore and seemingly preventing multi-line input (That may however just be visual, as only 1 row can be displayed).

On a sidenote: Do You have any experience regarding performance if the list grows (very) large?
Also regarding finding the correct annotation:
I had the idea of turning allCaptions into a dictionary of sets (which ideally only ever has 1 entry)
Is it possible to catch the input image of the current Example and use it as the key in a dictionary?
Basically getting:

"field_suggestions": [allCaptions[currentImage]]

(My experience with HTML and JS is very limited.)

I think the problem here is that the next task is already rendered before the allCaptions are updated. One thing you could try is checking in addition to the different task hashes is whether the length of allCaptions has changed compared to the last time you added the options. So you could keep a variable similar to the prevTaskHash and store the previous caption count. Just be careful when adding more conditions in the prodigyupdate callback, because that runs every time the task updates – including on every keystroke.

Ah yes, sorry about that, that's currently one of the limitations of using the field_suggestions!

This kinda depends on how large the list gets. The suggestions are added as a plain <datalist>, which is natively supported in the browser, so one of the main bottlenecks here is the browser. I'm not sure when that typically starts getting laggy. If your list gets large, I think a bigger performance issue is the way the <option>s are added. Writing to innerHTML can get slow and it can be faster to actually create the elements in JavaScript instead and appending them.