Blocks and Progress bar

Hello,

I have a couple of questions that I hope you might be able to help me with:

  1. How can I have a number/percentage in the progress bar instead of the infinite symbol? As below:
    image
    I tried setting "show_stats": true in the prodigy.json file and in the recipe config, but nothing changes.
  2. Is it possible to customize the output of the annotation? e.g. extracting the information in a different ways/outputting different information to the .jsonl file?
  3. Is it possible to create a blocks recipe, where one of the blocks only appears depending on the the answer to a previous block (e.g. 1st block is of type choice and 2nd block contains a html radio button only appearing depending on the answer to the choice block)?

Many thanks in advance!

Sofia

Hi! Prodigy will calculate the progress automatically if the stream returned by the recipe has a length (e.g. if it's a list). If the stream is a generator, it could potentially be infinite and it will only be read one batch at a time, so Prodigy can't know how many examples are left.

If you know how many examples you have, or want to implement some other custom logic to calculate the progress, you can also add a "progress" callback to the components returned by your recipe: https://prodi.gy/docs/custom-recipes#progress

(Just keep in mind that the progress is calculated on the server, so it's updated every time answers are sent back and not fully in real time. Calculating it on the server means that you can easily write custom logic and even take other things like the model into account – for example, in the active learning recipes, the progress is an estimate of when the loss might hit zero and there's nothing left to learn).

The annotations stored in Prodigy's database will have Prodigy's JSON format, but you can always access it and export it in a custom format, rearrange the information etc. For example, you can connect to the database in Python and load your annotations. This gives you a list of dictionaries with the data, that you can then modify and export however you like:

from prodigy.components.db import connect

db = connect()
# This is a list of dictionaries that you can modify and export
examples = db.get_examples("your_dataset")

You can also attach custom metadata to the examples you stream in and it will be preserved and saved with the annotations. This lets you include things like custom internal IDs, document meta information, and so on.

You could achieve something like that by adding custom JavaScript and listening to the prodigyupdate event that gets fired every time an update is made to the current task, for example, if an option is selected or unselected. You can then show/hide the radio button or any other content based on the contents of the "accept" key (the list of selected choice options).

In general, we do recommend keeping the interfaces straightforward and avoiding too many conditional changes of the UI. If the annotator can see everything they need to do upfront, it can reduce the potential for errors, lets them move faster and it also makes it easier later on to reproduce exactly what an annotator saw at any given point. So sometimes it can be more efficient to make several passes over the data and ask for different pieces of information each time.

Hello @ines,

Many thanks for your detailed explanation. It was very helpful!

Cool, I only had to add stream = list(stream) before the return and that solved the problem.

By running this code the examples list was empty - I guess you meant to use function get_database() instead of get_examples(). By using the get_database() function I was able to access the output data, as you said. Many thanks!

I'm not sure I understand how I would to this (sorry, not used to working with JavaScript). Let us assume I have two blocks: a choice block and a html block defining a radio button ideally only appearing if answer to choice block is different from 0. Where do I define the custom JavaScript? I am defining the JavaScript for the radio button in the return. But for this I need to define the JavaScript before, right? To be able to return 2 blocks or only 1.

Many thanks again!

Sofia

Sorry, I meant get_dataset, yes! This was a typo.

Yeah, so you would define the JavaScript as the "javascript" key returned by your recipe's "config". It would then apply to every task. Under the hood, the HTML block with the radio button/checkbox would always be there – but it would be visually hidden unless something specific happens – for example, a certain option gets selected. So conceptually, the logic goes like this:

  • Trigger: the current example changes (e.g. because the annotator made a change).
    • Is choice option X selected?
      • Select the checkbox and mark it as visible / invisible.
  • Trigger: the checked status of the checkbox changes (e.g. because annotator ticked it).
    • Update the current example with that information.

Here's how this could look in code:

// This is called when Prodigy loads
document.addEventListener('prodigymount', event => {
    const checkbox = document.querySelector("#checkbox")
    // Hide the checkbox by default
    checkbox.style.display = "none"
    // If the checkbox is checked, update "custom_value" of the current task
    checkbox.addEventListener('change', event => {
        const checked = event.target.checked
        window.prodigy.update({ custom_value: event.target.checked })
    })
})

// This is called when a task is updated
document.addEventListener('prodigyupdate', event => {
    const { task } = event.detail
    const selected = task.accept || []  // the selected options
    const checkbox = document.querySelector("#checkbox")
    // Show the checkbox if LABEL_ONE is selected, hide it if it's not
    if (selected.includes('LABEL_ONE')) {
        checkbox.style.display = "block"
    }
    else {
        checkbox.style.display = "none"
    }
})

And here's what you could return by the recipe:

return {
    "dataset": dataset,
    "view_id": "blocks",
    "stream": stream,
    "config": {
        "javascript": JAVASCRIPT,
        "blocks": [
            {"view_id": "choice"},
            {"view_id": "html", "html": '<input id="checkbox" type="checkbox" />'},
        ],
    },
}
1 Like

Many many thanks for your help @ines!