Enabling review feature for image_manual recipe and allowing to editing annotations on images

magdaaniol · August 24, 2024, 3:11pm

Please see this post from @ines where she address a very similar question: Add `review` mode to `image_manual` - #2 by ines
The gist of it here for your convenience:

One reason is that we haven't really found a satisfying way yet to display conflicting conflicting image annotations. Displaying all variations together can get pretty messy – and it's kind of unclear how to handle subtle differences. If you're annotating text and tokens, there are only so many possible variations, but if you're actually drawing bounding boxes, they're pretty much always going to be different (at least, it's super unlikely that two people will draw a box with identical pixel coordinates). So there probably need to be additional settings to define those things.

It also provides a suggestion for a image annotation correction workflow, which is revising them in a separate annotation pass. If you use multiple annotators with some overlap, you could additionally have a pre-preprocessing script/function that would select for re-annotation only the examples that received different annotations i.e. spans from your annotators.
A version of image.manual recipe with such example filtering could look like:

from typing import Dict, List

from prodigy import recipe
from prodigy.components.preprocess import fetch_media
from prodigy.components.stream import Stream, get_stream
from prodigy.core import Arg
from prodigy.protocols import ControllerComponentsDict
from prodigy.types import LabelsType, SourceType, StreamType
from prodigy.util import INPUT_HASH_ATTR, log


def spans_not_identical(dict_list: List[Dict]) -> bool:
    spans = [d.get("spans") for d in dict_list]

    # If all spans are [], return False
    if all(span == [] for span in spans):
        return False

    # Check if all spans are identical
    return not all(span == spans[0] for span in spans[1:])


def group_by_input_hash(stream: StreamType) -> Dict[str, List[Dict]]:
    """Group examples by their input hash.

    Args:
        stream (StreamType): Input stream of examples.

    Returns:
        Dict: Dict of examples grouped by input hash.
    """
    grouped: Dict[str, List] = {}
    for eg in stream:
        input_hash = eg.get(INPUT_HASH_ATTR)
        if input_hash not in grouped:
            grouped[input_hash] = []
        grouped[input_hash].append(eg)
    return grouped


def get_conflicting(grouped: Dict[str, List[Dict]]) -> List[Dict]:
    conflicting = []
    for input_hash, annotations in grouped.items():
        if spans_not_identical(annotations):
            conflicting.append(annotations[0])
    return conflicting


@recipe(
    "image.manual.review",
    # fmt: off
    dataset=Arg(help="Dataset to save annotations to"),
    source=Arg(help="Data to annotate (directory of images, file path or '-' to read from standard input)"),
    label=Arg("--label", "-l", help="Comma-separated label(s) to annotate or text file with one label per line"),
    loader=Arg("--loader", "-lo", help="Loader if source is not directory of images"),
    exclude=Arg("--exclude", "-e", help="Comma-separated list of dataset IDs whose annotations to exclude"),
    darken=Arg("--darken", "-D", help="Darken image to make boxes stand out more"),
    width=Arg("--width", "-w", help="Default width of the annotation card and space for the image (in px)"),
    no_fetch=Arg("--no-fetch", "-NF", help="Don't fetch images as base64"),
    remove_base64=Arg("--remove-base64", "-R", help="Remove base64-encoded image data before storing example in the DB. (Caution: if enabled, make sure to keep original files!)")
    # fmt: on
)
def image_manual(
    dataset: str,
    source: SourceType,
    label: LabelsType,
    loader: str = "images",
    exclude: List[str] = [],
    darken: bool = False,
    width: int = 675,
    no_fetch: bool = False,
    remove_base64: bool = False,
) -> ControllerComponentsDict:
    """
    Manually annotate images by drawing rectangular bounding boxes or polygon
    shapes on the image.
    """
    log("RECIPE: Starting recipe image.manual", locals())
    stream = get_stream(
        source,
        loader=loader,
        dedup=False,
        rehash=True,
        input_key="image",
        is_binary=False,
    )
    grouped = group_by_input_hash(stream)
    conflicting_examples: List[Dict] = get_conflicting(grouped)
    stream = Stream.from_iterable(conflicting_examples)
    if not no_fetch and loader != "image-server":
        stream.apply(fetch_media, stream=stream, input_keys=["image"])

    def before_db(examples: List[Dict]) -> List[Dict]:
        # Remove all data URIs before storing example in the database
        for eg in examples:
            if eg["image"].startswith("data:"):
                eg["image"] = eg.get("path")
        return examples

    return {
        "view_id": "image.manual",
        "dataset": dataset,
        "stream": stream,
        "before_db": before_db if remove_base64 else None,
        "exclude": exclude,
        "config": {
            "labels": label,
            "darken_image": 0.3 if darken else 0,
            "custom_theme": {"cardMaxWidth": width},
            "exclude_by": "input",
            "auto_count_stream": True,
        },
    }

It's just a starter script, you might want to show the annotation that most annotators agreed on instead the first one available or incorporate other logic.

You would call this recipe with your annotated dataset as input and save the data in a separate dataset so that you can create the final dataset out of the two datasets: the original and the reviewed one.

Topic		Replies	Views
Add `review` mode to `image_manual` enhancement , image , solved	9	2467	October 8, 2021
Editing already annotated images? enhancement , usage , done , image , front-end	4	1568	July 1, 2019
Review recipe enhancement , done , front-end , review	4	580	November 11, 2020
Creating a custom review recipe for image annotation	12	86	April 15, 2025
Precision image annotation enhancement , usage , done , image , front-end	4	2800	February 18, 2019

Enabling review feature for image_manual recipe and allowing to editing annotations on images

Related topics