Interface for selecting multiple bounding boxes

magdaaniol · May 6, 2025, 1:54pm

I imagine that clickable bounding boxes would be ideal for this use case. This is actually a feature that we're currently working on.

In the meantime adding a "GROUP" label for annotators to draw a bounding box around the items they want to group simplifies the immediate UI challenge a whole lot:
cats_groups

You could then use the before_db callback to process the spans and find the ones contained in the GROUP box. Since the rectangular and freehand bounding boxes do not have the centers calculated, I used shapely package to quickly check the containment. If all your boxes are rectangular you don't really need it:

# your custom recipe with image_manual UI
import uuid
from shapely.geometry import Polygon, Point

def process_span_groups(annotation):
        """
        Adds a unique ID to each span and resolves "GROUP" labeled spans
        into a "span_groups" field, handling rectangles, polygons, and freehand.

        Args:
            annotation (dict): The annotated example.

        Returns:
            dict: The updated annotation dictionary with unique span IDs and the "span_groups" field.
        """
        if "spans" not in annotation:
            return annotation

        # Add a unique ID to each span
        for span in annotation["spans"]:
            span["id"] = uuid.uuid4().hex

        group_spans = [span for span in annotation["spans"] if span.get("label") == "GROUP"]
        individual_spans = [span for span in annotation["spans"] if span.get("label") != "GROUP"]

        span_groups = []
        for group_span in group_spans:
            contained_span_ids = []
            group_points = group_span.get("points")
            group_type = group_span.get("type")

            if not group_points:
                continue  # Skip if the group span has no points

            group_geom = Polygon(group_points)
         

            for individual_span in individual_spans:
                individual_points = individual_span.get("points")
                individual_type = individual_span.get("type")

                if not individual_points:
                    continue

                try:
                    if individual_type in ["rect", "polygon", "freehand"]:
                        # For simplicity, we'll check if the *center* of the bounding box
                        # of the individual span falls within the group span.
                        # For polygons and freehand, calculating the exact bounding box center.
                        if individual_type == "rect":
                            center_x = individual_span["x"] + individual_span["width"] / 2
                            center_y = individual_span["y"] + individual_span["height"] / 2
                        else:
                            min_x = min(p[0] for p in individual_points)
                            max_x = max(p[0] for p in individual_points)
                            min_y = min(p[1] for p in individual_points)
                            max_y = max(p[1] for p in individual_points)
                            center_x = (min_x + max_x) / 2
                            center_y = (min_y + max_y) / 2

                        individual_center = Point(center_x, center_y)
                        if group_geom.contains(individual_center):
                            contained_span_ids.append(individual_span["id"])

                    else:
                        print(f"Warning: Unknown individual span type '{individual_type}'. Skipping.")
                        continue
                except Exception as e:
                    print(f"Error processing individual span with Shapely: {e}. Skipping.")
                    continue

            if contained_span_ids:
                span_groups.append({
                    "label": "GROUP",
                    "color": group_span.get("color"),
                    "spans": contained_span_ids
                })
        if span_groups:
            annotation["span_groups"] = span_groups

        return annotation

and then in before_db callback:

def before_db(examples: List[TaskType]) -> List[TaskType]:
    for eg in examples:
        if remove_base64 and eg["image"].startswith("data:"):
            eg["image"] = eg.get("path")
        # Process the groups if they exist in the submitted data
        process_span_groups(eg) 
        if "span_groups" in eg:
            # You could add validation or transformation logic here
            pass
     return examples

This should then result in the following DB record:

"span_groups": [
    {
      "label": "GROUP",
      "color": "springgreen",
      "spans": [
        "bec0e831b1a64974a5521c099ae26b89",
        "d1712ee4e05b4ba4b7e6bdc17e0b66cb",
        "34ac6ec3322e4cec92d7ca4e099b09f5"
      ]
    }

Where spans are IDs you've assigned to each annotated span.

This of course assumes that's it's practical to draw such GROUP boxes and the centers are a good indicator of containment. If it is, it would simplify the challenge a lot.

Topic		Replies	Views
✨ Demo: fully manual image annotation interface enhancement , done , image , front-end	11	3701	August 31, 2019
Bounding boxes on semi-structured forms usage	2	371	March 10, 2022
Text spans and Image spans simultaneously enhancement , ner , done , image , front-end	10	757	December 20, 2024
Document Images - Textual Images Labeling	1	319	April 20, 2022
Hiding bounding boxes in image classification task	3	304	May 4, 2022

Interface for selecting multiple bounding boxes

Related topics