Precision image annotation

Hello Prodigy!

I’ve recently started to move over our image annotation system to prodigy with good results. I’m currently using a personal license to get started, with the intent to update to a commercial license once we verify that everything is workable. :slight_smile:
However! I’ve run into some limitations that I’m not sure how to work around, and I’m hoping I can get some assistance.

Background:
We’re annotating industry specific images in natural environments. As part of this, we’re extracting the polygon boundary of target objects with high precision requirements. This is a set of tasks where we first determine if we need to annotate an image, then split the annotation into drawing polygons, then labeling them in a separate task, and finally verifying the annotated images are of acceptable quality in a third task.

Issue 1:
We can’t efficiently zoom and reduce line thickness and/or opacity so that we can precisely select an object boundary. This is blocking our transition right now and the issue I’m most keen on getting fixed.

Issue 2:
Attempting to zoom (e.g. in Chrome) increases the size of the accept/deny buttons to quickly cover the whole screen. Is there a way to ensure those buttons are always below the task card even if it forces scrolling for large tasks, or keep them at a consistent size using max-width or some other property?

Issue 3:
As you see from the background, we have a few different task types, primarily image_manual and choice. For choice with auto accept on, the accept/deny/skip-UI is causing problems.
We want to be able to hide the footer with CSS for only this task type, e.g. by specifying a CSS file in the task configuration. It does not seem to be possible currently, but would make the editor easier to work with, and possibly allow us to solve some of the other problems on our own. Is there a way forward on this you could point me at?
I’ve read about some solutions using greasemonkey, but we’re not necessarily in control of the end-user browser.

Issue 4:
We have an interest in a masking tool, where we can paint pixels that belong to specific classes, as some of our annotations are unsuitable to mark with polygons due to the complexity of the object. This is less of a priority, but would be useful.

Looking forward to hearing back from you. Thanks!

1 Like

Hi! Thanks so much for the detailed feedback :slightly_smiling_face:

Some aspects of the fully manual image mode are still experimental and the current feature set mostly focuses on fast annotation for common computer vision tasks (where high-res images and high-precision annotation are also a lot less important).

The auto-adjusting line thickness and font size also came up recently in this thread – I definitely want to update the algorithm here. We could then easily have a "base size" setting that lets you adjust the general line width (the visual "target size" of the lines etc. after the image was auto-resized). Exposing other config settings shouldn't be a problem either – we didn't expose everything yet because I wasn't sure what people would actually need. But in general, pretty much any semi-hardcoded size can be exposed as a config setting, if needed.

The image width adjusts to the width of the card, so you can overwrite the "cardMaxWidth" value in the "customTheme" settings in your prodigy.json or recipe config to fit more of the image on the screen.

When you're taking about zooming in your use case, I assume you mean really zooming in and enlarging the image, even beyond the default screen width, right?

For cases like this, it might make sense to add some kind of image-only zoom functionality to the image container instead of "hijacking" the native zoom. Maybe something similar to Google Maps zoom? (I'd have to read up on this and see what the implications are and what makes the most sense for Prodigy.)

Native browser zoom is supposed to increase the size of everything proportionally – including text, buttons and other elements. Changing that behaviour would produce very counterintuitive results and potentially be very problematic for accessibility.

For full flexibility, you can always edit the static/index.html and put in any CSS or JS. The current view_id isn't easily exposed at the moment – it's a good point thought, and I'll definitely add this to the JS support (currently experimental and available for testing in the HTML view).

As a quick and dirty workaround, you could add a query variable to the web url, e.g. localhost:8080?choice and then check for that via the window.location.search. If it's "choice", your script could then append a <style> tag to the <head> which hides the buttons via their CSS class (I don't think it currently has a human-readable one, but the class will be consistent throughout the app).

Features like masking are probably a bit out of scope at this point. For the objects that are this complex, maybe you'd actually be better off using a real graphics program or illustration tool instead? Those tools really focus on what you need, and you'd still be able to automate a lot of the process before and after.

Hey Ines,

Thanks for the response! I saw the previous discussion on some of these topics, and I’m checking them again.

I agree completely with your comment on native browser zoom. Forcing the footer below the card would be more of a way to fix potential issues where cards are big enough to be covered by the footer, which can happen with bigger images - but I see you’ve handled that by adding scroll space below the card instead, which works just as well if not better for the common case. :smiley:

I got the min/max width and theme customization working, but I get the impression I could solve more of these issues on my own if I could do something like "custom_style":"mystyle.css" instead, and I think it’d be less of a development impact on your end in the long run compared to adding all my special case theme settings for line widths and opacity. :stuck_out_tongue:

For now, I’m editing static/index.html to add a style tag, and targeting .prodigy-content - which helped resolve my problem with the border width and opacity, and even allows me to work around the zoom issue with a maximized image:
<style> .prodigy-content circle { opacity:0.4; /*Increase visibility under circles*/ } .prodigy-content path { stroke-width:1 !important; /*Thinner lines for increased visibility*/ opacity:0.5; /*Lower opacity for visibility*/ } .prodigy-content svg { width:100% !important; /*Maximize the image in the card, alternative to zooming*/ } </style>
This works perfectly fine for now, so I’m satisfied with that. :smiley:

If you want to expose custom CSS for different annotation tasks, how about adding the view_id (or a style_class:"classname" in the recipe) to the root div or “main” tag as a class? It wouldn’t affect anything, but you’d be able to specify task-specific details in custom CSS if necessary. With an added custom style file on top, I’d be able to set up everything I need (and more!) without any library hacking. :slight_smile:

As a sidenote, I can’t get the example you provided on configuring javascript to work. Is this on a newer version than the one I have prodigy-1.6.1-cp35.cp36.cp37-cp35m.cp36m.cp37m, or limited to only the custom HTML view_id? I also can’t tell from your linked example if I’m mixing languages with javascript in the JSON file, or simply providing a file path like above: "customJS":"myscript.js" - which might be easier to work with unless I’m in the custom recipe code.

Finally, it would be great with some validation on the config file (as with the json validation for annotation tasks) to avoid issues with the current mixture of under_score_notation and camelCaseNames (is that intentional?) I’m guessing you already have the necessary code to validate it in your platform, so all the advantages are readily available. :sunny:

That's a good idea! We could do something like .prodigy-view-ner, .prodigy-view-choice etc. I do see the point for custom identifiers for custom recipes, though – but maybe we could expose the recipe name separately, like data-prodigy-recipe="ner.teach"?

In addition, some more containers can get human-readable names, like .prodigy-buttons. And just like the "card_css" setting, we could then have a "global_css" setting that would let you add any other customisations you need.

Okay, so what I take away from this is that we could use the following settings (aside from fixing the auto-sizing for lines and labels, as discussed in this thread). Names not final, I'm not always good at naming stuff :stuck_out_tongue:

  • bounding_box_opacity: Opacity of bounding box lines (default 1.0)
  • auto_size_bounding_boxes: Automatically adjust line width and label font size if image is resized to fit. If disabled, lines and labels on a large images may appear smaller (default true)
  • bounding_box_font_base: Base font size for bounding box labels, in px (default ??)
  • bounding_box_line_base: Base stroke width for bounding box labels, in px (default 2)
  • force_image_full_width: Force images to always take up the full width of the card, even if they're smaller. May result in lower image quality due to stretching (default false)
    ´

Ah yes, at the moment, that feature is only enabled for the html view, because it's the most isolated one. Adding it to the other interfaces is definitely on the list for the next release.

window.prodigy should then also expose more details about the current task and configuration – for example, the current view_id and the name of the current recipe. The custom prodigyanswer event that's fired should probably include both the current task, as well as the answer – although I wouldn't recommend it, this would let you trigger an action immediately after the annotator has clicked on an answer.

The value of "javascript" should be a string. Initially, the idea here was that users might want to be compiling their markup in Python within recipes – for example, compose the html_template or javascript depending on some command line setting. This would be difficult if the settings only took a file path. But I'm not 100% sure if it's worth it, considering that a file path is just much more convenient and easier to work with in JSON.

Yessss, absolutely! As of v1.5.x, Prodigy uses JSON schemas to validate the stream, and I actually have an open PR internally that adds the same validation for the prodigy.json. I'm not 100% happy with some of the inconsistencies here, but we ended up accepting some of them for easy backwards compatibility. In theory, the only camel-case properties should be the CSS-in-JS-style values in "card_css" (and, in the future, "global_css").

(OT, but maybe the new GitHub actions will actually make Hashicorp's HCL config language more widespread and popular. There's some cool stuff in there that I like, but compared to regular JSON, the learning curve is just too much to ask from users at this point.)

Quick update: the improved auto-sizing and global CSS option is now available in v1.7. The root container also exposes its view ID and recipe name, so you can write custom rules targeting specific combinations of those.