Hiding bounding boxes in image classification task

Dear all,

I am currently working on a project where annotators have to classify tables from scanned documents.

It turned out that the annotation process is much faster when annotators get to see the original table rather than the OCR output which is why I decided to setup a custom recipe which is based on image classification.

Under the hood I also pass the OCR output to prodigy along with word bounding boxes for feeding my model. However, I do not want to have the bounding boxes to be displayed during the annotation process.

The problem is that if I use the 'choice' interface, all boxes are being displayed making it impossible to do the annotation job and I do not have any idea how to hide the labels let alone the boxes. I was expecting that boxes would only be displayed in the 'image_manual' interface. So my question is, do you have any suggestion how I can fix this problem ? Is there any customizing available ?

Thank you very much in advance.

I'm wondering if your stream uses a key that's also used by the image_manual view. Could you perhaps share (a part) of the custom recipe as well as an example from your stream? That way I might be able to reproduce/think along easier.

Yes, it does, namely the spans for the word bounding boxes! Even though not necessary for the annotation task itself I need the bounding boxes as inputs for the multi modal model that not only receives words but also their positions.

I think, I could use a different key, say 'word_spans'. However, this will require to change several mappings in the downstream task after loading the dataset from the prodigy database.

Does this make sense ?

Please find attached the recipe. You can see the input keys in the map_to_prodigy_input function.

OPTIONS = [{"id": 0, "text": "ASSETS/LIABILITIES"},...]

def classify_tables_recipe():

        def get_stream():
            analyzer = get_analyzer()
            df = analyzer.analyze()

            def map_to_prodigy_input(source):
                dp["image"] = source.image
                dp["meta"] = {"filename": source.file_name}
                dp["spans"] = source.word_annotations  # will give a list of dicts 
                # with 'labels', 'type', 'points', as requested for the image_manual recipe  
                dp["options"] = OPTIONS
                return dp

            df = MapData(df, map_to_prodigy_input)
            return df

    return {
        "dataset": "tables" ,
        "stream": get_stream(),
        "view_id": "choice",
        "progress": _no_progress_bar,
        "config": {
            "choice_style": "single",
            "choice_auto_accept": True,
            "feed_overlap": False

The simplest way to omit the bounding boxes is indeed to rename the dictionary key so that you still have it around. That way it does not get rendered. So word_spans sounds feasible to me.

Another option is to omit the spans in this annotation table and have two separate datasets that you'll join later. Whether or not this is a practical decision depends a lot on your application/situation.