Custom model prediction in choice view is not working properly..

Hi..

I am using this custom recipe to make predictions on the streaming images and view the predictions using the choice view..


import prodigy
from prodigy.components.loaders import Images
from PIL import Image
import numpy as np
from MRCNN.scripts.screen_classifier import model
from MRCNN.scripts.computer_vision_tools import b64_to_image

OPTIONS = [
    {"id": 0, "text": "type1"},
    {"id": 1, "text": "type2"},
    {"id": 2, "text": "type3"},
    {"id": 3, "text": "type4"},
]

@prodigy.recipe("classify-images")
def classify_images(dataset, source):
    
    def get_stream():
        # Load the directory of images and add options to each task
        stream = Images(source)
        for eg in stream:
            image_byte = eg["image"]
            image_pil = b64_to_image(image_byte)
            prediction = classify_screen(image_pil)
            eg['accept'] = [prediction]
            eg["options"] = OPTIONS
            
            yield eg

    return {
        "dataset": dataset,
        "stream": get_stream(),
        "view_id": "choice",
        "config": {
            "choice_style": "single", 
            "choice_auto_accept": False
        }
    }

def classify_screen(pil_image, model=model):
    
    img = np.array(pil_image.resize((300,300)))
    input_arr = np.stack([img], 0)

    return list(model.predict(input_arr)[0]).index(1)

But when I run the script, it just runs through all the images at one go and does not show me any on the screen. I'm not able to find where the problem is given that I've turned the choice_auto_accept as False.

Would really appreciate if someone could help troubleshoot.

Thanks a lot.

Hi! What exactly do you mean by "it just runs through all the images at one go"? Do you see "no tasks available" when you start the server? Is it possible that examples are being skipped because they're already in the dataset? Is there anything in the logs (with PRODIGY_LOGGING=basic) that looks suspicious?

If that's not the case, how is your classifier implemented? If you're using PyTorch, double-check that it's not launching multiple threads under the hood. One way to check and work around this is to move your stream loader into a separate process and pipe the output forward, to make sure everything is happening in the main thread. See here for an example of how to set up the loading: https://prodi.gy/docs/api-loaders#loaders-stdin

The examples were being skipped as they were already in the dataset.. Mystery solved.. :smile: Thanks a lot for your help..

1 Like