I'm presented tasks that is never yielded

I have a custom recipe where I have the following

     task["meta"]["has_outlook"] = round(span._.outlook_prob, 3)
     if task["meta"]["has_outlook"] > 0.5:
         yield task

For some reason I get tasks where task["meta"]["has_outlook"] <= 0.5 as well!? How is that possible?

My recipe is like this

    stream = ner_stream()

    return {
        "view_id": "ner_manual",
        "dataset": dataset,
        "stream": stream,
        "config": {
            "lang": outlook_nlp.lang,
            "labels": ["OUTLOOK"],
            "exclude_by": "input",

where the first snippet is a sample from ner_stream() iterator.

It is really creepy. The more I look at it; I'm getting convinced that it actually shows the wrong value in meta.

What you pass to Prodigy as the "stream" is a regular Python generator and if you Python generator does not include a dictionary, there's really no logical way for Prodigy to know about it. So there must be some piece of logic in your code that doesn't behave the way you think it does.

The "meta" you send out in your stream is the meta that gets send to the web app. If you think there's a problem somewhere, you can print the meta on the way out, enable JavaScript in you custom recipe and log the window.prodigy.content in your browser to see what the app received.

Not sure how the rest of your stream is set up, but also double-check that you're always deepcopying the dictionary if you're sending it out multiple times. Otherwise, you're modifying a task in place, which can have unintended side-effects. You could also convert the float to a string on the way out, to verify that nothing is messing with the actual value on the way out.

Ah of course deep copy. You are a genius. I was splitting each task into a task for each sentence and changing the dictionary without making a deep copy of the original document task. Thank you so much!

I have another issue though. My dockerised prodigy works fine locally but when I deploy on the web then after a while I get

Oops, something went wrong :frowning:

I think I'm hitting a timeout before getting a task (a lot of filtering going on) but I can't tell. For some reason all my logging. entries are not recorded (standard logging library) from my custom stream generators. How can I enable it? My logging. entries from my custom recipe works just fine but not those inside my custom stream generators.

Haha, glad it worked :smiley:

And there's a lot that could be going on here, but maybe it's because your streams are executed in a different thread? Maybe using a named logger helps, if you're not doing that already? And are you seeing anything Prodigy logs within its streams? If so, you can use Prodigy's log helper.

This is how I start it

    logger.info(f"Starting prodigy with: {settings}")
    prodigy.serve(f"{settings.RECIPE} {settings.DATASET}")

The logger.info works just fine here but nothing after that (all those within the recipe). I do propagate uvicorn logs though but that shouldn't affect my named loggers (named recipes.outlook e.g.). Using log from Prodigy doesn't help either.