custom choice recipe ignoring examples after updating the navigator

Hi, I am using the exact custom choice recipe but I am having a weird issue, when I refresh the page the shown example is lost and if I repeat this at the end prodigy say "No task available" but those examples are not in the database. In other words, once an observation is shown it's lost whether your annotate it or not.
I am lost here.
Any help pls!

How does your custom recipe look? Are you setting "force_stream_order": True? By default, Prodigy will always send the next batch whenever a new batch of questions is requested – so you can have multiple people requesting batches, and everyone will get different data. At the end of the annotation process when you exit the server, Prodigy know what has been annotated and what needs to be queued up again when you restart the server.

If you force the stream to always send out examples in the same order, and re-send batches, you should keep seeing the same example when you refresh, and Prodigy will keep resending the same questions until they're answered. (In that case, you wouldn't want to have multiple people requesting batches, though, since you'd end up with duplicates that way.)

Thanks for your response, I am using this exact recipe : https://github.com/explosion/prodigy-recipes/blob/master/other/choice.py and didn't do any modification. except that my options are already in the jsonl(i.e every observation has its own set of choices)

Every user has it's dedicated batch, the problem is every time an example is shown, whether your annotate it or not (like refreshing the page), that example is just lost but it's not in the database.

If you don't annotate an example, it's not going to be in the database – that's expected behaviour. Only examples you annotate and save will be in the database. When you restart the server with the same data, Prodigy will find all examples that aren't annotated yet and queue them up for annotation again, so you're never going to lose unannotated questions.

If you want batches to be re-sent in the same order until they're answered, add "force_stream_order": True to the "config" returned by your recipe.

Yes, I know that's the normal behaviour, the problem is that the examples that are not annotated are gone which is weird(prodigy says NO Task available). I think I am doing something incorrect.
lemme show an example of the jsonl file :
{"text": "texterrhgrefsdzq", "options": [{"id": "choice1", "text": "choice1"},
{"id": "choice2", "text": "choice2"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textvfbghggfds", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"},
{"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsfdghjfds", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdfghjkgfd", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdfghjgfd", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdf sdfdghj", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textqdertfhgjh ertytrfes", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textcsdfgh sdfghfds", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdfghj dsdfhgj", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text sfhjgfd efrthyjjrge", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textdfgh sdfghg", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text dsfs dfvsf", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdmfs sdfsdf33", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textm sdf 543", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textdfsd ZEF534", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text sdfsdf 4DF", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textxvs dsfds 56", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text sdfs 563", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text dfgdf 456534FD", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "text sdfs 56", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}
{"text": "textsdf 53f", "options": [{"id": "choice1", "text": "choice1"}, {"id": "choice2", "text": "choice2"}, {"id": "choice3", "text": "choice3"}, {"id": "choice4", "text": "choice4"}], "meta": {"filename": "choice/choice1.txt"}}

and here is the recipe :
import prodigy
from prodigy.components.loaders import JSONL

@prodigy.recipe(
    "choice",
    dataset=("The dataset to save to", "positional", None, str),
    file_path=("Path to texts", "positional", None, str),
    multiple=("Allow multiple choice", "flag", "M", bool)
)
def choice(dataset, file_path, multiple=False):
    """Annotate the sentiment of texts using different mood options."""
    stream = JSONL(file_path)     # load in the JSONL file

    return {
        "dataset": dataset,   # save annotations in this dataset
        "view_id": "choice",  # use the choice interface
        "stream": stream,
        "config": {
            "choice_style": "multiple" if multiple else "single",
            "choice_auto_accept": False if multiple else True
        }
    }

Also, even after annotating and saving annotation and having the examples in the database. If i restart the same annotation with same dataset id, I am still proposed to annotate the same documents and so I still have repeated examples in example table with the same input hash.

It looks like that prodigy load multiple examples in the memory, and when I refresh the page all those examples are gone and new examples are loaded from jsonl file.
This problem is different of the duplicated example problem shown above.

So looks like prodigy works like this : one a batch of example is loaded , if you resfresh the page, those are gone and you need to restart the server to see them again(Correct me if I am wrong, I also tested with the textcat.manual recipe)

For the duplicated examples in the database, it's a bug in the 1.10 release, since I dont have the problem with the 1.9.4 release.
Thanks.

Yes, that's the default – but you can change this by setting "force_stream_order": True.

Are you using the latest version, v1.10.2?

1 Like

Yes!