After some annotations prodigy says "No tasks available."

Hello everyone! :smile: I have built my custom recipe for prodigy. It took a list of sentences and a list of questions, and prodigy render them combined. However, something strange happen time by time. After a couple of annotations system says "No tasks avaible", but if I reload the page it lets me annotate again. Furthermore, they are sentences different from the previous ones. I can't figure the problem out. What am I misunderstanding?

This is the code of my recipe:

import prodigy
from prodigy.components.preprocess import add_tokens
from prodigy.components.db import connect
from prodigy import set_hashes
from utils import constant
import spacy
import json

def fak_facts_ner(dataset, lang="de",  input_file=("File to input", "positional", None, str)):

    input_file = "".join(input_file)

    def get_stream():
        db = connect()
        with open(input_file, 'r') as json_file:
            while True:
                json_list = list(json_file)
                for fact in json_list:
                    result = json.loads(fact)
                    for question, options in constant.options.items():
                        hashes_in_dataset = db.get_task_hashes(dataset)
                        next_value = set_hashes(
                                "text": result["text"],
                                "meta": result["meta"],
                                "label": question,
                                "options": options
                        if next_value["_task_hash"] not in hashes_in_dataset:
                            yield next_value

    nlp = spacy.blank(lang)           # blank spaCy pipeline for tokenization
    stream = get_stream()             # set up the stream
    stream = add_tokens(nlp, stream)  # tokenize the stream for ner_manual
    #stream = list(stream)

    return {
        "dataset": dataset,          # the dataset to save annotations to
        "view_id": "blocks",         # set the view_id to "blocks"
        "stream": stream,            # the stream of incoming examples,
        "config": {
            "labels": [],
            "blocks": constant.blocks,         # add the blocks to the config
            "choice_style": "multiple"

Hi Federico.

It's hard for me to pinpoint exactly what's happening but after reading your snippet, there are a few thoughts in my mind that seem worth to check.

1. Setting the Hash Manually

It seems like you're using set_hashes to control the hashes set on the object manually. However, it seems like you omitted the input_keys and task_keys. By using those keys, you can determine how the hashes get created. For example:

# In this example the `text` and `custom_text` properties would be used to 
# create the input hash
stream = (set_hashes(eg, input_keys=("text", "custom_text")) for eg in stream)

# You can do the same for the task_keys
stream = (set_hashes(eg, task_keys =("label", "meta")) for eg in stream)

It's possible that your if next_value["_task_hash"] not in hashes_in_dataset:-line isn't detecting your task_keys as you're expecting because you didn't set these. The default task keys are set to ("spans", "label", "options"), so if the options are the same in many examples, this might explain the behavior that you're seeing.

2. Reading in the file

Is there a reason why you're not using the builtin JSON file loader?

I think something like the code below should work in your situation.

from prodigy.components.loaders import JSON

stream = JSON("path/to/file.json")

The reason why I bring it up is because of the while True: json_list = list(json_file) loop. It might also be causing issues.

Feel free to let me know if these directions help, I'll gladly check in with you again.