Hi,
I wrote a custom recipe that uses the ner.make-gold
recipe, except for the on_load
method which I rewrite to filter out previously seen examples. I wrote it based off this forum post.
Here is my custom recipe:
import prodigy
from prodigy.recipes.ner import make_gold
from prodigy.util import INPUT_HASH_ATTR # the input hash attr constant
@prodigy.recipe("make-gold-filtered",
dataset=prodigy.recipe_args["dataset"],
model=prodigy.recipe_args["spacy_model"],
source=prodigy.recipe_args["source"],
label=prodigy.recipe_args["label"],
unsegmented=prodigy.recipe_args["unsegmented"]
)
def make_gold_filtered_recipe(dataset, model, source, label, unsegmented):
components = make_gold(dataset, model, source=source, label=label, unsegmented=unsegmented)
def on_load(controller):
"""Filter out previously seen examples."""
input_hashes = controller.db.get_input_hashes(dataset) # Since building gold dataset, each text should be seen once so filter by the input hash rather than task hash
components["stream"] = (eg for eg in components["stream"] if eg[INPUT_HASH_ATTR] not in input_hashes)
components["on_load"] = on_load
components["view"] = "ner"
return components
When I tried using it from the command line, it ran without any errors, but when I opened the web server I got an error message saying that the web app was broken. I could even click accept or reject and the examples would show up on the left side of the web app, but there was just an error message where each task should be displayed:
Screen Shot 2019-08-15 at 11.36.27 AM
Do you have any ideas as to what went wrong? Is there a bug in my recipe or in the web app or something else?