Custom Recipe load different HTML for each annotation segment

Hi!
I wrote a custom recipe to annotate sentences from a json file and I want to load a html file which is the spacy dependency analysis of the respective sentence. In the following code, l is a list of html code for each sentence. My idea was to use the validate_answer function to add 1 to a counter after each annotation such that for each sentence, the next html code gets loaded. (I use assert True just to prevent any error message). However, for each sentence, the l[0] gets loaded and I don't understand why - can I not use the validate_answer function to change the counter i? Maybe because nothing can be returned by this function? If it is not possible to do it similarly to my way, is there a work around to load different a different html for each annotation? (As I don't think it is possible to load in two streams and I already need the stream for the sentences from the json file because it is them I want to annotate.

def foo(i):
    i += 1
    assert True

@prodigy.recipe("custom_recipe")
def custom_recipe(dataset, jsonl_file):

n = 0

stream = JSONL(jsonl_file)
stream = add_tokens(nlp, stream)

blocks = [
{"view_id": "html", "html_template": l[n]},
{"view_id": "ner_manual"}    
]
return{
"dataset": dataset,
"stream": stream,
"view_id": "blocks",
"validate_answer": foo(n),
"config": {
"labels": ["LABEL1", "LABEL2", "LABEL3"],
"blocks":blocks
}}

Thanks!

Hi! The validate_answer method isn't really intended to be used this way but the actual problems are a bit deeper:

  1. validate_answer should be a callback that can be executed with the answer and will raise an error if the answer is not valid. In your case, you're returning None (the return value of foo), so nothing will be executed.
  2. Even if you did return a callback, the blocks are returned by the recipe config and that code is only executed once, on startup. So even if you keep executing foo, it will not actually execute your Python function again. To illustrate this outside of Prodigy, consider the following:
def my_function():
    i = 0
    
    def foo():
        nonlocal i
        i += 1
        print("foo", i)

    print("my_function", i)
    return foo

x = my_function()
x()
x()
x()
  1. Even if you did have a function that incremented a counter in the global scope, keep in mind that the stream is sent to the app in batches. So Prodigy will send 10 examples, then your callback is executed, and the updated global would only be reflected in the next batch of examples.

If you want to use a different template for every example, couldn't you just do something like this?

def get_stream(stream):
    i = 0
    for eg in stream:
        eg["config"] = {"html_template": l[i]}
        yield eg
        i += 1

Right, thanks for the explanation of what happens behind the curtains of validate_answer!
Also thanks for the input of using get_stream to add the correct html template to a new key "html_template", but I am not quite sure how to access it then?
I mean I still have to use the argument "html_template" in here:

@prodigy.recipe("custom_recipe")
def custom_recipe(dataset, jsonl_file):

    stream = JSONL(jsonl_file)
    stream = get_stream(stream)
    {"view_id": "html", "html_template": ?? }, 
    {"view_id": "ner_manual"}  
    ]
    return{
    "dataset": dataset,
    "stream": stream,
    "view_id": "blocks",
    "config": {
    "labels": ["LABEL1", "LABEL2", "LABEL3"],
    "blocks":blocks
    }}

and I am very confused what to use as the value now?

I had a small typo in my last example and just edited the code. Instead of overwriting the "html_template" globally on the block for all blocks, you can also set it on the individual examples via the "config". Alternatively, you can also overwrite the value of "html" of an individual example, if that's more intuitive.

Ok so now, the stream looks like this:
stream = {"text": "First sentence.", "config": "<First html template>", ...}
but still - how do I access the "config" key in
{"view_id": "html", "html_template": ?????}
?

The html_template on the block is just an option to override the value globally for all examples in this block. If you're setting it on the example itself, you can just leave it out on the block and only put {"view_id": "html"}. Your examples could then look like this:

{"text": "Text 1", "config": {"html_template": "<first template>"}}
{"text": "Text 2", "config": {"html_template": "<second template>"}}

They do look like this exactly but nothing (except for the labels) is loaded :confused: Neither the html template nor the data in the stream but there is no error message (not even after closing prodigy).

Just to make sure, I post my reproducible code again:
(l made a simple example using l = ["<p>test sentence</p>", "<p>second test</p>"] and only two sentences in the jsonl file)

def get_stream(file):
    i = 0
    for eg in file:
        eg["config"] = {"html_template": l[i]}
        print(eg)
        yield eg
        i += 1

#stream looks like this: {"text": "first sentence", "config": {html_template": "first template"}} ...

@prodigy.recipe("custom_recipe")
def custom_recipe(dataset, jsonl_file):

    stream = JSONL(jsonl_file)
    stream = get_stream(stream)
    blocks = [
    {"view_id": "html"},
    {"view_id": "ner_manual"}  
    ]
    return{
    "dataset": dataset,
    "stream": stream,
    "view_id": "blocks",
    "config": {
    "labels": ["LABEL1", "LABEL2", "LABEL3"],
    "blocks":blocks
    }}

prodigy.serve("custom_recipe some_data final_transcription.jsonl")

Thanks for sharing your code! I just had a deeper look at this and it turns out there was a small problem that caused the HTML template to not be used as a fallback in the blocks UI. I already fixed this and will include the fix in the next release.

In the meantime, can you just overwrite the "html" of the task instead and pass in the rendered template? If you're using Mustache variables, you can just substitute them in Python: GitHub - noahmorrison/chevron: A Python implementation of mustache

Edit: Fixed in v1.11!