Custom recipe stops after one annotation

Hey there! I’m having some problems with loading multi-column data. I need to display a subset of the columns to the annotator, with a 3-way classification. Right now I’m using a custom mark recipe with a choice interface, and am sure I’m probably missing something very obvious. With a small set of ~100 rows, the annotation interface shows the data properly (using a template), however it stops after a single task.
Note: I’m using a trial of prodigy, and don’t have access to the PRODIGY_README.html. Maybe this is explained in there?

Thanks!

import pandas as pd
import json
from pprint import pprint
import sys
from prodigy.components.loaders import JSONL
from prodigy.recipes.generic import mark

def add_options(stream, template):
    for task in stream:
        options = [{'id': 'R', 'text': 'Relevant'},
               {'id': 'N', 'text': 'Not Relevant'},
               {'id': 'U', 'text':
                'Unknown - Insufficient Information'}]
        task['options'] = options
        task['html'] = template
        yield task

@prodigy.recipe('custom-mark',
    dataset=('Dataset ID', 'positional', None, str),
    view_id=('Annotation interface', 'option', 'v', str))
def my_custom_recipe(dataset, view_id, source=None):
    # load your own streams from anywhere you want
    with open('template.html') as tmp:
        html_template = tmp.read()
    stream = add_options(JSONL('./data/test.jsonl'), html_template)
    def update(examples):
        # this function is triggered when Prodigy receives annotations
        print("Received {} annotations!".format(len(examples)))
    print("Dataset ID:", dataset)
    config = {'choice_auto_accept': True,
            'html_template': html_template,
            'instructions': './instructions.html'}
    components = mark(dataset=dataset, source=stream)
    components['view_id'] = 'choice'
    components['config'] = config
    return components

Your recipe looks good!

I think I can see what's happening here: I assume your JSONL file includes some custom fields that you then add to your HTML template as variables, right? When you stream in annotation tasks, Prodigy assigns them hashes, which are used internally to tell if two examples are identical, referring to the same input etc. By default, those hashes are only generated based on the default keys like "text", "spans", "html" etc., because Prodigy doesn't want to make assumptions about what any of your custom properties mean.

So in your case, the hash was probably only based on the "html" and "options", which were always identical. So Prodigy thought it was the exact same example! One option would be to customise the hashing, but I think that's not even necessary here. A simpler solution would be to just set the task['text'] to something unique – for example, one of your custom properties or a unique ID or something like that.

Btw, in your add_options function, you're passing in the html_template and setting that to the "html" value of the task. The "html" is only required if you actually want to pass in different HTML markup for each task – it's not necessary if you're already using an html_template. So you should be able to leave this out :slightly_smiling_face:

Oh, you shouldn't have to work without the Readme! We normally always include the PRODIGY_README.html when we issue a trial – could you send us an email at contact@explosion.ai and let us know who you're working for, so we can make sure you get access to the Readme?

Thanks for the prompt reply! My data does include a unique key field, used that as the "text" attribute, and it works! I did try using only the html_template in the config (without a separate "html" field for each task), but that just displayed the choice form with the options, and not the data itself. I’ll try to see if there’s any other issue that’s causing this, but the rest looks good for now. Will drop a mail about the README.

Thanks again!

Edit: Yup, they’d forgotten the README, sent it with a later email, no worries :slight_smile:

Sorry, my bad – I forgot that you were using the choice interface (not html directly). So yes, you're right: the only way the components of the choice interface can signal that they should be rendered as HTML is if they have a "html" key.