"No tasks available" for ner_manual but not ner

I want to do NER teaching with active learning and seed patterns, but I also want the option to correct the annotation spans suggested by Prodigy. I wrote a recipe that changes the view_id returned by teach to ner_manual but otherwise leaves the recipe the same.

@recipe('ner.gold-with-model',
        dataset=recipe_args['dataset'],
        spacy_model=recipe_args['spacy_model'],
        source=recipe_args['source'],
        api=recipe_args['api'],
        loader=recipe_args['loader'],
        label=recipe_args['label_set'],
        patterns=recipe_args['patterns'],
        exclude=recipe_args['exclude'],
        unsegmented=recipe_args['unsegmented'])
def gold_with_model(dataset, spacy_model, source=None, api=None, loader=None,
                    label=None, patterns=None, exclude=None, unsegmented=False):
    """
    Collect the best possible training data for a named entity recognition
    model with the model in the loop. Based on your annotations, Prodigy will
    decide which questions to ask next.
    """
    s = teach(dataset, spacy_model, source, api, loader, label, patterns, exclude, unsegmented)
    s["view_id"] = "ner_manual"
    return s

If I run under the debugger it looks like Prodigy is returning tasks, but the web UI says “No tasks available”.

PRODIGY_LOGGING=verbose pgy ner.gold-with-model -F ner_gold_with_model.py  disposition.gold.001 en drilling-reports.jsonl --label DISPOSITION --patterns seed-phrases.jsonl

14:31:59 - GET: /get_questions
14:31:59 - CONTROLLER: Returning a batch of tasks from the queue
14:31:59 - RESPONSE: /get_questions (10 examples)
{'tasks': ({'_id': 'Exploration-drilling-results25.txt', 'text': 'The well is now permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -253504316, '_task_hash': 1486205210, 'spans': [{'text': 'permanently plugged', 'start': 16, 'end': 35, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results25.txt', 'text': 'The well is now permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -253504316, '_task_hash': -484633833, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 16, 'end': 49, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results31.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results31.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results248.txt', 'text': 'The well will be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -2107397767, '_task_hash': -1151566459, 'spans': [{'text': 'permanently plugged', 'start': 17, 'end': 36, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results248.txt', 'text': 'The well will be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -2107397767, '_task_hash': -1890890808, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 17, 'end': 50, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results274.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results274.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results260.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results260.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}), 'total': 0, 'progress': None}

(The suggested tasks in the debug output above look correct to me.)

If I run ner.teach with the same arguments, the web UI does provide me with tasks. If I run ner.make-gold with the same arguments except --patterns seed-phrases.jsonl the web UI also provides me with tasks.

Why am I seeing “No tasks available” with my custom script, and how can I get active learning to run in ner_manual mode?

(Sorry about the lack of useful error messages here – we’re still working on that!)

I think your custom recipe is missing the "tokens" property on the individual examples. The text needs to be tokenized to make the token-based entity selection work. The add_tokens preprocessor can take care of this for you – here’s a standalone example:

import spacy
from prodigy.components.loaders import JSONL
from prodigy.components.preprocess import add_tokens

nlp = spacy.load('en')
stream = JSONL('your_data.jsonl')
stream = add_tokens(nlp, stream)

You should be able to just overwrite s['stream'] in your custom recipe. The add_tokens pre-processor will also add the respective token indices to already existing entities in the stream so they can be rendered in manual mode.

That gets me farther, but now I’ve got another problem, :slight_smile:

The following code modified following your advice gives me tasks, suggests spans from both patterns and the model, and also allows me to manually specify the markup.

def gold_with_model(dataset, spacy_model, source=None, api=None, loader=None,
                    label=None, patterns=None, exclude=None, unsegmented=False):
    nlp = spacy.load(spacy_model)
    s = teach(dataset, spacy_model, source, api, loader, label, patterns, exclude, unsegmented)
    s["view_id"] = "ner_manual"
    s["stream"] = add_tokens(nlp, s["stream"])
    return s

I am trying to train a new label, DISPOSITION, that is not in the default model. I have it specified in the --label command line option and in my seed-phrases.jsonl file. However, Prodigy suggests NO_LABEL as the only possible tag.

How do I get Prodigy to offer DISPOSITION as a label type in a dropdown menu?

Are you using the same argument annotations for your custom recipe (i.e. recipe_args['label_set'])? Since you’re calling teach directly, the value you pass in there for label needs to be a list of labels. So this needs to happen either via the Plac annotations (recipe_args['label_set'] includes a converter that takes care of that), or you need to do this manually, for example:

label = [l.strip() for l in label.split(',')]

The label argument is specified as the following argument to my custom recipe.

...
label=recipe_args['label_set'],
...

Is the issue that the original ner.teach handles its label argument than the original ner.make_gold?

Yes, that’s all correct – and we’ve also unified the way the labels are handled across recipes in a recent release.

I think I know the answer, though. The label set also needs to be added to the recipe config:

s['config']['labels'] = label

This is done automatically in the built-in recipes that use the manual interface – but since there’s no “full label set” in ner.teach that the web app needs to know about upfront, the recipe also doesn’t have this setting.

That last change got me working. Thanks.