I want to do NER teaching with active learning and seed patterns, but I also want the option to correct the annotation spans suggested by Prodigy. I wrote a recipe that changes the view_id
returned by teach
to ner_manual
but otherwise leaves the recipe the same.
@recipe('ner.gold-with-model',
dataset=recipe_args['dataset'],
spacy_model=recipe_args['spacy_model'],
source=recipe_args['source'],
api=recipe_args['api'],
loader=recipe_args['loader'],
label=recipe_args['label_set'],
patterns=recipe_args['patterns'],
exclude=recipe_args['exclude'],
unsegmented=recipe_args['unsegmented'])
def gold_with_model(dataset, spacy_model, source=None, api=None, loader=None,
label=None, patterns=None, exclude=None, unsegmented=False):
"""
Collect the best possible training data for a named entity recognition
model with the model in the loop. Based on your annotations, Prodigy will
decide which questions to ask next.
"""
s = teach(dataset, spacy_model, source, api, loader, label, patterns, exclude, unsegmented)
s["view_id"] = "ner_manual"
return s
If I run under the debugger it looks like Prodigy is returning tasks, but the web UI says “No tasks available”.
PRODIGY_LOGGING=verbose pgy ner.gold-with-model -F ner_gold_with_model.py disposition.gold.001 en drilling-reports.jsonl --label DISPOSITION --patterns seed-phrases.jsonl
14:31:59 - GET: /get_questions
14:31:59 - CONTROLLER: Returning a batch of tasks from the queue
14:31:59 - RESPONSE: /get_questions (10 examples)
{'tasks': ({'_id': 'Exploration-drilling-results25.txt', 'text': 'The well is now permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -253504316, '_task_hash': 1486205210, 'spans': [{'text': 'permanently plugged', 'start': 16, 'end': 35, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results25.txt', 'text': 'The well is now permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -253504316, '_task_hash': -484633833, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 16, 'end': 49, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results31.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results31.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results248.txt', 'text': 'The well will be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -2107397767, '_task_hash': -1151566459, 'spans': [{'text': 'permanently plugged', 'start': 17, 'end': 36, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results248.txt', 'text': 'The well will be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': -2107397767, '_task_hash': -1890890808, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 17, 'end': 50, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results274.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results274.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}, {'_id': 'Exploration-drilling-results260.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -209955494, 'spans': [{'text': 'permanently plugged', 'start': 21, 'end': 40, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 1}], 'meta': {'score': 0.7142857313156128, 'pattern': 1}}, {'_id': 'Exploration-drilling-results260.txt', 'text': 'The well will now be permanently plugged and abandoned.', 'document': {'segment': 1, 'start_char': 0}, '_input_hash': 682426378, '_task_hash': -939648210, 'spans': [{'text': 'permanently plugged and abandoned', 'start': 21, 'end': 54, 'label': 'DISPOSITION', 'priority': 0.7142857313156128, 'score': 0.7142857313156128, 'pattern': 0}], 'meta': {'score': 0.7142857313156128, 'pattern': 0}}), 'total': 0, 'progress': None}
(The suggested tasks in the debug output above look correct to me.)
If I run ner.teach
with the same arguments, the web UI does provide me with tasks. If I run ner.make-gold
with the same arguments except --patterns seed-phrases.jsonl
the web UI also provides me with tasks.
Why am I seeing “No tasks available” with my custom script, and how can I get active learning to run in ner_manual
mode?