"No tasks available" for any text source I give for ner.teach recipe

Trying an extremely simple ner.teach recipe. But the UI shows “No tasks available” irrespective of what text source I give. Am I doing something wrong?

Command:
prodigy ner.teach ner_goal_type en_core_web_sm train_data.txt --loader txt --label GOAL_TYPE

I tried with csv and txt. Both of it are showing “No tasks available”.

Contents of train_data.txt:
This is a sentence
This is another sentence

Data is not in a proper format.

It should look like this.

{"text": "Angela Merkel zeigt sich optimistisch", "spans": [{"start": 0, "end": 11, "label": "PERSON"}]}

You have to use the file as name.jsonl
Related

I think I know what the problem is: In your example, you’re trying to annotate a label GOAL_TYPE, which is not actually present in your model – at least not if you’re using the default en_core_web_sm model. ner.teach expects the model you’re loading in to recognise the entity you want to annotate already and have at least a vague concept of it – otherwise, it can’t suggest you potential candidates. So in your case, Prodigy is looking for entities that are recognised as GOAL_TYPE – which doesn’t exist, so it doesn’t have any tasks available for you to annotate.

As a test, try annotating a label that’s built-in – for example PERSON or ORG, or simply leave out the --label argument (which tells Prodigy to show all entities it finds).

We don’t yet have a perfect recipe for adding and training a new entity type. Especially the “cold start” is pretty difficult, because you first need to teach your model something about the new entity, before you can use it and correct its predictions.

As a solution for now, try pre-training your model with spaCy, add the new entity and train on a few examples that contain your GOAL_TYPE label – see this usage guide for a code example. After training, save your model using nlp.to_disk() and then load it into Prodigy – for example:

prodigy ner.teach ner_goal_type /path/to/model train_data.txt --loader txt --label GOAL_TYPE

.csv, .txt and .json are fine, too! Prodigy comes with readers for all file types, they just need to be formatted consistently. You can find examples of all supported file types and the expected input in the PRODIGY_README.html

Thanks @ines just read the documentation for file formats.

1 Like