jsonl error handling

Ah, so sorry – you edited your post, right? I think I missed that.

Your tasks definitely look correct, so here are the possible explanations I can think of:

  1. In your example above, you’re annotating the label PERNEEL – can you check that the label is definitely added to the model? When you use the --label argument, Prodigy will get spaCy’s predictions and filter out all examples that include entities with that label. But if your model doesn’t predict that label, there won’t be any tasks for you to annotate (see my comment here for a more detailed explanation of this). This is something we’re fixing btw – in the next version, Prodigy will raise an error if a label is not in the model.

  2. How many examples does your personeel_headlines.jsonl set include? By default, ner.teach will resort the stream and only ask you about the examples it’s most unsure about, e.g. the ones with the predictions close to 50/50 (and potentially skip examples with very high/low scores). This will adjust as you annotate – but if there are only very few examples, and they all end up with a very low or very high score, there might not be enough examples for a full batch.

  3. The fact that the error occurs in itertoolz and next(iter(seq)) looks like there might be an empty batch somewhere. What happens if you set "batch_size": 1 in your prodigy.json?

As I said, this will be easier to debug in the future with proper logging, but in the meantime, you could edit the teach() recipe in prodigy.recipes.ner and add a few print statements to see where it’s failing.