Bug Report: Error in using ner.model-annotate recipe related to `labels` argument

:bug: Bug description:
Hi :slight_smile: I'm following the "models as annotators" tutorial, trying to use spacy-medium to label 4 entities. However the argument --labels seem to be wrong, and I cannot get --label to work as expected either.

:man_walking:t4:Reproduction steps:
How can we recreate the bug?
Firstly I tried run command exactly like how tutorial states, by passing --labels ORG,PERSON,LOC,PRODUCT

Secondly, I tried not passing in --labels argument, it defaults to use all the pre-defined labels. It runs OK.

I then tried to pass by --label ORG,PERSON,LOC,PRODUCT. Note it's label rather than labels. It runs, but does not process any annotation, as you notice the progress states is 0%, rather than 100% like previous step.

:desktop_computer: Environment variables:
Version 1.15.6
License Type Prodigy Personal
Platform macOS-10.16-x86_64-i386-64bit
Python Version 3.9.13
spaCy Version 3.7.5

Welcome to the forum @foreveryang0208 :wave:

There's definitely a mistake in the video tutorial around minute 9. I'm guessing it's because it was recorded with a version of the recipe that wasn't 100% final.
Sorry about that!
The right argument name is --label. Btw. you can always run the recipe with just the -h flag to get CLI help on the arguments.

The reason why none of the examples gets annotated on your second pass is that the ner.model-annotate recipe filters out the examples that were already annotated by the model as defined via model-alias argument.
In other words, since in your second command (with the subset of labels) you use the same target dataset and the same model-alias - all examples are filtered out because they are considered to be already annotated by en_core_web_md. So for your experiment to work, you should change the target dataset name (or model alias name) to get the recipe to annotate all examples again.

1 Like

Thank you @magdaaniol ! That's very helpful!