Error: Can't find label with ner.teach

(I saw there is a similar topic with this error on the forum, but this doesn't resolve my issue)

  • I trained 3 NER categories (VN-L, VN-N, VN-H) and wrote them succesfully to a new model (VN1). The output shows that the three categories are trained (not all with a very high F score...)
  • As a next step I wanted to use ner.teach with this VN1 model on a new sourcefile.
  • I then receive the error "Can't find label 'VN-L' in model .../VN1" (if I remove VN-L from --label, it can't find VN-N)
  • This seems strange since I just wrote this label in the previous step.

Any suggestions how to resolve this?

As an aside -- is there a Spacy command to list the entities available in a Spacy model?

Best,

Roel

Hi! I just had a quick look and I think the issue is caused by the fact that your labels include a hyphen -, which is also used internally in spaCy to represent the BILUO scheme, e.g. B-VN-L. It looks like Prodigy still checks for label.split("-")[-1], which is unideal. Instead, it should just call into spaCy directly and get the labels of the NER component and check against that. I'll fix this for the next version!

In the meantime, the quickest workaround is this: find the recipes/ner.py of your Prodigy installation (you can run prodigy stats to find the path) and find the following lines:

for l in label:
    if not model.has_label(l):

... and then change those to:

ner = nlp.get_pipe("ner")
for l in label:
    if l not in ner.labels:

Of course, the other alternative would be to export and re-import your datasets and replace all your labels to not use hyphens.

Yes, that's the nlp.pipe_labels property, which returns a dict keyed by component name, mapped to the labels in the component. You can also look at the .labels property of a given component, e.g.:

ner = nlp.get_pipe("ner")
print(ner.labels)

Hi Ines,

Thank you for your response.

I think that the hyphen is already an issue when saving the model. Which means your coding solutions unfortunately doesn't work.

When loading the trained model in Spacy, the ner.labels command doesn't show named entities 'VN-L', 'VN-N', 'VN-H'. However there is a new entity 'VN'.

best,

Roel

Ah, in that case, the easiest workaround would probably be to export the dataset, update the labels so they don't use hyphens and then retrain. Sorry about the confusion here!

(Btw, does the problem only happen when you train with Prodigy, or do you see the same issue when you export the data with data-to-spacy and run spacy train? )

Ah, in that case, the easiest workaround would probably be to export the dataset, update the labels so they don't use hyphens and then retrain. Sorry about the confusion here!

Yes; that worked fine.

(Btw, does the problem only happen when you train with Prodigy, or do you see the same issue when you export the data with data-to-spacy and run spacy train ? )

I'll give that a go in the future!

1 Like