AttributeError: 'NoneType' object has no attribute 'strings'

After training a model on 200 examples, I can't do binary teach, when running this I am getting an error, any ideas what's going wrong? (search doesn't help, not sure exactly what's wrong in my dataset)

prodigy ner.teach ner_st2_skills ./model/model-best ./data.jsonl --label SKILL

Using 1 label(s): SKILL
Traceback (most recent call last):
  File "/Users/fed/.pyenv/versions/3.9.2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/fed/.pyenv/versions/3.9.2/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/prodigy/__main__.py", line 61, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 329, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "/Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "/Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "/Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/prodigy/recipes/ner.py", line 71, in teach
    model = EntityRecognizer(nlp, label=label)
  File "cython_src/prodigy/models/ner.pyx", line 340, in prodigy.models.ner.EntityRecognizer.__init__
  File "cython_src/prodigy/util.pyx", line 621, in prodigy.util.copy_nlp
  File "spacy/vocab.pyx", line 90, in spacy.vocab.Vocab.vectors.__set__
AttributeError: 'NoneType' object has no attribute 'strings'

My data.jsonl was mapped using PhraseMatch like this (from previously labeled data)

for obj in data:
    doc = nlp(obj['text'])

    matcher = PhraseMatcher(nlp.vocab, attr="LOWER")
    matcher.add("SKILL", [nlp.make_doc(cls['value']) for cls in filterSkillsByConfidence(skills[obj['meta']['listingId']])])
    matches = matcher(doc)

    entities = list()
    for match_id, start, end in matches:
        entities.append(Span(doc, start, end, label='SKILL'))

    doc.ents = spacy.util.filter_spans(entities)

    obj["spans"] = [{"token_start": ent.start,
                    "token_end": ent.end - 1,
                    "start": ent.start_char,
                    "end": ent.end_char,
                    "text": ent.text,
                    "label": ent.label_} for ent in doc.ents]

And whenever trying to:
poetry run python -m prodigy ner.correct ner_st2_skills ./model/model-best ./data.jsonl --label SKILL

The model you're using isn't setting sentence boundaries (e.g. via the parser or sentencizer). This means that incoming examples won't be split into sentences.

Hi! Sorry to hear you're running into this. I'm pretty sure I know what's up with the ner.teach error you're getting, but could you share your Prodigy and spaCy versions so I can verify?

1 Like
============================== Info about spaCy ==============================

spaCy version    3.2.1                         
Location         /Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/spacy
Platform         macOS-11.6.1-x86_64-i386-64bit
Python version   3.9.2                         
Pipelines        en_core_web_md (3.2.0), en_core_web_sm (3.2.0)

and

I have node ideas how to get the version of the prodigy, but reruning it pip install prodigy -f https://*@download.prodi.gy gives me actually this error: (btw, I installed prodigy using this command like ~3 days ago, should be latest)

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
en-core-web-sm 3.2.0 requires spacy<3.3.0,>=3.2.0, but you have spacy 3.1.4 which is incompatible.
en-core-web-md 3.2.0 requires spacy<3.3.0,>=3.2.0, but you have spacy 3.1.4 which is incompatible.

I installed prodigy first and then spaCy :frowning: that's why I haven't seen this error at first

@SofieVL thanks for pointing this out, I have rebuild env from scratch by installing spaCy and prodigy, by validating that there is no errors and downloading specific models for what spaCy + prodigy can support right now. Looks like error disappeared and binary teach works great now!

Thanks again and happy holidays!

Hi,

Thanks for confirming!

For future reference, you can run

prodigy stats

to get the version number.

Anyway, Prodigy up until 1.11.6 was pinning spaCy to <3.2. That explains why rerunning the installation in your old environment throws the error: Prodigy tries to install spaCy 3.1.4. But because you had installed 3.2.1, you had also downloaded spaCy models for 3.2, so pip couldn't easily downgrade spaCy. But as you've found, starting from scratch and letting Prodigy install spaCy from the start, should fix your problems for now.

Ofcourse we want you to be able to benefit from the newest spaCy releases! In fact, we had recently encountered the issue you described originally, and we've already fixed it to make sure Prodigy works well with the latest spaCy version. We're now working on a new release of Prodigy that will be compatible with spaCy 3.2. It will be available soonish :wink:

1 Like

I am happy to be a beta-tester of a new prodigy, would love to join that list. Thanks again for the help!

while prodigy runs tightly connected with spaCy, would it be great to have spaCy supported versions or spaCy validate output here?


Version          1.11.6                        
Location         /Users/fed/Library/Caches/pypoetry/virtualenvs/nel-riFBMyAx-py3.9/lib/python3.9/site-packages/prodigy
Prodigy Home     /Users/fed/.prodigy           
Platform         macOS-11.6.1-x86_64-i386-64bit
Python Version   3.9.2                         
Database Name    SQLite                        
Database Id      sqlite                        
Total Datasets   6                             
Total Sessions   28

?

Just released Prodigy v1.11.7, which should resolve the underlying problem :slight_smile:

This is a nice idea! We'd just have to think about how to best implement this, since the spaCy version range is typically only defined in the package requirements on Prodigy (and we wouldn't want to duplicate this configuration so it doesn't go out-of-sync). It's definitely possible to retrieve this info via importlib.metadata, though, so we can try that!

1 Like