ImportError: cannot import name 'Schema' from 'pydantic'

Hi all,

I tried pip installing the prodigy==1.10.6 on my local machine. The package installs properly, however, I am getting the following Traceback:

>>> import prodigy

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/archie/prodigy/venv/lib/python3.8/site-packages/prodigy/__init__.py", line 7, in <module>
    from . import recipes
  File "/home/archie/prodigy/venv/lib/python3.8/site-packages/prodigy/recipes/__init__.py", line 4, in <module>
    from ..deprecated import recipes  # noqa
  File "/home/archie/prodigy/venv/lib/python3.8/site-packages/prodigy/deprecated/recipes.py", line 9, in <module>
    from ..core import recipe
  File "cython_src/prodigy/core.pyx", line 12, in init prodigy.core
  File "cython_src/prodigy/components/feeds.pyx", line 8, in init prodigy.components.feeds
  File "/home/archie/prodigy/venv/lib/python3.8/site-packages/prodigy/components/validate.py", line 5, in <module>
    from ..types import TextTask, ClassificationTask, ImageTask, SpansTask, SpansManualTask
  File "/home/archie/prodigy/venv/lib/python3.8/site-packages/prodigy/types.py", line 3, in <module>
    from pydantic import validator, BaseModel, Field, Schema
ImportError: cannot import name 'Schema' from 'pydantic' (/home/archie/prodigy/venv/lib/python3.8/site-packages/pydantic/__init__.cpython-38-x86_64-linux-gnu.so)

It appears that prodigy breaks when pydantic>=1.8. This is because the pydantic.fields.Schema was deprecated in version 1.8 in favor of pydantic.fields.Field.

Short term solution was to pip install pydantic==1.7. But I thought it would be good to give you a heads up before the next release.

This is the very first time I have ever dealt with a bug in any of your projects. Keep up the awesome work! :slight_smile:

2 Likes

Thanks for the heads-up! I guess we'll have to push another update then to prevent the new Pydantic from breaking Prodigy.

(This is a frustrating part of pinning dependencies in Python: we used to have it pinned for narrowly to the next minor version, but this can cause users to get version-locked. But if we pin to a library loosely, it can mean that a breaking update can break our stuff.)

Edit: Fixed in Prodigy v1.10.7!

Wow that was fast! Thanks Ines for the quick feedback and fix!

Yeah, it is definitely an interesting dilemma. Out of curiosity, would you consider pinning the dependencies again? As a consumer of the software, I lean towards pinning the dependencies (or bundling it as an executable) because I want it to always work even once the updates have expired.

Yes, absolutely! In fact, I just added an upper pin for <1.9.0. IMO it's very important that a user is able to install older versions of a software package, especially in data science. You should be able to pick up a project a year later and it should still work. At the same time, packages should be able to move on and introduce breaking changes (of course while following proper versioning standards).

What makes this so difficult in the Python ecosystem is that you can only have one version of a package installed in the environment (unlike JS/Node/npm – not saying this is better, though) and that packages ship with their requirements included. If dependencies were specified outside of the package, this would allow them to be adjusted as other packages change. Anyway, if you're interested in this topic, here's a slightly longer (and surprisingly, quite controversial) recent explanation by @honnibal: This will be a somewhat intemperate response, because as a developer of a signif... | Hacker News

Yeah, to be honest, building up your own .pex files for your dev environments is probably the best future-proof way to go. (I think we actually briefly considered shipping Prodigy as an executable but it'd make the downloaded file significantly larger. Plus, it's not a concept every developer is familiar with.)

Ines, thank you for the in-depth response. There is a lot here to digest and look into. The HN thread was an interesting read. What stood out to me was the distinction between an "application" and "library".

In the case of prodigy, I find that it fits neatly in the application grouping: a cli app that helps you annotate and manage data with an intuitive GUI and fits nicely into a data preparation workflow. Where as something like spacy is clearly a library that you incorporate into your app, so shared dependency flexibility is super important.

In a previous project, we opted to use pyinstaller to deliver an app. The size of it was not an issue, but it quickly became very cumbersome when we needed to specify hidden imports and static data. This often occurred with data science packages that relied on underlying C libraries.

Also, thank you for the pex recommendation. I have yet to use this and am looking forward to giving it a good look :slight_smile:

Hi Ines,

I am just gonna train a new NER model to recognize the algorithms in scientific papers. I prefer the recipe ner. manual with patterns to help me to annotate raw text extracted from papers.

Firstly, I collected about 7,000 algorithm entities from Wikipedia and stored them in a text file.
Secondly, I wrote a python script to transform a list of algorithms (text file) to a Jsonl patterns file with the help of the information. https://support.prodi.gy/t/two-word-ner/995.

However, when I executed the code below I faced the same issue as https://support.prodi.gy/t/importerror-cannot-import-name-schema-from-pydantic/3948. I downgraded the version of pydantic to 1.7 and 1.7.4, but that didn't work. The version of Prodigy is 1.10.8.

Could you help address this issue? Thanks in advance.

$ python -m prodigy ner.manual ner_algorithm_demo en_core_web_trf ./news_headlines.jsonl --label ALGORITHM --patterns ./algorithm_pattern.jsonl
Traceback (most recent call last):
  File "C:\Python37\lib\runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "C:\Python37\lib\runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "C:\Python37\lib\runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "C:\Python37\lib\site-packages\prodigy\__init__.py", line 7, in <module>
    from . import recipes
  File "C:\Python37\lib\site-packages\prodigy\recipes\__init__.py", line 4, in <module>
    from ..deprecated import recipes  # noqa
  File "C:\Python37\lib\site-packages\prodigy\deprecated\recipes.py", line 9, in <module>
    from ..core import recipe
  File "cython_src\prodigy\core.pyx", line 12, in init prodigy.core
  File "cython_src\prodigy\components\feeds.pyx", line 8, in init prodigy.components.feeds
  File "C:\Python37\lib\site-packages\prodigy\components\validate.py", line 5, in <module>
    from ..types import TextTask, ClassificationTask, ImageTask, SpansTask, SpansManualTask
  File "C:\Python37\lib\site-packages\prodigy\types.py", line 3, in <module>
    from pydantic import validator, BaseModel, Field, Schema
ImportError: cannot import name 'Schema' from 'pydantic' (C:\Python37\lib\site-packages\pydantic\__init__.cp37-win_amd64.pyd)


Are you sure that this is the version you have installed in the environment? Prodigy v1.10.8 shouldn't be importing Schema from pydantic, we removed this explicitly in Prodigy v1.10.7. So maybe you ended up with an older Prodigy version somehow? Or maybe the python here refers to a different Python environment than the one you expect or the one pip installs to? This would also explain why you didn't see any effect from downgrading pydantic.


Hi Ines, thanks for your reply. I am pretty sure the version of Prodigy is v1.10.8, see the attached pic above.

This sounds probable. I switched to the right Python environment, and I got a new issue. Is it the problem that I got spacy v3.1.1? Thanks!

(venv) C:\Users\Jayshow\PycharmProjects\prodigy> python -m prodigy ner.manual ner_algorithm_demo en_core_web_trf ./news_headlines.jsonl --label AL
GORITHM --patterns ./algorithm_pattern.jsonl
Traceback (most recent call last):
  File "C:\Program Files\Python36\lib\runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "C:\Program Files\Python36\lib\runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "C:\Program Files\Python36\lib\runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "C:\Users\Jayshow\PycharmProjects\prodigy\venv\lib\site-packages\prodigy\__init__.py", line 7, in <module>
    from . import recipes
  File "C:\Users\Jayshow\PycharmProjects\prodigy\venv\lib\site-packages\prodigy\recipes\__init__.py", line 4, in <module>
    from ..deprecated import recipes  # noqa
  File "C:\Users\Jayshow\PycharmProjects\prodigy\venv\lib\site-packages\prodigy\deprecated\recipes.py", line 15, in <module>
    from ..models.matcher import PatternMatcher
  File "C:\Users\Jayshow\PycharmProjects\prodigy\venv\lib\site-packages\prodigy\models\__init__.py", line 1, in <module>
    from .ner import EntityRecognizer, merge_spans  # noqa: F401
  File "cython_src\prodigy\models\ner.pyx", line 7, in init prodigy.models.ner
ModuleNotFoundError: No module named 'spacy.gold'

Yes, Prodigy v1.10 requires spaCy v2. We have a new release coming up (currently available as a nightly) that uses spaCy v3 :slightly_smiling_face:

Whoops, understood. Many thanks, Ines. I just solved this issue successfully. Here is the version information of my packages or models, and I hope this may help others.

en_core_web_sm v2.2.0
spaCy v2.3.7
Prodigy v1.10.8

1 Like