Hello. I am new to prodigy and have created a project for span categorization using:
prodigy spans.manual my_project en_core_sci_sm C:\Prodigy\Data\my_project.csv --loader csv --label RESPIRATORY,NEGATIVE
When running the train recipe:
prodigy train ./Models --spancat my_project --base-model en_core_sci_sm
I'm getting this error:
ℹ Using CPU
========================= Generating Prodigy config =========================
ℹ Auto-generating config with spaCy
Using 'spacy.ngram_range_suggester.v1' for 'spancat' with sizes 1 to 11 (inferred from data)
ℹ Using config from base model
✔ Generated training config
=========================== Initializing pipeline ===========================
✘ Config validation error
Bad value substitution: option 'width' in section 'components.spancat.model.tok2vec' contains an interpolation key 'components.tok2vec.model.encode.width' which is not a valid option name. Raw value: '${components.tok2vec.model.encode.width}'
I am also seeing a similar error when trying the data-to-spacy command:
python -m prodigy data-to-spacy .\output --spancat my_project --base-model en_core_sci_sm
> ======================== Generating cached label data ========================
> Traceback (most recent call last):
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\runpy.py", line 194, in _run_module_as_main
> return _run_code(code, main_globals, None,
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\runpy.py", line 87, in _run_code
> exec(code, run_globals)
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\prodigy\__main__.py", line 62, in <module>
> controller = recipe(*args, use_plac=True)
> File "cython_src\prodigy\core.pyx", line 379, in prodigy.core.recipe.recipe_decorator.recipe_proxy
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\plac_core.py", line 367, in call
> cmd, result = parser.consume(arglist)
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\plac_core.py", line 232, in consume
> return cmd, self.func(*(args + varargs + extraopts), **kwargs)
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\prodigy\recipes\train.py", line 514, in data_to_spacy
> nlp = spacy_init_nlp(config)
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\spacy\training\initialize.py", line 29, in init_nlp
> config = raw_config.interpolate()
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\confection\__init__.py", line 196, in interpolate
> return Config().from_str(self.to_str())
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\confection\__init__.py", line 387, in from_str
> self.interpret_config(config)
> File "C:\Users\bmosher\Anaconda3\envs\prodigy\lib\site-packages\confection\__init__.py", line 238, in interpret_config
> raise ConfigValidationError(desc=f"{e}") from None
> confection.ConfigValidationError:
>
> Config validation error
> Bad value substitution: option 'width' in section 'components.spancat.model.tok2vec' contains an interpolation key 'components.tok2vec.model.encode.width' which is not a valid option name. Raw value: '${components.tok2vec.model.encode.width}
I verified the my text examples have a max length of 500. I'm really at a loss for how to move forward.
Here is the output of conda list:
# packages in environment at C:\Users\bmosher\Anaconda3\envs\prodigy:
#
# Name Version Build Channel
aiofiles 23.1.0 pypi_0 pypi
anyio 3.5.0 py38haa95532_0
appdirs 1.4.4 pyhd3eb1b0_0
argon2-cffi 21.3.0 pyhd3eb1b0_0
argon2-cffi-bindings 21.2.0 py38h2bbff1b_0
asttokens 2.0.5 pyhd3eb1b0_0
attrs 22.1.0 py38haa95532_0
backcall 0.2.0 pyhd3eb1b0_0
beautifulsoup4 4.11.1 py38haa95532_0
blas 1.0 mkl
bleach 4.1.0 pyhd3eb1b0_0
blis 0.7.9 pypi_0 pypi
brotlipy 0.7.0 py38h2bbff1b_1003
ca-certificates 2023.01.10 haa95532_0
cachetools 5.3.0 pypi_0 pypi
catalogue 2.0.8 pypi_0 pypi
certifi 2022.12.7 py38haa95532_0
cffi 1.15.1 py38h2bbff1b_3
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.1.3 pypi_0 pypi
colorama 0.4.6 py38haa95532_0
comm 0.1.2 py38haa95532_0
confection 0.0.4 pypi_0 pypi
conllu 4.5.2 pypi_0 pypi
cryptography 38.0.4 py38h21b164f_0
cymem 2.0.7 pypi_0 pypi
cython 0.29.28 py38hd77b12b_0
debugpy 1.5.1 py38hd77b12b_0
decorator 5.1.1 pyhd3eb1b0_0
defusedxml 0.7.1 pyhd3eb1b0_0
en-core-sci-sm 0.5.1 pypi_0 pypi
entrypoints 0.4 py38haa95532_0
executing 0.8.3 pyhd3eb1b0_0
fastapi 0.89.1 pypi_0 pypi
fftw 3.3.9 h2bbff1b_1
flit-core 3.6.0 pyhd3eb1b0_0
gensim 4.2.0 py38hd77b12b_0
h11 0.14.0 pypi_0 pypi
icc_rt 2022.1.0 h6049295_2
idna 3.4 py38haa95532_0
importlib_resources 5.2.0 pyhd3eb1b0_1
intel-openmp 2021.4.0 haa95532_3556
ipykernel 6.19.2 py38hd4e2768_0
ipython 8.8.0 py38haa95532_0
ipython_genutils 0.2.0 pyhd3eb1b0_1
jedi 0.18.1 py38haa95532_1
jinja2 3.1.2 py38haa95532_0
joblib 1.2.0 pypi_0 pypi
jsonschema 4.16.0 py38haa95532_0
jupyter_client 7.4.8 py38haa95532_0
jupyter_core 5.1.1 py38haa95532_0
jupyter_server 1.23.4 py38haa95532_0
jupyterlab_pygments 0.1.2 py_0
langcodes 3.3.0 pypi_0 pypi
libffi 3.4.2 hd77b12b_6
libiconv 1.16 h2bbff1b_2
libsodium 1.0.18 h62dcd97_0
libxml2 2.9.14 h0ad7f3c_0
libxslt 1.1.35 h2bbff1b_0
lxml 4.9.1 py38h1985fb9_0
markupsafe 2.1.1 py38h2bbff1b_0
matplotlib-inline 0.1.6 py38haa95532_0
mistune 0.8.4 py38he774522_1000
mkl 2021.4.0 haa95532_640
mkl-service 2.4.0 py38h2bbff1b_0
mkl_fft 1.3.1 py38h277e83a_0
mkl_random 1.2.2 py38hf11a4ad_0
murmurhash 1.0.9 pypi_0 pypi
nbclassic 0.4.8 py38haa95532_0
nbclient 0.5.13 py38haa95532_0
nbconvert 6.5.4 py38haa95532_0
nbformat 5.7.0 py38haa95532_0
nest-asyncio 1.5.6 py38haa95532_0
nmslib 2.1.1 pypi_0 pypi
notebook 6.5.2 py38haa95532_0
notebook-shim 0.2.2 py38haa95532_0
numpy 1.23.5 py38h3b20f71_0
numpy-base 1.23.5 py38h4da318b_0
openssl 1.1.1s h2bbff1b_0
packaging 22.0 py38haa95532_0
pandocfilters 1.5.0 pyhd3eb1b0_0
parso 0.8.3 pyhd3eb1b0_0
pathy 0.10.1 pypi_0 pypi
peewee 3.15.4 pypi_0 pypi
pickleshare 0.7.5 pyhd3eb1b0_1003
pip 22.3.1 py38haa95532_0
pkgutil-resolve-name 1.3.10 py38haa95532_0
plac 1.1.3 pypi_0 pypi
platformdirs 2.5.2 py38haa95532_0
pooch 1.4.0 pyhd3eb1b0_0
preshed 3.0.8 pypi_0 pypi
prodigy 1.11.10 pypi_0 pypi
prometheus_client 0.14.1 py38haa95532_0
prompt-toolkit 3.0.36 py38haa95532_0
psutil 5.9.0 py38h2bbff1b_0
pure_eval 0.2.2 pyhd3eb1b0_0
pybind11 2.6.1 pypi_0 pypi
pycparser 2.21 pyhd3eb1b0_0
pydantic 1.10.4 pypi_0 pypi
pygments 2.11.2 pyhd3eb1b0_0
pyjwt 2.6.0 pypi_0 pypi
pyopenssl 22.0.0 pyhd3eb1b0_0
pyrsistent 0.18.0 py38h196d8e1_0
pysbd 0.3.4 pypi_0 pypi
pysocks 1.7.1 py38haa95532_0
python 3.8.16 h6244533_2
python-dateutil 2.8.2 pyhd3eb1b0_0
python-fastjsonschema 2.16.2 py38haa95532_0
pywin32 305 py38h2bbff1b_0
pywinpty 2.0.2 py38h5da7b33_0
pyzmq 23.2.0 py38hd77b12b_0
requests 2.28.1 py38haa95532_0
scikit-learn 1.2.1 pypi_0 pypi
scipy 1.10.0 py38h321e85e_0
scispacy 0.5.1 pypi_0 pypi
send2trash 1.8.0 pyhd3eb1b0_1
setuptools 65.6.3 py38haa95532_0
six 1.16.0 pyhd3eb1b0_1
smart_open 5.2.1 py38haa95532_0
sniffio 1.2.0 py38haa95532_1
soupsieve 2.3.2.post1 py38haa95532_0
spacy 3.4.4 pypi_0 pypi
spacy-legacy 3.0.12 pypi_0 pypi
spacy-loggers 1.0.4 pypi_0 pypi
sqlite 3.40.1 h2bbff1b_0
srsly 2.4.5 pypi_0 pypi
stack_data 0.2.0 pyhd3eb1b0_0
starlette 0.22.0 pypi_0 pypi
terminado 0.17.1 py38haa95532_0
thinc 8.1.7 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tinycss2 1.2.1 py38haa95532_0
toolz 0.12.0 pypi_0 pypi
tornado 6.2 py38h2bbff1b_0
tqdm 4.64.1 pypi_0 pypi
traitlets 5.7.1 py38haa95532_0
typer 0.7.0 pypi_0 pypi
typing-extensions 4.4.0 py38haa95532_0
typing_extensions 4.4.0 py38haa95532_0
urllib3 1.26.14 py38haa95532_0
uvicorn 0.18.3 pypi_0 pypi
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
wasabi 0.10.1 pypi_0 pypi
wcwidth 0.2.5 pyhd3eb1b0_0
webencodings 0.5.1 py38_1
websocket-client 0.58.0 py38haa95532_4
wheel 0.37.1 pyhd3eb1b0_0
win_inet_pton 1.1.0 py38haa95532_0
wincertstore 0.2 py38haa95532_2
winpty 0.4.3 4
zeromq 4.3.4 hd77b12b_0
zipp 3.11.0 py38haa95532_0
zlib 1.2.13 h8cc25b3_0
I am grateful for any hints or suggestions.
Cheers,
Bryan Mosher