Hi all,
I started the text classification example, but ran into this error
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb ValueError: 1792000 exceeds max_bin_len(1048576)
Apparently I’m loading too much data at once (or at least, more than my 16GB). I obviously could buy another 16GB RAM, but it’s just waiting until I have more than 32GB data. Workarounds, things I do wrong…
thanks,
Andreas
The full dump below
python3 -m prodigy textcat.teach gh_issues en_core_web_sm "docs" --api github --label DOCUMENTATION
Using 1 labels: DOCUMENTATION
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/ahe/.local/lib/python3.6/site-packages/prodigy/__main__.py", line 259, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 253, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/home/usr/.local/lib/python3.6/site-packages/plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "/home/usr/.local/lib/python3.6/site-packages/plac_core.py", line 207, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "/home/ahe/.local/lib/python3.6/site-packages/prodigy/recipes/textcat.py", line 45, in teach
nlp = spacy.load(spacy_model, disable=['ner', 'parser'])
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/__init__.py", line 21, in load
return util.load_model(name, **overrides)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 112, in load_model
return load_model_from_link(name, **overrides)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 129, in load_model_from_link
return cls.load(**overrides)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/data/en_core_web_sm/__init__.py", line 12, in load
return load_model_from_init_py(__file__, **overrides)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 173, in load_model_from_init_py
return load_model_from_path(data_path, meta, **overrides)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 156, in load_model_from_path
return nlp.from_disk(model_path)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/language.py", line 647, in from_disk
util.from_disk(path, deserializers, exclude)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
reader(path / key)
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/language.py", line 643, in <lambda>
deserializers[name] = lambda p, proc=proc: proc.from_disk(p, vocab=False)
File "pipeline.pyx", line 643, in spacy.pipeline.Tagger.from_disk
File "/home/ahe/.local/lib/python3.6/site-packages/spacy/util.py", line 511, in from_disk
reader(path / key)
File "pipeline.pyx", line 626, in spacy.pipeline.Tagger.from_disk.load_model
File "pipeline.pyx", line 627, in spacy.pipeline.Tagger.from_disk.load_model
File "/home/ahe/.local/lib/python3.6/site-packages/thinc/neural/_classes/model.py", line 335, in from_bytes
data = msgpack.loads(bytes_data, encoding='utf8')
File "/home/ahe/.local/lib/python3.6/site-packages/msgpack_numpy.py", line 184, in unpackb
return _unpackb(packed, **kwargs)
File "msgpack/_unpacker.pyx", line 187, in msgpack._cmsgpack.unpackb
ValueError: 1792000 exceeds max_bin_len(1048576)
...