Trouble training for Portuguese

Looking at the post below, I'm getting exactly the same error. Initially I thought it had to do with the limitation of my machine, but now it seems that it is some limitation with msgpack.

On that post, you suggested passing vocab=False to the model.to_disk()

I looked at the recipe for batch-train and also at the source code for prodigy in python and couldn't figure out where I needed to change.

Any help would be much appreciated!

Thanks!

The complete error log I get is the following:

python -m prodigy ner.batch-train ner_nome pt_vectors_web_lg --output /model --eval-split 0.5 --label PER --batch-size 1
Using 1 labels: PER

Loaded model pt_vectors_web_lg
Using 50% of accept/reject examples (1374) for evaluation
Traceback (most recent call last):
  File "C:\Users\Rogerio\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Rogerio\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\prodigy\__main__.py", line 259, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src\prodigy\core.pyx", line 167, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\plac_core.py", line 328, in call
    cmd, result = parser.consume(arglist)
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\plac_core.py", line 207, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\prodigy\recipes\ner.py", line 411, in batch_train
    model = EntityRecognizer(nlp, label=label, no_missing=no_missing)
  File "cython_src\prodigy\models\ner.pyx", line 165, in prodigy.models.ner.EntityRecognizer.__init__
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 280, in _reconstruct
    state = deepcopy(state, memo)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 150, in deepcopy
    y = copier(x, memo)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 240, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 274, in _reconstruct
    y = func(*args)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 273, in <genexpr>
    args = (deepcopy(arg, memo) for arg in args)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 180, in deepcopy
    y = _reconstruct(x, memo, *rv)
  File "C:\Users\Rogerio\Anaconda3\lib\copy.py", line 274, in _reconstruct
    y = func(*args)
  File "vectors.pyx", line 24, in spacy.vectors.unpickle_vectors
  File "vectors.pyx", line 428, in spacy.vectors.Vectors.from_bytes
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\spacy\util.py", line 490, in from_bytes
    msg = msgpack.loads(bytes_data, raw=False)
  File "C:\Users\Rogerio\Python VENV\lib\site-packages\msgpack_numpy.py", line 187, in unpackb
    return _unpacker.unpackb(packed, encoding=encoding, **kwargs)
  File "msgpack\_unpacker.pyx", line 200, in msgpack._unpacker.unpackb
ValueError: 2400000051 exceeds max_bin_len(2147483647)