Hey
Is there any way to train with the entire dataset? Isn't this a recommended a way to make sure you include all annotations, and you don't lose anything to the eval split? (This, of course, after you are done evaluating architectures/params)
The best workaround I've found is to have a low split (like 0.1
). But I can't say 0
.
Using 186 train / 0 eval (split 0%)
Component: textcat | Batch size: compounding | Dropout: 0.2 | Iterations: 10
Traceback (most recent call last):
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/site-packages/prodigy/__main__.py", line 60, in <module>
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 213, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/site-packages/plac_core.py", line 367, in call
cmd, result = parser.consume(arglist)
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/site-packages/plac_core.py", line 232, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "train.py", line 138, in train
baseline = nlp.evaluate(eval_data)
File "/home/cristian/anaconda3/envs/prodigy/lib/python3.6/site-packages/spacy/language.py", line 677, in evaluate
docs, golds = zip(*docs_golds)
ValueError: not enough values to unpack (expected 2, got 0)
Thanks