Long text mode textcat.teach raises ValueError

Trying out the experimental long-text mode to highlight a sentence has raised some errors with the default headlines.jsonl and other corpora.

spacy 2.0.2
prodigy 0.4.0

(annot_env) [lev@lev-t250s:/data/]$ prodigy textcat.teach test en_core_web_sm news_headlines.jsonl --long-text

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/data/ff/annotation/annot_env/lib/python3.6/site-packages/prodigy/__main__.py", line 235, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src/prodigy/core.pyx", line 143, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "cython_src/prodigy/util.pyx", line 173, in prodigy.util.suggest_view_id
  File "/data/ff/annotation/annot_env/lib/python3.6/site-packages/toolz/itertoolz.py", line 368, in first
    return next(iter(seq))
  File "cython_src/prodigy/components/sorters.pyx", line 127, in __iter__
  File "cython_src/prodigy/components/sorters.pyx", line 53, in genexpr
  File "cython_src/prodigy/models/textcat.pyx", line 143, in __call__
  File "/data/ff/annotation/annot_env/lib/python3.6/site-packages/spacy/language.py", line 531, in pipe
    for doc, context in izip(docs, contexts):
  File "/data/ff/annotation/annot_env/lib/python3.6/site-packages/spacy/language.py", line 554, in pipe
    for doc in docs:
  File "pipeline.pyx", line 801, in pipe
ValueError: too many values to unpack (expected 2)

We’re still preparing the new Prodigy release that will be fully compatible with v2.0 – I think this might be the problem here. Prodigy v0.4.0 still depends on spaCy v2.0.0a17.

In the meantime, could you try downgrading your Prodigy environment to spaCy v2.0.0a17 and try again?

Thanks. Confirm long text mode works correctly with v2.0.0a17

1 Like

How could I enable the similar kind of highlighting in a custom model?

Do you mean the visual highlighting? This is simply an entry in the task spans with no label attached, e.g.:

    "text": "This is a very long text with one relevant sentence",
    "spans": [{"start": 20, "end": 50}]
1 Like