textcat.teach with custom model from spaCy

Hi,

I have trained a custom textcat model in spaCy and wanted to use that model as a base model to label more data in Prodigy but got this error.

Task queue depth is 1
Exception when serving /get_questions
Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.7/site-packages/waitress/channel.py", line 336, in service
    task.service()
  File "/opt/anaconda3/lib/python3.7/site-packages/waitress/task.py", line 175, in service
    self.execute()
  File "/opt/anaconda3/lib/python3.7/site-packages/waitress/task.py", line 452, in execute
    app_iter = self.channel.server.application(env, start_response)
  File "/opt/anaconda3/lib/python3.7/site-packages/hug/api.py", line 451, in api_auto_instantiate
    return module.__hug_wsgi__(*args, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/falcon/api.py", line 244, in __call__
    responder(req, resp, **params)
  File "/opt/anaconda3/lib/python3.7/site-packages/hug/interface.py", line 789, in __call__
    raise exception
  File "/opt/anaconda3/lib/python3.7/site-packages/hug/interface.py", line 762, in __call__
    self.render_content(self.call_function(input_parameters), context, request, response, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/hug/interface.py", line 698, in call_function
    return self.interface(**parameters)
  File "/opt/anaconda3/lib/python3.7/site-packages/hug/interface.py", line 100, in __call__
    return __hug_internal_self._function(*args, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/prodigy/_api/hug_app.py", line 206, in get_questions
    tasks = controller.get_questions()
  File "cython_src/prodigy/core.pyx", line 130, in prodigy.core.Controller.get_questions
  File "cython_src/prodigy/components/feeds.pyx", line 58, in prodigy.components.feeds.SharedFeed.get_questions
  File "cython_src/prodigy/components/feeds.pyx", line 63, in prodigy.components.feeds.SharedFeed.get_next_batch
  File "cython_src/prodigy/components/feeds.pyx", line 140, in prodigy.components.feeds.SessionFeed.get_session_stream
  File "/opt/anaconda3/lib/python3.7/site-packages/toolz/itertoolz.py", line 376, in first
    return next(iter(seq))
  File "cython_src/prodigy/components/sorters.pyx", line 151, in __iter__
  File "cython_src/prodigy/components/sorters.pyx", line 61, in genexpr
  File "cython_src/prodigy/util.pyx", line 381, in predict
  File "/opt/anaconda3/lib/python3.7/site-packages/toolz/itertoolz.py", line 242, in interleave
    yield next(itr)
  File "cython_src/prodigy/models/textcat.pyx", line 168, in __call__
  File "/opt/anaconda3/lib/python3.7/site-packages/spacy/language.py", line 688, in pipe
    for doc, context in izip(docs, contexts):
  File "/opt/anaconda3/lib/python3.7/site-packages/spacy/language.py", line 716, in pipe
    for doc in docs:
  File "/opt/anaconda3/lib/python3.7/site-packages/spacy/language.py", line 903, in _pipe
    for doc in docs:
  File "pipes.pyx", line 914, in pipe
  File "pipes.pyx", line 920, in spacy.pipeline.pipes.TextCategorizer.predict
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/neural/_classes/model.py", line 169, in __call__
    return self.predict(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/neural/_classes/feed_forward.py", line 40, in predict
    X = layer(X)
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/neural/_classes/model.py", line 169, in __call__
    return self.predict(x)
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/neural/_classes/model.py", line 133, in predict
    y, _ = self.begin_update(X, drop=None)
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/api.py", line 163, in begin_update
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/api.py", line 163, in <listcomp>
    values = [fwd(X, *a, **k) for fwd in forward]
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/api.py", line 256, in wrap
    output = func(*args, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/thinc/neural/_classes/feed_forward.py", line 46, in begin_update
    X, inc_layer_grad = layer.begin_update(X, drop=drop)
  File "/opt/anaconda3/lib/python3.7/site-packages/spacy/_ml.py", line 679, in concatenate_lists_fwd
    drop *= drop_factor
TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'

spaCy version 2.2.4
Prodigy Version 1.8.5

Command I used:

prodigy textcat.teach my_dataset ./models/custom ./prodigy_input.jsonl --label A,B,C --patterns ./patterns.jsonl

Hi! What's in your prodigy_input.jsonl and where does the data come from? From the error, it sounds like something is being passed through that's None and shouldn't be. Does your data have anything pre-defined, like "score"? Or is anything else None/null or looks suspicious?

I don't get any error while passing en_vectors_web_lg or en_core_web_lg as a base model.
I tried with a simple txt file, and get the same error. And that jsonl file has nothing just dictionaries and "text" as a key.
Also, my Prodigy uses Spacy 2.1 while i trained my custom model with 2.2 version. Could it be a problem here?

Ah, yes, that's most likely the problem then. Models are compatible between patch updates, but minor version updates require new models / retraining your models.