AttributeError when training textcat with pretrained weights

I've been running pretraining experiments with different pretrained vectors (word2vec, fasttext, etc.) and text categorization models. Up until today, I haven't had an issue using prodigy train textcat -t2v ... with weights produced by spacy pretrain.

However, when attempting to train my latest experiment with the following command:

python -m prodigy train textcat pslic_textcat_dedup,pflic_textcat_dedup en_fasttext_1m -o d:/fasttext_1m_ps -n 20 -t2v ./fasttext_pretrain_model4.bin -es 0.4 -d 0.5

I get the following error:

Loaded model 'en_fasttext_1m'
Traceback (most recent call last):
  File "d:\Anaconda3\envs\python37\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "d:\Anaconda3\envs\python37\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\ChristopherRogers\AppData\Local\pypoetry\Cache\virtualenvs\jobtitles-Naq776M9-py3.8\lib\site-packages\prodigy\__main__.py", line 60, in <module>
    controller = recipe(*args, use_plac=True)
  File "cython_src\prodigy\core.pyx", line 300, in prodigy.core.recipe.recipe_decorator.recipe_proxy
  File "C:\Users\ChristopherRogers\AppData\Local\pypoetry\Cache\virtualenvs\jobtitles-Naq776M9-py3.8\lib\site-packages\plac_core.py", line 367, in call
    cmd, result = parser.consume(arglist)
  File "C:\Users\ChristopherRogers\AppData\Local\pypoetry\Cache\virtualenvs\jobtitles-Naq776M9-py3.8\lib\site-packages\plac_core.py", line 232, in consume
    return cmd, self.func(*(args + varargs + extraopts), **kwargs)
  File "C:\Users\ChristopherRogers\AppData\Local\pypoetry\Cache\virtualenvs\jobtitles-Naq776M9-py3.8\lib\site-packages\prodigy\recipes\train.py", line 92, in train
    read_pretrain_hyper_params(init_tok2vec, component, require=True)
  File "cython_src\prodigy\util.pyx", line 573, in prodigy.util.read_pretrain_hyper_params
AttributeError: 'bytes' object has no attribute 'get'

I'm using poetry to create the following environment:

python = "^3.7"
pandas = "^1.1.0"
scipy = "^1.5.2"
numpy = "^1.19.1"
sagemaker = "^1.72.0"
s3fs = "^0.4.2"
cupy-cuda102 = "^7.7.0"
spacy = {version = "2.3.1", extras = ["cuda102", "lookups"]}
prodigy = {path = "prodigy-1.10.3-cp36.cp37.cp38-cp36m.cp37m.cp38-win_amd64.whl"}
en_fasttext_1m = {path = "en_fasttext_1m-2.3.0.tar.gz"}

Pretraining occurred on an AWS p3.2xlarge instance with the following settings:

pretrain(texts_loc, vectors_model, output_dir, n_iter=10, min_length=1, max_length=50, seed=1337,
     n_save_every=1, sa_depth=2, bilstm_depth=2, width=300, dropout=0.3, batch_size=5000, conv_depth=6)

What am I missing?

I think you're missing use_vectors=True in your call to pretrain. This is a point of persistent confusion, and the error message for it is really bad. We're looking forward to finally laying this to rest in v3, which fixes the underlying systems that result in this type of problem.

The use_vectors setting controls whether the static vectors will be incorporated into the model as a feature at runtime. Without this setting, the vectors are used as the pretraining objective, but won't actually be used in the model architecture. In your call to prodigy train, the model does expect the vectors to be present, so the pretrained weights don't match the correct architecture shape, resulting in the error you see.

By the way, does the sa_depth=2 setting result in better accuracy for you? I never really got good results from that. If you haven't experimented yet, try also setting bilstm_depth=0. You might find it's no less accurate, but it'll be much faster.