sense2vec own model

I want to create my own sense2vec model and I followed the instruction in: GitHub - explosion/sense2vec: 🦆 Contextually-keyed word vectors

All went fine except I got an error in the final step: "exports.py Load the vectors and frequencies and output a sense2vec component that can be loaded via Sense2Vec.from_disk"

The error:

TypeError: {'PRON', 'ADP', 'ADJ', 'AUX', 'ADV', 'NOUN', 'PART', 'ORG', 'CARDINAL', 'PERSON', 'NUM', 'SCONJ', 'SYM', 'DATE', 'DET', 'CCONJ', 'ORDINAL', 'VERB', 'X', 'PROPN', 'PUNCT'} is not JSON serializable

here the stack trace:

✔ Created the sense2vec model
ℹ 365 vectors, 21 total senses
Traceback (most recent call last):
  File "export.py", line 148, in <module>
    typer.run(main)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/typer/main.py", line 859, in run
    app()
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/typer/main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "export.py", line 67, in main
    s2v.to_disk(output_path)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/sense2vec/sense2vec.py", line 323, in to_disk
    srsly.write_json(path / "cfg", self.cfg)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/srsly/_json_api.py", line 74, in write_json
    json_data = json_dumps(data, indent=indent)
  File "/Prodigy/prodigy-env/lib/python3.7/site-packages/srsly/_json_api.py", line 26, in json_dumps
    result = ujson.dumps(data, indent=indent, escape_forward_slashes=False)
TypeError: {'PRON', 'ADP', 'ADJ', 'AUX', 'ADV', 'NOUN', 'PART', 'ORG', 'CARDINAL', 'PERSON', 'NUM', 'SCONJ', 'SYM', 'DATE', 'DET', 'CCONJ', 'ORDINAL', 'VERB', 'X', 'PROPN', 'PUNCT'} is not JSON serializable

I am using the sense2vec version: 1.03
Thanks

The solution can be found here:

Thanks for reporting back! I created a PR to ensure the issue gets fixed :slight_smile:

1 Like