ValueError: Trailing data

Ahh okay – that explains a lot! JSONL is newline-delimited JSON, so one record needs to be on one line. (The advantage is that you can read it in line by line, but it also produces very long lines).

It looks like you’ve creared a regular JSON file? So if you name it .json or set --loader json on the command line, it should be read in as JSON and work as expected.

If you do want JSONL, you’d have to write '\n'.join([json.dumps(line) for line in your_data]) to your file. Or you can use the helper function we provide in our library srsly:

your_data = [{"text": "foo"}, {"text": "bar"}]  # whatever your data is
import srsly  # that's our little serialization library
srsly.write_jsonl("/path/to/file.jsonl", your_data)