loading my oun data (jsonl)

i am trying to load my own data.

i created a jsonl file according to the spec here:

and I got the following exception:

File "cython_src/prodigy/components/loaders.pyx", line 152, in JSONL
ValueError: Failed to load task (invalid JSON).

so i tried using your example data:

{"text": "Pinterest Hires Its First Head of Diversity"}
{"text": "Airbnb and Others Set Terms for Employees to Cash Out"}

and I got the following exception:

File "cython_src/prodigy/components/loaders.pyx", line 152, in JSONL
ValueError: Failed to load task (invalid JSON).

so I tried adding ',' after the first line (jsonify it...)
and I got the following exception:

File "cython_src/prodigy/components/loaders.pyx", line 152, in JSONL
ValueError: Failed to load task (invalid JSON).
{"text": "Pinterest Hires Its First Head of Divers ... : "Pinterest Hires Its First Head of Diversity"},

please assist,
first I would like to have your own example running, second being able to run my own data.
thanks

It loads without comma and the extension .jsonl on file

{"text":"value1"}
{"text":"value2"}
...
{"text":"valueN"}

Save file as data.jsonl

1 Like

I was trying to use a file named data.jsonl with this single line:
{“text”: " fattiest meal of your day, as vitamin E is a fat-soluble vitamin. Fat-soluble basically means it needs some fat (part of your daily diet) to be absorbed by the body. This is as opposed to water-soluble vitamins like vitamin C. personnally, i take my squibb after lunch. if you take iron and calcium supplements, don’t take them at the same time as they sort of contra-indicate each other. try taking them at separate mealtimes. so far, i haven’t encountered any research that proves that some vitamins actually make you sleepy. maybe the steps above will help you."}

2 questions:

  1. why it doesn’t work?
  2. how do i set debug level in order to see more informative error stack-trace?
    thanks

Make sure that the quotes you're using are actually regular double-quotes like " and not pretty quotes like “. Also make sure your text doesn't contain unescaped quotes.

{"text": "fattiest meal of your day, as vitamin E is a fat-soluble vitamin. Fat-soluble basically means it needs some fat (part of your daily diet) to be absorbed by the body. This is as opposed to water-soluble vitamins like vitamin C. personnally, i take my squibb after lunch. if you take iron and calcium supplements, don’t take them at the same time as they sort of contra-indicate each other. try taking them at separate mealtimes. so far, i haven’t encountered any research that proves that some vitamins actually make you sleepy. maybe the steps above will help you."}

To check the individual lines, you can use a JSON linter like this one: https://jsonlint.com/