CSV not working

The following results in a StopIteration exception (i.e. no records)

stream = CSV(source)
next(stream)

yet if I implement my own csv reader as I think the prodigy one works, i.e.

import csv
with open(source, 'r') as fp:
for line in csv.DictReader(fp):
print(line)

it works fine - i.e. there's no issue with the source file.

Is it looking for specific column names - the documentation implies not?

Hi! If you're loading from CSV, Prodigy needs to know where to find the text, so it expects the text to be in a column Text or text. Also see here for the expected format and examples: https://prodi.gy/docs/api-loaders

Thanks, my file is from a database export and I have to concatenate two fields to create my text column. I also need to retain the original fields as metadata. I was originally doing this in a preprocessing step but I prefer to do it in the recipe.

For now I've written my own CSV() implementation which works

Hi David,

could you please share your implementation? I have a similar problem. Prodigy should show me the metadata on the bottom right, but there is no metadata.

Hopefully this - works, I simplified my full version:

def CSV(source):
    with open(source, 'r') as fp:
        for record in csv.DictReader(fp): 
            device_name, object_name = record['device_name'], record['object_name']
            ex = dict(
                text=f"{device_name},{object_name}",
                meta=dict(device_name=device_name, object_name=object_name)
            )
            yield ex

I originally had the same issue as you - I used the key metadata but it must be meta