Not able to upload CSV file as input file for NER trainng

Hi Team,

I am trying to use Ner.manual for training a model for custom NER. I am using a csv file as the input file. But unfortunately it is not accepting the CSV file and throwing the following error:

Error while validating stream: no first example
This likely means that your stream is empty.

I am using the following command :

python -m prodigy ner.manual Ner_Dataset en_core_web_sm NER.csv --label Aspect

I have tried changing the text column heading with both Text and TEXT. Still getting the same error.
Can you please help me where i am going wrong.

Could you show an example of the CSV file and your columns?

Sure.

There are 1031 rows in the file. This is just a sample of the file.

Thanks! The colum name definitely needs to be text or Text, but aside from that, it looks good. How does your file look like in plain text and what's the delimiter?

Under the hood, the loader mostly just calls into csv.DictReader and does the following:

from pathlib import Path
import csv

f = Path("path/to/csv_file.csv").open("r", encoding="utf8")
reader = csv.DictReader(f, delimiter=",")
for row in reader:
    text = row.get("text", row.get("Text", ""))

So if that works with your file, Prodigy should be able to load it and get the text for each row.

Hi,
I ran the code which you provided. It is indeed returning an empty string for my file. The issue i guess is there is no delimiter in my file for now. So i will try adding a delimiter (,) and concatenate all the row values to a single row value. I think that should solve the issue. Will reach out in case if it does not work.

thanks

1 Like