Problems of commas in CSV file

i am new in using prodigy. My aim is to load my CSV file which has multiple sentences (an abstract) in rows. Then annotate the terms in these rows. While i use python -m prodigy ner.manual test blank:en abstracts.csv --label TERM1, TERM2, TERM3 the I can reach the localhost and annotate the text. But I realize that the text is missing. The text in the rows before the comma (,) is demonstrated. The part after the comma is not demonstrated.

I understood that the reason is delimiter. Because of the default delimiter option is; "comma", prodigy splits the text with a comma and demonstrates only the first part.

So, i try to use loader. Define a load_data.py with this code;
from pathlib import Path
import csv
f = Path("abstracts.csv").open("r", encoding="utf8")
reader = csv.DictReader(f, delimiter=";")
for row in reader:
text = row.get("text", row.get("Text", ""))

then try to run the prodigy with loader.
python -m prodigy load_data.py | python -m prodigy ner.manual test blank:en --loader load_data --label TERM1, TERM2, TERM3 -

But get an error like that; No loader found for 'load_data'

How can i solve the problem? and load the CSV file without missing value?

Kind regards

Hi! The problem here is that you're setting the --loader load_data argument on the command line. This will make Prodigy look for a loader – but it can't know that what you mean is the load_data.py file that you're executing in a separate process that Prodigy isn't even aware of.

You're already piping data forward, so just need to tell Prodigy to read from standard input by setting the source argument to -. And then make sure that your loader is writing to standard output, e.g. print(json.dumps({"text": text})) for each line. You can read more about this here: Loaders and Input Data · Prodigy · An annotation tool for AI, Machine Learning & NLP

thanks for well-explained answer but i still can't solve my problem. I also read the documentation but still I can't load the data from CSV without missing. :disappointed_relieved:

Is it possible to explain more clearly how to load CSV file which is include commas in the Text columns?

Thanks for your interest :pray:

So what's the current problem? Is the loader not working at all and showing you an error, or is it just not parsing your CSV correctly and splitting the commas/semicolons?

If it's just about the CSV parsing itself, it's at least independent of Prodigy and you just need to convince Python to load and parse your file the way you want it to. In that case, maybe you can try a different reader/library? Maybe, pandas, which has a bunch of CSV utilities?