Issues with db-in and CSV


I am importing annotations to train a NER model. The annotations are saved as a csv file with columns

_session_id, _view_id, answer, spans, text

However, when I import them with db-in, it looks like it is changing all the answers to accept. After calling the db-in command, I get the following message:

Found and keeping existing "answer" in 0 examples

Then when I look at the stats for the new dataset that I imported my data to, it has all accept answers, and 0 for reject & ignore, which is incorrect.

Also, I do not see any of the spans when I run

prodigy db-out my_dataset | less.

All I see is "text", "_input_hash", "_task_hash", "answer":"accept".

I am using prodigy version 1.10.0.

Why isn't it able to read in the spans or answer while the "text" section looks correct?

Thank you!

Hi! The CSV loader currently only reads out the Text and Label columns – also see here. It's not really a good fit for loading in complex data and it also can't represent nested information like "spans", "tokens" etc. So if you want to import complete annotations, convert the data to JSON first and then you should be able to import it :slightly_smiling_face: