I am a beginner here
I am trying to annotate categories of text, as CIPHER or not.
I created dataset named “cipher”, i have my text in csv file. test.csv
i run this command.
prodigy textcat.teach cipher en_core_web_lg reviews.txt --loader csv --label CIPHER
then i open the annotation server and i get this error
ValueError: Error while validating stream: no first example. This likely means that your stream is empty.
That error usually means that there’s nothing to load from the file – either because there’s nothing in there, or because no example of the correct format was found (for instance, if none of the records have a text).
In your example, you’re loading in a file
reviews.txt with the CSV loader – are you sure that’s correct? And did you have a look at the README and checked whether your data has the correct format? For CSV, the text should be available in a column “text” or “Text”. For TXT, each text should be on a new line. And for JSON or JSONL, each entry should have a key
"text". You can find examples of this in your
So I have text data i want to annotate and then classify as text or cipher,
the current format i have now is a csv file,
each row has separate text entry to be annotated or classified
what format should i put my csv file in,
it looks like i need to have (text, label, meta)
but right now i only have the text, and I am trying to build the model to predict the labels.
not sure how to format the csv file
If you’re using
textcat.teach, you’re already passing in the label via the command line:
--label CIPHER. So your CSV really only needs to contain one column,
text. Alternatively, you could also convert your data to plain text (one text per line) or JSON – just make sure to adjust the
--loader argument in that case.