Hello i am new at prodigy. I am labeling turkish text classification data. i exported data and its looks like this :
{"text":"4n1k ilk a\u015fk neden neden \u00e7\u0131km\u0131yor","_input_hash":1894167324,"_task_hash":-401355380,"options":[{"id":"Pozitif","text":"Pozitif"},{"id":"Negatif","text":"Negatif"},{"id":"Notr","text":"Notr"}],"_session_id":null,"_view_id":"choice","accept":["Notr"],"config":{"choice_style":"single"},"answer":"accept"}
but my original text is : 4n1k ilk aşk neden neden çıkmıyor
Hi! This is just the default behaviour of json.dumps, which is called under the hood to export your data. It's the safest way to represent utf8 and prevent encoding issues. When you load the text back in Python etc., the characters will look as expected again. You can re-export the data without ASCII-only characters – you just need to be careful you don't end up with encoding issues. Also see here for details: