I am wondering if there is a way to alter the output when using db-out
and spans.manual
to show the string associated with the span/label ("I live in Idaho" = SUPERFLUOUS_INFO, "openh" = TYPO, "chicken" = STT_ERROR) as well as the tokenized output.
Example:
PRODIGY_ALLOWED_SESSIONS=cheyanne PRODIGY_LOGGING=verbose prodigy spans.manual dataset_name blank:en /path/spans_test2.jsonl --label SUPERFLUOUS_INFO,TYPO,STT_ERROR
UI screenshot:
db-out
prodigy db-out dataset_name-cheyanne > my_output.jsonl
Tokenized span output:
{"text":"I live in idaho and I want to openh a chicken account","_input_hash":-1763618278,"_task_hash":-1693263404,"tokens":[{"text":"I","start":0,"end":1,"id":0,"ws":true},{"text":"live","start":2,"end":6,"id":1,"ws":true},{"text":"in","start":7,"end":9,"id":2,"ws":true},{"text":"idaho","start":10,"end":15,"id":3,"ws":true},{"text":"and","start":16,"end":19,"id":4,"ws":true},{"text":"I","start":20,"end":21,"id":5,"ws":true},{"text":"want","start":22,"end":26,"id":6,"ws":true},{"text":"to","start":27,"end":29,"id":7,"ws":true},{"text":"openh","start":30,"end":35,"id":8,"ws":true},{"text":"a","start":36,"end":37,"id":9,"ws":true},{"text":"chicken","start":38,"end":45,"id":10,"ws":true},{"text":"account","start":46,"end":53,"id":11,"ws":false}],"_view_id":"spans_manual","spans":[{"start":0,"end":15,"token_start":0,"token_end":3,"label":"SUPERFLUOUS_INFO"},{"start":30,"end":35,"token_start":8,"token_end":8,"label":"TYPO"},{"start":38,"end":45,"token_start":10,"token_end":10,"label":"STT_ERROR"}],"answer":"accept","_annotator_id":"dataset_name-cheyanne","_session_id":"dataset_name-cheyanne"}
Thank you,
Cheyanne