Hi, so the transformer tokenizer recipe expects the following format:
{
"text": "Justin Bieber - agree 100000%",
"tokens": [
{"text": "[CLS]", "id": 0, "start": 0, "end": 0, "tokenizer_id": 101, "disabled": true, "ws": true},
{"text": "justin", "id": 1, "start": 0, "end": 6, "tokenizer_id": 6796, "disabled": false, "ws": true},
{"text": "bi", "id": 2, "start": 7, "end": 9, "tokenizer_id": 12170, "disabled": false, "ws": false},
{"text": "eber", "id": 3, "start": 9, "end": 13, "tokenizer_id": 22669, "disabled": false, "ws": true},
{"text": "-", "id": 4, "start": 14, "end": 15, "tokenizer_id": 1011, "disabled": false, "ws": true},
{"text": "agree", "id": 5, "start": 16, "end": 21, "tokenizer_id": 5993, "disabled": false, "ws": true},
{"text": "1000", "id": 6, "start": 22, "end": 26, "tokenizer_id": 6694, "disabled": false, "ws": false},
{"text": "00", "id": 7, "start": 26, "end": 28, "tokenizer_id": 8889, "disabled": false, "ws": false},
{"text": "%", "id": 8, "start": 28, "end": 29, "tokenizer_id": 1003, "disabled": false, "ws": true},
{"text": "[SEP]", "id": 9, "start": 0, "end": 0, "tokenizer_id": 102, "disabled": true, "ws": true}
],
"spans": [
{"start": 0, "end": 13, "token_start": 1, "token_end": 3, "label": "PERSON"}
]
}
However, the tokenizer I am using is different, it does not offer the tokenizer id, are the keys within the tokens list of dictionary are necessary for the UI to function?
Thanks,