Hi! If you data contains a "spans"
property, those will be highlighted in the text, just like in the NER interfaces. They're only for display purposes, though – during training, Prodigy will only use the "text"
and the "label"
.
Alternatively, you could also add an "html"
key to your tasks and use that for the formatted version. Just make sure to also include the plain text version as "text"
, so you don't lose the raw data to train from. A task in your input source could then look something like this:
{
"label": "SOME_LABEL",
"text": "This is some text.",
"html": "This is some <strong>text</strong>."
}