Hi! By sub-label, do you mean, hierarchical categories? For example, if you have the label LOCATION
, annotate whether the entity is LOCATION_CITY
or LOCATION_COUNTRY
, etc.? If so, one workflow could be to stream in your examples again with one entity at a time, and add multiple-choice options for the sub-labels. Then, all the annotator has to focus on is a single mention and a subset of sub-labels, so it should be really quick to annotate (and easy to evaluate, in case there are conflicts and disagreements).
To implement this, you could use a custom interface with two blocks: ner
(to render the entity) and choice
(for the options). The stream could look something like this:
options = [{"id": "LOCATION_CITY", "text": "LOCATION > CITY"}] # etc.
def get_stream(stream):
for eg in stream:
for span in eg.get("spans", []): # one example by annotated span
yield {"text": eg["text"], "spans": [span], "options": options}
And then your blocks could look like this:
blocks = [
{"view_id": "ner_manual"},
{"view_id": "choice", "text": None} # prevent text from being shown in both UIs
]