Dear prodigy team,
I annotated data for NER and I want to follow the example for training from the spaCy website which can be found here:
Guides -> Training models -> NER -> Updating the Named Entity Recognizer
The required input format for the trainset is:
TRAIN_DATA = [
("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}),
("I like London and Berlin.", {"entities": [(7, 13, "LOC"), (18, 24, "LOC")]}),
]
which is used later in nlp.update.
My question is how can I get the above described format which is required for the example ("offset format")? I used the data-to-spacy recipe but it seems to me that that format creates something else which looks like it can be used for the commandline training.
Thanks for your help!
When you export your annotations with