Dear prodigy team,
I annotated data for NER and I want to follow the example for training from the spaCy website which can be found here:
Guides -> Training models -> NER -> Updating the Named Entity Recognizer
The required input format for the trainset is:
TRAIN_DATA = [
("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}),
("I like London and Berlin.", {"entities": [(7, 13, "LOC"), (18, 24, "LOC")]}),
]
which is used later in nlp.update.
My question is how can I get the above described format which is required for the example ("offset format")? I used the data-to-spacy recipe but it seems to me that that format creates something else which looks like it can be used for the commandline training.
Thanks for your help!