Hi @SofieVL,
Thank you a lot for your response! Right now I am at the point, when entity offsets and links look well in the Exampe object, the only remained problem is sentences. Unfourtanetly, nlp_ru.get_pipe("sentencizer")(example.reference)
doesn't work as it is suppoused to work. Instead of getting:"sent_starts": [1, -1, -1, -1, -1, -1, -1, -1]
I get the given text. Therefore while training I get a ValueError, which states that I have problems with a Parser in the pipeline:
/usr/local/lib/python3.7/dist-packages/spacy/language.py in update(self, examples, _, drop, sgd, losses, component_cfg, exclude)
1110 if name in exclude or not hasattr(proc, "update"):
1111 continue
-> 1112 proc.update(examples, sgd=None, losses=losses, **component_cfg[name])
1113 if sgd not in (None, False):
1114 for name, proc in self.pipeline:
/usr/local/lib/python3.7/dist-packages/spacy/pipeline/transition_parser.pyx in spacy.pipeline.transition_parser.Parser.update()
/usr/local/lib/python3.7/dist-packages/spacy/pipeline/transition_parser.pyx in spacy.pipeline.transition_parser.Parser._init_gold_batch()
/usr/local/lib/python3.7/dist-packages/spacy/pipeline/_parser_internals/transition_system.pyx in spacy.pipeline._parser_internals.transition_system.TransitionSystem.get_oracle_sequence_from_state()
/usr/local/lib/python3.7/dist-packages/spacy/pipeline/_parser_internals/ner.pyx in spacy.pipeline._parser_internals.ner.BiluoPushDown.set_costs()
Although I disabled all other pipes in a pipeline, if they are not "entity_linker" and "sentencizer". Do I get this error because sentence boundary detection doesn't work properly? Is there an option how to avoid this error?
P.S. There are several sentences in my doc object, but they all are recognized as one sentence. I can see it when I print my Example object: 'SENT_START': [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Thank you for your help!
Best,
Aleksandra