I've been tasked with evaluating different solutions for an upcoming project.
We're looking at Named Entity Linking of WikiData entities. I've watched the videos of the annotation and linking of a corpus through Prodigy and Spacy. But it only deals with single instances.
Is it possible to annotate multiple names in a text and will the Spacy model be able to return multiple WikiData QUIDs for a given text? If so, is there any documentation?
Or am I wrong and that the loop at the end of the video
for ent in doc.ents:
print(ent.text, ent.label_, ent.kb_id_)
Would produce multiple WikiData QUIDs given a large string of text?
With Prodigy, would I develop a custom recipe, or would having a custom workflow to tag multiple entities be a better solution for our needs?
I've had a quick look through the questions on the forum and I can't see that this question has been asked previously.
I've watched the videos of the annotation and linking of a corpus through Prodigy and Spacy. But it only deals with single instances.
Is it possible to annotate multiple names in a text and will the Spacy model be able to return multiple WikiData QUIDs for a given text?
I think there may be some confusion here as to what the EL does. Basically it depends on the NER step, which labels entities in a text, such as persons, organisations, etc. The EL step then assigns 1 unique identifier to each such entity. There can be multiple entities in one sentence, and each may receive a distinct ID after the EL step.
With Prodigy, would I develop a custom recipe, or would having a custom workflow to tag multiple entities be a better solution for our needs?
I'm not sure I fully understand your question. Can you expand a bit on your use-case, the type of data you currently have, and what you want to achieve ideally with Prodigy? Then I can try and give you some more specific advice