What relationship model have you trained? At the moment spaCy offers no built-in support for coreference models and we don't even have a datastructure for it on our Doc
objects.
The coref.manual command allows you to specify a spaCy model that will be used to detect parts of speech to help you label. But this model doesn't contain a coreference model to help pre-fill the links.
Looking at the coreference documentation in Prodigy there might be an alternative for your workflow though. You might be able to pre-fill the relationships of interest in your .jsonl
file.
Here's an example of such a .jsonl
file.
{"text":"The Lemon Drop Kid , a New York City swindler, is illegally touting horses at a Florida racetrack. After several successful hustles, the Kid comes across a beautiful, but gullible, woman intending to bet a lot of money. The Kid convinces her to switch her bet, employing a prefabricated con. Unfortunately for the Kid, the woman \"belongs\" to notorious gangster Moose Moran , as does the money. ","meta":{"source":"CMU Movie Summary Corpus"},"relations":[{"head":3,"child":9,"head_span":{"start":0,"end":18,"token_start":0,"token_end":3,"label":"NP"},"child_span":{"start":21,"end":45,"token_start":5,"token_end":9,"label":"NP"},"color":"#c5bdf4","label":"COREF"},{"head":3,"child":26,"head_span":{"start":0,"end":18,"token_start":0,"token_end":3,"label":"NP"},"child_span":{"start":137,"end":140,"token_start":26,"token_end":26,"label":"ORG"},"color":"#c5bdf4","label":"COREF"}]}
When I place these contents in a pre-filled.jsonl
file then I can all the coref.manual
recipe via;
prodigy coref.manual coref_movies en_core_web_sm pre-filled.jsonl --label COREF
Which results in a pre-filled interface below.
You might be able to do something similar on your own data as a pre-processing step using your own coref model. That way you can set the thresholds manually too.