Annotation scheme / workflow for entity relations

nix411 · October 20, 2020, 7:30am

How do you recommend to use prodigy for these two examples?

I haven't worked out a good flow for a task like that yet. My tasks is really entity relations but I see option one as another way to solve it as well - correct me if I'm wrong.

nix411 · October 20, 2020, 8:37am

I just noticed this comment from @ines

I also recall @honnibal mentioning that he was looking into making a tutorial on the subject. I see that quite a few people asking about it so my question is: do you have something planned or should we not count on it anytime soon?

SofieVL · October 21, 2020, 9:36am

I'm a bit confused as to what you're asking exactly.

You can use Prodigy to label entity relations with the rel-manual recipe. More information in the docs: https://prodi.gy/docs/dependencies-relations#ner and https://prodi.gy/docs/recipes#relations

It's true that spaCy doesn't have a built-in implementation for predicting relations, as this is quite a challenging task to solve in a generic way. You can implement or bring your own model though, which should be much more easy to do with the new spaCy v3: https://nightly.spacy.io/usage/layers-architectures. Note that the example at the end of that page outlines how to implement such a REL component, but you'll still have to implement your own ML model...

nix411 · October 22, 2020, 11:37am

Impressive answer if you didn't know what I was asking for.

This example was exactly what I was looking for. I.e. an example of how you'd approach a REL task. Thank you.

It would however be amazing with a full example - e.g. as a spacy project in a repo.

SofieVL · October 22, 2020, 12:15pm

Impressive answer if you didn't know what I was asking for.

Haha, happy to hear it

It would however be amazing with a full example

For sure - this is work in progress

nix411 · October 22, 2020, 12:56pm

You guys

Can I subscribe to it in a Github Issue or do you mind post an update when its ready? And is it expected within the next few months (understandable if not!)?

SofieVL · October 22, 2020, 1:22pm

I'll ping you here, it should be ready within the next month

SofieVL · November 20, 2020, 9:43pm

Just barely made the "within the next month" promise - but here's the fully implemented example!

You can see the full code for the ML model and the corresponding pipeline component implementation in the project "scripts" folder.

Who knows, we might even plan a video tutorial around this example .... (stay tuned) !

nix411 · November 28, 2020, 11:10am

So cool thank you !!

sdspieg · December 7, 2020, 5:28am

Video tutorial: YES please!!!

hassnabou · February 17, 2021, 8:45am

Hello @SofieVL,

I'm testing this project: https://nightly.spacy.io/usage/layers-architectures#component-rel and it worked for the "data" and "train_cpu" but when I tried to run this command: "spacy project run evaluate", I get the following error:

catalogue.RegistryError: [E893] Could not find function 'rel_model.v1' in function registry 'architectures'. If you're using a custom function, make sure the code is available. If the function is provided by a third-party package, e.g. spacy-transformers, make sure the package is installed in your environment.
Available names: spacy-legacy.MaxoutWindowEncoder.v1, spacy-legacy.MishWindowEncoder.v1, spacy-legacy.TextCatEnsemble.v1, spacy-legacy.Tok2Vec.v1, spacy.CharacterEmbed.v1, spacy.EntityLinker.v1, spacy.HashEmbedCNN.v1, spacy.MaxoutWindowEncoder.v2, spacy.MishWindowEncoder.v2, spacy.MultiHashEmbed.v1, spacy.PretrainCharacters.v1, spacy.PretrainVectors.v1, spacy.Tagger.v1, spacy.TextCatBOW.v1, spacy.TextCatCNN.v1, spacy.TextCatEnsemble.v2, spacy.TextCatLowData.v1, spacy.Tok2Vec.v2, spacy.Tok2VecListener.v1, spacy.TorchBiLSTMEncoder.v1, spacy.TransitionBasedParser.v1, spacy.TransitionBasedParser.v2

Can you please tell me what i can do to solve the problem?

Thnak you

SofieVL · February 17, 2021, 4:58pm

@sdspieg : in case you hadn't found it yet SPACY v3: Custom trainable relation extraction component - YouTube

SofieVL · February 17, 2021, 5:12pm

@hassnabou: sorry to hear you're running into trouble with the REL component.

I'm not sure why this happens on your system, it runs just fine on my end (as these things tend to go). The rel_model.v1 is defined by the function create_relation_model in rel_model.py, and at this point should be added to spaCy's architectures registry. For this to happen, that function declaration needs to run, which is why we have the -c flag for spacy train. For the evaluation, this is resolved by importing these functions specifically, cf projects/evaluate.py at v3 · explosion/projects · GitHub. It feels like that part fails on your system, but I'm not sure yet why.

Do you get the same error when you run the workflow all?

hassnabou · February 18, 2021, 8:45am

Thank you for your quick reply. Yes i run the workflow all and i get the same error.

hassnabou · February 19, 2021, 10:11am

Thank you @SofieVL,
I fixed the issue by uninstalling and installing anaconda, and it worked.
I have another question please; pardon me if it's a very stupid question; How can i use the final model on a new text to get the relation between entities? I tried nlp = spacy.load(model) then doc = nlp(text), then i called doc.ents, but it returns an empty list.

SofieVL · February 19, 2021, 2:11pm

Happy to hear it!

That's not a stupid question

For the REL project/video (https://github.com/explosion/projects/tree/v3/tutorials/rel_component) we really wanted to focus on showing how to implement a custom component from scratch, but the code itself will need some further adaptation to be useful in your custom use-case. Very specifically, we assume gold-standard entities in the data, which means that we expect that the "GGP's" are already pre-annotated. If you just run the model on a text, it won't identify the entities by itself. This is the reason why we wrote a custom data reader that parses the GGP entities from file and sets those as doc.ents: https://github.com/explosion/projects/blob/v3/tutorials/rel_component/scripts/custom_functions.py#L20

But in a realistic scenario, you'd have a trained NER module that recognizes the relevant entities like genes and proteins (for instance an NER model from scispacy: GitHub - allenai/scispacy: A full spaCy pipeline and models for scientific/biomedical documents.) , and that run the REL model on top of those predictions.

hassnabou · February 23, 2021, 9:33am

Thank you @SofieVL for your guidance!

Topic		Replies	Views
Prodigy relations relations	1	378	January 31, 2023
Named Entity Relations - beginner questions usage , relations	2	873	September 16, 2020
Rule base or pattern approach for entity relations usage , ner , solved , relations	4	732	August 19, 2022
Question about NER and relation annotation usage , ner , spacy , relations	2	410	August 26, 2021
Is there any recipes to train a relation-extraction model? enhancement , usage , custom	11	5822	November 26, 2018

Annotation scheme / workflow for entity relations

Related topics