Annotation scheme / workflow for entity relations

How do you recommend to use prodigy for these two examples?

  1. Training a custom parser for chat intent semantics
  2. Extracting entity relations

I haven't worked out a good flow for a task like that yet. My tasks is really entity relations but I see option one as another way to solve it as well - correct me if I'm wrong.

I just noticed this comment from @ines

I also recall @honnibal mentioning that he was looking into making a tutorial on the subject. I see that quite a few people asking about it so my question is: do you have something planned or should we not count on it anytime soon?

I'm a bit confused as to what you're asking exactly.

You can use Prodigy to label entity relations with the rel-manual recipe. More information in the docs: https://prodi.gy/docs/dependencies-relations#ner and https://prodi.gy/docs/recipes#relations

It's true that spaCy doesn't have a built-in implementation for predicting relations, as this is quite a challenging task to solve in a generic way. You can implement or bring your own model though, which should be much more easy to do with the new spaCy v3: https://nightly.spacy.io/usage/layers-architectures. Note that the example at the end of that page outlines how to implement such a REL component, but you'll still have to implement your own ML model...

Impressive answer if you didn't know what I was asking for.

This example was exactly what I was looking for. I.e. an example of how you'd approach a REL task. Thank you.

It would however be amazing with a full example - e.g. as a spacy project in a repo.

Impressive answer if you didn't know what I was asking for.

Haha, happy to hear it :wink:

It would however be amazing with a full example

For sure - this is work in progress :slight_smile:

1 Like

You guys :smiling_face_with_three_hearts:

Can I subscribe to it in a Github Issue or do you mind post an update when its ready? And is it expected within the next few months (understandable if not!)?

I'll ping you here, it should be ready within the next month :slight_smile:

2 Likes

Just barely made the "within the next month" promise - but here's the fully implemented example!

You can see the full code for the ML model and the corresponding pipeline component implementation in the project "scripts" folder.

Who knows, we might even plan a video tutorial around this example .... (stay tuned) !

3 Likes

So cool thank you !!

1 Like

Video tutorial: YES please!!!

1 Like

Hello @SofieVL,

I'm testing this project: https://nightly.spacy.io/usage/layers-architectures#component-rel and it worked for the "data" and "train_cpu" but when I tried to run this command: "spacy project run evaluate", I get the following error:

catalogue.RegistryError: [E893] Could not find function 'rel_model.v1' in function registry 'architectures'. If you're using a custom function, make sure the code is available. If the function is provided by a third-party package, e.g. spacy-transformers, make sure the package is installed in your environment.
Available names: spacy-legacy.MaxoutWindowEncoder.v1, spacy-legacy.MishWindowEncoder.v1, spacy-legacy.TextCatEnsemble.v1, spacy-legacy.Tok2Vec.v1, spacy.CharacterEmbed.v1, spacy.EntityLinker.v1, spacy.HashEmbedCNN.v1, spacy.MaxoutWindowEncoder.v2, spacy.MishWindowEncoder.v2, spacy.MultiHashEmbed.v1, spacy.PretrainCharacters.v1, spacy.PretrainVectors.v1, spacy.Tagger.v1, spacy.TextCatBOW.v1, spacy.TextCatCNN.v1, spacy.TextCatEnsemble.v2, spacy.TextCatLowData.v1, spacy.Tok2Vec.v2, spacy.Tok2VecListener.v1, spacy.TorchBiLSTMEncoder.v1, spacy.TransitionBasedParser.v1, spacy.TransitionBasedParser.v2

Can you please tell me what i can do to solve the problem?

Thnak you

@sdspieg : in case you hadn't found it yet :wink: SPACY v3: Custom trainable relation extraction component - YouTube

@hassnabou: sorry to hear you're running into trouble with the REL component.

I'm not sure why this happens on your system, it runs just fine on my end (as these things tend to go). The rel_model.v1 is defined by the function create_relation_model in rel_model.py, and at this point should be added to spaCy's architectures registry. For this to happen, that function declaration needs to run, which is why we have the -c flag for spacy train. For the evaluation, this is resolved by importing these functions specifically, cf projects/evaluate.py at v3 · explosion/projects · GitHub. It feels like that part fails on your system, but I'm not sure yet why.

Do you get the same error when you run the workflow all?

Thank you for your quick reply. Yes i run the workflow all and i get the same error.

Thank you @SofieVL,
I fixed the issue by uninstalling and installing anaconda, and it worked.
I have another question please; pardon me if it's a very stupid question; How can i use the final model on a new text to get the relation between entities? I tried nlp = spacy.load(model) then doc = nlp(text), then i called doc.ents, but it returns an empty list.

Happy to hear it!

That's not a stupid question :wink:

For the REL project/video (projects/tutorials/rel_component at v3 · explosion/projects · GitHub) we really wanted to focus on showing how to implement a custom component from scratch, but the code itself will need some further adaptation to be useful in your custom use-case. Very specifically, we assume gold-standard entities in the data, which means that we expect that the "GGP's" are already pre-annotated. If you just run the model on a text, it won't identify the entities by itself. This is the reason why we wrote a custom data reader that parses the GGP entities from file and sets those as doc.ents: projects/custom_functions.py at v3 · explosion/projects · GitHub

But in a realistic scenario, you'd have a trained NER module that recognizes the relevant entities like genes and proteins (for instance an NER model from scispacy: GitHub - allenai/scispacy: A full spaCy pipeline and models for scientific/biomedical documents.) , and that run the REL model on top of those predictions.

Thank you @SofieVL for your guidance!

1 Like