prodigy data-to-spacy for relation extraction

nemomar · May 7, 2022, 7:08pm

First of all thanks for the excellent tool.

I want to convert my .jsonl file that contains annotated data to .spacy binary files for training a relation extraction (RE) model. I succeed to do it for the NER part with:

prodigy data-to-spacy ./corpus_ner --ner bla --eval-split 0.3 -V

but I cannot find the similar parameter (like "--ner") for RE.

Furthermore, I observed that in the case of the "--ner" parameter a config file was generated. Is the config file customized based on the input text (the .jsonl file) or is it a default one?

koaning · May 9, 2022, 7:16am

At the time of writing, spaCy doesn't natively support relation extraction models. The example that we list on our docs here is meant to be a tutorial on how to set up a custom component, not a guide on a feature in spaCy.

The crux of the issue is that the Doc object in spaCy currently has no support for relationships. That is also why, in turn, the .spacy object does not support them.

The config file that you see can be changed via the --config flag (docs). If this flag is not set, which is your case, it will auto generate the default settings as found here.

nemomar · May 11, 2022, 8:30am

Thank you for your swift answer. Now I have a better image of what needs to be done. I have another question for you regarding the relation extraction models in spaCy.

Is it any limitation/recommendation regarding the training set (text length wise, relation length, relation between entities belonging to different sentences)? We obtained better results if the text has one sentence (this is also available for the example from spaCy). My interest is for extracting relations for entities that are in different sentences.

nlp_reseracher478 · January 3, 2023, 2:48pm

Hi @nemomar !

I have a few questions hope you can answer.

Can you explain how you were able to address the issue?
For relationship extraction, we required a named entity as well, so do we need to run data-to-spacy two time one for ner and another for re or in the single pass it can be done?

ryanwesslen · February 23, 2023, 9:07pm

Thanks for your question!

Sorry for the delay. Have you seen this post? You can modify parse_data.py from the rel_components template project.

Topic		Replies	Views
How to extract dependencies in spaCy after using prodigy rel.manual? usage , spacy , relations	7	1466	April 19, 2021
Training NER and relations extraction (RE) together usage , spacy , relations	9	4603	June 10, 2022
Question about NER and relation annotation usage , ner , spacy , relations	2	410	August 26, 2021
ner.correct equivalent for relation extraction? usage , ner , textcat , spacy , relations	3	479	December 9, 2021
Prodigy relations relations	1	378	January 31, 2023

prodigy data-to-spacy for relation extraction

Related topics