Can Dependencies & Relations work after Span Categorization?

SofieVL · October 21, 2021, 4:45pm

Hi!

Interesting use-case

I think your suggested approach of using a spancat to extract the "rights" makes sense, as this is indeed not a typical NER task. What isn't clear to me is how you're defining the actions like "contact us" or "access your account settings". Are these extracted with a spancat model as well?

With respect to relation extraction: There is builtin support in Prodigy to annotate both entities and relations at the same time with rel.manual. This does assume the entities are input for an NER algorithm, so it won't allow overlapping ones. However for annotation of your use-case, I think that's probably fine? You could parse the resulting NER annotations from Prodigy and create your own custom Doc objects storing the data in doc.spans, and then train a spancat. Let me know if you'd need further help on creating the Doc objects and serializing them to .spacy files with a DocBin. Alternatively, you can use spans.manual and then mark the annotations in the input JSONL files you're feeding into rel.manual. In that scenario, you only need to focus on the actual relations in the second step.

Training a relation extraction method with the REL annotations isn't currently a builtin functionality in spaCy or Prodigy, though you can have a look at this tutorial to get started with a custom implementation: projects/tutorials/rel_component at v3 · explosion/projects · GitHub. Note that this tutorial was written with actual named entities in mind, and not spans in doc.spans, but it should be relatively straightforward to make those adjustements in the code.

Finally, thinking a bit outside the box, I wonder whether your challenge could be recast as a textcat challenge instead? Suppose that you have all the "user right" entities/spans marked up in your sentences. If you typically only have 1 such entity per sentence, you could try to extract the "action" as a textcat category, i.e. analysing the full sentence to determine the correct action. This would make the challenge somewhat more simple, as you wouldn't need to extract the exact offsets of the "action words" like "contact us". This might be more appropriate if those contact phrases are long or not continuous in the sentence. But it'll depend on your dataset which approach will be better.

Hope that at least gives you some ideas to get started!

Topic		Replies	Views
How to extract dependencies in spaCy after using prodigy rel.manual? usage , spacy , relations	7	1465	April 19, 2021
Editing rel.manual to allow for multiple passes to account for nested entities usage	1	408	November 16, 2022
Annotation interface to do both SpanCat and NER ner , spancat	2	563	August 31, 2022
Does data need to be reannotated to use train recipe for predicting span labels after rel.manual recipe was used? usage , ner , spancat	1	387	October 15, 2021
Training a relation extraction component solved , relations , training	84	5678	June 27, 2023

Can Dependencies & Relations work after Span Categorization?

Related topics