constituency parsing, dependency parsing and semantic role labeling

bayethiernodiop · October 31, 2019, 11:59am

Hello, I am planning to use prodigy to label a legal text dataset with the intention of doing semantic role labeling, my question is can I use prodigy to do all this kind of labeling (constituency parsing, dependency parsing, and semantic role labeling) by default, or do is it possible to create and use my own recipe.
Thanks in advance.

honnibal · November 3, 2019, 2:36pm

Hi @bayethiernodiop,

Unfortunately support for tree annotation in Prodigy is currently very limited. You can accept or reject labelled relations, but there's currently no way to create new trees. We have some ideas for how to provide this, but it would be a new front-end that has different assumptions than the current one --- as the task is pretty different. In the meantime, you might find it helpful to use Prodigy's sequence annotation to help you prepare bracketed spans, which could be a helpful preprocess.

bayethiernodiop · November 25, 2019, 10:31am

Thanks a lot hanibal, what about semantic role labelling.

honnibal · November 25, 2019, 11:51am

We don't really have an end-to-end solution for that either yet. We have plans that we're excited to try out, but for now our best advice is to think about custom workflows, or perhaps tools from academia.

bayethiernodiop · November 26, 2019, 10:51am

Thanks for your feedback, i don't want an end to end solution here but just do the labeling par for semantic role labeling and these use some deep learning library on those annotations.

bayethiernodiop · November 26, 2019, 3:36pm

Also do you know any tool that can be used to annotate data for semantic role labeling.

ines · November 26, 2019, 7:26pm

What type of data do you need? If you just want to assign labels to spans of text, you could use the ner.manual workflow / ner_manual interface?

bayethiernodiop · November 27, 2019, 8:20am

Very good idea Ines, however after giving the label i need to add BIO notation to the tokens in the span. Any idea on a good way to handle this other than creating three entities for the same one with suffixe of B,I and O.
thanks

ines · November 27, 2019, 1:29pm

You should be able to do this automatically – no need to do this all by hand!

Prodigy already pre-tokenizes the text and stores all that information with the data. So for each annotated span, you have its position in the text and the ID of its start and end token. So you'll know which token needs to be B, I and O.

You could also use spaCy to do this for you: load in the data, process each text, use doc.char_span to get a span object for each annotated span in the data and then look at the token.ent_iob_ tag for each token in the span

bayethiernodiop · May 10, 2020, 8:01pm

Thanks a lot

Topic		Replies	Views
Advanced Relation Labeling Receipe usage , relations	1	482	November 26, 2020
sequence labelling with prodigy ? usage	2	627	February 27, 2018
Annotating custom semantic relations usage , custom , dep	4	716	September 2, 2019
POS tag, dependency, and nested entity interfaces? enhancement , usage	1	1640	January 26, 2018
Review NER + Relation annotations jointly usage , ner , done , review , relations	17	1487	August 8, 2022

constituency parsing, dependency parsing and semantic role labeling

Related topics