Can we train abstractive summarization in prodigy?

mirfan923 · April 3, 2023, 1:08pm

This repository contains abstractive summary datasets for different languages.

Can we train the abstractive summarization on these datasets using prodigy?

ryanwesslen · April 3, 2023, 1:43pm

Thanks for your question. This looks like an interesting repo.

It's important to remember that Prodigy is an annotation tool to acquire more annotated data, not necessarily a tool for model training.

Prodigy does have thetrain recipe, but it's just a wrapper for spacy train. Since spaCy doesn't haven't a built-in summarization component, it's not possible to train abstractive summarization out-of-the-box with prodigy train.

Since you mentioned the "datasets" in the repo - are you only interested in training or using the datasets and model in the repo, and creating a "model-in-the-loop" workflow to acquire more annotated data?

If you're only interested in training with those data and not getting any additional annotated data, then I'm not sure Prodigy would help.

However, if you wanted a model-in-the-loop workflow, then yes, Prodigy could help if you wrote a custom recipe. Custom recipes are essentially Python functions (written as a Python script) that can be run through the command line. So in this way, it may be possible you could write a custom recipe to do abstractive summarization with another model framework (e.g., the seq2seq training module used in the repo you posted).

Alternatively, if you only wanted additional annotated data for summarization (no model in the loop), you could create a custom recipe like this:

You could also create a custom interface based on what annotation task you were looking for:

Hope this helps!

Topic		Replies	Views
Extractive summarization with labels	5	556	June 20, 2022
Labelling dataset for extractive text summarization usage , custom	7	1752	October 15, 2020
How do I use prodigy as a purely annotation tool with no underlying SpaCy model? usage	1	1626	April 27, 2018
what is best way to to extract paragraph or long sentences in a text document? usage	18	3837	August 9, 2020
Use Prodigy purely as an annotating tool? usage , spacy , solved	10	1982	December 12, 2018

Can we train abstractive summarization in prodigy?

Related topics