I was actually able to figure out Q1, but still unsure about Q2 and Q3.
Here is the recipe.py file:
import prodigy
from prodigy.components.loaders import JSONL
@prodigy.recipe(
"extsumm",
dataset=("The dataset to save to", "positional", None, str),
file_path=("Path to texts", "positional", None, str),
)
def extsumm(dataset, file_path):
"""Annotate sentences of a document to be included in extractive summary or not."""
stream = JSONL(file_path) # load in the JSONL file
return {
"dataset": dataset, # save annotations in this dataset
"view_id": "choice", # use the choice interface
"stream": stream,
'config': {'choice_style': 'multiple'},
}
The annotation file - ext_summ_annotation.jsonl
- looks like this:
{"text": "Tick imp sentences", "options": [{"id": 0, "text": "line 1"}, {"id": 1, "text": "summary line 2"}, {"id": 2, "text": "line 3"}]}
{"text": "Tick imp sentences", "options": [{"id": 0, "text": "summary x 1"}, {"id": 1, "text": "x 2"}, {"id": 2, "text": "summary x 3"}, {"id": 3, "text": "x 4"}]}
You can run it using: prodigy extsumm extsumm_dataset ext_summ_annotation.jsonl -F recipe.py