Customize recipe for text generation tasks


I'm wondering if and how Prodigy can be used to annotate text generation tasks, e.g translation.

I added an editable html text bar to a custom recipe - < div style="text-align:left;font-size: 16px;" id="translation" contenteditable="true">{{translation}}< /div>.

I can see and edit it in the UI after opening the app but don't know how to make it part of the output extracted with "python3 -m prodigy db-out <data_set_name> <path_to_output>".
Note: 'translation' is part of the input - I edit it in cases it is inaccurate and want the edited text in the output.

Any help will be highly appreciated, thanks.

Hi @Leon!

Great question! Have you seen the documentation for custom interfaces?

You may only need to pass a key-value pair with translation as the key for each record in your .jsonl: {"translation": "xxx"}. Since you're using db-out, you may need to create (mutate) or rename that translation key-value pair if it's not in your original <data_set_name>. You may find the clumper package to be helpful if you're new to manipulating nested (json) structures.

Also, are you aware of the compare recipe? It's used for A/B evaluation (comparison) but could also be used in generative/translation if you want to compare different generations/translations. There's an example of a translation task here.

Let me know if this helps (or doesn't). Thanks again for your question!

Thank you @ryanwesslen, I checked the documentation you attached and it really helped.
I am using the "text_input" view_id to gather textual input from the annotators but there is just one thing I'm lacking.

I'd like to have a dynamic "field_placeholder" for "text_input" - i.e I'm trying to populate the placeholder with a field from my input data file ("translation" field).
This will make annotation easier in cases where only a little editing is required to make the translation correct.

Here is an element in my "blocks" variable, defined in the recipe:
blocks = [..., {"view_id": "text_input", "field_label": "German translation", "field_placeholder": "Type here..."}, ...]
Any idea on how to replace the static "Type here..." value with the "translation" field from my input file which looks like this:
{"English text":"That is good","translation":"Das is gut"}

Thanks again.


As per the documentation attached, to populate the "text_input" field with a value from your data, use "field_id":
blocks = [..., {"view_id": "text_input", "field_label": "German translation", "field_id": "translation"}, ...]