Labels Translation in Prodigy UI

Hi,
I would like to know if its possible to edit the prodigy UI. So what I want to do is translate the labels only in the UI while keeping the label as it is in the backend. For example, I have 2 labels person and location. In the UI, I want them to appear as translated to a specific language but in the dataset, they will appear as person and location. Do you think that's possible?

Thanks

Hi! There's no built-in feature for this at the moment but I'll put this on my list of enhancements because it could be a cool feature to have :blush:

There are different workarounds I can think of, depending on your use case:

If your main goal is to provide more information about the labels, you could use the annotation instructions to provide a table of labels and translations, and maybe even additional information (like how you define the label): https://prodi.gy/docs/api-web-app#instructions

Alternatively, you could just use the translated labels and then have a post-process that replaces them with the unified label that you want to use for training later. It's one extra step, but it's a relatively simple search & replace that you can easily do programmatically.

2 Likes

I did use the instructions and will probably just use the translated labels. It will be really cool to see it in the future, thank you alot for your time :smiley:

1 Like

Hi, Ines.

This feature is very nice, but it seems that Japanese is not yet supported.
Can you please consider supporting Japanese?
Also, please let me know when it will be available,
and if there is a way to customize it for unsupported languages.
Please let me know.

By the translation feature, do you mean the ui_lang that translates the interface? If so, that's currently built in and we don't yet have a feature for changing it manually.

That said, if you're interested in helping out and contributing a Japanese translation, I've posted the relevant strings to be translated here:

All we need is the translated JSON and then I'd be very happy to include Japanese in the next release :slight_smile:

Hi Ines,

Thanks for the quick reply!
I may be misunderstanding, but is it correct that this can also translate the labels detected by Spacy?

For example, in the English View live demo, I can select "person" and "org" as labels, but I want to translate them into Japanese "人名" and "組織".

Thank you for your time.

Ah, in that case, you could just use one of the two solutions I posted above, depending on your use case:

If you want to actually show the translated labels, you could add a wrapper like this at the end of your recipe (e.g. ner.correct) that translates all span labels in the stream using a translation map:

TRANSLATIONS = {
    "PERSON": "人名", 
    "ORG": "組織",
    # ...
}

def translate_span_labels(stream):
    for eg in stream:
        for span in eg.get("spans", []):
            # Replace label with translation and default to original label
            span["label"] = TRANSLATIONS.get(span["label"], span["label"])
        yield eg

# At the end of your recipe
stream = translate_span_labels(stream)

Just keep in mind that if you want to update an existing models trained with labels like PERSON and ORG, you'll need a post-process that replaces the Japanese labels with the English labels again before you train. After all, the labels are just IDs so as far as the model is concerned, 人名 and PERSON are two different labels.