Using prodigy for sentence similarity labelling

Hello! I am trying to train a sentence similarity model to match historical addresses to modern ones. Practically, it means i am trying to match one string against all the modern strings which I know to be contained within the same geographical area. I was thinking of using SBERT for this, and was wondering if prodigy would support labelling for that kind of task, and if so, if anyone has ever written a custom recipe for it.
Thanks

Hey!

Thanks for reaching out. Someone asked something similar in the forum a bit ago and feels like it could still be relevant.

I played around with @ines's suggestion to build your own interface using the "html" view to compare two sentences. It worked well - you could also add an additional "text_input" view if you want to capture how similar two sentences are, i.e:

blocks = [
    {"view_id": "html", "html_template": HTML},
    {"view_id": "text_input", "field_id": "user_input", "field_placeholder": "Enter how similar the two sentences are from 0 to 1."},
    ]

Hopefully that's a good starting point for your custom recipe!

1 Like