Hello Prodigy community,
I have an annotation task that should be fairly simple, but doesn’t appear to be supported out of the box. I’m hoping somebody who has dealt with something similar can help out.
For this task, I want to compare two strings of text and assign one of five labels to the pair. Let’s say these are 1
, 2
, 3
, 4
, or 5
. Only one label will ever be assigned to the pair and every pair will have a label. Ideally, I would like it if Prodigy could randomly draw two samples from a dataset and avoid pairs that have already been selected (similar to the --memorize
flag that the mark
recipe can take). Since I will just be labeling a relationship between the pair (and not tokens within them), I would like to be able to just hit the number button corresponding to the correct label and automatically move to the next pair.
If that’s the best-case scenario, here’s the workaround I’ve been able to come up with. I could randomly pair and concatenate strings outside of Prodigy and prepend that with some sort of flag token (like “label”). I could then use ner.manual
with my five labels, and for each concatenated string select the appropriate label and highlight just the word “label”. I could then export my annotations and programmatically label each of them based on that consistent information. This is obviously not as good of a solution since it requires selecting a label, actually highlighting a token, and a considerable amount of work with text files outside of Prodigy.
Please let me know if you’ve solved a similar problem or if you have ideas on how to approach this.
Thank you,
Tyler