We want to create an IR dataset for which we want to manually label the relevance of some search results shown for a set of queries and also classify the queries themselves.
Please see the image below for a graphical example of the intended annotation task for one query example (design details are not important as this one was created using simple online editors to clarify our requirements). In this example:
Fruit: is the query to be annotated
Food, Automobiles, Animals, Nature: are the possible classes for query some of which should be selected
Numbers: for each search result (image with its title) exactly one number should be selected indicating a relevance score for this result to the query.
We checked the online demos for a similar use case but we could not find any, although some are close. Can we use Prodigy for such an annotation task? If so could you please point us to the relevant documentation entries that we should check or propose how can this be achieved?
Thank you for your support!
This exact annotation interface isn’t something that Prodigy’s design makes especially easy, though, because it’s not an approach we’d generally recommend. I think you should consider breaking up your interface much more. You’re asking many separate questions here, such as:
- Is fruit in the category food?
- Is fruit in the category automobile?
- How well does this picture (an onion) match this query (fruit)
If you ask all of these questions at once, the UI has to be quite complicated, and so a lot of clicks through the interface are required for each bit of information. The annotator also has to take in all of the questions, figure out which one to answer, etc. If you instead ask one question at a time, you’ll be able to code the tasks very easily, the annotator will stay focused, and you’re likely to get more reliable annotations at a higher speed.
Thank you Matthew for your reply. Actually, as the main annotation task is a ranking task, we need to show a set of search results because we believe that these will influence the rating decision which is a desired behaviour. We could have used reordering only which would also be fine, but we introduced this complexity to deal with exceptional cases where some of the search results is irrelevant (0) or all of them represent the exact same item. The another query classification task is minor and does not have to be part of the ranking task but we hypothesis that being together might improve both tasks.
We will evaluate your arguments and look for alternatives to simplify the task.
For the rating task, you could try using the
choice interface? See here for an example: https://prodi.gy/docs/workflow-custom-recipes#example-choice
If you add a
"label" to the task dict, you’ll be able to show the top-level category that has been annotated in a previous process. This way, you can give the annotator enough context to make it easier to answer the question, while still keeping focus on one decision at a time.
Thanks Ines for the hint. The
choice interface looks good if we want to rate a single search result at a time. Can we use this interface multiple times at the same annotation step; five choices per search result, and four search results per query, where the query is the object to be annotated?