Hi all,
I have performed NER annotation using task routing with the feedoverlap: true configuration. The annotation process involves 6 annotators. How do I get only the best dataset to use in training the model? Is it possible to merge the six annotation results, or is there another way?
Hi @sigitpurnomo !
In order to get the best dataset you should now adjudicate the potentially conflicting annotations - after all, if there's a difference, it takes the domain expert to decide what the "best" annotation is. You can do that by feeding your dataset to the review
recipe. Please check the docs to see what are the available options. For example, you could automatically accept answers which received the same annotations from all 6 annotators. For the conflicting ones, you'll be able to compare the answers, choose the best one or even manually correct the annotation.
Having the data annotated by multiple annotators is also a great opportunity to measure the inter-annotator agreement (IAA), which is indicative of how consistent your annotations are and what you can expect from the machine learning model trained on this dataset. Please see out docs on IAA for more details.
HI @magdaaniol
Thank you for your suggestions.
Hi @magdaaniol,
Is the review
recipe still not supporting the blocks
annotations? Sometimes I use the custom recipe to annotate both POS Tagging and NER with multiple annotators via a task routing, and want to review the results at the same time using one user interface.
Hi @sigitpurnomo,
The review
UI is actually meant to work with one view_id
at a time. Rendering the diff between two different annotation tasks and multiple annotator could get really illegible otherwise.
Would doing one review per view_id
be a solution in your case?
The --view_id
parameter let's you specify which view_id
from your blocks should be selected for review.