NER, additional checking after highlighting spans

ctoto93 · June 30, 2021, 3:17pm

Hi Prodigy Team. I wonder if we can make additional checking when the annotator highlights the words. We have annotation tasks similar to NER where the annotator highlights some phrases from a review (many sentences) and labels them. There is a requirement that the annotator should not highlight phrases from 2 different sentences.

Thanks :'D

ines · July 1, 2021, 1:33am

Hi! It sounds like this is a good use case for the validate_answer recipe callback: https://prodi.gy/docs/custom-recipes#validate_answer

You'll be able to define a Python function that receives the annotated example when it's submitted and raises an error if the annotations are considered invalid. The error is then shown to the user as an alert so they can fix the annotations, and they'll only be able to submit if the checks pass.

For efficiency, you probably want to do the sentence segmentation when you generate the examples you send out for annotation, and store the information with the JSON, so you don't have to do it during validation (which can potetially take longer). One simple option would be to just store the character offsets of the sentence starts, e.g. "sentence_starts": [50, 125, 293, 290]. For each span, you'll have the start and end character offsets, so you can easily calculate which sentence it belongs to based on its "start". You can then raise if there are two spans from two different sentences.

ctoto93 · July 2, 2021, 7:59am

Great! I will try this. Thank you for your support :"D

Topic		Replies	Views
Custom recipe for Annotating Overlapping Spans custom , front-end , best-practices	15	2505	September 6, 2020
Limit number of annotations usage , ner , custom , solved	4	444	March 8, 2022
Best approach for using ner manual and mark usage , ner , solved	22	2345	January 20, 2020
Highlighting spans that are not the entities to be labeled when using ner.correct usage , ner	1	454	December 21, 2020
Ambiguous NER annotation decisions usage , ner , solved , best-practices	12	4672	February 12, 2018

NER, additional checking after highlighting spans

Related topics