I'm working on building a resume parser, and wanted to ask for some advice on my data annotation strategy. I have images of documents, and I'm predicting bounding boxes and doing OCR to get the text output. I want to group together the predictions in a few ways:
- Group together multi-line bullet points
- Group together all the bullet points under Roles and responsibilities
- Group together the blue box and the roles and responsibilities
It's a little extra complicated because there are multiple experience in one image / document, so I can't just label the "class" of the bounding box, because I need to differentiate between experience record #1 at Align Technology and experience record #2 at Amerisourceberger.
I've also thought about other strategies like using larger bounding boxes that cover larger document sections, but then I run into issues with multiple page documents and more diverse layouts where sections are disconnected or there are other pieces in between two related text fields.
Any recommendations / advice?