Hi Jetson, here are the answers to your questions:
- Yes, correct. Prodigy doesn't natively support PDFs. However, you can choose to write your own custom recipe that is able to use them. That way, you could consider using a Python package that can parse .pdf files. You might be able to consider this if your pdfs follow a very strict structure, but the image OCR path seems like a more common approach.
- An annotated image will have bounding boxes with the data format described here.
- You may appreciate this answer if you want to auto-tag images.