Hi, we have a requirement to perform annotations on emails in an exchange server mailbox. Would like to check if there are any plugins for reading the data directly from mailboxes or if there are ready code snippets or any other known working solutions.
Initially the requirement is to read all the emails and subsequently, only read the unread emails as they hit the mailbox.
There's no existing plugin for that, but if you can read in your emails in Python, you can use it in Prodigy. I'm sure that if you look online, you'll find existing code for something like it
That said, I'm not sure I'd recommend doing the reading at runtime. It just sounds like you'd potentially be introducing more problems this way: your scripts have to be super reliable (because if they die, your annotation server dies), you have to make sure they're fast, always return the correct data etc. etc. If possible, you might want to try reading your data first, converting them to an easy-to-load format (e.g. JSONL) and then annotating the files in a reliable way.
Even if your system later on has to read unread emails live, there's no need to do this during the annotation phase. You just need to label representative data – not use the exact same process you'd run in your final application.