Interested in using prodigy for tagging HIPAA data. Any advice on setup?

Hello, heard about prodigy, like it, but want to make sure of some details.

I'm interested in using Prodigy to tag some health care data (text and audio/video). All data are anonymized. We are interested in using Prodigy as tagging tool, but we also need to be HIPAA compliant at the same time.

Does anybody have any prior experience or suggestions on the setup?
We work with AWS stack if that makes a difference.

Thank you very much for the advice!!!

hi @uwyang!

Thanks for your question and welcome to the Prodigy community :wave:

Prodigy is an on-premise, Python library. Prodigy runs entirely on your own hardware and never "phones home" or otherwise connects to our or other third-party servers. So if you're working with sensitive data, you can even run it on a completely air-gapped machine with no internet access to meet your compliance requirements (which is how many users working on health data for instance do it).

Here's a related question where someone asked about HIPAA and Ines' thoughts:

Prodigy is installed on the cloud as it would on-premise. I would suggest looking through the aws issues as it can give you a lot of ideas on how custom it and how others have worked with AWS.

Hope this helps!