We are trying to decide on whether we should use Prodigy for a specific annotation task, however, we are not sure if certain features we'd be interested in are easy to implement. Could you, given that you have a good overview of the tool, give us some insight as to how difficult the following would be to code up?
Relation- and entity-attributes: we would like to, for example, add a ’negated’ attribute to entities: could one make such a thing possible in Prodigy, and how much such information could be incorporated into custom interfaces
Lookups: Suppose a text contains an abbreviation like “HTN”: Can one add a button, which automatically, after highlighting the respective abbreviation, googles “medical abbreviation ”?
Fragmented entities: Suppose there is the text “abdominal- and chest-pain”. Is there a way to mark two entities, “abdominal-pain” and “chest-pain” here? If not, how difficult would it be to implement this behavior?
Knowledge base integration: How difficult is it to save URI to an entity: Say we find out that “headache” has a certain URI “C1”, I would like to
add this to the data outside of prodigy via a model, in a way such that this can be viewed in Prodigy and
automatically display information in the given knowledge base, by, say, appending “C1” to a base url.
Thank you for taking the time to respond. I wish you a pleasant day and am looking forward to your answers!
By attribute, do you mean an additional label? Whether the entity was negated? Depending on what you want to do here and what the end goal is, you could probably model this as a single flat label scheme.
Non-contgiuous entities are a tricky challenge in general and solving this probably goes beyond just customising an interface and comes down to the annotation scheme for it as well. I've posted som thoughts on this in the following thread:
You should be able to do this via a custom recipe, a Python script that lets you control how data is streamed in, what to pre-annotate (e.g. using a model) and how to present the examples for annotation. I think it'll mostly come down to deciding how you want to present the information, e.g. if you want to use a general-purpose html interface, or a combination of different interfaces using blocks: https://prodi.gy/docs/custom-interfaces#blocks
Prodigy comes with a JupyterLab extension for annotation: GitHub - explosion/jupyterlab-prodigy: 🧬 A JupyterLab extension for annotating data with Prodigy In general, you can also embed the Prodigy web app via an iframe if needed, but it's not necessarily that useful: Prodigy is really designed for annotation and annotation sessions can also be stateful – so you typicall want to spin up the annotation server, create a small dataset, stop the server and run training experiments.