Questions on difficulty of implementing the following features in a custom recipe

Hello dear developers of Prodigy!

We are trying to decide on whether we should use Prodigy for a specific annotation task, however, we are not sure if certain features we'd be interested in are easy to implement. Could you, given that you have a good overview of the tool, give us some insight as to how difficult the following would be to code up?

  1. Relation- and entity-attributes: we would like to, for example, add a ’negated’ attribute to entities: could one make such a thing possible in Prodigy, and how much such information could be incorporated into custom interfaces

  2. Lookups: Suppose a text contains an abbreviation like “HTN”: Can one add a button, which automatically, after highlighting the respective abbreviation, googles “medical abbreviation ”?

  3. Fragmented entities: Suppose there is the text “abdominal- and chest-pain”. Is there a way to mark two entities, “abdominal-pain” and “chest-pain” here? If not, how difficult would it be to implement this behavior?

  4. Knowledge base integration: How difficult is it to save URI to an entity: Say we find out that “headache” has a certain URI “C1”, I would like to

  • add this to the data outside of prodigy via a model, in a way such that this can be viewed in Prodigy and

  • automatically display information in the given knowledge base, by, say, appending “C1” to a base url.

  1. Javascript embedding: Can one generate outputs from dataset which can then be embedded in other applications? Say Jupyter, or something we write ourselves

Thank you for taking the time to respond. I wish you a pleasant day and am looking forward to your answers!

Hi! Most of what you describe definitely sounds possible so it's a question of deciding how things should work and how comfortable you are implementing a bit of JavaScript.

By attribute, do you mean an additional label? Whether the entity was negated? Depending on what you want to do here and what the end goal is, you could probably model this as a single flat label scheme.

This is something you could definitely do with a bit of custom JavaScript: Assuming you're using an interface like ner_manual or spans_manual, You can listen to the prodigyspanselected (fired when span is selected) or update event (fired when current annotation changes), look at the window.prodigy.content.spans for a list of all spans and insert a button or link that points to the Google search results for the span.text.

Non-contgiuous entities are a tricky challenge in general and solving this probably goes beyond just customising an interface and comes down to the annotation scheme for it as well. I've posted som thoughts on this in the following thread:

You should be able to do this via a custom recipe, a Python script that lets you control how data is streamed in, what to pre-annotate (e.g. using a model) and how to present the examples for annotation. I think it'll mostly come down to deciding how you want to present the information, e.g. if you want to use a general-purpose html interface, or a combination of different interfaces using blocks:

If you're working with entity linking, you might also find the following example project and tutorial useful: projects/tutorials/nel_emerson at v3 · explosion/projects · GitHub It also includes a Prodigy recipe for annotation entity links with a multiple-choice UI that shows what's possible:

Prodigy comes with a JupyterLab extension for annotation: GitHub - explosion/jupyterlab-prodigy: 🧬 A JupyterLab extension for annotating data with Prodigy In general, you can also embed the Prodigy web app via an iframe if needed, but it's not necessarily that useful: Prodigy is really designed for annotation and annotation sessions can also be stateful – so you typicall want to spin up the annotation server, create a small dataset, stop the server and run training experiments.