Questions on difficulty of implementing the following features in a custom recipe

tmpr · February 25, 2022, 2:18pm

Hello dear developers of Prodigy!

We are trying to decide on whether we should use Prodigy for a specific annotation task, however, we are not sure if certain features we'd be interested in are easy to implement. Could you, given that you have a good overview of the tool, give us some insight as to how difficult the following would be to code up?

Relation- and entity-attributes: we would like to, for example, add a ’negated’ attribute to entities: could one make such a thing possible in Prodigy, and how much such information could be incorporated into custom interfaces
Lookups: Suppose a text contains an abbreviation like “HTN”: Can one add a button, which automatically, after highlighting the respective abbreviation, googles “medical abbreviation ”?
Fragmented entities: Suppose there is the text “abdominal- and chest-pain”. Is there a way to mark two entities, “abdominal-pain” and “chest-pain” here? If not, how difficult would it be to implement this behavior?
Knowledge base integration: How difficult is it to save URI to an entity: Say we find out that “headache” has a certain URI “C1”, I would like to

add this to the data outside of prodigy via a model, in a way such that this can be viewed in Prodigy and
automatically display information in the given knowledge base, by, say, appending “C1” to a base url.

Javascript embedding: Can one generate outputs from dataset which can then be embedded in other applications? Say Jupyter, or something we write ourselves

Thank you for taking the time to respond. I wish you a pleasant day and am looking forward to your answers!

ines · February 28, 2022, 9:51am

Hi! Most of what you describe definitely sounds possible so it's a question of deciding how things should work and how comfortable you are implementing a bit of JavaScript.

By attribute, do you mean an additional label? Whether the entity was negated? Depending on what you want to do here and what the end goal is, you could probably model this as a single flat label scheme.

This is something you could definitely do with a bit of custom JavaScript: Custom Interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP Assuming you're using an interface like ner_manual or spans_manual, You can listen to the prodigyspanselected (fired when span is selected) or update event (fired when current annotation changes), look at the window.prodigy.content.spans for a list of all spans and insert a button or link that points to the Google search results for the span.text.

Non-contgiuous entities are a tricky challenge in general and solving this probably goes beyond just customising an interface and comes down to the annotation scheme for it as well. I've posted som thoughts on this in the following thread:

You should be able to do this via a custom recipe, a Python script that lets you control how data is streamed in, what to pre-annotate (e.g. using a model) and how to present the examples for annotation. I think it'll mostly come down to deciding how you want to present the information, e.g. if you want to use a general-purpose html interface, or a combination of different interfaces using blocks: Custom Interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP

If you're working with entity linking, you might also find the following example project and tutorial useful: https://github.com/explosion/projects/tree/v3/tutorials/nel_emerson It also includes a Prodigy recipe for annotation entity links with a multiple-choice UI that shows what's possible:

github.com

explosion/projects/blob/v3/tutorials/nel_emerson/scripts/el_recipe.py

"""
Custom Prodigy recipe to perform manual annotation of entity links,
given an existing NER model and a knowledge base performing candidate generation.
You can run this project without having Prodigy or using this recipe:
sample results are stored in assets/emerson_annotated_text.jsonl
"""

import spacy
from spacy.kb import KnowledgeBase, get_candidates

import prodigy
from prodigy.models.ner import EntityRecognizer
from prodigy.components.loaders import TXT
from prodigy.util import set_hashes
from prodigy.components.filters import filter_duplicates

import csv
from pathlib import Path

This file has been truncated. show original

Prodigy comes with a JupyterLab extension for annotation: GitHub - explosion/jupyterlab-prodigy: 🧬 A JupyterLab extension for annotating data with Prodigy In general, you can also embed the Prodigy web app via an iframe if needed, but it's not necessarily that useful: Prodigy is really designed for annotation and annotation sessions can also be stateful – so you typicall want to spin up the annotation server, create a small dataset, stop the server and run training experiments.

Topic		Replies	Views
Customizing prodigy for NER and relationship extraction usage , ner , custom	4	4203	December 20, 2017
Custom interface/recipe for entity linking custom	1	73	June 17, 2024
Manual text typing usage , custom	2	932	February 25, 2018
Annotating custom entities in job descriptions usage , custom , hr	9	1158	June 2, 2019
annotations imported via db-in not showned ner , done , front-end	2	39	August 31, 2024

Questions on difficulty of implementing the following features in a custom recipe

Related topics