Disable relations between specific entities

Dear Prodigy team,

I am currently using your rel.manual recipe for annotating biomedical literature, which I found super helpful!

I am currently dealing with 9 entities (pre-labelled) and 6 types of relations and the task consist of annotating the relations between entities. Each relation can only happen between specific entity types, and I was wondering if there is a way to disable the annotation of certain relations if the entity types do not correspond. Given the number of relations + entities, the annotation task is fairly complex and I have noticed that, eventually, annotators often label relations between entity types that can never have that type of relation, so this feature would be super helpful on simplifying the annotation task.

Just for clarification and as an example: let's say I have 4 pre-annotated entities (E1, E2, E3, E4) and 3 relationship types (R1, R2, R3).

R1 only happens between E1-E2 and E3-E2
R2 only happens between E4-E3 and E1-E3
R3 only happens between E1/E2/E3/E4-E1

I was wondering if there is a way in prodigy to prevent annotators from setting, for instance, an R1 relation between E2-E4 and anything that doesn't satisfy the rules above?

Thank you very much,

Ferran

Thanks, glad to hear the workflow has been useful :slightly_smiling_face:

Allowing a general-purpose setting for that is tricky because there are just so many possible combinations and it can change depending on the existing state. But what you describe sounds like a good use case for a custom recipe with a validate_answer callback. See here for details: https://prodi.gy/docs/custom-recipes#validate_answer

The validate_answer function takes an answer after it's submitted and lets you raise custom errors to give live feedback to the annotator. They will then see an alert and can only submit the annotation if it passes validation. The error message you raise in Python gets shown to the annotator, so you can use it for customised feedback.

You can see an example of the JSON format produced by the relations UI here: https://prodi.gy/docs/api-interfaces#relations So in your case, you could check the label of each assigned relation and the label of the head_span and/or child_span, and then determine if it's a valid combination.

Hi Ines,

Thanks a lot for your advice. I agree that this functionality might not suit as a general-purpose feature. Running a check through the validate_answer seems to do exactly what I would like.

However, I tried to reproduce the functionalities behind the rel.manual recipe to plug the validate_answer in my custom recipe, but I am struggling to get the same features in the UI (essentially ner+relations). I may be overcomplicating things but I was trying to write a custom recipe that extends rel.manual with validate_answer callback, but I did not find the code behind rel.manual as I did for the ner recipes: https://github.com/explosion/prodigy-recipes/tree/master/ner. Is there a way to inherit a recipe or any code to reproduce the rel.manual functionalities?

I tried to reproduce the functionalities with:

import prodigy
from prodigy.components.loaders import JSONL
from prodigy.util import combine_models, split_string
from typing import List, Optional
from prodigy.components.preprocess import add_tokens
import spacy 

@prodigy.recipe(
    "myrel",
    dataset=("The dataset to save to", "positional", None, str),
    spacy_model=("The base model", "positional", None, str),
    source=("The source data as a JSONL file", "positional", None, str),
    label=("One or more comma-separated labels", "option", "l", split_string),
    wrap=("optional wrapping", "flag", "w", bool),
    span_label=("One or more comma-separated span-labels", "option", "sl", split_string),
    patterns=("Optional match patterns", "option", "p", str),
    disable_patterns=("The disable patterns as a JSONL file", "option", "dpt", str),
    add_ents=("Add entities", "flag", "R", bool),

)
def myrel(
    dataset: str,
    spacy_model: str,
    source: str,
    label: Optional[List[str]] = None,
    wrap: bool = False,
    span_label: Optional[List[str]] = None,
    patterns: Optional[str] = None,
    disable_patterns: Optional[str] = None,
    add_ents: bool = False,
  
    ):
    
    stream = JSONL(source)
    nlp = spacy.load(spacy_model)
    stream = add_tokens(nlp, stream)
    

    return {
        "dataset": dataset,   
        "view_id": "relations",  
        "stream": stream,
        "config": {  
            "lang": nlp.lang,
            "labels": label,  
        }
    }

When I launch the UI with this recipe I get the relationship interface, but I don't get the option to swap to the NER when including the --span-label argument. Also, the entities that should be pre-highlighted by the model are not available. Any ideas on why that might be happening?
Apologies in advance if this is a rather trivial question

Ah, sorry if my answer was confusing! The rel.manual recipe definitely includes some more complex logic and you can check it out in the recipes/rel.py in your Prodigy installation. (Run prodigy stats to find the exact path.) So if you just quickly want to try it out the hacky way, you can also just edit this file in your Prodigy installation.

If you want a standalone recipe, you don't necessarily have to implement or copy the whole functionality. A recipe is just a Python function that returns a dictionary of components – so you can also just call a built-in recipe as a function in your own custom recipe.

Here's an example – I just copy-pasted all arguments and argument annotations you can see how it all fits together :slightly_smiling_face: Of course don't have to set or overwrite all of them. You can make your custom recipe only take the arguments that it needs and that you want to set on the CLI.

from typing import List, Optional, Union, Iterable, Dict, Any
import prodigy
from prodigy.recipes.rel import manual as rel_manual

@recipe(
    "custom.rel.manual",
    dataset=("Dataset to save annotations to", "positional", None, str),
    spacy_model=("Loadable spaCy model or blank:lang (e.g. blank:en)", "positional", None, str),
    source=("Data to annotate (file path or '-' to read from standard input)", "positional", None, str),
    loader=("Loader (guessed from file extension if not set)", "option", "lo", str),
    label=("Comma-separated relation label(s) to annotate or text file with one label per line", "option", "l", get_labels),
    span_label=("Comma-separated span label(s) to annotate or text file with one label per line", "option", "sl", get_labels),
    patterns=("Patterns file for defining custom spans to be added", "option", "pt", str),
    disable_patterns=("Patterns file for defining tokens to disable (make unselectable)", "option", "dpt", str),
    add_ents=("Add entities predicted by the model", "flag", "AE", bool),
    add_nps=("Add noun phrases (if noun chunks rules are available), based on tagger and parser", "flag", "AN"),
    wrap=("Wrap lines in the UI by default (instead of showing tokens in one row)", "flag", "W", bool),
    exclude=("Comma-separated list of dataset IDs whose annotations to exclude", "option", "e", split_string),
    hide_arrow_heads=("Hide the arrow heads visually", "option", "HA", bool),
)
def custom_rel_manual(
    dataset: str,
    spacy_model: str,
    source: Union[str, Iterable[dict]] = "-",
    loader: Optional[str] = None,
    label: Optional[List[str]] = None,
    span_label: Optional[List[str]] = None,
    exclude: Optional[List[str]] = None,
    patterns: Optional[Union[str, List]] = None,
    disable_patterns: Optional[Union[str, List]] = None,
    add_ents: bool = False,
    add_nps: bool = False,
    wrap: bool = False,
    hide_arrow_heads: bool = False,
) -> Dict[str, Any]:
def custom_rel_manual():
    components = rel_manual(
        dataset=dataset,
        spacy_model=spacy_model,
        source=source,
        loader=loader,
        label=label,
        span_label=span_label,
        exclude=exclude,
        patterns=patterns,
        disable_patterns=disable_patterns,
        add_ents=add_ents,
        add_nps=add_nps,
        wrap=wrap,
        hide_arrow_heads=hide_arrow_heads,
    )
    # Add callback to the components returned by the recipe
    components["validate_answer"] = validate_answer
    return components

Thanks so much for this @ines, it worked perfectly :slight_smile: !

I leave here the validate_answer function I used in case it can help anyone trying to set up a similar functionality. The only thing that would need to be adapted is the valid_relations dictionary at the end:

def validate_answer(eg):
    error_messages = []
    for relation in eg['relations']:
        head_label = relation['head_span']['label']
        child_label = relation['child_span']['label']
        rel_type = relation['label']
        is_valid = (head_label, child_label) in valid_relations[rel_type]
        if not is_valid:
            error_messages.append(
                "Careful!, you can't assign a relation type " + rel_type + " between: " + head_label + "-" + child_label)
    if error_messages:
        raise ValueError("\n".join(error_messages))

valid_relations = {
    "C_VAL": [("PK", "VALUE")],
    "C_MIN": [("PK", "VALUE")],
    "C_MAX": [("PK", "VALUE")],
    "D_VAL": [("VALUE", "VALUE")],
    "D_MIN": [("VALUE", "VALUE")],
    "D_MAX": [("VALUE", "VALUE")],
    "DOSAGE": [("VALUE", "VALUE")],
    "COMPLEMENT": [("COVARIATES", "COVARIATES")],
    "RELATED": [("TYPE_MEAS", "VALUE"), ("COMPARATIVE", "VALUE"), ("SPECIES", "VALUE"), ("CHEMICAL", "VALUE"),
                ("DISEASES", "VALUE"), ("UNITS", "VALUE")]
}