Prodigy Questions

Thanks for your feedback! This is an interesting case. I'm not familiar with tagtog but glad you mentioned so I can learn more!

Yes! I found these posts that are relevant:

It's possible to add nested entities or synonyms if you converted the .tsv file to a dictionary with the nested entities (you could also do the same for the synonyms) like this:

# dictionary of lowercase entities mapped to subtypes
DRUG_SUBTYPES = {
    'citalopram': ['ANTIDEPRESSANT', 'SOMETHING_ELSE'],
    'lexapro': ['ANTIDEPRESSANT'],
    # etc.
}

Then you would follow the instructions to use spaCy to create a custom component to your modeling pipeline. I've posted a quick example of what it may look like:

Two important points to think about. First, the model development isn't really Prodigy, but spaCy. Prodigy is the UI tool to get more annotations while spaCy is the NLP engine underneath. Prodigy does offer helpful training recipes but these are really running spaCy. To get the greatest/quickest gains with Prodigy, it's helpful to learn more about spaCy. Therefore, it seems like this question is really "can spaCy do this?" rather than "can Prodigy do this?".

It is worth noting that you can use Prodigy with other NLP/python libraries like TensorFlow or PyTorch, but that will require even more customization on the developer's part.

Related, what separates Prodigy from many other annotator tools is that it is a developer annotation tool. Prodigy is designed to be customized by your developers to write their own Python scripts to fit their unique needs (e.g., through custom recipes or custom interfaces). My favorite video that captures this design philosophy is this excellent talk titled "Let Them Write Code" by Ines:

Thanks again for your questions! Let us know if you have any further questions.