Hello,
I've had a look around regarding this usecase but it doesn't seem to be too well-documented. I was wondering if there's a possible way to pipe the outputs of a SpanCat model (i.e. Docs with identified Spans) into Prodigy, and then complete annotations for a TextCat model, e.g. whether the appearance of a particular entity is from Context A or Context B.
Steps Taken So Far
- I have labelled some datasets with the spans.manual recipe
- I have trained a SpanCat model on these datasets
- I have corrected the model usings spans.correct
What I Would Like to Happen Now
- I use the outputs of the SpanCat model to go through the annotated datasets again, this time providing annotations on the contexts of a particular entity
a) Ideally this would be in the form of, "What is the context of this extracted entity? A, B, or C? - Or, I revise the annotation method - and provide annotations for both the SpanCat and TextCat model in the first pass.
Comments
Classifying the whole origin text/sentence of the entities won't work as there might be multiple entities in each text, each with different contexts.
For a more concrete example, if I'm extracting Equipment and the context the equipment is being used, an example text might be "Mr X needs a wheelchair but is currently using a zimmer frame", I would like to indicate for the extracted entities of wheelchair and zimmer frame that the respective classifications are EQUIPMENT_NEEDED and EQUIPMENT_USED. Ideally there would exist a workflow within Prodigy that would facilitate labelling the latter as needed.
Many thanks