Span Cat Annotations and Incorrect Predictions

I'm having issues with the span cat model not correctly predicting, or attempting to predict, all annotated labels in a text with the spancat model.

Backstory: I am currently training a span cat model to try to identify medical phrases in clinical narratives. We have annotated around 800 clinical narratives from real patient data and labelled them using prodigy's span cat annotator to identify phrases as disorders, findings, procedures, etc. We are using span cat, rather than NER, because many of the medical phrases have multiple sub-phrases within the terms that are separate labels altogether. For instance, "family history of heart disease" would be a situation, but the text "heart disease" is a disorder.

After training the model with prodigy, we found that the results had very low sensitivity - very few terms were ever being identified, even when predicting on the data used to train the model. We thought we needed more data, so we continued to add more examples and iteratively train, but the sensitivity was still low so I looked into how the annotations are being saved and I found something peculiar: Only the labels which contained a "text" attribute were being predicted and identified in the model. In other words, the labels contained in each example in the jsonl file was annotated differently, such that only the spans with the "text" attribute were being predicted, and the spans without it were not. Let me give an example:

A middle-aged female presents to the emergency room with the concern of active gastrointestinal (GI) bleeding. 99-Tc labeled RBC scan is performed, and imaged were acquired for 4 hours. No abnormal area of increased radiotracer uptake is identified. The patient had a bowel movement during the scan, and the stool was imaged and was positive for radioactivity.

The JSONL output for that annotation appeared as follows:

{"text":"A middle-aged female presents to the emergency room with the concern of active gastrointestinal (GI) bleeding. 99-Tc labeled RBC scan is performed, and imaged were acquired for 4 hours. No abnormal area of increased radiotracer uptake is identified. The patient had a bowel movement during the scan, and the stool was imaged and was positive for radioactivity. ","_input_hash":-1194054636,"_task_hash":-470800897,"tokens":[{"text":"A","start":0,"end":1,"id":0,"ws":true},{"text":"middle","start":2,"end":8,"id":1,"ws":false},{"text":"-","start":8,"end":9,"id":2,"ws":false},{"text":"aged","start":9,"end":13,"id":3,"ws":true},{"text":"female","start":14,"end":20,"id":4,"ws":true},{"text":"presents","start":21,"end":29,"id":5,"ws":true},{"text":"to","start":30,"end":32,"id":6,"ws":true},{"text":"the","start":33,"end":36,"id":7,"ws":true},{"text":"emergency","start":37,"end":46,"id":8,"ws":true},{"text":"room","start":47,"end":51,"id":9,"ws":true},{"text":"with","start":52,"end":56,"id":10,"ws":true},{"text":"the","start":57,"end":60,"id":11,"ws":true},{"text":"concern","start":61,"end":68,"id":12,"ws":true},{"text":"of","start":69,"end":71,"id":13,"ws":true},{"text":"active","start":72,"end":78,"id":14,"ws":true},{"text":"gastrointestinal","start":79,"end":95,"id":15,"ws":true},{"text":"(","start":96,"end":97,"id":16,"ws":false},{"text":"GI","start":97,"end":99,"id":17,"ws":false},{"text":")","start":99,"end":100,"id":18,"ws":true},{"text":"bleeding","start":101,"end":109,"id":19,"ws":false},{"text":".","start":109,"end":110,"id":20,"ws":true},{"text":"99","start":111,"end":113,"id":21,"ws":false},{"text":"-","start":113,"end":114,"id":22,"ws":false},{"text":"Tc","start":114,"end":116,"id":23,"ws":true},{"text":"labeled","start":117,"end":124,"id":24,"ws":true},{"text":"RBC","start":125,"end":128,"id":25,"ws":true},{"text":"scan","start":129,"end":133,"id":26,"ws":true},{"text":"is","start":134,"end":136,"id":27,"ws":true},{"text":"performed","start":137,"end":146,"id":28,"ws":false},{"text":",","start":146,"end":147,"id":29,"ws":true},{"text":"and","start":148,"end":151,"id":30,"ws":true},{"text":"imaged","start":152,"end":158,"id":31,"ws":true},{"text":"were","start":159,"end":163,"id":32,"ws":true},{"text":"acquired","start":164,"end":172,"id":33,"ws":true},{"text":"for","start":173,"end":176,"id":34,"ws":true},{"text":"4","start":177,"end":178,"id":35,"ws":true},{"text":"hours","start":179,"end":184,"id":36,"ws":false},{"text":".","start":184,"end":185,"id":37,"ws":true},{"text":"No","start":186,"end":188,"id":38,"ws":true},{"text":"abnormal","start":189,"end":197,"id":39,"ws":true},{"text":"area","start":198,"end":202,"id":40,"ws":true},{"text":"of","start":203,"end":205,"id":41,"ws":true},{"text":"increased","start":206,"end":215,"id":42,"ws":true},{"text":"radiotracer","start":216,"end":227,"id":43,"ws":true},{"text":"uptake","start":228,"end":234,"id":44,"ws":true},{"text":"is","start":235,"end":237,"id":45,"ws":true},{"text":"identified","start":238,"end":248,"id":46,"ws":false},{"text":".","start":248,"end":249,"id":47,"ws":true},{"text":"The","start":250,"end":253,"id":48,"ws":true},{"text":"patient","start":254,"end":261,"id":49,"ws":true},{"text":"had","start":262,"end":265,"id":50,"ws":true},{"text":"a","start":266,"end":267,"id":51,"ws":true},{"text":"bowel","start":268,"end":273,"id":52,"ws":true},{"text":"movement","start":274,"end":282,"id":53,"ws":true},{"text":"during","start":283,"end":289,"id":54,"ws":true},{"text":"the","start":290,"end":293,"id":55,"ws":true},{"text":"scan","start":294,"end":298,"id":56,"ws":false},{"text":",","start":298,"end":299,"id":57,"ws":true},{"text":"and","start":300,"end":303,"id":58,"ws":true},{"text":"the","start":304,"end":307,"id":59,"ws":true},{"text":"stool","start":308,"end":313,"id":60,"ws":true},{"text":"was","start":314,"end":317,"id":61,"ws":true},{"text":"imaged","start":318,"end":324,"id":62,"ws":true},{"text":"and","start":325,"end":328,"id":63,"ws":true},{"text":"was","start":329,"end":332,"id":64,"ws":true},{"text":"positive","start":333,"end":341,"id":65,"ws":true},{"text":"for","start":342,"end":345,"id":66,"ws":true},{"text":"radioactivity","start":346,"end":359,"id":67,"ws":false},{"text":".","start":359,"end":360,"id":68,"ws":true}],
"spans":[
{"start":14,"end":20,"text":"female","source":"./output/model-best","input_hash":-1194054636,"token_start":4,"token_end":4,"label":"FINDING"},
{"start":79,"end":109,"token_start":15,"token_end":19,"label":"DISORDER"},
{"start":111,"end":133,"token_start":21,"token_end":26,"label":"TEST"}],"_view_id":"spans_manual","answer":"accept","_timestamp":1642257676}

The model only predicts one span: "female". However, there are two other spans that were labelled: "Active gastrointestinal (GI) bleeding" (Tokens 15-19 labelled as disorder) and "99-Tc labeled RBC scan" (tokens 21 to 26 as test). At the bottom of the output, you can see that only the "female" label had the attributes of "text" and "source". The missing labels did not.

This does not appear to be a coincidence with this one example. I have tested multiple examples, each with varying levels of complexity, and I found over and over again that any of the labelled spans that did not contain "text" or "source" were never predicted in the model.

I am training the model with default variables. Is there something that I am doing wrong that leads to this incorrect behavior? Possible bug somewhere? Thank you for any help you can provide!

Edit: Potentially?? related issue: When I add a seed file for manual annotations, if the seed terms contain special characters like hyphens or parantheses, the seeds are not identified.