I just had a look and I think the reason this happens is that the split_spans helper defaults the score to 1.0 for all spans, if no score is available. The reason for this is that the preprocessor is typically used in the context of binary active learning recipes, where you typically have a score attached to a span. split_spans takes care of porting over the scores and for consistency, it assigns 1 for all spans that come through without a score.