Scores added to meta for some reason!?

I have a recipe where I uses the following functions

stream = get_stream(stream, rehash=True, dedup=True, input_key="text")
stream = split_sentences(nlp, stream, min_length=300)
stream = split_spans(stream, ["AMOUNT"])

The first AMOUNT span comes without a score (as it should) but for each following task on that same example it comes with a score equal to 1.

I realize it doesn't matter really but I am simply puzzled of the behaviour. Has it changed recently?

I just had a look and I think the reason this happens is that the split_spans helper defaults the score to 1.0 for all spans, if no score is available. The reason for this is that the preprocessor is typically used in the context of binary active learning recipes, where you typically have a score attached to a span. split_spans takes care of porting over the scores and for consistency, it assigns 1 for all spans that come through without a score.

1 Like