Spancat : surrounding text used as context?

Quick question on the SpanCategorizer that I haven’t been able to find an answer about.

Does the SpanCategorizer include context surrounding the suggested spans when classifying or is only the span being considered? If it does take context into account - how much’ish?

You're in luck, we just released a video that explains some of the details of spancat. You can watch it here.

If there's anything still unclear after watching the segment, feel free to say so!

yeah so I actually came up with the question after I watched the wonderful video :slight_smile:

Ah! That's good to know :slight_smile: to my knowledge, the spancat classifier depends on the suggested spans going in. But suppose that you're considering all 5-grams. Then by definition, we will consider every combination in a moving window of tokens. So even if the classifier only considers one chunk at a time. It is also considering surrounding tokens because these occur in the surrounding windows.

If you'd like to go more into details, it would be better to ask this question on the spaCy discussion forum. The spaCy maintainers will be able to give more detailed answers to any further questions.

1 Like