spancat with really large spans? (Identify sections in text)

I found a dialogue on the forum that might be inspirational here.

It's a different problem, but it highlights another two-step approach to rethinking spans.

That said, reading your reply still makes me think that textcat might be the simplest way forward, albeit on paragraphs instead of sentences. While I like your idea of using NER to detect the start of a section, I wonder if you might be able to leverage that this always starts on a newline, which suggests a heuristic might be better than a ML model.