hi @dad766!
Yes very likely that's the problem. Have you tried to train with a CPU? Also, can you try to train on a random X docs shortest docs to see if you can still run? This will at least confirm the length of the docs.
One possible option is to try a different suggester function than the n-grams (since it blows up candidates with long spans):
Is there any way you can break up your transcripts? I know sometimes it doesn't have sentences/punctuation, but any simple rules you can do. I think a clever way to segment your data may help your model more than a different suggester though.
As you have more questions on training, you may also want to check out spaCy's discussion forum. There's more posts on optimizing training with spaCy there.
Hope this helps!