Hi! It looks like you've done everything correctly in terms of setting up and packaging your sentencizer
It looks like you've hit an interesting edge case here in data-to-spacy: the recipe currently uses a blank model with the default sentencizer to process the examples (mainly tokenization and sentence segmentation). ner.manual doesn't segment any sentences and just show you whatever you stream in – so you'll want to use your custom sentencizer when you conver the data for spaCy.
The easiest workaround for now would be to find the location of your Prodigy installation (you can run prodigy stats to get the path) and then open prodigy/recipes/train.py and find the data_to_spacy recipe function. You can then modify the calls to spacy.blank(lang) and nlp.add_pipe (first few lines) and either hard-code your model, or change it to spacy.load and remove the default sentencizer if you want to be able to pass in a model name instead of just a language code via --lang.
I'll try to think of a good way to solve this in a general-purpose way I think just allowing a custom base model for tokenization and sentence segmentation instead of just a base language should work fine.
Just released Prodigy v1.10, which allows a --base-model argument on the data-to-spacy recipe. Ideally, you'd still have to call your component sentencizer so it's not disabled when the Doc objects are created (or perform your segmentation in the tokenizer). I'd love to resolve this as well, but this would require some deeper refactoring.