data-to-spacy for transformers

This command should already take care of that:

prodigy data-to-spacy my_directory --ner my_dataset

If you're training a spaCy pipeline with a transformer then spaCy will take care of all the token translation on your behalf.

You can see me report on all of the required steps here. Since you mentioned you're running similar steps but on Colab ... that's why I'm thinking this might be spaCy issue on top of colab.