You can try out a transformer, but I recommend also trying a lightweight CNN model. If the lightweight model performs nearly as well it might be a better option for production because it's able to run much faster.
With regards to training with transformers, the simplest way to think about it is that you merely need to make a change to the config.cfg file that has all the parameters for the spaCy model that you'll train.