Welcome to the forum @andspasol
The most important tool would by the spacy package
command. This lets you package your trained pipeline as Python package so that you can install it and deploy as any other Python package.
As for the host, there are various options of course but judging by the forum our users tend to opt for AWS Lambda, Sagemaker or Google Cloud.
Here you can find related posts with some pointers and available integrations :
AWS Lambda: Trained model. Where should I deploy it?
Sagemaker: Deploy trained text classifier on Sagemaker - #2 by ines
Streamlit could be another option. There's spaCy integration with Streamlit documented here.
Yet another option is Hugging Face Inference Endpoint.
Not sure which of these options would be the best option money wise - probably depends on how much traffic you expect.
Additionally, here you can find some info on how to make the deployment lighter for faster inference service: Lightweight version of spacy for inference ?
As for the right way to set up the API, perhaps this blog post that includes detailed instructions could be of help? Deploy Spacy Ner With Fast Api
Also, in case you haven't seen it spaCy also has an integration with FastAPI that gives you the right scaffolding: Projects ยท spaCy Usage Documentation