Hi! Prodigy is primarily designed as a developer tool and the main way to interact with it and start the annotation server is via the CLI or from Python. You can also add a layer on top that provides a simple UI, e.g. for uploading a file or configuring some settings. One way to do this would be to put Prodigy behind a REST API, e.g. using a library like FastAPI, and then make a request to it that starts the server and redirects the user.
One thing to keep in mind is that you typically want the developer who is also working on the model to define the annotation task and settings. How you set things up will be very important for the final results, so you typically want to avoid any variance here and have the annotator label exactly what you need for the modeo.
Once the annotation server is running, you can have multiple people access it. You can also start multiple instances of Prodigy on separate ports, e.g. one instance per annotator.
Sure, you can load and filter your CSV data however you want. At the end of it, you just need to send out examples in Prodigy's data format, depending on the interface you want to use: Annotation interfaces · Prodigy · An annotation tool for AI, Machine Learning & NLP
Streams that provide the examples to annotate are simple Python generators, so you can implement any custom logic using custom recipes or your own loader functions.
Prodigy is an annotation tool for creating training data for machine learning models. After creating that data, you can use it to train any model, implemented in any framework. Prodigy also provides various built-in workflows for working with spaCy models and training spaCy models from Prodigy annotations. However, spaCy is not a strict requirement. After training a model, you can then use it in your code.
You can run Prodigy on a local machine or on a server in the cloud etc. It's a regular Python app, so you can deploy it just like you would deploy any other Python application. You can also run training experiments locally, or on a remote machine. Once you're serious about training a final model, you typically want to do this one a separate machine with more resources (or a GPU), since that's usually more efficient.