Store results in another database

Can I store the validate results in SQLserver?

If you can connect to SQLserver from Python, then yes :slightly_smiling_face: Prodigy lets you pass in a custom Database class via the "db" setting returned by a custom recipe, or via an entry point of your own Python package installed in the same environment.

See this thread for more details and discussion:

You can find the expected database API in your PRODIGY_README.html.

Do I need to setup the prodigy.json file as following, if I want to store the results in SQL.

{
    "db": "mysql",
    "db_settings": {
        "mysql": {
            "host": "localhost",
            "user": "username",
            "passwd": "xxx",
            "db": "prodigy"
        }
    }
}

If you want to connect to a regular MySQL database, that's supported out-of-the-box. Just make sure the Python driver is installed and the connection parameters are in your prodigy.json.

However, from what I understand, SQL Server is different? (If you're referring to Microsoft SQL Server and not just a remote MySQL database.)

yes, I mean Microsoft SQL Server

In that case, you probably need to write your own adapter (as discussed in the thread I linked above). I just did a quick search and there seem to be several resources available for working with Microsoft SQL Server in Python: https://docs.microsoft.com/en-us/sql/connect/python/python-driver-for-sql-server?view=sql-server-2017

I am still not quite understand, could you help me write down the adapter. I want to load parsed data from MSSQL, use prodigy to validate the annotation results and store the validated results back to MSSQL.

The main thing here is that databases are different: if you want to store your annotations in a database, you need to tell Prodigy how. Prodigy already comes with all the internals and makes sure that the method to save annotations or add a new dataset are called in the right places. But if you want to use a custom database, you need to write the code that's executed in those methods. For example, Database.add_examples will receive a list of examples and one or more dataset names, and you'd have to write the code that uses that information and adds it to your database.

And we can't really write custom adapters, sorry – I also don't really know SQL Server at all. But if that's what you're using, maybe there's someone you work with who can help? I'm sure you'll also find resources online if you google SQL Server and Python. In terms of the Prodigy-specific methods, check out the documentation on custom loaders and the Database class.