Connecting mysql db failed

Hello!
I am trying to switch from the default sqlite to mysql remote db, but I am keep getting strange error.
The command:
python3 -m prodigy ner.teach my_set en_core_web_sm around-the-world.txt --label GPE
The error:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/__init__.py", line 4, in <module>
    from . import recipes, about  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/recipes/__init__.py", line 4, in <module>
    from . import ner, textcat, compare, terms, generic  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/recipes/ner.py", line 25, in <module>
    DB = connect()
  File "cython_src/prodigy/components/db.pyx", line 31, in prodigy.components.db.connect
  File "cython_src/prodigy/components/db.pyx", line 89, in prodigy.components.db.Database.__init__
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 431, in initialize
    callback(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 1197, in _set_constructor
    self._constructor = database.get_binary_type()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 4377, in get_binary_type
    return mysql.Binary
AttributeError: 'NoneType' object has no attribute 'Binary'

prodigy.json file:

{
    "db": "mysql",
    "db_settings": {
        "mysql": {
		"host" :"host",
		"user" :"user",
		"passwd":"password",
		"db": "myawasomedb"
	}
    }
}

Appreciate any help!

Thanks for the report! This is strange – your config definitely looks okay. The good news is, under the hood, Prodigy uses peewee, so you can debug the database connection by calling into peewee directly. If peewee can connect to your DB, Prodigy should, too. The database setup also allows more advanced usage – see here for an example.

I had a look online and found this issue on the peewee tracker describing the same error message. The solution suggested was the following:

So this error indicates that the mysql Python driver is not installed or is not import-able. […] You’ve triggered the “mysql=None” bit, meaning peewee could not import either driver.

Maybe the MySQL driver is not available in the Python environment you’re using to run Prodigy?

Thanks for the quick reply!
Who is responsible to install the Mysql driver? is it peewee or Prodigy? I am trying to understand the flow of the installation so I can try to find where it failed from the first place.

As far as I know, this is all up to the user – sorry if this was unclear. There are different options and configurations, so using peewee with a MySQL database requires a driver to already be installed. The connection parameters supported by peewee are PyMySQL and MySQLdb, so installing any of those packages should work. You can also connect to a remote MySQL database using peewee’s Playhouse extension (example here).

mmm, ok so that was unclear to me.
I thought the Prodigy installation has it bundled together.
What modules I need to install in order to use remote mysql db?

Maybe this should have been more explicit in the docs, sorry. The reason we’re not shipping any database drivers with Prodigy is that there are too many options and user preferences – we want to keep the dependencies lightweight, instead of making the user install all database drivers by default. In theory, you can even plug in your very own database solution by adding your own Database class, so Prodigy tries to not many too many assumptions about the user’s preference here.

The thread I linked above should include all relevant details on remove connections – including the code snippet contributed by a fellow Prodigy user. Essentially, Prodigy’s built-in Database class can be initialised with any valid peewee database, however you choose to construct it. Here’s a StackOverflow thread on remote MySQL databases with peewee. There’s also the Playhouse extension which adds even more advanced functionality – but I’m not sure if this is relevant in your case.

The easiest way for now is probably to try it out with one of the MySQL Python drivers and make sure it all works as expected.

Cool thanks a lot, will update once everything is set up.

Hi again,
I updated my packeges, yet another error came…

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 142, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/__init__.py", line 4, in <module>
    from . import recipes, about  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/recipes/__init__.py", line 4, in <module>
    from . import ner, textcat, compare, terms, generic  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/prodigy/recipes/ner.py", line 25, in <module>
    DB = connect()
  File "cython_src/prodigy/components/db.pyx", line 31, in prodigy.components.db.connect
  File "cython_src/prodigy/components/db.pyx", line 94, in prodigy.components.db.Database.__init__
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3917, in create_tables
    create_model_tables(models, fail_silently=safe)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 5356, in create_model_tables
    m.create_table(**create_table_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 5028, in create_table
    if fail_silently and cls.table_exists():
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 5024, in table_exists
    return cls._meta.db_table in cls._meta.database.get_tables(**kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 4321, in get_tables
    return [row for row, in self.execute_sql('SHOW TABLES')]
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3828, in execute_sql
    cursor = self.get_cursor()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3774, in get_cursor
    return self.get_conn().cursor()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3763, in get_conn
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3738, in connect
    self._local.conn = self._create_connection()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 3768, in _create_connection
    return self._connect(self.database, **self.connect_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/peewee.py", line 4318, in _connect
    return mysql.connect(db=database, **conn_kwargs)
TypeError: Connect() got multiple values for keyword argument 'db'

Any ideas?

In your prodigy.json config, could you try changing "db" in "db": "myawasomedb" to either "name", "dbname" or "database"?

It looks like Prodigy’s connector currently only looks for these values in the config, to support different types of config parameters. But somehow, not "db", which is specific to the MySQLdb driver. Will also make sure to fix this for the next release.

“name” is working!
Thanks again!

1 Like