peewee.OperationalError: MySQL server has gone away

I’m seeing the following error on a pretty regular basis. Based on what I’m reading in the peewee docs it looks like the connection is timing out and this is not being properly handled by prodigy. To reproduce all I need to do is start the app (in this case I’m using the textcat.teach example) and leave it alone for a while. Then when I come back and try to save some answers I get this error. The only way to fix the app once this happens is to restart it completely. I’m using the PyMySQL driver. Thanks!

 16:02:04 - CONTROLLER: Receiving 3 answers
 16:02:05 - PROGRESS: Estimating progress of 0.1667
 16:02:05 - Exception when serving /give_answers
 Traceback (most recent call last):
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1043, in _write_bytes
     self._sock.sendall(data)
 BrokenPipeError: [Errno 32] Broken pipe

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3830, in execute_sql
     cursor.execute(sql, params or ())
   File "/usr/local/lib/python3.6/site-packages/pymysql/cursors.py", line 165, in execute
     result = self._query(query)
   File "/usr/local/lib/python3.6/site-packages/pymysql/cursors.py", line 321, in _query
     conn.query(q)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 859, in query
     self._execute_command(COMMAND.COM_QUERY, sql)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1096, in _execute_command
     self._write_bytes(packet)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1048, in _write_bytes
     "MySQL server has gone away (%r)" % (e,))
 pymysql.err.OperationalError: (2006, "MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))")

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "cython_src/prodigy/components/db.pyx", line 191, in prodigy.components.db.Database.add_examples
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 4977, in create
     inst.save(force_insert=True)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 5170, in save
     pk_from_cursor = self.insert(**field_dict).execute()
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3584, in execute
     cursor = self._execute()
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 2939, in _execute
     return self.database.execute_sql(sql, params, self.require_commit)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3837, in execute_sql
     self.commit()
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3656, in __exit__
     reraise(new_type, new_type(*exc_args), traceback)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 135, in reraise
     raise value.with_traceback(tb)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3830, in execute_sql
     cursor.execute(sql, params or ())
   File "/usr/local/lib/python3.6/site-packages/pymysql/cursors.py", line 165, in execute
     result = self._query(query)
   File "/usr/local/lib/python3.6/site-packages/pymysql/cursors.py", line 321, in _query
     conn.query(q)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 859, in query
     self._execute_command(COMMAND.COM_QUERY, sql)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1096, in _execute_command
     self._write_bytes(packet)
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1048, in _write_bytes
     "MySQL server has gone away (%r)" % (e,))
 peewee.OperationalError: (2006, "MySQL server has gone away (BrokenPipeError(32, 'Broken pipe'))")

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3849, in rollback
     self.get_conn().rollback()
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 785, in rollback
     self._execute_command(COMMAND.COM_QUERY, "ROLLBACK")
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1075, in _execute_command
     raise err.InterfaceError("(0, '')")
 pymysql.err.InterfaceError: (0, '')

 During handling of the above exception, another exception occurred:

 Traceback (most recent call last):
   File "/usr/local/lib/python3.6/site-packages/waitress/channel.py", line 338, in service
     task.service()
   File "/usr/local/lib/python3.6/site-packages/waitress/task.py", line 169, in service
     self.execute()
   File "/usr/local/lib/python3.6/site-packages/waitress/task.py", line 399, in execute
     app_iter = self.channel.server.application(env, start_response)
   File "/usr/local/lib/python3.6/site-packages/hug/api.py", line 424, in api_auto_instantiate
     return module.__hug_wsgi__(*args, **kwargs)
   File "/usr/local/lib/python3.6/site-packages/falcon/api.py", line 244, in __call__
     responder(req, resp, **params)
   File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 734, in __call__
     raise exception
   File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 709, in __call__
     self.render_content(self.call_function(input_parameters), request, response, **kwargs)
   File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 649, in call_function
     return self.interface(**parameters)
   File "/usr/local/lib/python3.6/site-packages/hug/interface.py", line 100, in __call__
     return __hug_internal_self._function(*args, **kwargs)
   File "/usr/local/lib/python3.6/site-packages/prodigy/app.py", line 71, in give_answers
     controller.receive_answers(answers)
   File "cython_src/prodigy/core.pyx", line 104, in prodigy.core.Controller.receive_answers
   File "cython_src/prodigy/components/db.pyx", line 188, in prodigy.components.db.Database.add_examples
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 4452, in __exit__
     return self.context_manager.__exit__(exc_type, exc_val, exc_tb)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 4484, in __exit__
     self.rollback(False)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 4471, in rollback
     self.db.rollback()
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3849, in rollback
     self.get_conn().rollback()
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3656, in __exit__
     reraise(new_type, new_type(*exc_args), traceback)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 135, in reraise
     raise value.with_traceback(tb)
   File "/usr/local/lib/python3.6/site-packages/peewee.py", line 3849, in rollback
     self.get_conn().rollback()
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 785, in rollback
     self._execute_command(COMMAND.COM_QUERY, "ROLLBACK")
   File "/usr/local/lib/python3.6/site-packages/pymysql/connections.py", line 1075, in _execute_command
     raise err.InterfaceError("(0, '')")
 peewee.InterfaceError: (0, '')

Thanks a lot for the report and checking the peewee docs! This is pretty annoying, sorry about that.

We’ll think about the best way to handle this – since the database is integrated via the recipes and controller, and is used across the application, I don’t think we can just handle the connection via the REST API. But we should be able to fix this issue consistently using better connection management.

1 Like

Thanks! In the meantime I’m guessing using sqlite is recommended?

That's probably the safest option, yes (sorry again!). If you want, you can also experiment with initialising peewee database yourself – maybe something clever could be done with the PooledMySQLDatabase class of the Playhouse extensions? (See here for an example of plugging in a custom database.)

@beckerfuffle

Quick update: Just working on the fix for this problem! If you have a second, would you be able to test the following two little scripts with your Prodigy database?

I want to verify that explicitly closing and reconnecting does indeed solve the problem – but I’m not 100% sure if my test setup is fully representative. I’m using 30 seconds as the timeout, since 28800ms is apparently the default MySQL wait_timeout – but if you’re using a custom configuration, you might have to adjust this.

Example 1: should produce the error

import time
from peewee import MySQLDatabase

db = MySQLDatabase('your_prodigy_db', user='user', password='password')
db.connect()
db.execute_sql('SELECT * FROM Dataset')  # execute arbitrary query
time.sleep(30)                           # sleep for the wait_timeout
db.execute_sql('SELECT * FROM Dataset')  # try query again
# this should produce the "MySQL server has gone away" error

Example 2: should work

import time
from peewee import MySQLDatabase

db = MySQLDatabase('your_prodigy_db', user='user', password='password')
db.connect()
db.execute_sql('SELECT * FROM Dataset') 
time.sleep(30)          
db.close()    # explicitly close the connection
db.connect()  # explicitly reconnect
db.execute_sql('SELECT * FROM Dataset') 
# this should work now!

(Btw, as an alternative quick fix to keep working with your MySQL database until we release the next Prodigy update, you could also try increasing the wait_timeout temporarily.)

Sure I should have some time to verify this afternoon. I’ll let you know how it goes. Thanks for the support!

@ines I think this works! With 30 seconds it didn’t fail so I upped it to 60 seconds. My MySQL server is also configure with wait_timeout = 28800. Here were my final test scripts:

Example 1: Produces the error after first sleep

import time
from peewee import MySQLDatabase

import logging
logger = logging.getLogger('peewee')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

db = MySQLDatabase('your_prodigy_db', host='mysql.hostname', user='user', password='password')
db.connect()
db.execute_sql('SELECT * FROM dataset')  # execute arbitrary query
for i in range(30):
    print(i)
    time.sleep(60)                           # sleep for the wait_timeout
    db.execute_sql('SELECT * FROM dataset')  # try query again

Example 2: Seems to not get the error

import time
from peewee import MySQLDatabase

import logging
logger = logging.getLogger('peewee')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

db = MySQLDatabase('your_prodigy_db', host='mysql.hostname', user='user', password='password')
db.connect()
db.execute_sql('SELECT * FROM dataset')  # execute arbitrary query
for i in range(30):
    print(i)
    time.sleep(60)                           # sleep for the wait_timeout
    db.close()    # explicitly close the connection
    db.connect()  # explicitly reconnect
    db.execute_sql('SELECT * FROM dataset')  # try query again

Thanks so much for testing the scripts and updating with the results! :pray: So it looks like the relevant timeout here isn’t actually the wait_timeout, but likely the connect_timeout (see here). This is also very good to know.

As a solution, I’ve now added a reconnect method to Prodigy’s Database class that closes the connection (if it’s not already closed) and then explicitly reconnects. This method is called in the controller before adding annotations. It also allows the user to handle connections more explicitly – for example, if the update function of a custom recipe needs to access the database, too, and/or performs actions that potentially take longer.

1 Like

FWIW I tried tweaking the sleep and it fails inconsistently at around 35 seconds. I’m guessing it’s 30 seconds +/- some random amount of time it takes the server to check?

I’ve tried setting wait_timeout and connect_timeout with the current terms.teach workflow I’m working with and neither seem to be making a difference. However, unlike the previous textcat.teach workflow I was using it’s not throwing an exception when it crashes so I’m hoping it the same error. I’d be happy to test out an alpha or beta release with the fix if you have one available. Otherwise it looks like sqlite is my best bet. I’ll test that out now to make sure it addresses my issues.

Thanks for the update and sorry about the frustration. We're hoping to get Prodigy v1.4.0 fully tested and completed over the weekend, so we can release it next week.

Thanks – this would be nice! If you're on macOS / OSX or Linux, we could build you a quick alpha wheel on a local machine (Windows is more difficult, because we have a separate build process). Do you want to send me a quick email to ines@explosion.ai?

Thanks FWIW sqlite does seem to have adressed the issues.

Thanks, got your email! :+1:

And yeah, I’m pretty sure this error must be somehow related to the specifics of MySQL. I guess SQLite is a little simpler in this way, because it just writes to a flat file. And as far as I know, we have some users running pretty large scale stuff with Prodigy + PostgreSQL, so I feel like it would have come up if there was a similar issue. Anyway, very keen to investigate and fix this!

I think I am getting timeout errors with my Postgres instance. There was a long-running task that was giving me “invalidSave” errors, so I checked the logs and it gave me:

22:13:57 - Socket error
Traceback (most recent call last):
   File "/home/ubuntu/chegg/writelab-aes/venv/lib/python3.6/site-packages/waitress/channel.py", line 139, in handle_write
     flush()
   File "/home/ubuntu/chegg/writelab-aes/venv/lib/python3.6/site-packages/waitress/channel.py", line 258, in _flush_some
     num_sent = self.send(chunk)
   File "/usr/lib/python3.6/asyncore.py", line 361, in send
     result = self.socket.send(data)
 TimeoutError: [Errno 110] Connection timed out

This was for Prodigy 1.4.2. I’ll try out 1.5.1.

@plusepsilon Thanks for the report and yes, definitely update if it still occurs in the latest version! The fixes we added for the MySQL error weren’t MySQL-specific, so it’s possible that this is caused by something else (and maybe needs a fix specific to Postgres?)

I did notice Prodigy uses an old version of peewee (2.10.2 vs. 3.5.1). I’m not sure if that causes any issues.