Mysql Issue

Hi,

I am trying to use MySQL to store prodigy annotations, but I am facing issue on saving the annotations. I include the error printed in the terminal on the bottom.

The steps that I did:

  1. Install prodigy
  2. Install pymysql
  3. Fill in username, password in prodigy.json
  4. Create dataset
  5. Run prodigy ner.manual
  6. Open prodigy web ui, start annotate texts and accept them
  7. Click save button

Then the error occur, I checked in mysql database, I can see the dataset is created, but the annotations are not saved.

I have tried to run the test_database.py in the documentation. There is no issue on that.

The full error log:

/home/username/anaconda3/lib/python3.7/site-packages/pymysql/cursors.py:170: Warning: (3090, "Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release.")
result = self._query(query)
:sparkles: Starting the web server at http://: ...
Open the app in your browser and start annotating!
Task exception was never retrieved
future: <Task finished coro=<RequestResponseCycle.run_asgi() done, defined at /home/username/anaconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py:383> exception=Error('Already closed')>
Traceback (most recent call last):
File "/home/username/anaconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 388, in run_asgi
self.logger.error(msg, exc_info=exc)
File "/home/username/anaconda3/lib/python3.7/logging/init.py", line 1407, in error
self._log(ERROR, msg, args, **kwargs)
File "/home/username/anaconda3/lib/python3.7/logging/init.py", line 1514, in _log
self.handle(record)
File "/home/username/anaconda3/lib/python3.7/logging/init.py", line 1523, in handle
if (not self.disabled) and self.filter(record):
File "/home/username/anaconda3/lib/python3.7/logging/init.py", line 751, in filter
result = f.filter(record)
File "cython_src/prodigy/util.pyx", line 120, in prodigy.util.ServerErrorFilter.filter
File "/home/username/anaconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
result = await app(self.scope, self.receive, self.send)
File "/home/username/anaconda3/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in call
return await self.app(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/fastapi/applications.py", line 140, in call
await super().call(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/applications.py", line 134, in call
await self.error_middleware(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 178, in call
raise exc from None
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 156, in call
await self.app(scope, receive, _send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/middleware/cors.py", line 84, in call
await self.simple_response(scope, receive, send, request_headers=headers)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/middleware/cors.py", line 140, in simple_response
await self.app(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/exceptions.py", line 73, in call
raise exc from None
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/exceptions.py", line 62, in call
await self.app(scope, receive, sender)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/routing.py", line 590, in call
await route(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/routing.py", line 208, in call
await self.app(scope, receive, send)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/routing.py", line 41, in app
response = await func(request)
File "/home/username/anaconda3/lib/python3.7/site-packages/fastapi/routing.py", line 129, in app
raw_response = await run_in_threadpool(dependant.call, **values)
File "/home/username/anaconda3/lib/python3.7/site-packages/starlette/concurrency.py", line 25, in run_in_threadpool
return await loop.run_in_executor(None, func, *args)
File "/home/username/anaconda3/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/username/anaconda3/lib/python3.7/site-packages/prodigy/app.py", line 417, in give_answers
controller.db.reconnect()
File "/home/username/anaconda3/lib/python3.7/site-packages/prodigy/components/db.py", line 245, in reconnect
self.db.close()
File "/home/username/anaconda3/lib/python3.7/site-packages/peewee.py", line 3017, in close
self._close(self._state.conn)
File "/home/username/anaconda3/lib/python3.7/site-packages/peewee.py", line 3023, in _close
conn.close()
File "/home/username/anaconda3/lib/python3.7/site-packages/pymysql/connections.py", line 354, in close
raise err.Error("Already closed")
pymysql.err.Error: Already closed

Thanks for the report! Which version of Prodigy are you using? You can check by running prodigy stats. If you're not on the latest version, can you test if upgrading solves the issue? It looks like it's related to the async API, and we might have already fixed the underlying problem.

I am using prodigy version 1.9.4, peewee version 3.13.1, pymysql version 0.9.3.
Like I mentioned in my first post, The commands such as db-in , db-out , stats works, but fails on saving annotations with ner.manual .

I just tried prodigy version 1.8.5, peewee version 2.10.2, pymysql version 0.9.3. There is NO problem on saving the annotations to mysql.

1 Like

Thanks for the report @fernandop!

We found the cause in a side effect from the new async/threading features.

In short, under certain conditions:

  • A thread could check and see that a DB connection is open, right before trying to close it.
  • Then, right after that, another thread could close that same DB connection.
  • And after that, the original thread that just found the DB connection "open" tries to close it, but it was already closed by the other thread.

The fix is already implemented and will be available in a bug-fix release in the next few days.

In the meantime, if you want to temporarily patch it locally, we can guide you in the process. It's actually just a few lines of code.

3 Likes

Just released v1.9.5, which should resolve the issue! :slightly_smiling_face:

Hello Ines. Thx for the update. Unfortunately the update still gives the same error as above. Could you please provide the quick fix? :dizzy:

Hi,

I have updated prodigy to version 1.9.5, but I still face the same error as above.
I have tried to create a fresh python environment then only install prodigy 1.9.5 and pymysql 0.9.3, it is still the same problem.

Could you tell me the temporary patch please?

The patch is what's included in v1.9.5 – it adds a threading.Lock() and calls that around the two blocks that try to reconnect to the database in db.py. If that's not making a difference, then there's possibly something else going on.

We ship the source of the components/db.py with Prodigy, so you could also try and comment out the calls to self.db.close() in the reconnect method and see if that resolves the problem.

Hi,

I tried python 3.6 and prodigy 1.9.5, there is no issue on saving to MySql, so I just stick with 3.6 now.
The issue still persists on python 3.7.

Thanks.

I have same problem in my environment.
I also use prodigy 1.9.5 and pymysql 0.9.3.

Task exception was never retrieved
future: <Task finished coro=<RequestResponseCycle.run_asgi() done, defined at /home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py:383> exception=Error('Already closed')>
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 388, in run_asgi
    self.logger.error(msg, exc_info=exc)
  File "/home/user/.pyenv/versions/3.7.5/lib/python3.7/logging/__init__.py", line 1407, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/home/user/.pyenv/versions/3.7.5/lib/python3.7/logging/__init__.py", line 1514, in _log
    self.handle(record)
  File "/home/user/.pyenv/versions/3.7.5/lib/python3.7/logging/__init__.py", line 1523, in handle
    if (not self.disabled) and self.filter(record):
  File "/home/user/.pyenv/versions/3.7.5/lib/python3.7/logging/__init__.py", line 751, in filter
    result = f.filter(record)
  File "cython_src/prodigy/util.pyx", line 120, in prodigy.util.ServerErrorFilter.filter
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/fastapi/applications.py", line 140, in __call__
    await super().__call__(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/applications.py", line 134, in __call__
    await self.error_middleware(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/middleware/errors.py", line 178, in __call__
    raise exc from None
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/middleware/errors.py", line 156, in __call__
    await self.app(scope, receive, _send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/middleware/cors.py", line 84, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/middleware/cors.py", line 140, in simple_response
    await self.app(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/exceptions.py", line 73, in __call__
    raise exc from None
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/exceptions.py", line 62, in __call__
    await self.app(scope, receive, sender)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/routing.py", line 590, in __call__
    await route(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/routing.py", line 208, in __call__
    await self.app(scope, receive, send)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/fastapi/routing.py", line 129, in app
    raw_response = await run_in_threadpool(dependant.call, **values)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/starlette/concurrency.py", line 25, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/home/user/.pyenv/versions/3.7.5/lib/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/prodigy/app.py", line 387, in get_session_questions
    return _shared_get_questions(req.session_id, excludes=req.excludes)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/prodigy/app.py", line 357, in _shared_get_questions
    controller.db.reconnect()
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/prodigy/components/db.py", line 252, in reconnect
    self.db.close()
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/peewee.py", line 3017, in close
    self._close(self._state.conn)
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/peewee.py", line 3023, in _close
    conn.close()
  File "/home/user/.pyenv/versions/env-prodigy/lib/python3.7/site-packages/pymysql/connections.py", line 354, in close
    raise err.Error("Already closed")
pymysql.err.Error: Already closed

Thanks everyone for the reports!

We were able to fully replicate and fix the issue with MySQL, pymysql, and Python 3.7.

The issue was indeed related to how Peewee interacts with the async parts, but it was a bit more involved than initially thought.

The patch is already implemented and you'll receive a new release in the next days.

4 Likes