AttributeError when running Prodigy 1.9.7

After upgrading Prodigy from 1.9.0 to 1.9.7 I'm receiving the following error when trying to load the Prodigy page. It also shows a 500 Internal Server Error.

Task exception was never retrieved
future: <Task finished coro=<RequestResponseCycle.run_asgi() done, defined at /usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py:383> exception=AttributeError("'_ConnectionLocal' object has no attribute '_state'")>
Traceback (most recent call last):
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 388, in run_asgi
    self.logger.error(msg, exc_info=exc)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1412, in error
    self._log(ERROR, msg, args, **kwargs)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1528, in handle
    if (not self.disabled) and self.filter(record):
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 762, in filter
    result = f.filter(record)
File "cython_src/prodigy/util.pyx", line 120, in prodigy.util.ServerErrorFilter.filter
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
    result = await app(self.scope, self.receive, self.send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/fastapi/applications.py", line 140, in __call__
    await super().__call__(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/applications.py", line 134, in __call__
    await self.error_middleware(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 178, in __call__
    raise exc from None
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 156, in __call__
    await self.app(scope, receive, _send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/cors.py", line 76, in __call__
    await self.app(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/base.py", line 25, in __call__
    response = await self.dispatch_func(request, self.call_next)
File "/usr/local/miniconda3/lib/python3.7/site-packages/prodigy/app.py", line 181, in reset_db_middleware
    controller.db.db.obj._state._state.set(db_state_default.copy())
AttributeError: '_ConnectionLocal' object has no attribute '_state'
Task exception was never retrieved
future: <Task finished coro=<RequestResponseCycle.run_asgi() done, defined at /usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py:383> exception=AttributeError("'_ConnectionLocal' object has no attribute '_state'")>
Traceback (most recent call last):
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 388, in run_asgi
    self.logger.error(msg, exc_info=exc)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1412, in error
    self._log(ERROR, msg, args, **kwargs)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 1528, in handle
    if (not self.disabled) and self.filter(record):
File "/usr/local/miniconda3/lib/python3.7/logging/__init__.py", line 762, in filter
    result = f.filter(record)
File "cython_src/prodigy/util.pyx", line 120, in prodigy.util.ServerErrorFilter.filter
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 385, in run_asgi
    result = await app(self.scope, self.receive, self.send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/fastapi/applications.py", line 140, in __call__
    await super().__call__(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/applications.py", line 134, in __call__
    await self.error_middleware(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 178, in __call__
    raise exc from None
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/errors.py", line 156, in __call__
    await self.app(scope, receive, _send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/cors.py", line 76, in __call__
    await self.app(scope, receive, send)
File "/usr/local/miniconda3/lib/python3.7/site-packages/starlette/middleware/base.py", line 25, in __call__
    response = await self.dispatch_func(request, self.call_next)
File "/usr/local/miniconda3/lib/python3.7/site-packages/prodigy/app.py", line 181, in reset_db_middleware
    controller.db.db.obj._state._state.set(db_state_default.copy())
AttributeError: '_ConnectionLocal' object has no attribute '_state'

Could you run pip list and post the versions of fastapi, starlette, uvicorn and peewee that you're running? It looks like the error here is happening within the peewee database, so maybe you ended up with an imcompatible version after upgrading?

pip list | grep -E "prodigy|spacy|uvicorn|fastapi|starlette|peewee"
fastapi                                       0.44.1             
peewee                                        3.13.1             
prodigy                                       1.9.7              
spacy                                         2.2.3              
starlette                                     0.12.9             
uvicorn                                       0.10.8

I also just played around with the dependencies and tried to upgrade all of them. But the error remains.

Hum, this is quite strange. I don't see any problem with the versions and I've been trying to find a way to replicate it but I haven't been able. :confused:

By any chance, are you implementing a custom database class based on Peewee? That's the only way I could imagine that would happen.

If that's the case and you have a custom Peewee database class, you could try re-using the custom connection state class prodigy.components.db.PeeweeConnectionState.

It would be something like:

from prodigy.components.db import PeeweeConnectionState
from peewee import Database


class YourCustomDB(Database):
    def __init__(self, *args, **kwargs):
        # initialize your custom database
        self._state = PeeweeConnectionState()

    def get_dataset(self, name, default=None):
        # get examples for a given dataset name

The class PeeweeConnectionState monkeypatches Peewee to make it behave correclty in an async environment (anything with async / await, asyncio, etc, in this case, FastAPI).


Otherwise, which database are you using? Do you have any specific settings for it?

I use a PostgreSQL database like described in the documentation:

from playhouse.postgres_ext import PostgresqlExtDatabase
from prodigy.components.db import Database

prodigy_db = Database(
    PostgresqlExtDatabase(
        config['db/prodigy/database'],
        user=config['db/prodigy/user'],
        password=config['db/prodigy/password'],
        host=config['db/prodigy/host']
    )
)

Is there anything I can try to narrow this down?

Thanks @simon.gurcke . Get it. That makes sense. I think I know where the problem is happening.

I'm going to investigate how to improve it all, meanwhile, here's a workaround you can try.

You can make sure that the DB uses that PeeweeConnectionState().

For that, you would make it a custom DB that inherits from PostgresqlExtDatabase and sets that DB state.

It would look something like:

from playhouse.postgres_ext import PostgresqlExtDatabase
from prodigy.components.db import Database, PeeweeConnectionState


class ModernPostgresqlExtDatabase(PostgresqlExtDatabase):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._state = PeeweeConnectionState()


prodigy_db = Database(
    ModernPostgresqlExtDatabase(
        config['db/prodigy/database'],
        user=config['db/prodigy/user'],
        password=config['db/prodigy/password'],
        host=config['db/prodigy/host']
    )
)

Now, a bit more info on all that:

So, Peewee internally relies on some heavy assumptions about threads, so it doesn't support well running with async features of modern Python, even if the developer handles blocking operations internally, etc.

Prodigy's internal API app uses FastAPI, which is async underneath. To make Peewee compatible with it, we monkeypatch it to make it use the standard contextvars instead of thread locals.

If you want to read more (a lot more :sweat_smile: ) about it all, check this section in FastAPI's docs: https://fastapi.tiangolo.com/advanced/sql-databases-peewee/ it explains how to monkeypatch Peewee to make it compatible, the problem behind, how it works, etc.

1 Like

Thanks so much for figuring this out @tiangolo. I’ll test it tomorrow.

Sounds like peewee is not a good fit for Prodigy, if it only works with monkeypatching. Especially for a library that other people write code on top there is likely going to be unwanted side effects. It also seems to only work in Python 3.7 according to the FastAPI doc.

Any plans to move away from it?

Yeah, that's sad with the current state of Peewee :disappointed:

Actually, we are having some internal conversations about it :sweat_smile: :nerd_face: , but nothing is defined yet...

Another reason against peewee might be that it really messes up the json (the Prodigy example data) in a PostgreSQL database by storing it in a weird bytea format while there is a native jsonb data type. Makes working with it incredibly painful.

Hmm, good point. Thanks for the feedback.

1 Like

Using the PeeweeConnectionState as mentioned above indeed fixes the error and I'm now able to run Prodigy 1.9.7. Thanks for the help!

1 Like

@simon.gurcke Thanks for the update and glad it works now! We'll include an official fix for this in the upcoming version :slightly_smiling_face: And as @tiangolo mentioned above, we're evaluating different ORM options that also play nicely with modern async Python. (peewee was alright in the beginning and let us move quickly and support different database options... but it's not very future-proof.)

2 Likes

@ines @tiangolo Just wanted to check in to see if any progress has been made on moving away from Peewee and/or storing example data in native JSONB format in PostgreSQL. We'd like to implement more analytics for our annotation database and it would be helpful if we could query it directly, rather than pulling all records into Python first, which is becoming increasingly unfeasible.

It's definitely on our list for Prodigy v2 and something we'd like to have available for Prodigy Teams as well. But I couldn't give you an ETA yet.

In the meantime, one thing you could do is implement your own Database class without peewee that exposes the same methods. You can then plug it in just like any other database wrapper in Prodigy and handle adding and querying yourself. For inspiration, here's an example of this in the mondigy package: