Losing samples on browser refresh

I just had a closer look and I think what's happening is this: when force_stream_order is enabled, Prodigy uses a different type of logic to orchestrate the stream of examples. (In this scenario, it needs to be a bit more complex because Prodigy needs to keep track of what's already been sent out to which session, what's coming back and what's already in the DB, so it can re-send the questions). For some reason, the database object doesn't seem to get passed through correctly here so Prodigy falls back to connecting to the default DB specified in the prodigy.json. It should be easy to fix and we'll include the fix in the next release :slightly_smiling_face:

In the meantime, a pretty simple (and actually quite elegant) workaround would be to just register your custom database with a string name, so you can refer to it in your prodigy.json. The following should work:

from prodigy.util import registry
from prodigy.components.db import Database

db = PostgresqlExtDatabase(...)  # etc.
db = Database(psql_db, "custom_postgres", "Custom PostgreSQL Database")
registry.databases("custom_postgres", func=db)

Edit: Forgot one line in my code example above (also see here).

The code can go before or inside your custom recipe, and you won't need to return "db" from the recipe anymore. Instead, you should now be able to write "db": "custom_postgres" in your prodigy.json`, and it will be the default database used by the recipe and internally.