DB switches from Custom back to SQLite

I'm not sure how this could happen, but here are the symptoms (verbose logging):

  1. I start a custom recipe using a custom DB (updated version of the mondigy project, notably adding Database inheritance - see below), and logging indicates that it is being used properly
class AnnotationDatabase(Database):
  1. When CONTROLLER: Recipe Config prints, it does not include the DB
  2. If I wrap the DB in a subclass that prints out all attribute access attempts, it does use the DB initially:
attribute: dataset_collection
attribute: add_dataset
attribute: db_name
Added dataset dataset_out to database Custom Database.
attribute: get_meta
attribute: add_dataset
attribute: count_dataset
  1. By the time I reach /give_answers, it seems to be back to using SQLite somehow and has to create datasets, no longer printing out attribute access attempts:
15:35:55: CONTROLLER: Receiving 1 answers
15:35:55: Controller: received answers for session 2023-11-20_15-34-00: 1
15:35:59: DB: Creating unstructured dataset '2023-11-20_15-34-00'
15:35:59: DB: Creating unstructured dataset 'dataset_out'
15:35:59: DB: Added 1 examples to 2 datasets
15:35:59: CONTROLLER: Added 1 answers to dataset 'dataset_out' in database SQLite

Here is the wrapping code I'm talking about:

default_db = None
class ProxyDatabase(AnnotationDatabase):
    def __init__(self, *args):
        global default_db
        default_db = AnnotationDatabase(*args)

    def __getattribute__(self, item):
        print(f"attribute: {item}")
        return default_db.__getattribute__(item)

Prodigy v1.14.9. What am I doing wrong here?

Update:
5. Printing the config["db"] from my make_update() and validate_answer() hooks confirm it has not changed, and yet prodigy uses SQLite (#4) for some reason.
6. Removing the inheritance from Database also does not change things.
7. I removed my ProxyDatabase object and just added logging statements to every method of the AnnotationDatabase object to see what was being used, but they told the same story (#3) as the ProxyDatabase object.

I don't think that custom DBs work as documented across a couple of different dimensions, but for whoever comes after, I got it working by setting the default DB via prodigy.json and adding the DB to the databases registry using:

from prodigy.util import registry
prodigy.databases.register("mongodb", func=AnnotationsDatabase())

This is a bit confusing because register is a decorator elsewhere but func here accepts a class instantiation.

prodigy.json now includes:

{
  "db": "mongodb"
}

It now seems to work as expected.

This post was critical information that I couldn't get to by looking at the prodigy source because prodigy.util is compiled now.

2 Likes