Problems exposing a custom Database class via an entry point?

I have a custom class that exposes a MongoDB database with the same API as Database, and I'm trying to make it accessible through an entry point.

For reference, here is my setup.py file for the package I'm writing.

setup(
    name='mondigy',
    version='0.1',
    entry_points={
        'console_scripts': ['mondigy = mondigy.__main__:main'],
        'prodigy_loaders': ['mondigy.loader = mondigy.loader:mongo_loader'],
        'prodigy_db': ['mondigy.db = mondigy.database:AnnotationDatabase']
    }
)

mondigy/database.py contains a class called AnnotationDatabase that implements the Database API and mondigy/loader.py contains a recipe function as in the custom loader guide. However, when I try to start Prodigy with the mongo database loader and mongo database, I see a bunch of examples from my MongoDB printed and it throws the following error:

Traceback (most recent call last):
File "/Users/jdagdelen/opt/anaconda3/envs/prodigy/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/Users/jdagdelen/opt/anaconda3/envs/prodigy/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/jdagdelen/opt/anaconda3/envs/prodigy/lib/python3.6/site-packages/prodigy/main.py", line 60, in
controller = recipe(*args, use_plac=True)
File "cython_src/prodigy/core.pyx", line 231, in prodigy.core.recipe.recipe_decorator.recipe_proxy
File "cython_src/prodigy/core.pyx", line 71, in prodigy.core.Controller.init
File "cython_src/prodigy/core.pyx", line 189, in prodigy.core.Controller.connect_db
TypeError: argument of type 'type' is not iterable

It seems like it expects the database class to be iterable?

The documentation on using entry points seems to be thinner on the new documentation site than it used to be in the README.html. Can someone share the relevant section from the old documentation?

Hi! This looks like a cool project :slightly_smiling_face:

Which version of Prodigy are you using? I've been trying to find where this problem could be happening and I don't see anything that could explain it :thinking: The Database class shouldn't have to be iterable.

One thing to note maybe: If a custom database is provided, Prodigy currently expects an already instantiated object (because it can't make any assumptions about how your database wants to be initialized – the built-in one for instance just holds a peewee object). So maybe that's the problem? If you need access to any user settings, you can call Prodigy's get_config() to access the prodigy.json.

Here's the section we previously had on entry points in the README – maybe we should add the example back! I think I ended up replacing that with a reference to the sense2vec package because that actually showed a full working example.

Plugging in your own loaders or database connectors usually required writing a custom recipe – even if you don't want to change anything about the built-in
recipes themselves. Prodigy v1.5.0 introduces a new way of making your own
functions available to built-in recipes and CLI commands, without having to
modify the source. All you need is a simple Python package that exposes your
components via entry_points, and is installed in the same environment as
Prodigy. For a quick introduction to entry points in Python, we recommend
this blog post.

Consider the following structure of your prodigy_utils package:

└── prodigy_utils       # package directory
    ├── recipes.py      # recipe functions
    ├── loaders.py      # loader functions
    ├── db.py           # database classes
    └── setup.py        # package setup

In the setup.py, you can then define a dictionary of entry_points for one
or more of the available categories. Each category is mapped to a list of
strings in the format [name] = module:function. For example,
'custom_json = loaders:CustomJSON' will make the function or class
CustomJSON in the file loaders.py available via the name custom_json.

from setuptools import setup

setup(
    name='prodigy_utils',
    entry_points={
        'prodigy_recipes': [
            'custom_recipe = recipes:custom_recipe'
        ],
        'prodigy_loaders': [
            'database = loaders:DatabaseLoader',
            'custom_json = loaders:CustomJSON'
        ],
        'prodigy_db': [
            'mongodb = db:MongoDBLoader'
        ]
    },
    requirements=[
        'prodigy>=1.4.3,<1.5.0'
    ]
)

Prodigy checks the following entry point categories:

Name Description
prodigy_recipes Custom recipe functions, one entry per recipe.
prodigy_loaders File or API loaders.
prodigy_db Database connectors that follow the same API as Prodigy's Database class.

To install your package and expose the entry points, navigate to the package
directory and run the setup – for example, in development mode:

cd prodigy_utils
python setup.py develop

If your package is installed in the same environment, you won't have to do
anything else. Prodigy will automatically find and load your entry points. To
verify that your entry points were read in correctly, you can set the
PRODIGY_LOGGING=basic environment variable. On startup, Prodigy will log how
many components were added via entry points.

Ah, yep! That was the issue. I was passing the class name, not an instantiated object. Thanks for your help! If you're interested, the finished product can be found here: GitHub - jdagdelen/mondigy: A small component for using a Mongodb database as a data loader for Prodigy annotation applications.

1 Like

Oh wow, thanks for sharing and developing this, this is so cool and I'm sure others will find it useful as well :heart_eyes: (I think we finally need a "Prodigy Universe" type of page in the docs so we can showcase integrations like this!)