Custom model Requirements

To make any model integrate with Prodigy’s active learning workflow, you mainly need to expose two functions:

  • a predict function that takes an iterable stream of examples in Prodigy’s JSON format, scores them and yields (score, example) tuples
  • an update callback that takes a list of annotated examples and updates the model accordingly

Here’s a pseudocode example of how this could look in a custom text classification recipe. How you implement the individual components of course depends on the specifics of your model.

import copy
from prodigy.components.loaders import JSONL
from prodigy.components.sorters import prefer_uncertain

@prodigy.recipe('custom')
def custom_recipe(dataset, source):
    stream = JSONL(source)
    model = load_your_model()

    def predict(stream):
        for eg in stream:
            predictions = get_predictions_from_model(eg)
            for label, score in predictions:
                example = copy.deepcopy(eg)
                example['label'] = label
                yield (score, example)

    def update(answers):
        for eg in answers:
            if eg['answer'] == 'accept':
                update_model_with_accept(eg)
            elif eg['answer'] == 'reject':
                update_model_with_reject(eg)
        loss = get_loss()
        return loss

    return {
        'dataset': dataset,
        'view_id': 'classification',
        'stream': prefer_uncertain(predict(stream)),
        'update': update
    }

You can also find more details on the expected formats and component APIs in your PRODIGY_README.html or in the custom recipes workflow.

1 Like