Custom model Requirements

I would like to use my own custom model (not a spacy model. e.g. pytorch, tensorflow, keras) with prodigy interface in active learning. What are the requirements of the model and how to integrate into custom recipe?

To make any model integrate with Prodigy’s active learning workflow, you mainly need to expose two functions:

  • a predict function that takes an iterable stream of examples in Prodigy’s JSON format, scores them and yields (score, example) tuples
  • an update callback that takes a list of annotated examples and updates the model accordingly

Here’s a pseudocode example of how this could look in a custom text classification recipe. How you implement the individual components of course depends on the specifics of your model.

import copy
from prodigy.components.loaders import JSONL
from prodigy.components.sorters import prefer_uncertain

def custom_recipe(dataset, source):
    stream = JSONL(source)
    model = load_your_model()

    def predict(stream):
        for eg in stream:
            predictions = get_predictions_from_model(eg)
            for label, score in predictions:
                example = copy.deepcopy(eg)
                example['label'] = label
                yield (score, example)

    def update(answers):
        for eg in answers:
            if eg['answer'] == 'accept':
            elif eg['answer'] == 'reject':
        loss = get_loss()
        return loss

    return {
        'dataset': dataset,
        'view_id': 'classification',
        'stream': prefer_uncertain(predict(stream)),
        'update': update

You can also find more details on the expected formats and component APIs in your PRODIGY_README.html or in the custom recipes workflow.

1 Like


Thanks for your instruction. I used this method to apply my customized pytorch model, which make loss converged when I do batch training. But when I use this algorithm to teach annotation, the result is close to even worse than samples which is random chosen from whole dataset. The experiment I took is to predict the sentiment of the IMDB review is positive or negative. The experiment group is using the annotations generated by prodigy active learning process while the baseline group is a random order chosen from whole dataset. For both groups, I trained successive data based on the model trained on previous samples .

I’m very confused about the result, it seems algorithm that is good for supervised learning is not good for active learning. I’m curious if there is any requirement for the customized model output? The meaning of my customized score is the probability of positiveness now. Could you tell me the logic of the ```prefer_uncertain`` function? I just think it’s important to know it well to build more suitable model.

Just so I’m understanding your experiment correctly: what’s step in the graph above? Is it the number of data samples? If so, is this training from only one epoch? What happens if you train for multiple epochs?

My more general answer: it’s true that active learning isn’t a good fit for every problem. The IMDB sentiment corpus was designed to investigate particular text classification techniques, so the dataset has several characteristics that make models converge well on the data. Specifically:

  • The texts are quite long.
  • The texts are of similarish length.
  • Exactly two classes.
  • Perfect class balance.
  • Few boundary cases.
  • Low annotation noise.

These problem characteristics make the dataset a relatively bad example for active learning, I think. The class composition is especially relevant. In many datasets you’ll have a lot of labels, with one label making up a lot of the examples, some classes that are rare but easy to predict (because the examples are all very similar), and some other classes that are easily confused.

Another thing to consider is that when doing active learning, it can matter a lot how fast your model responds to new examples. The default text classification model for Prodigy actually has a few features designed to make it learn better under an active learning regime. The most important feature is that it’s an ensemble of a unigram bag-of-words model and a CNN. During the first 10-20 weight updates, the CNN is still performing at close to chance, but the unigram bag-of-words model can already have learned a lot, because it starts off with such a useful inductive bias about the problem.

So, it’s possible that your model architecture learns a bit too slowly, and that’s one reason why your active learning might not perform well. But it might not be the decisive reason — I do think IMDB is a very tough example for active learning, so I wouldn’t be surprised if Prodigy’s default configuration actually doesn’t beat the baseline on it either. I haven’t run that experiment; I’d be interested to find out the result.

Hi @ines @honnibal
I wanted to know that if there are multiple predictions for one eg, then we will get that example multiple times for annotation ? If not, how will the score be calculated for that particular example ?

Also, in my use case, I am updating the model in the loop after every ‘n’ sentences. I want the predict method to be called whenever my model is updated. Is there any way to trigger the predict method externally ?
From what I understood from the documentation is that the stream should be updated after every chunk of given batch_size.


Hi @akshitasood63,

If you have a custom training loop, you can control the flow of questions that Prodigy asks you exactly how you want them. Specifically, your recipe just needs to return a dictionary, and one of the items will be the "stream". This can be a function implementing a generator, and you can yield out whatever examples you want from it. So, if you have a model that predicts scores for multiple classes, and you want to ask a different question for each class, you can definitely do that. You would just yield multiple tasks from your generator for each item in your input.

You can also control the logic that gets executed on update inside the "update" callback. If you wish to make predictions, you can run the model then. Note that if you just want the updates to be reflected in the model that’s running in the questions loop, you don’t normally need to do anything special. The update callback can just update the model in place, and then as the data is streaming through your model, your model will be using the updated weights.

Thanks @honnibal
I got a pretty clear idea on multiple class problem.
Now I want the predictions from the updated model to be reflected in the questions that are fed to prodigy UI, so I was editing the stream in the predict function and then yielding it. Seems like the predict function is not called for every batch of questions. Can you give me some insight on how does it work?

So, should I yield the updated stream in update instead of predict function ?

No, the data should always be yielded out in the generator you pass in as the "stream". However, since it’s a generator, it can respond to state changes – for example, if you update your model with each batch of answers you receive, that model will score the stream differently when you predict the incoming new examples. So what’s sent out for annotation will change as the model changes.

Here’s an example that shows this idea with a “dummy model”:

1 Like