Only 25 lines loading from my .jsonl stream

Hi! One thing to keep in mind when using the active learning-powered recipes like textcat.teach is that they don’t necessarily show you all examples you’re loading in. The main concept behind the active learning approach is to show you the most relevant examples for annotation using the model’s predictions. Under the hood, Prodigy uses an exponential moving average of the scores to decide whether to send an example out for annotation or not.

So based on the annotation decisions you make, the model state and the annotations that are already in the database, you’ll see different suggestions. That’s also why the active learning recipes typically work best if you have very large volumes of raw data and want to find the best possible examples.

If you’re looking to just label all examples in your dataset as they come in, you probably want to be using a recipe like textcat.manual instead.