Only 25 lines loading from my .jsonl stream

ines · July 28, 2019, 4:21pm

Hi! One thing to keep in mind when using the active learning-powered recipes like textcat.teach is that they don’t necessarily show you all examples you’re loading in. The main concept behind the active learning approach is to show you the most relevant examples for annotation using the model’s predictions. Under the hood, Prodigy uses an exponential moving average of the scores to decide whether to send an example out for annotation or not.

So based on the annotation decisions you make, the model state and the annotations that are already in the database, you’ll see different suggestions. That’s also why the active learning recipes typically work best if you have very large volumes of raw data and want to find the best possible examples.

If you’re looking to just label all examples in your dataset as they come in, you probably want to be using a recipe like textcat.manual instead.

Topic		Replies	Views
can't annotate fully loaded jsonl data usage , textcat	2	304	February 27, 2023
Best use of `textcat.teach` usage , textcat	2	1433	June 18, 2020
No tasks available in v1.10 - texcat.teach usage , textcat	4	839	June 28, 2020
"No tasks available" even though there's plenty of samples left usage , textcat	21	5504	September 13, 2021
text classification usage , textcat	7	1126	October 7, 2019

Only 25 lines loading from my .jsonl stream

Related topics