"No tasks available" even though there's plenty of samples left

@YarrDOpanas Which version of Prodigy are you using?

To read from stdin, you'll need to set the source argument to a -. For example:

cat ./5461_dataset_prodigy.jsonl  | prodigy textcat.teach dataset_name en_core_web_lg - --label L1,L2,...

To some extent, this is part of the concept of textcat.teach: the model will produce predictions for the given labels and the sorter function will select the most relevant examples to annotate. By default, those with the most uncertain scores where the decision makes the biggest difference. I've shared some more details on this in this thread: using sorters (prefer_uncertain or prefer_high_scores) result in prodigy showing me the same data samples with different predictions - #2 by ines

However, if you're starting from scratch with a model that knows nothing, it will take a while until it can make meaningful suggestions. 262 labels is an unusally large label scheme for text classification, at least a the top level. So the model will need to see enough examples to make meaningful suggestions for 262 labels, which is going to be very difficult from a cold start, imbalanced classes and only 5461 raw texts to choose from.

Are your labels hierarchical? If so, can you split your classifier into multiple steps and start by predicting the top level categories, train separate classifiers for the different subcategories (e.g. given the text is about sport, is is about football?). This is likely going to be much easier to learn.