It was my understanding that ner.correct is for creating a gold standard and correcting errors and as such there is no active learning component involved. so why does prodigy stop mid-way for me (after just annotating 1200 samples when my dataset has more than 1 million) saying no tasks available? alternatively how do i continue annotating more samples after this?
Hi! How large are your individual texts in the data? And are you loading from a format that can be read line by line (e.g. JSONL) or a format that needs to be parsed and loaded into memory upfront (e.g. JSON)?
"No tasks available" is shown if Prodigy comes across an empty batch and typically happens when there's nothing more to annotate. If there is more data, this could indicate that the server somehow takes too long to produce a new batch and/or some race conditon causes an empty batch to be sent before the next one
The easiest workaround should be to just make Prodigy try again and send the next batch, either be refreshing your browser or by restarting the server (which should let you start right where you left off because Prodigy will skip all questions that are already annotated in the dataset).
(which should let you start right where you left off because Prodigy will skip all questions that are already annotated in the dataset).
brilliant. I'll do just that then.
To answer your questions - yes I am reading jsonl files, the individual text files could be anywhere between 2 sentences to 10 sentences.