Active learning performs worse than pretrained model

You might enjoy this discussion on active learning:

In general, or at least in my personal experience, it feels safe to say that "active learning doesn't always work" but it's incredibly hard to tell upfront. In situations where one class is relatively rare, it seems to cause improvements. Mainly because you have some help in sampling the rare class. But in other scenarios you could argue that active learning could get stuck in a local optima depending on how everything is set up.

There are also some articles in this space that might be of interest, here's one TIL from my personal blog:

https://koaning.io/til/2022-05-01-active-churning/

At the end of that TIL you'll also notice a benchmark with scikit-learn where "random sampling" beats an active learning strategy.

Gut Feelings: Why?!

What I'm about to describe is a gut feeling based on personal experience, but when you wonder "why?!" you might want to consider a thought experiment.

Suppose that we have a balanced classification use-case (so no rare labels). We have a dataset X that we split up into X_valid and X_train. Let's also assume that X_valid is annotated (we have y_valid) without error and we're about the annotate X_train.

Then which strategy should we apply to annotate?

  1. We should be careful with introducing sampling bias. So the best thing we can do is just sample randomly. That way, we annotated subset from X_train should resemble the same distribution as X_valid.
  2. We should do something else.

When you frame the problem this way, it suddenly sounds a bit strange to even introduce active learning, because it's introducing a bias of sorts.

When does it work?

I like to keep this argument in the back of my mind at all times, but I want to acknowledge that there is also evidence of situations where active learning does make a difference. I think the main thing is that there's not a clear consensus on the circumstances that are needed to make a big impact. I don't read papers as much, so somebody with more experience could correct me, but my understanding is that this is still an area of active research.