In talking to myself here, I discovered an issue in my code that was generating the input JSONL. As mortifying as it is to leave this post here for posterity, perhaps it will serve as a helpful reminder to really closely inspect your input data. The eyes tend to glaze over when reading a giant JSONL file and you might not notice that you have repeated the same image or text across its entirety.