Difference number examples dataset and batch-train

Hi! I wrote about this in some more detail here:

I'm not sure what's in your data and how many unique examples you have – you could check that looking at how many unique input hashes there are:

from prodigy.components.db import connect
db = connect()
input_hashes = db.get_input_hashes(["energy_patterns"])
print(len(set(input_hashes)))

You could also set PRODIGY_LOGGING=basic to see if anything else is being skipped.