Basic question about batch persistence

Hi! I've explained some of this in more detail in this thread:

My post here has a little example of an "infinite stream" that checks the incoming examples against the hashes in the database to make sure everything is annotated:

Of course, you could also come up with your own custom logic for this. Streams in Prodigy are regular Python generators that yield example dicts, so they can respond to external state and let you control what to send out when.

1 Like