Is it possible to programmatically exit Prodigy when there’s no more tasks available?

Is it possible to programmatically exit Prodigy when there’s no more tasks available when using custom recipes? Something like sys.exit(0) or similar.

Hi! Prodigy starts a web server under the hood that needs to be stopped as well, so one option would be to kill the process running on the given port. I haven't tried this yet but something like this could work:

That logic would then go at the end of your stream, after all examples have been sent out:

def get_stream(stream):
    for eg in stream:
        yield eg
    # All examples are done, kill the process

@ines thank you! It works. (Although I went with a simpler solution than the one in the link you posted, just os.kill(os.getpid(), signal.SIGTERM))

In case anyone else is looking at this solution:
The only problem with it is that since the stream is exhausted with the last batch, the process is killed before the last batch is annotated (unless you have a batch size 1 probably). I think this can be mitigated by adding a non-blocking wait with a long timeout before sending SIGTERM - I haven't tried it yet myself though.

1 Like

Oh, that's a good point! Another possible solution could be to use the update callback to kill the server, once you've received all answers back from the server. So when the last batch is sent out, you can toggle a global variable so the update callback knows.

From what I understand, the logic behind this is:

  • the last batch is sent out
  • we change a global variable, say LAST_BATCH_SENT=False to LAST_BATCH_SENT=True
  • within the update() method, if LAST_BATCH_SENT=True, kill the session.

How do you know when the last batch is sent out? I thought the update() method was called automatically once this happens.

The update callback is called every time new examples are received by the server. So if you know how many examples you have, you could just keep a global variable with a counter and increment/check that in the update callback. You definitely want to wait until the last batch is received, not sent – otherwise, you'd be killing the server prematurely and the user wouldn't be able to submit the last batch of annotations anymore.