Prodigy.Serve & FastAPI

Hi I was wondering if you could help with an issue we are having with the using the Prodigy.Serve function and FastAPI.

We have experimented with some of the Prodigy functions and managed to put them behind FastAPI and this has all worked successfully. (ie we can call Create and drop dataset from a FastAPI end point). We also built a python script that used prodigy.serve(“ner.manual….") and this script executes successfully. ie the Prodigy Annotation tool starts up and can be viewed in a browser. However as soon as we put this script behind a FastApi end point, we get a “signal only works in main thread” error. We are running the python and Prodigy environment on a window environment.

Have you any ideas on how to work around this issue? Or alternatively is there a plan to make all the other Prodigy commands available via FastApi? ie ner.manaul. make gold, ner.batch-train etc. If all the Prodigy commands were available via FastAPI, we wouldn’t need to go down the Prodigy.Serve route.

Thanks in advance for your help.

Hey @walsh, so, the thing is that what Prodigy commands do in the end is to compose a function call and then start a separate and isolated REST service, so it probably wouldn't really make sense to make them available inside of another FastAPI app, as each one of them would actually start a new separated server.

In Prodigy, the main entrypoint would be the CLI, and that in turn would create a new API for that CLI command, but it would all start and depend on the CLI command.

On the other side, we are currently working on Prodigy Teams, which will simplify managing teams, Prodigy tasks, etc. all via a web UI :rocket:


Now, about your specific use case, maybe you could share a bit more of info and we could try to find the problem, for example, if you could create a minimal example that shows the error, we could see if we spot anything problematic in it.

There are also a couple of things that could be affecting here:

  • Windows in general might behave strangely in some cases and throw errors that might even seem unrelated, and it can get old corrupt internal state, that's why "turning it off and on again" works in many situations :sweat_smile: You could try it all in a Linux-like system, even WSL2 might do it. And see if that helps.
  • Maybe, if you are calling prodigy.serve from inside of a FastAPI path operation, it would mean that after receiving a request, it would have to try to start a new server inside of the existing one, trying to block there and trying to create a new event loop inside of the existing one. And all that before returning a response. So that would probably be problematic. In that case, it would be better to start a new sub-process instead of calling prodigy.serve directly.
  • Then, if you are starting a sub-process, it could be trying to listen on the same port that is already being used.
  • Or you could be trying to start multiple processes that are never killed and compete for the ports or something similar.

The way to go would probably be to start a subprocess running Prodigy without waiting for it to complete (as if it will run its own server in most of the cases, and a server would never complete).

Each process would need to have its own port (if it runs a server) so they don't compete.

And then you would need something to track which processes are running, probably their PID, it could be a SQLite DB, a JSON file, a .pid file for each process, etc, so that you can terminate them when you are done with them.