ner.openai.correct Time out issue

insightful · July 19, 2023, 10:48pm

Hello, I am very new to prodigy. I am getting issues using the ner.openai.correct service. It works for a while and then annoyingly can get up to 200 approved items and then times out with an error. Any thoughts or suggestions?

= Task exception was never retrieved

future: <Task finished name='Task-86' coro=<RequestResponseCycle.run_asgi() done, defined at /home/paul/anaconda3/envs/prodigy-dev-2/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py:402> exception=ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='api.openai.com', port=443): Read timed out. (read timeout=20)"))>

Traceback (most recent call last):

koaning · July 20, 2023, 10:09am

Hi Paul!

Just from a glance, that looks like a timeout from OpenAI. I've seen these happen before and it's usually not something that we can control. OpenAI can get big bursts of traffic, which also means that the latency can go all over the place on their side ... even to the extend that it causes a timeout once in while to an average user. Our code already has a retry mechanic in place, but that's not always enough.

Just to check, is there a specific moment where this issue pops up consistently? Or is it somewhat random?

An alternative for now might be to use ner.openai.fetch instead. This recipe allows you to fetch examples upfront, and you can always resume a previous run via the --resume flag. These results can then be used in a ner.manual recipe without having to worry about the OpenAI connection.

Would this remedy the situation for now? I could dive into our code and see if we might be able to increase the wait time in our retry-mechanic, but I fear that this might only remedy a small part of the problem.

insightful · July 20, 2023, 9:03pm

Thanks, it seems to be random, time-wise. I had it fail after 20 records, then 200. I will give fetch ago. It would be good if the code could pause/retry, as this is the best current LLM we have at the moment. I am on fibre, so I assume it is OpenAI's end as they still struggle with performance. Any plans to support locally installed open-source LLMs? Paul

koaning · July 21, 2023, 9:27am

We already have a mechanic in there that does that if I recall correctly the base behavior is to really try retry 10 times in total per run.

One tip, when you use the fetch recipe, make sure you consider the --resume command. That way, you'll be able to pick up the progress from before. That means that if the connection fails, you'll still be able to pick up where it left off.

Yes! Version 1.13 of Prodigy will support spacy-llm which will be able to connect to all sorts of backends for all sorts of tasks. We'll likely be able to replace all the OpenAI recipes with recipes that allow you to pick your own provider, as well as have more predefined tasks available.

Topic		Replies	Views
ner.llm.correct timeout for big dataset	7	386	November 16, 2023
Recipe "ner.openai.correct" uses openai models with low token limit ner	1	309	July 14, 2023
Bug Report: Message No Task Available due to Prodigy not respecting batch config after a few annotations for ner.llm.correct recipe ner , spacy , solved	5	40	August 28, 2024
llm.fetch doesn't write to the database if it gets interrupted	4	232	January 3, 2024
No tasks available for ner.correct? ner	2	548	October 3, 2020

ner.openai.correct Time out issue

Related topics