OpenAI integration deprecation

Hey guys,

this morning I tried to switch my workflow from textcat.manual to textcat.openai.correct to save some time. But even after setting all the correct ENV-variables I ended up in this neverending loop:

Do I see that correctly that choosing text-davinci-003 is not possible anymore, because it is not available on the first, but then required on the second API that seems to be called?

Is it possible for you to use the textcat.llm.correct recipe? As noted in the error the .openai.* recipes are deprecated by the more general .llm.* recipes

MMh, actually I was just to lazy because I hoped it would be more easy to configure using the other pipelines.

Now I tried it with this config

[nlp]

lang = "de"

pipeline = ["llm"]

[components]

[components.llm]

factory = "llm"

save_io = true

[components.llm.task]

@llm_tasks = "spacy.TextCat.v3"

labels = ["Sach- & Haftpflichtversicherungen","Werbungskosten","Privat"]

exclusive_classes = false

[components.llm.task.label_definitions]

"Sach- & Haftpflichtversicherungen" = "xxx"

"Werbungskosten" = "xxx"

"Privat" = "xxx"

[components.llm.model]

@llm_models = "spacy.GPT-4.v3"

config = {"temperature": 0.3}

[components.llm.cache]

@llm_misc = "spacy.BatchCache.v1"

path = "local-cached"

batch_size = 3

max_batches_in_mem = 10

Calling it like this: prodigy textcat.llm.correct bank_turnovers_categorization configs/spacy_textcat_llm_config.cfg assets/bank_turnovers_versicherung.jsonl

But this results in Prodigy seemingly forever looking like this. I see that it talks to OpenAI, but never comes back.

OK, on further testing this was just the default model gpt-4 taking too much time.

Using this config

[nlp]
lang = "de"
pipeline = ["llm"]

[components]

[components.llm]
factory = "llm"
save_io = true

[components.llm.task]
@llm_tasks = "spacy.TextCat.v3"
labels = ["Sach- & Haftpflichtversicherungen","Werbungskosten","Privat"]
exclusive_classes = false

[components.llm.task.label_definitions]
"Sach- & Haftpflichtversicherungen" = "xxx"
"Werbungskosten" = "xxx"
"Privat" = "xxx"

[components.llm.model]
@llm_models = "spacy.GPT-4.v3"
name = "gpt-4o"
config = {"temperature": 0.3}

[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-cached"
batch_size = 3
max_batches_in_mem = 10

I started working.

Hi @toadle,

Glad to hear you managed to implement your workflow with spacy-llm recipe. I'll go ahead to change the subject to reflect the content of the thread more accurately, now that we know the issue.