OpenAI integration deprecation

toadle · September 24, 2024, 7:06am

Hey guys,

this morning I tried to switch my workflow from textcat.manual to textcat.openai.correct to save some time. But even after setting all the correct ENV-variables I ended up in this neverending loop:

Do I see that correctly that choosing text-davinci-003 is not possible anymore, because it is not available on the first, but then required on the second API that seems to be called?

honnibal · September 24, 2024, 7:16am

Is it possible for you to use the textcat.llm.correct recipe? As noted in the error the .openai.* recipes are deprecated by the more general .llm.* recipes

toadle · September 24, 2024, 4:06pm

MMh, actually I was just to lazy because I hoped it would be more easy to configure using the other pipelines.

Now I tried it with this config

[nlp]

lang = "de"

pipeline = ["llm"]

[components]

[components.llm]

factory = "llm"

save_io = true

[components.llm.task]

@llm_tasks = "spacy.TextCat.v3"

labels = ["Sach- & Haftpflichtversicherungen","Werbungskosten","Privat"]

exclusive_classes = false

[components.llm.task.label_definitions]

"Sach- & Haftpflichtversicherungen" = "xxx"

"Werbungskosten" = "xxx"

"Privat" = "xxx"

[components.llm.model]

@llm_models = "spacy.GPT-4.v3"

config = {"temperature": 0.3}

[components.llm.cache]

@llm_misc = "spacy.BatchCache.v1"

path = "local-cached"

batch_size = 3

max_batches_in_mem = 10

Calling it like this: prodigy textcat.llm.correct bank_turnovers_categorization configs/spacy_textcat_llm_config.cfg assets/bank_turnovers_versicherung.jsonl

But this results in Prodigy seemingly forever looking like this. I see that it talks to OpenAI, but never comes back.

toadle · September 24, 2024, 4:10pm

OK, on further testing this was just the default model gpt-4 taking too much time.

Using this config

[nlp]
lang = "de"
pipeline = ["llm"]

[components]

[components.llm]
factory = "llm"
save_io = true

[components.llm.task]
@llm_tasks = "spacy.TextCat.v3"
labels = ["Sach- & Haftpflichtversicherungen","Werbungskosten","Privat"]
exclusive_classes = false

[components.llm.task.label_definitions]
"Sach- & Haftpflichtversicherungen" = "xxx"
"Werbungskosten" = "xxx"
"Privat" = "xxx"

[components.llm.model]
@llm_models = "spacy.GPT-4.v3"
name = "gpt-4o"
config = {"temperature": 0.3}

[components.llm.cache]
@llm_misc = "spacy.BatchCache.v1"
path = "local-cached"
batch_size = 3
max_batches_in_mem = 10

I started working.

magdaaniol · September 25, 2024, 7:26am

Hi @toadle,

Glad to hear you managed to implement your workflow with spacy-llm recipe. I'll go ahead to change the subject to reflect the content of the thread more accurately, now that we know the issue.

Topic		Replies	Views
Prodigy 1.13.0 is out! :tada: news	2	347	August 23, 2023
textcat.openai.correct error "The specified model 'text-davinci-003' is not available"	4	1790	January 15, 2024
llm.fetch doesn't write to the database if it gets interrupted	4	232	January 3, 2024
💥 Prodigy v1.13.2 is out 💥 New recipes, LLM UI component & task router fixes news	1	375	September 8, 2023
根据调整openai主动学习根据调整openai主动学习 According to the adjustment of OpenAI active learning ner	1	199	December 14, 2023

OpenAI integration deprecation

Related topics