-
Notifications
You must be signed in to change notification settings - Fork 117
adding exp backoff retry decorator to openai embedding and completion calls #12
base: main
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make some comments to make clear what I already said on discord
|
||
private embedMany = async ( | ||
@retry(3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this line as it would retry the whole batch
this.api.createEmbedding({ | ||
...options, | ||
input: text.replace(/\n/g, " "), | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace this to a call to this.embedOne, since embedOne has the retry decorator, each function call should be retried in case of failure instead of the whole batch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another comment on checking if the error is throttling before retrying. You don't want to wait for up to 10 seconds before realizing that the request if malformed or that you api key is wrong
logger.log(chalk.red(`Maximum retries exceeded`)); | ||
throw error; // re-throw error if maximum retries exceeded | ||
} else { | ||
logger.log(chalk.yellow(`Retrying...`)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might want to only retry if the error is that there is throttling of the request. This would make this function less abstract as we have to assume that the error is an axios result and do a check that looks like
if (error.response.status === 429)
So this would only work for axios errors but I think it makes sense for now as this is only used with openai client that uses axios under the hood.
Another option for retry logic here that is probably more performant / better supported is the Here's an example implementation from my codebase, wrapping the openai SDK:
Thoughts:
I would probably opt to add this at the base ModelProvider level, and I think it's worth considering implementing this as a function decorator s.t. the retry logic can be added without too much additional boilerplate. That said, thinking deeply about retry logic on an api-specific basis is a real good idea because what is good for OpenAI APIs might not hold for other service providers. |
Just updating here. Holding off on adding this for now, we have some other ideas that we'd like to try that might be better. |
Cool @cfortuner lmk if you want me to review the solution when you have a PR. |
Having read @mathisobadia's comments, I think that makes more sense. There is a trade-off here:
One more thing to consider is how to handle embedding requests that take more than 250k tokens per minute. If we do batching, we have to construct the array to remain below that. But in one by one, as one embedding request cannot exceed model max token length, we are safe. Even if the total of those exceed 250k/m, the retry policies would handle that. So I'd say reducing the time needed to process embeddings and being able to handle large text processing is more important (at least for my case). But maybe we can have processing policies to handle both. |
Adds a new utility -> Retry!
It's a typescript decorator that lets you retry N number of times:
Let me know what you think!