Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi @danielchalef, I'm proposing this PR since I'm currently using this feature on my own project and it may be useful to users using open source models, or users using Anthropic llm and that would like to use OpenAI embeddings.
The goal is to give the option to create a separate embeddings client. For example in my case, I use the Llama 2 70B Chat through a compatible OpenAI endpoint, but those endpoints just have the llm part enabled, not the embeddings part (in my case I use Anyscale Endpoints, but it would be the same problem for users hosting their own open source model using compatible OpenAI endpoints with vLLM for example).
With this PR, if the embeddings client is disabled (false by default), it will use the llm endpoint for embeddings as it is currently the case.
If you enable embeddings client, you can configure this endpoint that will be specifically used for embeddings (when not using local embeddings), only with the OpenAI service option for the moment. This allows me for example to use the open source model for intents and summaries, and the OpenAI api for embeddings.
The tests are passing locally (except for the Anthropic ones since I don't have a key) and I added some new tests for this use case also. It's also working on Render with embeddings client disabled or enabled
Tell me if you think it would be interesting to merge this feature on the main repo (this is the second time I write with go so please also tell me if I need to rework/rewrite some parts).
I can also make a separate PR to be able to customise the intent prompt when using open source models if you think it would be interesting