Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable separate Embeddings Client #258

Closed
wants to merge 9 commits into from
Closed

Enable separate Embeddings Client #258

wants to merge 9 commits into from

Conversation

bricemacias
Copy link
Contributor

Hi @danielchalef, I'm proposing this PR since I'm currently using this feature on my own project and it may be useful to users using open source models, or users using Anthropic llm and that would like to use OpenAI embeddings.

The goal is to give the option to create a separate embeddings client. For example in my case, I use the Llama 2 70B Chat through a compatible OpenAI endpoint, but those endpoints just have the llm part enabled, not the embeddings part (in my case I use Anyscale Endpoints, but it would be the same problem for users hosting their own open source model using compatible OpenAI endpoints with vLLM for example).

With this PR, if the embeddings client is disabled (false by default), it will use the llm endpoint for embeddings as it is currently the case.
If you enable embeddings client, you can configure this endpoint that will be specifically used for embeddings (when not using local embeddings), only with the OpenAI service option for the moment. This allows me for example to use the open source model for intents and summaries, and the OpenAI api for embeddings.

The tests are passing locally (except for the Anthropic ones since I don't have a key) and I added some new tests for this use case also. It's also working on Render with embeddings client disabled or enabled

Tell me if you think it would be interesting to merge this feature on the main repo (this is the second time I write with go so please also tell me if I need to rework/rewrite some parts).
I can also make a separate PR to be able to customise the intent prompt when using open source models if you think it would be interesting

@danielchalef
Copy link
Member

Sorry for the delay. This is great! Please allow me a few days to review it.

@danielchalef
Copy link
Member

We're refactoring how inference works to allow separate endpoints for the various inference actions and so this PR has been superseded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants