Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connect hosted LLM #350

Open
wants to merge 80 commits into
base: main
Choose a base branch
from
Open

Connect hosted LLM #350

wants to merge 80 commits into from

Conversation

dimits-ts
Copy link
Collaborator

Fixes #305

Allow open-source LLMs hosted on any server to be called as Gemini would.

Changes:

  • Abstracted LLM API type
    • Created an interface to use any kind of API key
    • Exported interface for LLM responses regardless of API
    • API validity checks can now be extended for any kind of API
  • Added Ollama server support
    • Supports most open source models (is not limited to LLaMa)
    • Added frontend section to connect to Ollama server through HTTP URL
    • Added sample Ollama server in the form of a docker container for testing
  • Included unit tests for most new functions

TODOs:

  • LLaMa server may need some kind of authentication
  • While Ollama supports almost all popular open-source LLMs, this patch only allows the use of llama3.2.
    • We can include a text area or dropdown menu in Settings with all available models. The issue is that doing so would necessitate pulling and deploying all such models on the remote Ollama server first.
  • API key check only checks whether the key is not empty, not if it's valid. This also goes for Gemini keys.
  • Right now all moderators are either Gemini or LLaMa. Future versions may include mixing different kinds of models (see LLaMa integration design doc)

dimits-ts and others added 30 commits October 31, 2024 15:42
we can remove the tests later but they are essential for now
the ollama api necessitates keeping the model state active, so we need a class instead of a set of functions
on second thought it's too much trouble for what it's worth
Standardize docker container creation for local/remote LLM hosting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't want to duplicate the API libraries between the frontend and functions directories - too easy for the copies to get out of sync. I'm still getting familiar with this codebase, so I'm not sure of the best way to organize this. @cjqian or @vivtsai, any thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I think only having API functions in one place would be best. @dimits-ts is there a reason to make LLM calls from the frontend? Our initial assumption when setting up the codebase was that backend calls would suffice/be preferred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At quick glance it looks like the frontend/api is no longer used and could be deleted (since the actual ollama call uses functions/src/api)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are both correct, I have no idea how the files ended up getting duplicated. The actual API calls are made to the functions module, the frontend/api module is unused. I am removing it, sorry for the confusion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend calling this ollama.api.ts, and consistently referring to this by "ollama" instead of "llama" (e.g. OLLAMA_CUSTOM_URL instead of LLAMA_CUSTOM_URL). Like you mention in the description, ollama can host plenty of non-llama models - but there are also other methods of hosting llama models that won't work with the ollama API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, this was a holdout from when I wasn't sure which backend library was going to be used for hosting the OSS LLMs. I will correct it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing some references to "llama", mostly the things defined in utils/src/experimenter.ts: llamaApiKey, LlamaServerConfig, LLAMA_CUSTOM_URL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I overlooked these. There should be no references to llama anywhere now. Apologies.

* and is managed through the `ollama` framework (https://github.com/ollama/ollama).
* Example docker instance hosting an ollama server: https://github.com/dimits-ts/deliberate-lab-utils/tree/master/llm_server
*
* Note: there already exists a client library for JavaScript, but not for Typescript.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on adding some more API support, and I'm planning to support Ollama either via the openai library and Ollama's openai API compatibility, or via multi-llm-ts. This looks good in the meantime, just a heads up, and I wouldn't recommend putting a lot more work into polishing this.

Copy link
Collaborator Author

@dimits-ts dimits-ts Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to change the name in the future, to denote that this functionality is purely for local ollama support in that case? Admittedly, I'm not familiar with ollama's openai integration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The openai API will work fine wherever ollama is hosted. That method won't touch openai's servers or anything, it's just a client library that we don't need to maintain ourselves.

@dimits-ts
Copy link
Collaborator Author

New updates:

  • The user can now select which OSS model will be used through the Settings tab
  • Disambiguated "ollama server" from "llama" both on source files and HTML
  • Added some HTML text and links in the "Ollama Server" section for ease-of-use

I think this is enough for this PR feature-wise. The other issues outlined in the TODO section above may need extensive changes, which may impact how we internally handle agents and configurations in general.

P.S. I did encounter a bug where the exported index.ts files wouldn't get updated for some reason (which is one of the reasons these patches were delayed). I can't seem to be able to replicate this now, however. If you encounter any issues with outdated types, give me a heads-up.

}

// Log the response
console.log(response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing these console messages (here and elsewhere) or adding more informative statements.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed most from the functions module and my own code in chat_triggers, but left the warnings and errors as console.warn and console.error respectively. I didn't touch the logs for the rest of the codebase, let me know if I should.

@@ -0,0 +1,19 @@
import { OllamaChat } from "./llama.api";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thank you for adding tests!

*/
type IncomingMessage = {
model: string,
created_at: Date,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider renaming to dateCreated for consistency with the rest of the codebase

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon inspection, this type didn't end up being used anywhere, since we are only extracting the LLM's response. I removed it for now.

model: string,
created_at: Date,
message: LlmMessage,
done_reason: string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For this variable and others, rename using mixedCase (e.g. doneReason, modelType) instead of done_reason, model_type for consistency.

Copy link
Collaborator Author

@dimits-ts dimits-ts Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an outdated version of the file which I mistakenly left committed (see mike's comment above). Sorry for the confusion!

That said, I had indeed missed some, fixed!

apiCheck = html`
<div class="warning">
<b>Note:</b> In order for LLM calls to work, you must add your Gemini
API key under Settings.
<b>Note:</b> In order for LLM calls to work, you must add your Gemini API key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this more generic: "you must add an API key or server configuration under Experimenter Settings." (Here and in chat_panel.ts)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably shouldn't go in the repository.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and added to gitignore

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing some references to "llama", mostly the things defined in utils/src/experimenter.ts: llamaApiKey, LlamaServerConfig, LLAMA_CUSTOM_URL.

* and is managed through the `ollama` framework (https://github.com/ollama/ollama).
* Example docker instance hosting an ollama server: https://github.com/dimits-ts/deliberate-lab-utils/tree/master/llm_server
*
* Note: there already exists a client library for JavaScript, but not for Typescript.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The openai API will work fine wherever ollama is hosted. That method won't touch openai's servers or anything, it's just a client library that we don't need to maintain ourselves.

id: string;
apiKeys: APIKeyConfig;
Copy link
Collaborator

@vivtsai vivtsai Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimits-ts @mkbehr @cjqian The original idea here was to have APIKeyConfig contain all API key info (as ExperimenterData may eventually contain non-API key settings). What do you think about keeping APIKeyConfig and nesting all api settings under there? E.g.,

export interface ExperimenterData {
  id: string;
  email: string; // added recently in `main` branch commit
  apiConfig: apiKeyConfig;
}

export interface APIKeyConfig {
  geminiApiKey: string;
  llamaApiKey: LlamaServerConfig;
  activeApiKeyType: ApiKeyType;
}

(we also may want geminiApiKey to point to a config, but I'm fine with waiting until we have something else to store there)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this seems like a clearer API. I refactored the types like your example.

Copy link
Collaborator

@vivtsai vivtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dimits-ts - thanks for working on this! I left a comment about structuring ExperimenterData / API key configs.

@cjqian
Copy link
Member

cjqian commented Dec 20, 2024

I took my final pass and pushed a small cosmetic change. I've also verified that Gemini keys continue to work following this change.

Dimitris, this looks phenomenal! Thank you for all of your hard work on this and this incredible contribution.

Leaving to @vivtsai to merge when she's done with her final pass.

@dimits-ts
Copy link
Collaborator Author

dimits-ts commented Dec 23, 2024

I have refactored the types as @vivtsai recommended, and took the opportunity to make the code in experimenter_data_editor.ts more DRY-like. The changes seem to be compatible with the new changes in the upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Llama API key option
4 participants