Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for locally hosted models #190

Closed
3coins opened this issue May 24, 2023 · 14 comments
Closed

Support for locally hosted models #190

3coins opened this issue May 24, 2023 · 14 comments
Labels
enhancement New feature or request @jupyter-ai/chatui @jupyter-ai/magics project:extensibility Extension points, routing, configuration
Milestone

Comments

@3coins
Copy link
Collaborator

3coins commented May 24, 2023

Summary

@krassowski brought this up during the JupyterLab weekly meeting. This is important because of privacy concerns and some JupyterLab users would prefer not sending their prompts across the wire. Alternately, we should have a more pronounced messaging so the user is aware that their inputs will be send to the model and embedding providers.

@3coins 3coins added the enhancement New feature or request label May 24, 2023
@3coins
Copy link
Collaborator Author

3coins commented May 24, 2023

@krassowski
Please feel free to add any more context if I missed anything.

@krassowski
Copy link
Member

jupyter-ai currently only contains providers for models accessible via over-the-wire API, although the tooling currently employed (LangChain) supports a number of local models, for example GPT4All, LLama-cpp, or Hugging Face Local Pipelines.

Since jupyter-ai does not support local models out of the box, I and others (#17) have previously asked about the way to register custom providers. The initial jupyter-ai proposal involved a cookiecutter for creating custom providers (back then called engines) and back in March this seemed to still be advised by @dlqqq (#17 (comment)), however documentation for cookiecutters at first degraded and then was removed (#163) and the cookiecutter approach was described as no longer recommended in favour of declaring and registering custom lang chain models (#163 (comment), #17 (comment)). However, there appears to be no way (please correct me if I am wrong) to register custom models as of today (although there is a PR open for some time at #136), nor is it clear how it would work for non lang chain, and in general non-language models (e.g. stable diffusion kind).

The approach proposed in #136 is fine for hacking things together or switching models of pre-defined providers, but when it comes to registration of completely new models it is highly repetitive and would force users to paste chunks of boilerplate code into their notebooks (#136 (review comment)). Therefore it is not a proper replacement for:

  • native definition of providers for offline models which are supported LangChain (I can work on these if you accept such a contribution)
  • well documented way of creating packages AI modules (providers, engines, whatever we call them), whether based on LangChain or not.

@dlqqq
Copy link
Member

dlqqq commented May 30, 2023

@krassowski Wow, thank you for such awesome feedback! It's clear that you've been keeping up with our development very closely. Let me address some of your points:

  • The cookiecutter template still exists in the repository under packages/jupyter-ai-module-cookiecutter. However, we are heavily focused on the core jupyter_ai package and are not prioritizing the robustness and documentation of the cookiecutter. This project is changing so rapidly that maintaining the cookiecutter is unduly burdensome.

  • Supporting local language models is a high-priority issue for us; we received lots of demand for this at JupyterCon, and we're excited to bring local LMs to Jupyter AI. However, there are several subtle technical considerations that need to be addressed before we can bring in local LMs:

    • Platform compatibility -- how do we ensure the best support of each language model on each system? What happens if the LM requires more compute/memory than the platform hardware offers? Etc. There is a lot of investigation to be done here.

    • General interface for local LMs -- are cookiecutter templates really needed? i.e. Is there a way to build a general interface via LangChain for any locally hosted language model without needing to write a custom Jupyter AI module via the cookiecutter?

    • Request/response schemas -- How do we let users specify the request/response schema of an arbitrary local LM? The key difference here is that when using upstream 3P LMs (e.g. OpenAI), those have a defined request/response syntax in the API. However, in the general case of an arbitrary local LM, the schema is unknown to us, and the user must somehow specify this. This is being addressed in Support SageMaker Endpoints in chat #197, but this may prove insufficient for local LMs.

We are working on all of these issues as we speak. We would like local LM support to be as robust and high-quality as possible before we release this feature, so we encourage patience here. We would also like to welcome any and all feedback on this feature request to help guide us as we are implementing this.

@krassowski
Copy link
Member

Platform compatibility [...] There is a lot of investigation to be done here.

In my humble opinion enabling users to test it ASAP would accelerate investigations and expose user expectations.

are cookiecutter templates really needed?

I would be happy with or without a cookiecutter - as long as documentation on entrypoints and APIs exist.

Request/response schemas

Cross-ref #193. Again I think enabling advanced user experimentation would accelerate discovering what needs to be done :)

@krassowski
Copy link
Member

To give an example of what I mean by public API for registering custom models programatically, as simplest (not neccessairly best) solution would be renaming AiMagics._safely_set_target to AiMagics.register_model(name, model) in #136.

@JasonWeill
Copy link
Collaborator

We're about to release Jupyter AI 0.8.0. I'm going to move this to the next release, scheduled for about two weeks from now; let's make local models a priority. This feature has been widely demanded and would add significant value to Jupyter AI.

@dlqqq dlqqq modified the milestones: 0.9.0 Release, 0.10.0 Release Jun 23, 2023
@JasonWeill JasonWeill added the project:extensibility Extension points, routing, configuration label Jul 18, 2023
This was referenced Jul 28, 2023
@JasonWeill
Copy link
Collaborator

In preparation for local models, I'm working on custom prompt templates per provider (see #226) in PR #309.

@FurkanGozukara
Copy link

FurkanGozukara commented Aug 3, 2023

If you add local support hopefully i will make a tutorial for this on my channel

Add a dropdown box that people can select model

It will download the model automatically from hugging face

my channel has over 22k subscriber atm : https://www.youtube.com/SECourses

@egeucak
Copy link

egeucak commented Aug 3, 2023

I am willing to contribute to support huggingface text generation inference endpoints.

@ktaletsk
Copy link

ktaletsk commented Aug 3, 2023

I am testing deploying models in the same Kubernetes cluster as JupyterHub with https://github.com/chenhunghan/ialacol and would like to connect to them from the extension

Since the APIs provided by ialacol are mimicing OpenAPI's, it should be relatively straightforward to support, where instead of OpenAPI key the UI would allow customizing endpoint URL instead.

I understand, this might be different from local models, they are better called "self-hosted", then local.

@dlqqq
Copy link
Member

dlqqq commented Aug 14, 2023

#209 introduces early-stage support for GPT4All, which will allow you to run language models locally in the next release. Requests for additional features can be tracked in separate issues. Thank you all for providing your feedback on this issue! 👍

@dlqqq dlqqq closed this as completed Aug 14, 2023
@dlqqq
Copy link
Member

dlqqq commented Aug 14, 2023

@FurkanGozukara I've created an issue to track your feature request: #343

@dlqqq
Copy link
Member

dlqqq commented Aug 14, 2023

@ktaletsk You will be able to set the OpenAI proxy in the next release. 🎉 See: #322

@surak
Copy link

surak commented Jan 19, 2024

Please have a look at my comment here: #389 (comment)

About self-hosted openai-compatible servers. That allows organizations to centralize the inference and connect all jupyter clients to a single big, fast server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request @jupyter-ai/chatui @jupyter-ai/magics project:extensibility Extension points, routing, configuration
Projects
None yet
Development

No branches or pull requests

8 participants