Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to swap the DefaultInlineCompletionHandler #702

Open
krassowski opened this issue Mar 27, 2024 · 6 comments
Open

Allow to swap the DefaultInlineCompletionHandler #702

krassowski opened this issue Mar 27, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@krassowski
Copy link
Member

Problem

DefaultInlineCompletionHandler makes some choices which may be sub-optimal when using a models tuned for code infill tasks.

For example, the reliance on langchain template system here:

model_arguments = self._template_inputs_from_request(request)
suggestion = await self.llm_chain.ainvoke(input=model_arguments)
suggestion = self._post_process_suggestion(suggestion, request)

makes it hard to use structured prefix/suffix queries as described in
#669.

Further, the post processing assumptions may be different when using the infill model:

def _post_process_suggestion(
self, suggestion: str, request: InlineCompletionRequest
) -> str:

Proposed Solution

Either:

  • (a) add a traitlet allowing to swap the DefaultInlineCompletionHandler to a different class
  • (b) add an entry point allowing to swap the DefaultInlineCompletionHandler to a different class

Any preferences?

Additional context

Chat slash command handlers can be added/(swapped?) by using entry points:

[project.entry-points."jupyter_ai.chat_handlers"]
custom = "custom_package:CustomChatHandler"
@krassowski
Copy link
Member Author

The traitlet approach is commonly used for swapping classes and can be configured in standard jupyter config files. For example, it is used by jupyter-scheduler here:

scheduler_class = Type(
    default_value="jupyter_scheduler.scheduler.Scheduler",
    klass="jupyter_scheduler.scheduler.BaseScheduler",
    config=True,
    help=_i18n("The scheduler class to use."),
)

and extensively in jupyter-server itself.

While jupyter-ai uses traitlets for configuration, most of the extensibility is implemented using entry points:

  • jupyter_ai.chat_handlers,
  • jupyter_ai.model_providers,
  • jupyter_ai.default_tasks,
  • jupyter_ai.embeddings_model_providers

However, currently these entry points act as extension points rather than points for overriding the default handlers. In particular, a long standing issue is that we cannot use jupyter_ai.chat_handlers to override the default handlers (#507).

@dlqqq @JasonWeill any thoughts/preferences?

@krassowski
Copy link
Member Author

Thinking more about it, there is a clear case to favour the traitlets approach:

  • using a traitlet makes sense if there there can always be only one completion handler; this ensures that there is no confusion if multiple packages provide entry points, because that would lead to undefined behaviour
  • if we wanted to have multiple models generate completions, we would probably want a some kind of meta handler rather than to register multiple handlers; this is because the top-level handler must coordinate sending the responses and streaming; thus even if we imagine having multiple handlers in the future, there should always be one top-level handler to govern them all, and that one could be traitlets-configurable to specify the list of child handlers.
  • a user may wish to install multiple external providers but may not want them to replace the default handler - installation should not be equivalent with agreeing to change the default handler
  • the handler exposes server resources; it should be a conscious user decision to swap it because swapping the handler to one which does not follow auth practices may turn out to be costly - another point to the traitlets approach

@krassowski krassowski self-assigned this Apr 4, 2024
@michaelchia
Copy link
Collaborator

sorry if im intruding but as a user who has been extending jupyter-ai for use in my organisation, my personal preference would be for an entry point approach with different completion providers. mostly for consistency with the llm and embedding providers. (I've been a bit more detailed in #669). there are customisations in the current default handler that would make more sense to me to implement them in some completion provider subclass instead.

my preference would be to generalise the completion handler and have most of the current logic (prompt templates and llm_chain) be part of a subclass of a base completions provider class where you can easily apply with the current llm providers. while the base completions provider class would be more general allowing extensions to implement some other logic (potentially non-langchain) to provide completions.

thanks for your consideration.

@krassowski
Copy link
Member Author

my preference would be to generalise the completion handler and have most of the current logic (prompt templates and llm_chain) be part of a subclass of a base completions provider class where you can easily apply with the current llm providers

I think agree. FYI I previously moved the prompt templates from handlers to providers in #581.

The provider could get two methods with default implementations:

  • async generate_inline_completion(request) -> InlineCompletionReply
  • async stream_inline_completion(request) -> AsyncGenerator[InlineCompletionStreamChunk]

This would be backward compatible. The only downside is extending the API surface of the BaseProvider and I am not sure if maintainers would accept such a contribution - I sometimes find it hard to get a reply here. I will open two PRs, one allowing to swap the default handler, and another implementing the above methods in provider, let's see which one works out better.

@michaelchia
Copy link
Collaborator

although i am not totally against overloading the same llm provider class, as i can see how you can get it to work, i would have a preference for having of having a separate provider for inline completion logic.

i think it would be cleaner to separate the provider class for chat and the provider class for completion.
firstly, you can cleanly separate completion concerns from chat concerns.
secondly, i personally think chat llms and code completions are different based on my own experimentations trying to get models like claude v2 and the google vertex models to do code completion reliably in a variety of scenarios (e.g. proper fill-in-the-middle from incomplete lines in prefix/suffix). i found that only true code completion models (e.g. code-llama, starcoder, code-gecko) are suitable for general code completion. some general LLMs may be able to do it and can be provided in both providers but it is not generally the case.
however, even with a separate completion provider class, if you do want to offer all llm providers as code completion providers by default, im sure there is some way of automatically creating classes of completion providers for all llm providers.

@krassowski
Copy link
Member Author

i would have a preference for having of having a separate provider for inline completion logic.

I do not disagree.

i think it would be cleaner to separate the provider class for chat and the provider class for completion.
firstly, you can cleanly separate completion concerns from chat concerns.

True

secondly, i personally think chat llms and code completions are different based on my own experimentations trying to get models like claude v2 and the google vertex models to do code completion reliably in a variety of scenarios (e.g. proper fill-in-the-middle from incomplete lines in prefix/suffix). i found that only true code completion models (e.g. code-llama, starcoder, code-gecko) are suitable for general code completion. some general LLMs may be able to do it and can be provided in both providers but it is not generally the case.

Right, this is getting addressed by the ability to mark specific models as suitable for completion-only and suitable for chat-only, or both in #711

however, even with a separate completion provider class, if you do want to offer all llm providers as code completion providers by default, im sure there is some way of automatically creating classes of completion providers for all llm providers.

Sure. It does not seem pretty to me, but of course we can code anything up.

All this is to say that while I mostly agree with you, I also think that the way to get things merged and released today is by making small incremental changes in non-breaking fashion, and once it is time for a new major release, then rewrite the class hierarchy following the lessons we learned along the way.

@krassowski krassowski removed their assignment Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants