-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-proposal: standardize object representations for ai and a protocol to retrieve them #128
Comments
Thank you for the detailed pre-proposal @mlucool. I love the idea of Jupyter-AI and other LLM-powered tools to be able to utilize the current state of the kernel in a standardized manner rather than merely using the content of the documents at hand. This approach resonates with the preference many users had for IPython's autocompletion, which offered more dynamic and context-aware suggestions compared to tools that relied purely on static analysis. By utilizing runtime information, we can achieve more accurate and relevant results. |
From the investigation I did for @mlucool; I think it is possible (with some tweaks), to reuse the IPython display formatter; that not only would allow to define
In [1]: from IPython.core.formatters import BaseFormatter, FormatterABC
...:
...:
...: class LLMFormatter(BaseFormatter):
...: format_type = "x-vendor/llm"
...: print_method = "_ai_repr_"
...: _return_type = dict
...:
...:
...: llm_formatter = LLMFormatter()
...:
...:
...: class Foo:
...: def _ai_repr_(self, *args, **kwargs):
...: return {"text/plain": "this is foo"}
...:
...:
...: llm_formatter(Foo())
Out[1]: {'text/plain': 'this is foo'} We can also register formatter for external objects:
I think as a first step I can :
I'm suggesting we create a package call from ai_repr import formatter # or ai_formatter That basically exposes the same functionalities when outside of IPython and have a way of registering formatter via entrypoints so you can for example |
@mlucool Thank you for opening this JEP! I'm really excited to see others driving thought leadership on how we can improve Jupyter AI & other AI extensions further. I need to head out now for a personal matter, so I will give this a thorough review this tomorrow morning. |
Excited to see this being discussed. A past colleague of mine recently asked about these capabilities in Jupyter AI and what would it take to integrate.
|
We'll make a PR that includes this in Jupyter AI as part a larger PR that demos some things we think would be good to discuss with the community in the near future. |
@mlucool Thank you for opening this again! I've reviewed this and have left some recommendations & initial thoughts below. Explore only returning MIME bundlesIt may be better to only return class RawXmlTree():
"""Stores an arbitrary XML tree."""
...
class RawHtmlTree():
"""Stores an arbitrary HTML tree."""
... Both of these classes have string representations, but the strings may be so similar in structure that the language model can't infer the format of the string. By requiring the implementation to always return a Using union return types ( Explore preferring a functional interfaceOne issue that may impede adoption of this proposal is that we don't have control over packages outside of Project Jupyter. If a project doesn't wish to implement this JEP, then we wouldn't have a way of computing an AI representation for its classes, since they will lack the We could subclass each of those classes or implement meta-programming to add the methods immediately on import. However, these approaches add difficulty to the implementation, which may also impede adoption. The most basic implementation of this proposal can be described by a single, top-level function that provides the API: def compute_ai_repr(obj: Any) -> Dict[str, Any]:
... By defining the top-level API as a single function that takes one object instead of multiple methods on every object, we can provide AI representations for objects from packages outside of Project Jupyter, without requiring an upstream change. Objects may still define an def compute_ai_repr(obj: Any) -> Dict[str, Any]:
if hasattr(obj, '_ai_repr_'):
return obj._ai_repr_()
... A functional top-level API provides benefits for both implementers & consumers. I believe we should consider this when discussing implementation ideas. Explore including implementation guidanceThe JEP should also include guidance on implementing AI representations. This will drive adoption by improving consistency across implementations & by making it easier for future contributors to write new implementations. As we experiment further, we should think deeply about what guidance we should provide to implementers. For example, here are some rough guidelines that I think are worth considering as we experiment further:
|
I'm really excited to see a proof of concept for this, and would be happy to review your PR! Jupyter AI has support for context commands, which take the syntax This source file may be helpful: https://github.com/jupyterlab/jupyter-ai/blob/main/packages/jupyter-ai/jupyter_ai/context_providers/file.py Note that I will be out of office from tomorrow to Mon Dec 16, so I will only be able to review your PR after that. 👋 |
Thanks for the reply! Looks like there is consensus for
I think this somewhat close to what @Carreau proposed with a package and a registry that handles more of this. While I agree that the single function is enough, I'm not sure it adds anything practically here, but maybe I'm missing something
I am hesitant here to be too prescriptive. I think its pretty unknown at this point and would prefer not to have any "must" to describe the output, At even guidance like "must be deterministic" feels extreme. What if you asked an LLM to turn your large repr into something small automatically? Should that an antipattern if it turns out that is very effective?
As an FYI, the proof of concept is not just limited to this feature and will be meant as a dicussion point on AI in jupyter in general. The PR is likely to big to be merged, so we wanted to get feedback on key parts (one of which is this concept). |
What about adding bytes as potential output? |
As noted by others above, I agree that |
@govinda18 made a PR to demo this an many other features to Jupyter AI in: jupyterlab/jupyter-ai#1157. The screencast should give readers a pretty good sense of its power - it loads data from a CVS and knows column names all without a user needing to be explicit. Really, I encourage people to try the PR themselves and get a sense of how it feels. As noted, that is a large PR, so we can further discuss other features there (note: it uses the previous internal idea of just calling it As briefly discussed with @krassowski and @Carreau, we'll make a JEP focusing on the protocol change which is clearly within the Jupyter's purview. The other mechanics we can be a bit more agnostic to and are not clearly something we need a JEP for |
Summary
To deeply integrate AI into Jupyter, we should standardize both a method on objects to represent themselves and a messaging protocol for retrieving these representations. We propose using
_ai_repr_(**kwargs) -> str | dict
for objects to return representations. Additionally, we suggest creating a registry in kernels (e.g. IPython) for users to set representations for objects that do not define this method, along with a new message type for retrieving these representations.Motivation
Users should be able to include representations of instances of objects from their kernel as they interact with AI. This capability is what sets a productive Jupyter experience apart from other IDE-based approaches. For example, you should be able to use Jupyter AI and ask
Given @myvar, how do I add another row
orWhat's the best way to do X with @myvar
?While using something like
_repr_*
may have been sufficient, it can slow down display requests and does not allow passing information to hint about the shape of representations. For example, imagine aChart
. In a multimodal model, we may want to use a rendered image, but in a text-only model, we may want to pass only a description. Other model parameters or user preferences may also matter, such as the size of the context window or how verbose they want the representation to be.Because of this, we suggest defining a new standard called
_ai_repr_(**kwargs) -> str | dict
. This method should return either a string or a MIME bundle. Additionally, since many libraries will not have this defined initially, there should be a registry where users can create a set of defaults and/or overrides, allowing them to use this feature without waiting for libraries to define it themselves.Finally, the UI (e.g., jupyter-ai) needs a way to retrieve these representations for a given object. This is best done by introducing a new message type that can include the object and the kwargs. We expect this process to be slow at times (e.g., generating an image for a chart), so the control channel should not be used. Instead, a normal comms message can be used today, and as support for subshells improves, we can use that to avoid blocking while kernels are busy.
Example
Continuing with the chart object example, we may want to add something like below. Typically this fictional chart it returns structured data for its JS display to render, but now we want an image for the context, which we expect to be slow to compute (e.g. a headless browser may need to be launched to do this):
Other MIME types can also be used to enable the caller to represent the object in an optimized way for the model they are using (e.g., XML). For example, we could imagine Pandas DataFrame's defining this method:
Now the caller can use this MIME type to render the object in the context window using xml if it chooses (see here):
This approach intentionally mirrors how
repr
works in the Jupyter ecosystem, but it focused on non-displayed reprs. In a similar fashion, we don't want to over-specify return types because we want to encourage innovation in this area.Given the desire to query for this from the front end, we also propose a new message type similar to inspect_request, but allowing kwargs to be passed in by the caller. We intentionally do not want to define what these kwargs are at this early stage, preferring to let extension providers innovate and reach a consensus on what is useful. In the example above, we may pass
multimodal=False
and update the code inJSBasedChart
to not render an image or we may passcontext_window=1_000_000
and let the DataFrame repr include statistics per column or maybe even put small tables into the context window as is.CC @Carreau @krassowski @SylvainCorlay
The text was updated successfully, but these errors were encountered: