-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LlamaIndex Instrumentation #24
Conversation
021a28f
to
821e8c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this file is long 😅
Can you split into multiple files based on the part of llamaindex you instrument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved llm instrumentation in to separate class and genericWrapper
into utils. I tried to refactor LlamaIndexInstrumentation
similar to how it's done in Python. However, the _wrap
method is a part of the InstrumentationBase class.
packages/instrumentation-llamaindex/src/custom-llm-instrumentation.ts
Outdated
Show resolved
Hide resolved
if (shouldSendPrompts(plugin.config)) { | ||
span.setAttribute( | ||
`${SpanAttributes.LLM_COMPLETIONS}.0.role`, | ||
result.message.role, | ||
); | ||
span.setAttribute( | ||
`${SpanAttributes.LLM_COMPLETIONS}.0.content`, | ||
result.message.content, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 points here:
- Are you sure
role
exist here? - There can't be more than one completion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Are you sure role exist here?
Yes, it's a non optional field of ChatMessage Interface.
interface ChatMessage {
content: any;
role: MessageType;
}
- There can't be more than one completion?
Just checked return type of chat, it could also return AsyncGenerator. I'll update chat and completion wrapper to handle streaming response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@galkleinman I've pushed changes to add support for capturing streaming response.
|
||
for (const key in moduleExports) { | ||
const cls = (moduleExports as any)[key]; | ||
if (this.isLLM(cls.prototype)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one will result in duplicate spans.
Explanation:
We instrument OpenAI (for example) independently of LlamaIndex. It means that for a call to OpenAI, since LlamaIndex uses OpenAI sdk theirselves, you'll have duplicate spans, One reported from the LlamaIndex Instrumentation and the other one from OpenAI instrumentation.
By the naming you gave of CustomLLMInstrumentation
I guess you saw my python impl.
How I solved this kind of issue there?
In python LllamaIndex have a class called CustomLLM which is intended to be a class where the non trivial/commercial LLMs (openai, cohere, anthropic, etc.) inherits from - example for such one is Ollama.
So I instrumented only those inherit CustomLLM, than the commercial ones which inherit LLM don't get instrumented by LlamaIndex and therefore will not produce span here, but rather produce span through their designated instrumentation.
Unfortunately, I checked, and it seems like there isn't such CustomLLM base class in LlamaIndexTS (validate me).
Maybe we should have some exclusion list of LLMs which have their own instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But here only calls instrumented in llamaindex are getting captured.
Code I used for testing
import * as traceloop from "@traceloop/node-server-sdk";
import { OpenAI } from "llamaindex";
traceloop.initialize({
appName: "sample_llamaindex",
apiKey: process.env.TRACELOOP_API_KEY,
disableBatch: true,
});
class SampleLlamaIndex {
@traceloop.workflow("sample_query")
async query() {
const openai = new OpenAI();
const res = await openai.complete('How are you?');
return res;
}
}
traceloop.withAssociationProperties(
{ user_id: "12345", chat_id: "789" },
async () => {
const sampleLlamaIndex = new SampleLlamaIndex();
const result = await sampleLlamaIndex.query();
console.log(result);
},
);
349936e
to
40a5609
Compare
`${lodash.snakeCase(className)}.completion`, | ||
); | ||
|
||
span.setAttribute(SpanAttributes.LLM_VENDOR, "llamaindex"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to figure out the actual llm vendor here? Cause I'd say that if we're using OpenAI for example than writing OpenAI here is more correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can replace it with class name of LLM. That's I think the closest thing to vendor. We've mapped LLM_VENDOR to class name in python instrumentation.
0c04f42
to
b8fb4c5
Compare
Fixes: #22
/claim #22
Testing:
llamaindex.mp4
I had skip embedding part in this video because my daily quota of embedding api got exhausted. I'll update the testing video once I receive a new quota.