Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LlamaIndex Instrumentation #24

Merged
merged 20 commits into from
Jan 12, 2024

Conversation

Kartik1397
Copy link
Contributor

@Kartik1397 Kartik1397 commented Dec 31, 2023

Fixes: #22

/claim #22

Testing:

llamaindex.mp4

I had skip embedding part in this video because my daily quota of embedding api got exhausted. I'll update the testing video once I receive a new quota.

@Kartik1397 Kartik1397 changed the title [WIP] LlamaIndex Instrumentation LlamaIndex Instrumentation Jan 1, 2024
@algora-pbc algora-pbc bot mentioned this pull request Jan 1, 2024
1 task
@Kartik1397 Kartik1397 marked this pull request as ready for review January 1, 2024 14:17
@nirga nirga self-requested a review January 1, 2024 14:38
@Kartik1397 Kartik1397 force-pushed the instrumentation-llamaindex branch 2 times, most recently from 021a28f to 821e8c9 Compare January 5, 2024 21:10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this file is long 😅
Can you split into multiple files based on the part of llamaindex you instrument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved llm instrumentation in to separate class and genericWrapper into utils. I tried to refactor LlamaIndexInstrumentation similar to how it's done in Python. However, the _wrap method is a part of the InstrumentationBase class.

packages/instrumentation-llamaindex/src/instrumentation.ts Outdated Show resolved Hide resolved
packages/instrumentation-llamaindex/src/instrumentation.ts Outdated Show resolved Hide resolved
packages/instrumentation-llamaindex/src/instrumentation.ts Outdated Show resolved Hide resolved
packages/traceloop-sdk/package.json Outdated Show resolved Hide resolved
@Kartik1397 Kartik1397 requested a review from nirga January 8, 2024 19:53
Comment on lines 146 to 154
if (shouldSendPrompts(plugin.config)) {
span.setAttribute(
`${SpanAttributes.LLM_COMPLETIONS}.0.role`,
result.message.role,
);
span.setAttribute(
`${SpanAttributes.LLM_COMPLETIONS}.0.content`,
result.message.content,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 points here:

  1. Are you sure role exist here?
  2. There can't be more than one completion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Are you sure role exist here?

Yes, it's a non optional field of ChatMessage Interface.

interface ChatMessage {
    content: any;
    role: MessageType;
}
  1. There can't be more than one completion?

Just checked return type of chat, it could also return AsyncGenerator. I'll update chat and completion wrapper to handle streaming response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@galkleinman I've pushed changes to add support for capturing streaming response.


for (const key in moduleExports) {
const cls = (moduleExports as any)[key];
if (this.isLLM(cls.prototype)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one will result in duplicate spans.

Explanation:
We instrument OpenAI (for example) independently of LlamaIndex. It means that for a call to OpenAI, since LlamaIndex uses OpenAI sdk theirselves, you'll have duplicate spans, One reported from the LlamaIndex Instrumentation and the other one from OpenAI instrumentation.

By the naming you gave of CustomLLMInstrumentation I guess you saw my python impl.
How I solved this kind of issue there?

In python LllamaIndex have a class called CustomLLM which is intended to be a class where the non trivial/commercial LLMs (openai, cohere, anthropic, etc.) inherits from - example for such one is Ollama.
So I instrumented only those inherit CustomLLM, than the commercial ones which inherit LLM don't get instrumented by LlamaIndex and therefore will not produce span here, but rather produce span through their designated instrumentation.

Unfortunately, I checked, and it seems like there isn't such CustomLLM base class in LlamaIndexTS (validate me).

Maybe we should have some exclusion list of LLMs which have their own instrumentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here only calls instrumented in llamaindex are getting captured.

Screenshot 2024-01-10 at 12 33 11 AM

Code I used for testing

import * as traceloop from "@traceloop/node-server-sdk";
import { OpenAI } from "llamaindex";

traceloop.initialize({
  appName: "sample_llamaindex",
  apiKey: process.env.TRACELOOP_API_KEY,
  disableBatch: true,
});

class SampleLlamaIndex {
  @traceloop.workflow("sample_query")
  async query() {
    const openai = new OpenAI();
    const res = await openai.complete('How are you?');
    return res;
  }
}

traceloop.withAssociationProperties(
  { user_id: "12345", chat_id: "789" },
  async () => {
    const sampleLlamaIndex = new SampleLlamaIndex();
    const result = await sampleLlamaIndex.query();
    console.log(result);
  },
);

`${lodash.snakeCase(className)}.completion`,
);

span.setAttribute(SpanAttributes.LLM_VENDOR, "llamaindex");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to figure out the actual llm vendor here? Cause I'd say that if we're using OpenAI for example than writing OpenAI here is more correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can replace it with class name of LLM. That's I think the closest thing to vendor. We've mapped LLM_VENDOR to class name in python instrumentation.

@nirga nirga merged commit 7fea505 into traceloop:main Jan 12, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🚀 Feature: LlamaIndex Instrumentation
3 participants