Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the aggregation of tokens in metrics taking into account function calls #1800

Open
1 of 6 tasks
markpollack opened this issue Nov 22, 2024 · 0 comments
Open
1 of 6 tasks
Labels
Milestone

Comments

@markpollack
Copy link
Member

markpollack commented Nov 22, 2024

This was done for Amazon Bedrock Converse but needs to be done for the other all the other models.

@markpollack markpollack added this to the 1.0.0-M5 milestone Nov 22, 2024
@markpollack markpollack added the bug Something isn't working label Nov 22, 2024
tzolov added a commit to tzolov/spring-ai that referenced this issue Nov 29, 2024
- Modify OllamaChatModel to support accumulating tokens and durations across multiple responses
- Update ChatResponse metadata generation to aggregate usage and duration metrics
- Add tests to verify metadata aggregation behavior

Related to spring-projects#1800
tzolov added a commit to tzolov/spring-ai that referenced this issue Nov 29, 2024
…lamaApi

- Implement streaming tool call support in OllamaApi and OllamaChatModel
- Add OllamaApiHelper to manage merging of streaming chat response chunks
- Remove @disabled annotations for streaming function call tests
- Update documentation to reflect new streaming function call capabilities
- Add a new default constructor for ChatResponse
- Update Ollama chat documentation to clarify streaming support requirements
- Deprecated withContent(), withImages(), and withToolCalls() methods
- Replaced with content(), images(), and toolCalls() methods

Add token and duration aggregation for Ollama chat responses

- Modify OllamaChatModel to support accumulating tokens and durations across multiple responses
- Update ChatResponse metadata generation to aggregate usage and duration metrics
- Add tests to verify metadata aggregation behavior

Refactor Ollama duration fields and tests

- Replace Duration fields in OllamaApi.ChatResponse with Long to represent durations in nanoseconds, ensuring precision and compatibility.
- Update methods to convert Long nanoseconds to Duration objects (getTotalDuration, getLoadDuration, getEvalDuration, getPromptEvalDuration).
- Adjust merge logic in OllamaApiHelper to sum Long values for duration fields.
- Modify test cases in OllamaChatModelTests to align with Long duration representation and Duration.ofNanos conversions.
- Add new test class OllamaDurationFieldsTests to validate JSON deserialization and Duration conversion for duration fields.

Resolves spring-projects#1847
Related to spring-projects#1800
Resolves spring-projects#1796
Related to spring-projects#1307
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant