Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ollama: Duration metrics deserialized to the wrong time unit #1796

Open
jorander opened this issue Nov 22, 2024 · 1 comment · May be fixed by #1848
Open

Ollama: Duration metrics deserialized to the wrong time unit #1796

jorander opened this issue Nov 22, 2024 · 1 comment · May be fixed by #1848
Assignees
Labels
bug Something isn't working ollama
Milestone

Comments

@jorander
Copy link

Bug description
Ollama reports duration values (total_duration, load_duration, prompt_eval_duration and eval_duration) in chat and embed responses as nanoseconds in JSON INTEGER format. In Spring AI the Jackson class DurationDeserializer from the jsr310 module is used to deserialize these values into Java Duration objects. In this process the integer value is interpreted as seconds, instead of nanoseconds, making the Duration 10^9 times larger.

This is because the DurationDeserializer expects durations with nanosecond precision to be formatted as decimal values with a decimal separator (dot) separating the seconds part from the nanoseconds part. Durations formatted as integers are, depending on context settings, interpreted as seconds (default) or milliseconds. None of these work for durations reported by Ollama.

Environment
Spring AI: 1.0.0-M4
Java: 21
Ollama: 0.4.2

Steps to reproduce
Run a chat-request towards an Ollama server.
Compare the durations reported by Ollama with the values found in the OllamaApi.ChatResponse and propagated to the map available at ChatClient.ChatResponse#getMetadata.

Expected behavior
Serialization of Duration takes into account that Ollama reports duration values as a JSON INTEGER value in nanoseconds.

Minimal Complete Reproducible example
None at the moment.

jorander added a commit to jorander/spring-ai that referenced this issue Nov 24, 2024
… an incorrect time unit is used while deserializing the duration values coming from Ollama.
@jorander
Copy link
Author

Modified unit tests to reproduce error in connected PR.

@tzolov tzolov self-assigned this Nov 29, 2024
@tzolov tzolov added ollama bug Something isn't working labels Nov 29, 2024
@tzolov tzolov added this to the 1.0.0-M5 milestone Nov 29, 2024
tzolov added a commit to tzolov/spring-ai that referenced this issue Nov 29, 2024
- Replace Duration fields in OllamaApi.ChatResponse with Long to represent durations in nanoseconds, ensuring precision and compatibility.
- Update methods to convert Long nanoseconds to Duration objects (getTotalDuration, getLoadDuration, getEvalDuration, getPromptEvalDuration).
- Adjust merge logic in OllamaApiHelper to sum Long values for duration fields.
- Modify test cases in OllamaChatModelTests to align with Long duration representation and Duration.ofNanos conversions.
- Add new test class OllamaDurationFieldsTests to validate JSON deserialization and Duration conversion for duration fields.

Resolves spring-projects#1796
Related to spring-projects#1307
tzolov added a commit to tzolov/spring-ai that referenced this issue Nov 29, 2024
…lamaApi

- Implement streaming tool call support in OllamaApi and OllamaChatModel
- Add OllamaApiHelper to manage merging of streaming chat response chunks
- Remove @disabled annotations for streaming function call tests
- Update documentation to reflect new streaming function call capabilities
- Add a new default constructor for ChatResponse
- Update Ollama chat documentation to clarify streaming support requirements
- Deprecated withContent(), withImages(), and withToolCalls() methods
- Replaced with content(), images(), and toolCalls() methods

Add token and duration aggregation for Ollama chat responses

- Modify OllamaChatModel to support accumulating tokens and durations across multiple responses
- Update ChatResponse metadata generation to aggregate usage and duration metrics
- Add tests to verify metadata aggregation behavior

Refactor Ollama duration fields and tests

- Replace Duration fields in OllamaApi.ChatResponse with Long to represent durations in nanoseconds, ensuring precision and compatibility.
- Update methods to convert Long nanoseconds to Duration objects (getTotalDuration, getLoadDuration, getEvalDuration, getPromptEvalDuration).
- Adjust merge logic in OllamaApiHelper to sum Long values for duration fields.
- Modify test cases in OllamaChatModelTests to align with Long duration representation and Duration.ofNanos conversions.
- Add new test class OllamaDurationFieldsTests to validate JSON deserialization and Duration conversion for duration fields.

Resolves spring-projects#1847
Related to spring-projects#1800
Resolves spring-projects#1796
Related to spring-projects#1307
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ollama
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants