-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ollama: Duration metrics deserialized to the wrong time unit #1796
Comments
jorander
added a commit
to jorander/spring-ai
that referenced
this issue
Nov 24, 2024
… an incorrect time unit is used while deserializing the duration values coming from Ollama.
Modified unit tests to reproduce error in connected PR. |
tzolov
added a commit
to tzolov/spring-ai
that referenced
this issue
Nov 29, 2024
- Replace Duration fields in OllamaApi.ChatResponse with Long to represent durations in nanoseconds, ensuring precision and compatibility. - Update methods to convert Long nanoseconds to Duration objects (getTotalDuration, getLoadDuration, getEvalDuration, getPromptEvalDuration). - Adjust merge logic in OllamaApiHelper to sum Long values for duration fields. - Modify test cases in OllamaChatModelTests to align with Long duration representation and Duration.ofNanos conversions. - Add new test class OllamaDurationFieldsTests to validate JSON deserialization and Duration conversion for duration fields. Resolves spring-projects#1796 Related to spring-projects#1307
tzolov
added a commit
to tzolov/spring-ai
that referenced
this issue
Nov 29, 2024
…lamaApi - Implement streaming tool call support in OllamaApi and OllamaChatModel - Add OllamaApiHelper to manage merging of streaming chat response chunks - Remove @disabled annotations for streaming function call tests - Update documentation to reflect new streaming function call capabilities - Add a new default constructor for ChatResponse - Update Ollama chat documentation to clarify streaming support requirements - Deprecated withContent(), withImages(), and withToolCalls() methods - Replaced with content(), images(), and toolCalls() methods Add token and duration aggregation for Ollama chat responses - Modify OllamaChatModel to support accumulating tokens and durations across multiple responses - Update ChatResponse metadata generation to aggregate usage and duration metrics - Add tests to verify metadata aggregation behavior Refactor Ollama duration fields and tests - Replace Duration fields in OllamaApi.ChatResponse with Long to represent durations in nanoseconds, ensuring precision and compatibility. - Update methods to convert Long nanoseconds to Duration objects (getTotalDuration, getLoadDuration, getEvalDuration, getPromptEvalDuration). - Adjust merge logic in OllamaApiHelper to sum Long values for duration fields. - Modify test cases in OllamaChatModelTests to align with Long duration representation and Duration.ofNanos conversions. - Add new test class OllamaDurationFieldsTests to validate JSON deserialization and Duration conversion for duration fields. Resolves spring-projects#1847 Related to spring-projects#1800 Resolves spring-projects#1796 Related to spring-projects#1307
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug description
Ollama reports duration values (total_duration, load_duration, prompt_eval_duration and eval_duration) in chat and embed responses as nanoseconds in JSON INTEGER format. In Spring AI the Jackson class DurationDeserializer from the jsr310 module is used to deserialize these values into Java Duration objects. In this process the integer value is interpreted as seconds, instead of nanoseconds, making the Duration 10^9 times larger.
This is because the DurationDeserializer expects durations with nanosecond precision to be formatted as decimal values with a decimal separator (dot) separating the seconds part from the nanoseconds part. Durations formatted as integers are, depending on context settings, interpreted as seconds (default) or milliseconds. None of these work for durations reported by Ollama.
Environment
Spring AI: 1.0.0-M4
Java: 21
Ollama: 0.4.2
Steps to reproduce
Run a chat-request towards an Ollama server.
Compare the durations reported by Ollama with the values found in the OllamaApi.ChatResponse and propagated to the map available at ChatClient.ChatResponse#getMetadata.
Expected behavior
Serialization of Duration takes into account that Ollama reports duration values as a JSON INTEGER value in nanoseconds.
Minimal Complete Reproducible example
None at the moment.
The text was updated successfully, but these errors were encountered: