-
Notifications
You must be signed in to change notification settings - Fork 674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update dependency com.tdunning:t-digest to v3.3 #2136
Update dependency com.tdunning:t-digest to v3.3 #2136
Conversation
38913d0
to
ddb0858
Compare
ddb0858
to
3e17b87
Compare
This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the [email protected] mailing list. Thank you for your contribution! |
3e17b87
to
eb2cd9b
Compare
So turns out the Streaming tests fail consistently with this upgrade. Have not dived into it, but there must be some incompatible behavior between these versions causing this. Anyone knows why? |
t-digest 3.2 looks fine but 3.3 seems to have different stats assertions? The changes look to be around percentile. |
The nitty gritty details look to be here: and opensearch-project/OpenSearch#3634 basically for small sample sizes there used to be a bug and now its fixed. That seems to be my rough interpretation so its not surprising that our tests are failing. |
Pushed e0d3696 updating the assertions in the percentiles for tests. |
Edited/Blocked NotificationRenovate will not automatically rebase this PR, because it does not recognize the last commit author and assumes somebody else may have edited the PR. You can manually request rebase by checking the rebase/retry box above. ⚠ Warning: custom changes will be lost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @risdenk
* Update dependency com.tdunning:t-digest to v3.3 * Update percentile assertions based on bug fixes in t-digest --------- Co-authored-by: Kevin Risden <[email protected]>
This PR contains the following updates:
3.1
->3.3
Release Notes
tdunning/t-digest (com.tdunning:t-digest)
v3.2
===========
In release 3.2, the goal is to produce an update to the code given the large number of improvements since the previous release.
There are a few bugs that will survive this release, most notably in the AVLTreeDigest. These have to do with large numbers of repeated data points and are not new bugs.
There is also a lot of work going on with serialization. I need to hear from people about what they are doing with serialization so that we can build some test cases to allow an appropriate migration strategy to future serialization.
The paper continues to be updated. The algorithmic descriptions are getting reasonably clear, but the speed and accuracy sections need a complete revamp with current implementations.
Bugs, fixed and known
Fixed
The following important issues are fixed in this release
Issue #90 Serialization for MergingDigest
Issue #92 Serialization for AVLTreeDigest
Maybe fixed
This issue has substantial progress, but lacks a definitive test to determine whether it should be closed.
Issue 78 Stability under merging.
Pushed
The following issues are pushed beyond this release
Issue #87 Future proof and extensible serialization
Issue #89 Bad handling for duplicate values in AVLTreeDigest
All fixed issues
Here is a complete list of issues resolved in this release:
Issue #55 Add time
decay to t-digest
Issue #52 General
factory method for "fromBytes"
Issue #90
Deserialization of MergingDigest BufferUnderflowException in 3.1
Issue #92 Error in
AVLTreeDigest.fromBytes
Issue #93 high
centroid frequency causes overflow - giving incorrect results
Issue #67 Release of
version 3.2
Issue #81
AVLTreeDigest with a lot of datas : integer overflow
Issue #75 Adjusting
the centroid threshold values to obtain better accuracy at interesting
values
Issue #74 underlying
distribution : powerlaw
Issue #72 Inverse
quantile algorithm is non-contiguous
Issue #65
totalDigest add spark dataframe column / array
Issue #60 Getting
IllegalArgumentException when adding digests
Issue #53
smallByteSize methods are very trappy in many classes -- should be
changed or have warnings in javadocs
Issue #82 TDigest
class does not implement Serializable interface in last release.
Issue #42 Histogram
Issue #40 Improved
constraint on centroid sizes
Issue #37 Allow
arbitrary scaling laws for centroid sizes
Issue #29 Test
method testScaling() always adds values in ascending order
Issue #84 Remove
deprecated kinds of t-digest
Issue #76 Add
serializability
Issue #77 Question:
Proof of bounds on merging digest size
Issue #71 Simple
alternate algorithm using maxima, ranks and fixed cumulative weighting
Issue #61 Possible
improvement to the speed of the algorithm
Issue #58 jdk8
doclint incompatibility
Issue #48 Build is
unstable under some circumstances
Issue #63 Which
TDigest do you recommend?
Issue #62 Very slow
performance; what am I missing?
Issue #47 Make
TDigest serializable
Issue #49
MergingDigest.centroids is wrong on an empty digest
Configuration
📅 Schedule: Branch creation - "* * * * *" (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Renovate Bot