Releases: caio/go-tdigest
v4.0.1
v4.0.0
v3.1.0
v3.0.0: Major release of go-tdigest
Major release of go-tdigest
This release brings in some API and internal changes, but behavior should remain the same.
Changes
- A local RNG is used by default, instead of the global one
- Internal counters were upgraded to 64bit integers
- (Cosmetic) Configuring Compression uses a float64 instead of uint32
Minor release of go-tdigest
This release introduces support for configuring the deserialized
digest when using tdigest.FromBytes
Right now this is mostly useful for configuring the RNG the
digest will use, for example:
t1, err := tdigest.FromBytes(payload, tdigest.LocalRandomNumberGenerator(42))
Patch release of go-tdigest
This release contains a fix for code introduced on v2.0.0
where deserializing a payload would cause errors due to
unexpected nil pointers.
Minor release of go-tdigest
This release contains several performance oriented patches
and was made possible mostly by @vmihailenco adding benchmarks,
optimizing existing code and backporting improvements in the
@honeycombio fork written by @ianwilkes
Notably, the optimizations introduced on v2.0.0 by using a
fenwick tree to cache prefix sums have been refined (#23) and,
later, removed (#25) after the introduction of more thorough
benchmarks.
New Public API
-
TDigest instances may now be duplicated via Clone()
-
You may inspect the compression any given digest has been
configured to use by calling Compression() -
A tdigest instance may be reused/reinitialized (to minimize
allocations, generally) directly from a buffer via
FromBytes(bytes) -
You may now opt to destroy a given digest when merging
for the sake of performance:t1.Merge(t2)
may be
replaced by MergeDestructive for faster execution,
but you must make sure to not uset2
after this since
its state is going to be seriously invalid.
Dependency Changes
We don't import yourbasic/fenwick
anymore, so this library
now doesn't require any external dependency to be used (test
dependencies remain unchanged).
Other
There's been some discussion about performance and chaging
counts to 64bit at #20 - Much of what has been discussed is
done and the next changes will likely require an API change,
so this might be one of the last v2 releases, but keep in
mind that the migration to the future v3 should be really
simple; Even easier than the v1 to v2 path.
Minor release of go-tdigest
This release adds support for TrimmedMean (#22) and reduces inaccuracies caused by floating point math (#19).
Many thanks to @mcbridne and @vmihailenco for making this happen.
Patch release of go-tdigest
v2.0.0: Major release of go-tdigest
Major release of go-tdigest
This release contains major API changes and significant performance improvements to the tdigest package. All users are encouraged to upgrade.
Performance Improvements
The critical path of this library (adding samples to the digest) has been drastically sped up by making use of a binary indexed tree so that prefix sums and updates don't have to necessarily scan most of the storage.
Results from benchcmp
in a late 2013 MacAir (running Linux):
benchmark old ns/op new ns/op delta
BenchmarkAdd1-4 187 206 +10.16%
BenchmarkAdd10-4 332 274 -17.47%
BenchmarkAdd100-4 1092 325 -70.24
Additionally, it's possible now to create a digest that uses a custom random number generator, which means that if you were suffering from lock contention (due to heavy usage of the shared rng), you can easily enable more speed gains by creating your digests with:
digest := tdigest.New(
tdigest.Compression(200),
tdigest.LocalRandomNumberGenerator(),
)
API Changes
The tdigest API has been drastically simplified with the goal of making it more readily usable without requiring people to read up and understand what, for example, compression means.
Modifications
- The
Add(float64,uint32)
method has been renamed toAddWeighted
Additions
- Construction is now done via
New()
which accepts configuration parameters while providing sane defaults - There is a new
Add(float64)
method that works as a shortcut forAddWeighted(float64,1)
- The
Count()
method has been introduced to allow users to decide what to do when the digest grows too much - The
CDF(float64)
method has been added. It stands for cumulative distribution function and it's useful for asking the inverse of the question asked viaQuantile(x)
: it answers at which fraction (quantile) of the data all seen samples are less than or equal to the givenx
.
Removals
- There is no
Len()
method anymore since it provided no real actionable information New(float64)
doesn't exist anymore, it's been replaced by a simplerNew()
one
External Dependencies
Two dependencies have been introduced (v1.x had zero):
- yourbasic/fenwick, used to speed up prefix sum computations allowing major performance improvements
- (test only) leesper/go_rng, for generating non-uniform distributions to assist with testing
Other changes
- This project now uses dep for dependency management
- A single digest can be used to summarize more than 4B data points
- We now have contribution guidelines :-)