Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor compression and XOR last model value #105

Merged
merged 12 commits into from
May 24, 2023

Conversation

skejserjensen
Copy link
Contributor

This branch started as an endeavor to improve the compression of residual values by XORing the first residual value with the model's last value instead of storing the first residual value in full. However, doing so proved to be very difficult due to the less than ideal structure the compression crate had ended up with (a combination of functions and builders with unclear separation of responsibility) as functionality was added over time. Thus, I had to give up halfway and refactor the compression crate. The crate now consists of the modules list below and the compression process is performed by three, hopefully, intuative builders (ModelBuilder -> CompressedSegmentBuilder -> CompressedSegmentBatchBuilder) while compression.rs only operates as the driver that decides how to represent each part of the time series being compressed.

  • lib.rs: re-export of the few functions and type (ErrorBound) intended for users of the crate.
  • types.rs: types used throughout the crate.
  • compression.rs: compression of time series as segments containing metadata and models.
  • merge.rs: merging of segments containing metadata and models.
  • models/*.rs: fitting of models to time series and computing aggregates from segments containing metadata and models.

So while this PR make significant changes to the compression crate it is mostly refactoring. I have tried to make the PR simpler to review by rewriting the commit history so the refactoring is done first. Thus, a diff containing only the commits made after the refactoring can be seen here.

@skejserjensen skejserjensen requested a review from CGodiksen May 21, 2023 16:48
@skejserjensen skejserjensen self-assigned this May 21, 2023
@skejserjensen skejserjensen force-pushed the dev/residuals-xor-model-last branch from b79c26a to 306f8d5 Compare May 22, 2023 13:57
crates/modelardb_compression/src/models/gorilla.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/lib.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/lib.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/models/mod.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/models/mod.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/types.rs Show resolved Hide resolved
crates/modelardb_compression/src/types.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/merge.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/merge.rs Outdated Show resolved Hide resolved
crates/modelardb_compression/src/compression.rs Outdated Show resolved Hide resolved
@skejserjensen skejserjensen requested a review from CGodiksen May 23, 2023 14:05
@skejserjensen skejserjensen requested a review from CGodiksen May 23, 2023 15:46
@skejserjensen skejserjensen requested a review from chrthomsen May 23, 2023 17:52
Copy link
Contributor

@chrthomsen chrthomsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a challenge to read through :-) I did not find not any problems

@skejserjensen skejserjensen merged commit 916b8dc into master May 24, 2023
@skejserjensen skejserjensen deleted the dev/residuals-xor-model-last branch May 24, 2023 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants