colblk: experiment with potential performance improvements #4022

RaduBerinde · 2024-10-09T15:16:17Z

This issue is meant to be a running list of ideas to improve performance in the columnar format.

The text was updated successfully, but these errors were encountered:

RaduBerinde · 2024-10-31T14:38:09Z

Separate prefix and suffix in index blocks and use compressed prefix encoding. The seeking should be faster and we'd be able to use a single-level index in more cases.

We experimented with this and it was slower in the benchmarks. The keys in the benchmark were random and fairly short and the extra overhead of comparing the block prefix, bundle prefix, and remainder separately outweighed any benefit. On the other hand, if the keys are large and have long common prefixes, the benefit could be significant in theory.

I think what we need here is to make PrefixBytes be more adaptive - it should automatically choose between using bundles or not (i.e. bundleSize=1) depending on the data. We should also move the prefixChanged bitmap to PrefixBytes where it can be used by the prefix bytes code as well. I added these two items to the issue.

RaduBerinde assigned jbowens and RaduBerinde Oct 9, 2024

RaduBerinde added this to [Deprecated] Storage Oct 9, 2024

github-project-automation bot moved this to Incoming in [Deprecated] Storage Oct 9, 2024

blathers-crl bot added A-storage T-storage labels Oct 9, 2024

RaduBerinde moved this from Incoming to Next in [Deprecated] Storage Oct 9, 2024

exalate-issue-sync bot unassigned RaduBerinde Oct 18, 2024

RaduBerinde added the C-performance label Oct 22, 2024

RaduBerinde self-assigned this Oct 30, 2024

exalate-issue-sync bot unassigned RaduBerinde Nov 19, 2024

jbowens removed their assignment Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colblk: experiment with potential performance improvements #4022

colblk: experiment with potential performance improvements #4022

RaduBerinde commented Oct 9, 2024 •

edited by jbowens

Loading

RaduBerinde commented Oct 31, 2024

colblk: experiment with potential performance improvements #4022

colblk: experiment with potential performance improvements #4022

Comments

RaduBerinde commented Oct 9, 2024 • edited by jbowens Loading

RaduBerinde commented Oct 31, 2024

RaduBerinde commented Oct 9, 2024 •

edited by jbowens

Loading