You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is meant to be a running list of ideas to improve performance in the columnar format.
Don't separate suffix into wall time / logical time / untyped version. The hypothesis is that the current comparison code is too complex and a simple bytes.Compare might be faster [radu] Experimented with this and saw 10-30% regressions in CockroachDataBlockIterShort.
Add a version-is-not-timestamp column to data blocks and have a fast path for blocks or regions that only have timestamps. This column would be empty for all sql table data. [radu] Added a per-block property indicating what types of suffixes it contains.
Separate prefix and suffix in index blocks and use compressed prefix encoding. The seeking should be faster and we'd be able to use a single-level index in more cases.
Expose a PrevPrefix operation that uses the prefix changed bitmap to quickly jump to the previous key. The expectation is that this would reduce CPU during reverse scans. I believe MVCC GC currently scans in reverse and could benefit.
In PrefixBytes.Search try to compare a single byte before calling bytes.Compare.
Move block properties into BlockPropertyCollector-defined columns. This can allow more compact encoding of properties (eg, MVCC timestamps may benefit from uint delta encoding). Likely a minor benefit, but we would additionally avoid decoding O(n) varints where n is the number of BlockPropertyCollectors
Stash block decoders in the table cache to avoid the initial parsing of the structure of a block on block cache hits
Automatically switch between using bundles in prefix bytes (if bundles don't save a lot of space, use bundle size of 1 which is more efficient to search)
Move prefix-changed bitmap into PrefixBytes (and perhaps RawBytes), so it can be used by that code as well
Implement and use colblk.DataBlockIter.SeekPrefixGE
Uint 'column families,' that apply the width-reduction but store the values of a family together for better cache efficiency. This would apply in the context of the index block, allowing the offset and length to be accessed together. It would however require access to not be simple aligned memory loads, but use explicit little endian gets.
Loading an inline-encoded value to call base.MakeInPlaceValue is cheap, but we will do it for every internal iterator positioned. Most of these values will not be used. We could consider making all values follow a 'lazy' retrieval process, which in the case of inlined values would just call Value() on the DataBlockIter.
Separate prefix and suffix in index blocks and use compressed prefix encoding. The seeking should be faster and we'd be able to use a single-level index in more cases.
We experimented with this and it was slower in the benchmarks. The keys in the benchmark were random and fairly short and the extra overhead of comparing the block prefix, bundle prefix, and remainder separately outweighed any benefit. On the other hand, if the keys are large and have long common prefixes, the benefit could be significant in theory.
I think what we need here is to make PrefixBytes be more adaptive - it should automatically choose between using bundles or not (i.e. bundleSize=1) depending on the data. We should also move the prefixChanged bitmap to PrefixBytes where it can be used by the prefix bytes code as well. I added these two items to the issue.
This issue is meant to be a running list of ideas to improve performance in the columnar format.
Don't separate suffix into wall time / logical time / untyped version. The hypothesis is that the current comparison code is too complex and a simple bytes.Compare might be faster[radu] Experimented with this and saw 10-30% regressions inCockroachDataBlockIterShort
.PrefixBytes.Search
try to compare a single byte before callingbytes.Compare
.base.MakeInPlaceValue
is cheap, but we will do it for every internal iterator positioned. Most of these values will not be used. We could consider making all values follow a 'lazy' retrieval process, which in the case of inlined values would just callValue()
on theDataBlockIter
.Jira issue: PEBBLE-273
The text was updated successfully, but these errors were encountered: