db: consider adding IterContext #4115

jbowens · 2024-10-26T14:00:52Z

The iterator stack has a host of configuration and global state. Today a lot of this configuration exists outside the base.InternalIterator interface. This forces iterator construction to propagate this configuration, every internal iterator in the stack to save a copy, and every internal iterator to zero this state on Close. Iterator.Close has an unfortunately high cost on workloads that use an iterator for a one-time seek (#4049) in part due to zeroing these structures.

We might reduce CPU and reduce the memory used by an iterator by propagating some of this information down the iterator stack during an operation by pointer. For example, we could define a base.IterContext struct:

type IterContext struct {
	Compare    Compare
	Equal      Equal
	Split      Split
	LowerBound []byte
	UpperBound []byte
	Prefix     []byte
	Stats      InternalIteratorStats
	Context    context.Context
        Logger     Logger
        Comparer   Comparer
}

Every iteration method of the InternalIterator interface could accept a *IterContext as the first parameter, which would point into a field of the pebble.Iterator struct when iterating using a top-level pebble.Iterator. With a quick survey, it looks like we would be able to remove:

mergingIter:
- logger (16 bytes)
- split (8 bytes)
- prefix (24 bytes)
- lower (24 bytes)
- upper (24 bytes)
- stats (8 bytes)
levelIter:
- ctx (16 bytes)
- logger (16 bytes)
- comparer (8 bytes)
- cmp (8 bytes)
- split (8 bytes)
- lower (24 bytes)
- upper (24 bytes)
- prefix (24 bytes)
singleLevelIterator:
- ctx (16 bytes)
- cmp (8 bytes)
- lower (24 bytes)
- upper (24 bytes)
colblk.DataBlockIter
- cmp (8 bytes)
- split (8 bytes)
colblk.IndexIter
- cmp (8 bytes)
- split (8 bytes)
arenaskl.Iterator
- lower (24 bytes)
- upper (24 bytes)

There are some tricky bits where global state diverges from an individual iterator's state. For example the levelIter avoids propagating bounds if a sstable falls wholly within the bounds. I think we'd still see benefit because, for example, some iterators maintain both global-level bounds and bounds to use within a more constrained iteration context.

Additionally, we might see some benefit to keeping frequently accessed fields (cmp, split) shared by improving cache locality.

Also propagating the iteration prefix through Next will facilitate #3794, because today leaf iterators don't retain the iteration prefix.

Jira issue: PEBBLE-289

The text was updated successfully, but these errors were encountered:

RaduBerinde · 2024-10-30T17:38:56Z

Maybe we can somehow tie the valueBlockReader allocation to the IterContext too - you would only be able to read a lazy value if the IterContext is still valid.

jbowens added T-storage A-storage labels Oct 26, 2024

jbowens added this to [Deprecated] Storage Oct 26, 2024

github-project-automation bot moved this to Incoming in [Deprecated] Storage Oct 26, 2024

itsbilal moved this from Incoming to Backlog in [Deprecated] Storage Oct 29, 2024

itsbilal assigned RaduBerinde Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db: consider adding IterContext #4115

db: consider adding IterContext #4115

jbowens commented Oct 26, 2024 •

edited by cockroach-jira-scripts

Loading

RaduBerinde commented Oct 30, 2024

db: consider adding IterContext #4115

db: consider adding IterContext #4115

Comments

jbowens commented Oct 26, 2024 • edited by cockroach-jira-scripts Loading

RaduBerinde commented Oct 30, 2024

jbowens commented Oct 26, 2024 •

edited by cockroach-jira-scripts

Loading