gzip: Use sync.pool #1080

Jeremyyang920 · 2024-11-26T17:34:03Z

This commit uses a sync.Pool for botht he gzip writer and reader so that we reduce the number of allocations and time GC takes as previously every mutation that required to be gzipped would make a call to newWriter and allocate a new object. This in turn spent a lot of time and created extra objects on the heap that were un needed which drove up GC time

This change is

Jeremyyang920 · 2024-11-26T17:36:00Z

pprof flame graph for reference over a 15s profile. Majority of time spent in malloc and gc due to all the object allocations.

stevendanna · 2024-11-26T17:46:59Z

internal/staging/stage/gzip.go

-	r, err := gzip.NewReader(bytes.NewReader(data))
+
+	gzReader := gzipReaderPool.Get().(*gzip.Reader)
+	defer gzipReaderPool.Put(gzReader)


If Reset() failed below, are we sure we want to put this back in the pool?

Hmm thats a good question. Im looking at the Reset code, and I think it just calls bufio.NewReader(r) to reset it back. So the put call is really just putting the base gzip.Reader{} object back and reset internally resets z.r which is the actual reader.

Does that check out with your understand of reset as well?

ryanluu12345 · 2024-11-26T18:06:35Z

Unrelated to the change itself, but I think @noelcrl mentioned this in standup that the container images for v20.1 and v20.2 are no longer present. You can just rebase after this goes in:
#1081

ryanluu12345

General implementation looks good to me. Pending Steven's question and light testing to see the profile changes before/after change.

internal/staging/stage/gzip.go

This commit uses a sync.Pool for botht he gzip writer and reader so that we reduce the number of allocations and time GC takes as previously every mutation that required to be gzipped would make a call to newWriter and allocate a new object. This in turn spent a lot of time and created extra objects on the heap that were un needed which drove up GC time

Jeremyyang920 requested review from sravotto, ryanluu12345 and ZhouXing19 November 26, 2024 17:34

stevendanna reviewed Nov 26, 2024

View reviewed changes

ryanluu12345 reviewed Nov 26, 2024

View reviewed changes

internal/staging/stage/gzip.go Show resolved Hide resolved

internal/staging/stage/gzip.go Show resolved Hide resolved

cockroachdb deleted a comment from ryanluu12345 Nov 26, 2024

Jeremyyang920 force-pushed the jyang/gzip_pool branch from 2eed567 to 12fa00d Compare November 26, 2024 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gzip: Use sync.pool #1080

gzip: Use sync.pool #1080

Jeremyyang920 commented Nov 26, 2024 •

edited by cockroach-dev-inf

Loading

Jeremyyang920 commented Nov 26, 2024

stevendanna Nov 26, 2024

Jeremyyang920 Nov 26, 2024

ryanluu12345 commented Nov 26, 2024

ryanluu12345 left a comment

gzip: Use sync.pool #1080

Are you sure you want to change the base?

gzip: Use sync.pool #1080

Conversation

Jeremyyang920 commented Nov 26, 2024 • edited by cockroach-dev-inf Loading

Jeremyyang920 commented Nov 26, 2024

stevendanna Nov 26, 2024

Choose a reason for hiding this comment

Jeremyyang920 Nov 26, 2024

Choose a reason for hiding this comment

ryanluu12345 commented Nov 26, 2024

ryanluu12345 left a comment

Choose a reason for hiding this comment

Jeremyyang920 commented Nov 26, 2024 •

edited by cockroach-dev-inf

Loading