Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable zstd in tests #8368

Merged
merged 10 commits into from
Jul 18, 2024
Merged

Enable zstd in tests #8368

merged 10 commits into from
Jul 18, 2024

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Jul 12, 2024

Successor of #8288 , just enable zstd in tests. Also adds a test that creates easily compressable data.

Part of #5431

@arpad-m arpad-m requested a review from skyzh July 12, 2024 14:19
Copy link

github-actions bot commented Jul 12, 2024

3126 tests run: 3005 passed, 0 failed, 121 skipped (full report)


Code coverage* (full report)

  • functions: 32.7% (6991 of 21386 functions)
  • lines: 50.1% (55231 of 110136 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
2f675d0 at 2024-07-18T18:14:04.899Z :recycle:

@arpad-m
Copy link
Member Author

arpad-m commented Jul 12, 2024

test_forward_compatibility gives:

Error: Failed to parse pageserver configuration

Caused by:
    unrecognized pageserver option 'image_compression'

Apparently it still uses a pageserver from commit 939b595.

@arpad-m
Copy link
Member Author

arpad-m commented Jul 12, 2024

test_sharding_compaction failure is interesting:

AssertionError: assert 30983.603960396038 > (131072 / 2)

test_runner/regress/test_compaction.py:200: in test_sharding_compaction
    assert avg_size > compaction_target_size / 2
E   assert 30983.603960396038 > (131072 / 2)

Quoting the test description:

We are looking for bugs that might emerge from the way sharding uses sparse layer files that
only contain some of the keys in the key range covered by the layer, such as errors estimating
the size of layers that might result in too-small layer files
.

I suppose this is precisely what is going on here. When planning compaction, we use the uncompressed sizes. Then, we compress them, obviously resulting in smaller images. Chi's work might be useful here, to cut an image layer in an on-demand fashion instead of the current planning based approach.

@arpad-m
Copy link
Member Author

arpad-m commented Jul 12, 2024

test_partial_evict_tenant failure link:

AssertionError: assert 0.0817762921333351 < 0.06

test_runner/regress/test_disk_usage_eviction.py:622: in test_partial_evict_tenant
    assert abs_diff < expectation
E   assert 0.0817762921333351 < 0.06

Here we are apparently surpassing the limit. I suppose one needs to adjust the test in some way to account for smaller layer sizes, not sure which method is best.

            # in this test case both relative_spare and relative_equal produce
            # the same outcomes; this must be a quantization effect of similar
            # sizes (-s4 and -s6) and small (5MB) layer size.
            # for pg15 and pg16 the absdiff is < 0.01, for pg14 it is closer to 0.02
            assert abs_diff < expectation

@arpad-m arpad-m force-pushed the arpad/compression_10 branch from b9ae709 to 10377ea Compare July 17, 2024 03:42
@arpad-m
Copy link
Member Author

arpad-m commented Jul 18, 2024

The test_forward_compatibility test is fixed, as it was expected to be, nice.

## Problem

We didn't have a way to see if compression was really working.

## Summary of changes

- Add `pageserver_compression_image_in_bytes_total`,
`pageserver_compression_image_out_bytes_total`
- Validate these from test_image_layer_compression
- Run that test with & without compression to validate not just that
compression works, but that switching it off also works.
@jcsp jcsp requested a review from a team as a code owner July 18, 2024 11:59
@arpad-m arpad-m enabled auto-merge (squash) July 18, 2024 13:17
@arpad-m arpad-m disabled auto-merge July 18, 2024 14:54
@arpad-m
Copy link
Member Author

arpad-m commented Jul 18, 2024

Addressing John's feedback on Slack:

I was thinking if it's bad, we might choose to disable compression in debug mode and leave it enabled in release. Would prefer to just always have it enabled everywhere though, if the runtimes are okay

comparing CI runtime of enabling zstd link and not enabling zstd link:

job zstd v14 v15 v16
regress-tests debug yes 24m 16s 25m 13s 23m 6s
regress-tests debug no 24m 31s 24m 35s 24m 38s
regress-tests release yes 12m 52s 11m 32s 11m 17s
regress-tests release no 11m 39s 12m 51s 11m 14s

I'd say if there is an impact, it's very tiny and much smaller than the noise. If we see it slow down CI, we can add such a limitation later on as well in a followup.

@arpad-m arpad-m enabled auto-merge (squash) July 18, 2024 17:23
@arpad-m arpad-m merged commit c96e801 into main Jul 18, 2024
61 checks passed
@arpad-m arpad-m deleted the arpad/compression_10 branch July 18, 2024 18:09
problame pushed a commit that referenced this pull request Jul 22, 2024
Successor of #8288 , just enable zstd in tests. Also adds a test that
creates easily compressable data.

Part of #5431

---------

Co-authored-by: John Spray <[email protected]>
Co-authored-by: Joonas Koivunen <[email protected]>
arpad-m added a commit that referenced this pull request Aug 18, 2024
After the rollout has succeeded, we now set the default image
compression to be enabled.

We also remove its explicit mention from `neon_fixtures.py` added in
#8368 as it is now the default (and we switch to `zstd(1)` which is a
bit nicer on CPU time).

Part of #5431
VladLazar pushed a commit that referenced this pull request Aug 20, 2024
After the rollout has succeeded, we now set the default image
compression to be enabled.

We also remove its explicit mention from `neon_fixtures.py` added in
#8368 as it is now the default (and we switch to `zstd(1)` which is a
bit nicer on CPU time).

Part of #5431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants