Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading and writing compressed blobs #8106

Merged
merged 8 commits into from
Jul 2, 2024

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Jun 19, 2024

Add support for reading and writing zstd-compressed blobs for use in image layer generation, but maybe one day useful also for delta layers. The reading of them is unconditional while the writing is controlled by the image_compression config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently keep reserved for blobs larger than 256 MiB. This assumes that we have never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a size larger than 256 MiB (or write them).

TODO:

  • Maybe introduce a new version so that we give better errors should we encounter legacy image layers with such large blobs. This is to insure us in the case the assumption above is wrong, so there is > 256MiB large images. eventually chosen against as image layers and delta layers have different ways of storing the version number.

A non-goal of this PR is to come up with good heuristics of when to compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc #7879

Part of #5431

@arpad-m arpad-m requested a review from VladLazar June 19, 2024 00:30
@arpad-m arpad-m force-pushed the arpad/compression_1 branch from 22c35d8 to 2d87a15 Compare June 19, 2024 00:34
Copy link

github-actions bot commented Jun 19, 2024

3000 tests run: 2885 passed, 0 failed, 115 skipped (full report)


Flaky tests (2)

Postgres 14

  • test_secondary_background_downloads: debug
  • test_subscriber_restart: release

Code coverage* (full report)

  • functions: 32.7% (6932 of 21183 functions)
  • lines: 50.1% (54301 of 108414 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
3e0d7f6 at 2024-07-02T13:55:48.181Z :recycle:

pageserver/src/tenant/blob_io.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/blob_io.rs Show resolved Hide resolved
pageserver/src/tenant/blob_io.rs Show resolved Hide resolved
@arpad-m arpad-m marked this pull request as ready for review June 20, 2024 22:29
@arpad-m arpad-m requested a review from a team as a code owner June 20, 2024 22:29
@arpad-m arpad-m requested review from jcsp and VladLazar June 20, 2024 22:29
Copy link
Contributor

@VladLazar VladLazar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test needs updating, but otherwise looks good to me. It would have been interesting to make the de-compression lazy (i.e. decompress right before walredo).

pageserver/src/tenant/blob_io.rs Show resolved Hide resolved
pageserver/src/tenant/blob_io.rs Outdated Show resolved Hide resolved
@arpad-m arpad-m requested a review from VladLazar June 21, 2024 15:06
@arpad-m
Copy link
Member Author

arpad-m commented Jul 1, 2024

Apparently the CI tests hit neondatabase/tokio-epoll-uring#46 . I asked Christian about it who suggested me to change the API to take a slice. So I wrote #8225 .

@arpad-m arpad-m enabled auto-merge (squash) July 2, 2024 13:12
@arpad-m arpad-m merged commit 25eefde into main Jul 2, 2024
57 checks passed
@arpad-m arpad-m deleted the arpad/compression_1 branch July 2, 2024 14:14
arpad-m added a commit that referenced this pull request Jul 3, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
Add support for reading and writing zstd-compressed blobs for use in
image layer generation, but maybe one day useful also for delta layers.
The reading of them is unconditional while the writing is controlled by
the `image_compression` config variable allowing for experiments.

For the on-disk format, we re-use some of the bitpatterns we currently
keep reserved for blobs larger than 256 MiB. This assumes that we have
never ever written any such large blobs to image layers.

After the preparation in #7852, we now are unable to read blobs with a
size larger than 256 MiB (or write them).

A non-goal of this PR is to come up with good heuristics of when to
compress a bitpattern. This is left for future work.

Parts of the PR were inspired by #7091.

cc  #7879

Part of #5431
VladLazar pushed a commit that referenced this pull request Jul 8, 2024
…8238)

PR #8106 was created with the assumption that no blob is larger than
`256 MiB`. Due to #7852 we have checking for *writes* of blobs larger
than that limit, but we didn't have checking for *reads* of such large
blobs: in theory, we could be reading these blobs every day but we just
don't happen to write the blobs for some reason.

Therefore, we now add a warning for *reads* of such large blobs as well.

To make deploying compression less dangerous, we therefore only assume a
blob is compressed if the compression setting is present in the config.
This also means that we can't back out of compression once we enabled
it.

Part of #5431
arpad-m added a commit that referenced this pull request Jul 11, 2024
We need to pass on the configured compression param during image layer
generation.

This was an oversight of #8106, and the likely cause why #8288 didn't
bring any interesting regressions.

Part of #5431
skyzh pushed a commit that referenced this pull request Jul 15, 2024
We need to pass on the configured compression param during image layer
generation.

This was an oversight of #8106, and the likely cause why #8288 didn't
bring any interesting regressions.

Part of #5431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants