Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics for input data considered and taken for compression #8522

Merged
merged 5 commits into from
Jul 30, 2024

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Jul 26, 2024

If compression is enabled, we currently try compressing each image larger than a specific size and if the compressed version is smaller, we write that one, otherwise we use the uncompressed image. However, this might sometimes be a wasteful process, if there is a substantial amount of images that don't compress well.

The compression metrics added in #8420 pageserver_compression_image_in_bytes_total and pageserver_compression_image_out_bytes_total are well designed for answering the question how space efficient the total compression process is end-to-end, which helps one to decide whether to enable it or not.

To answer the question of how much waste there is in terms of trial compression, so CPU time, we add two metrics:

  • one about the images that have been trial-compressed (considered), and
  • one about the images where the compressed image has actually been written (chosen).

There is different ways of weighting them, like for example one could look at the count, or the compressed data. But the main contributor to compression CPU usage is amount of data processed, so we weight the images by their uncompressed size. In other words, the two metrics are:

  • pageserver_compression_image_in_bytes_considered
  • pageserver_compression_image_in_bytes_chosen

Part of #5431

@arpad-m arpad-m requested a review from a team as a code owner July 26, 2024 02:05
@arpad-m arpad-m requested a review from problame July 26, 2024 02:05
Copy link

github-actions bot commented Jul 26, 2024

3138 tests run: 3017 passed, 0 failed, 121 skipped (full report)


Flaky tests (1)

Postgres 14

Code coverage* (full report)

  • functions: 32.8% (7022 of 21383 functions)
  • lines: 50.0% (55881 of 111692 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
0e7d493 at 2024-07-30T05:42:01.765Z :recycle:

@arpad-m arpad-m requested a review from jcsp July 29, 2024 09:23
@arpad-m arpad-m merged commit 1c7b06c into main Jul 30, 2024
65 checks passed
@arpad-m arpad-m deleted the arpad/compression_11 branch July 30, 2024 07:59
arpad-m added a commit that referenced this pull request Aug 5, 2024
If compression is enabled, we currently try compressing each image
larger than a specific size and if the compressed version is smaller, we
write that one, otherwise we use the uncompressed image. However, this
might sometimes be a wasteful process, if there is a substantial amount
of images that don't compress well.

The compression metrics added in #8420
`pageserver_compression_image_in_bytes_total` and
`pageserver_compression_image_out_bytes_total` are well designed for
answering the question how space efficient the total compression process
is end-to-end, which helps one to decide whether to enable it or not.

To answer the question of how much waste there is in terms of trial
compression, so CPU time, we add two metrics:

* one about the images that have been trial-compressed (considered), and
* one about the images where the compressed image has actually been
written (chosen).

There is different ways of weighting them, like for example one could
look at the count, or the compressed data. But the main contributor to
compression CPU usage is amount of data processed, so we weight the
images by their *uncompressed* size. In other words, the two metrics
are:

* `pageserver_compression_image_in_bytes_considered`
* `pageserver_compression_image_in_bytes_chosen`

Part of #5431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants