Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edits for Compression 2021 Chapter #2624

Merged
merged 1 commit into from
Nov 30, 2021
Merged

Edits for Compression 2021 Chapter #2624

merged 1 commit into from
Nov 30, 2021

Conversation

shantsis
Copy link
Contributor

@shantsis shantsis commented Nov 30, 2021

Progress on #2525
Closes #2160

Stage link: https://20211116t174953-dot-webalmanac.uk.r.appspot.com/en/2021/compression

Basic edits and some TODOs for the authors to review.

@shantsis shantsis added the editing Content excellence label Nov 30, 2021
@shantsis shantsis added this to the 2021 Launch 🚀 milestone Nov 30, 2021
@shantsis
Copy link
Contributor Author

shantsis commented Nov 30, 2021

I couldn't finish reading through this because I was thrown off with the direction - this is the only chapter I see that sounds like a guide on how to do something instead of just presenting and analyzing data. Is this the correct approach?

Edit: I see the 2020 chapter is similar but its quite odd

@shantsis shantsis requested a review from mo271 November 30, 2021 02:09
@tunetheweb
Copy link
Member

Some understanding may be necessary to explain the data being shown, and advise on how to learn from the findings we present, but yes some chapters perhaps go too far in this regards.

As this is similar to previous years let's go with it here. I think there's enough interesting data in the chapter that it's still got the "Web Almanac" feel.


## HTTP compression is useful for many types of content

{# TODO - is this title too long? Can it be shortened? #}
Copy link
Contributor

@lvandeve lvandeve Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to:

Content types using HTTP compression


When using only Gzip compression (also known as Deflate or Zlib), you can add support for Brotli. In comparison to Gzip, Brotli compresses to <a hreflang="en" href="https://quixdb.github.io/squash-benchmark/">smaller files at the same speed</a>, decompresses at the same speed, and is widely supported: <a hreflang="en" href="https://caniuse.com/brotli">can I use Brotli</a>.

{# TODO - the wide support has been mentioned twice already #}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, the part "and is widely supported: can I use Brotli." could be removed, especially since the brotli caniuse link is elsewhere as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, removed this here, and the other two mentions also got merged as far as I know thanks to an earlier TODO


Responses that are Gzip compressed will show "gzip", while those compressed with Brotli will show "br". If the value is blank, no HTTP compression is used. For images this is normal, since these resources are already compressed on their own.

If you hover the mouse over the values in the Size column, you can also see "transferred over network" and "resource size" to compare the compressed and actual sizes. This data can also be seen for the entire site: this is indicated as "size" / "transferred size" in Firefox and "transferred" and "resources" in Chrome on the bottom left hand side of the Network tab.

{# TODO is this section above necessary? it has less to do with discussing data #}
Copy link
Contributor

@lvandeve lvandeve Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes some sense to have it, in that it tells how to analyze your own data (or other websites). If this gets a bit too detailed, it could be shortened by removing the last paragraph above this comment about the Size column and the transfered network size (since this is just one of the many things you can do). The part higher up about the response header is useful in general though as it shows the HTTP content-encoding this chapter is talking about in use. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the part about the sizes, and changed the first paragraph to say analysis of "a" site rather than "your" site because this is generally applicable, a way to analyze compression usage anywhere, to make this part slightly less instructional. The screenshot remains a useful addition imho thanks to showing the content encodings visually

@lvandeve
Copy link
Contributor

Thank you for the review and edits. I responded to several TODOs, but will instead continue in a pull request with a commit on top of this one to apply the TODOs. I'll keep this updated once the pull request is made

@shantsis
Copy link
Contributor Author

Thank you for the review and edits. I responded to several TODOs, but will instead continue in a pull request with a commit on top of this one to apply the TODOs. I'll keep this updated once the pull request is made

Sure thing! You can commit directly here

@@ -23,12 +23,14 @@ unedited: true

## Introduction

Users' time is valuable, and they shouldn't have to wait long for a web page to load. The HTTP protocol allows the responses to be compressed, which decreases the time needed to transfer the content. Even when taking the compression and decompression time into account, compression often leads to significant improvement in the user experience. It can reduce [page weight](./page-weight), improve [web performance](./performance) and boost search rankings, so it's an important part of [Search Engine Optimization](./seo).
A user's time is valuable, so they shouldn't have to wait a long time for a web page to load. The HTTP protocol allows to compress responses and this decreases the time needed to transfer the content. When taking the compression and decompression time into account, this often leads to significant improvement in the user experience. It can reduce [page weight](./page-weight), improve [web performance](./performance) and boost search rankings. As such it's an important part of [Search Engine Optimization](./seo).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"When taking the compression and decompression time into account" would be more clear as the original "Even when taking ...", because the loading time is faster despite the extra compression time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the compression time part altogether here, and instead describe this a bit better in the compression levels section, since the compression levels determine the CPU time

@lvandeve
Copy link
Contributor

Sure thing! You can commit directly here

I don't know for sure if I have the permissions for that. I created a separate pull request for now (which includes your commit) at #2630. Feel free to cherry pick the commit here instead if that's possible for you, or let me know if there's a way I can add the commit here instead anyway.

I'll reply to the TODOs in this pull request here now to explain what changes in the commit


Practically all text compression is done by one of two HTTP content encodings: <a hreflang="en" href="https://tools.ietf.org/html/rfc1952">Gzip</a> and <a hreflang="en" href="https://github.com/google/brotli">Brotli</a>. Both are widely supported by browsers: <a hreflang="en" href="https://caniuse.com/brotli">can I use Brotli</a> / <a hreflang="en" href="https://caniuse.com/gzip">can I use Gzip</a>. On the server side, most [popular servers](https://en.wikipedia.org/wiki/HTTP_compression#Servers_that_support_HTTP_compression) can be configured to use [Brotli](https://en.wikipedia.org/wiki/Brotli) and/or [Gzip](https://en.wikipedia.org/wiki/Gzip).

Depending on the web server software you use, compression needs to be enabled, and the configuration may be separate for precompressed and dynamically compressed content. Here are a few pointers for two of the most popular web servers. For <a hreflang="en" href="https://httpd.apache.org/">Apache</a>, Brotli can be enabled with <a hreflang="en" href="https://httpd.apache.org/docs/2.4/mod/mod_brotli.html">mod\_brotli</a>, and Gzip with <a hreflang="en" href="https://httpd.apache.org/docs/2.4/mod/mod_deflate.html">mod\_deflate</a>. For <a hreflang="en" href="https://nginx.org/">nginx</a> instructions for <a hreflang="en" href="https://github.com/google/ngx_brotli">enabling Brotli</a> and for <a hreflang="en" href="https://nginx.org/en/docs/http/ngx_http_gzip_module.html">enabling Gzip</a> are available as well.
{# TODO - this paragraph feels like a repeat of above #}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thanks for spotting this. I merged this paragraph with the above one


- for precompressed (static) content: this content is already compressed beforehand, ideally with the highest level possible, and the web server should be set up to find the appropriate compressed files based on the filename extension, for example.
- for dynamic content, which is compressed on the fly for each request by the web server (or a plugin) itself, for dynamically generated text content

When compressing text with Brotli or Gzip it is possible to select different compression levels. Higher compression levels will result in smaller compressed files, but take a longer time to compress. During decompression, CPU usage tends not to be higher for more heavily compressed files. Rather, files that are compressed with a higher compression level are slightly faster to decode.
{# TODO: these bullets dont flow well gramatically but i don't understand it enough to fix it #}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten these bullet points, I have also taken the understanding as feedback, and therefore added slightly more context around this, by adding "The configuration is different depending on when the content is generated" to the previous sentence


Depending on the web server software used, compression needs to be enabled, and the configuration may be separate for precompressed and dynamically compressed content. For <a hreflang="en" href="https://httpd.apache.org/">Apache</a>, Brotli can be enabled with <a hreflang="en" href="https://httpd.apache.org/docs/2.4/mod/mod_brotli.html">mod\_brotli</a>, and Gzip with <a hreflang="en" href="https://httpd.apache.org/docs/2.4/mod/mod_deflate.html">mod\_deflate</a>. For <a hreflang="en" href="https://nginx.org/">nginx</a> instructions for <a hreflang="en" href="https://github.com/google/ngx_brotli">enabling Brotli</a> and for <a hreflang="en" href="https://nginx.org/en/docs/http/ngx_http_gzip_module.html">enabling Gzip</a> are available as well.

{# TODO: i've seen at least 2 other chapters mention britoli and gzip compression, most recently javascript. you could link to that chapter to learn more... or they should be linking to you, not sure actually #}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look and indeed saw the gzip and br analysis. I've linked to the JS article with a specific mention of this, but in the previous section about content types rather than this section here.

chart_url="https://docs.google.com/spreadsheets/d/e/2PACX-1vQtfyTM9VEweN_Hli3IuxxqU1CRap4V5Q28baEs7aEBResoPRgk9Dwp1m_vdS9lzNlfO8J4hZN7GPT7/pubchart?oid=586666706&format=interactive",
sheets_gid="150560131"
)
}}

## How to analyze compression on your sites
{# TODO - should data be broken down more to say which content type? #}
Copy link
Contributor

@lvandeve lvandeve Nov 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The figure has the breakdown, so I take this to mean to describe it in text. I've done so, with a focus on JavaScript since the main difference is visible there.

However in doing so I found an error in the analysis, there was a very common small json file that skewed the Gzip 'optimal' stats (this json file compression sizes are too similar to determine the level). I've fixed the data, figure, and updated the text for this, as well as the reference png image. For Brotli this made no difference.

@tunetheweb tunetheweb merged commit c81f677 into main Nov 30, 2021
@tunetheweb tunetheweb deleted the compression-2021-edits branch November 30, 2021 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editing Content excellence
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compression 2021
3 participants