Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEO 2020 edits #1625

Merged
merged 40 commits into from
Dec 8, 2020
Merged

SEO 2020 edits #1625

merged 40 commits into from
Dec 8, 2020

Conversation

rviscomi
Copy link
Member

@rviscomi rviscomi commented Dec 4, 2020

Progress on #908 #1432

Editors' notes:

  • global replace of smart quotes to straight quotes
  • code format for things like robots.txt and HTTP headers
  • Oxford comma
  • replace "homepage" to "home page"
  • right alignment of numeric data in tables
  • replace passive voice with active voice
  • various TODOs and fixes

@rviscomi rviscomi added the editing Content excellence label Dec 4, 2020
@rviscomi rviscomi added this to the 2020 Content Writing milestone Dec 4, 2020
Copy link
Member Author

@rviscomi rviscomi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding placeholder comments for the content team to resolve all of the existing TODOs while I start the deeper edit.

src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
@@ -569,7 +568,7 @@ Feature | Mobile | Desktop
`orientation` | 33.48% | 33.49%
`max-device-width` | 26.23% | 28.15%

<figcaption>{{ figure_link(caption="Media query usage.", sheets_gid="1141218471", sql_file="TODO.sql") }}</figcaption>
<figcaption>{{ figure_link(caption="Media query usage.", sheets_gid="1141218471", sql_file="TODO..sql") }}</figcaption>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tiggerito can you suggest the correct SQL file for this figure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I stole this from you 😮

I asked the CSS chapter for some help:

#898 (comment)

And you provided the data...

https://docs.google.com/spreadsheets/d/1sMWXWjMujqfAREYxNbG_t1fOJKYCA6ASLwtz4pBQVTw/edit#gid=1374950017

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi what's the solution here. Is there any formal SQL to reference?

@@ -637,7 +636,7 @@ Good | 15.44% | 8.39%
Average | 25.49% | 20.19%
Poor | 59.06% | 71.42%

<figcaption>{{ figure_link(caption="Good, Average and Poor ratios of Lighthouse v5 versus v6", sheets_gid="692150551", sql_file="TODO.sql") }}</figcaption>
<figcaption>{{ figure_link(caption="Good, Average and Poor ratios of Lighthouse v5 versus v6", sheets_gid="692150551", sql_file="TODO..sql") }}</figcaption>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tiggerito can you suggest the correct SQL file for this figure?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that came from the performance chapter @fellowhuman1101 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting, so if the query itself doesn't live in the SEO directory then this value wouldn't be very useful and we can omit it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s my intention to handle a full path here, or even relative (../performance/query.sql) in much the same way as we do for images. So would still be good to get the SQL from performance directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bazzadp does it currently work with absolute/relative paths or is there an open issue to track that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will work with relative but not absolute. Will need to test to be 100% sure but pretty confident. Plan on looking at this, this weekend when I’ll add absolute support too. Will try to finish this out for launch - saw your updates but not had a chance to look at them yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that came from the performance chapter @fellowhuman1101 ?

Correct, this came from the performance chapter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This figure seems to have been removed, so no issue anymore.

This was referenced Dec 4, 2020
rviscomi and others added 2 commits December 4, 2020 15:01
@rviscomi rviscomi assigned rviscomi and unassigned aleyda and Tiggerito Dec 4, 2020

Let us go through this years’ websites Organic Search optimization main findings.
{# TODO(authors): Is "Organic Search" a proper noun? Or should it be lowercase? #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi should be lowercase! Should I directly update the doc? Just let me know the best way to proceed to fix :) Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi saw your comment below about directly fixing and pushing commits so I already replaced this! Thanks


{# TODO(analysts, authors): Note that mobile and desktop can't be combined into "all devices" since they are overlapping datasets and most websites would be double-counted. When citing stats throughout the chapter, you need to specify which client you're referring to or include a disclaimer in the intro that stats are mobile unless specified otherwise. #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleyda @Tiggerito I haven't scanned the rest of the document but I'm afraid that this might be common throughout. Could you suggest changes or push commits directly to this branch to resolve any of these issues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi I had a scan through the markdown and only saw this case. I can fix this one. What's the process here, edit the latest draft, and remove the TODO lines, then commit?

I'm around for a few hours then out for the day.


When analyzing the usage of the disallow statement in robots.txt by using Lighthouse-powered data of over 6 million sites, it was found that 97.84% of them were completely crawlable, with only 1.05% using a disallow statement.
This is notable as [Google documentation](https://developers.google.com/search/docs/advanced/robots/intro) states that site owners should not use `robots.txt` as a means to hide web pages from Google Search, as internal linking with descriptive text could result in the page being indexed without a crawler visiting the page. Instead, site owners should use other methods, like a `noindex` directive via meta robots.
{# TODO(authors): Tie this notable fact back to the data: is it notable because the disallow numbers are so low? What does that say about site owners following Google's guidance? #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi it's notable because of Google's guidance about not using robots.txt for that purpose and it is already tied to the "source" by linking to the Google documentation where these guidelines are given using the "Google Documentation" text as anchor text.

The URL is: https://developers.google.com/search/docs/advanced/robots/intro where you can see the guidance:

"You should not use robots.txt as a means to hide your web pages from Google Search results."

--- If this is not clear, could you please let me know how to better rephrase it and link to make it clearer?... I thought it was :(

Copy link
Member Author

@rviscomi rviscomi Dec 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you've written is totally relevant and appropriate, my suggestion is to relate it more directly with the data. I'd suggest something like "The low usage of Disallow statements seems to suggest that site owners are adhering to Google's guidance." if you agree with that interpretation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rviscomi ! Understood :D I added the context on why adding disallow along more "indexable" pages rather than "noindexed" is what is notable.

src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
Copy link
Member Author

@rviscomi rviscomi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slow progress. More TODOs. Please feel free to work on them while I finish the edits. The best workflow would be to resolve them by suggesting changes in the PR.


As part of our examination, we took a look at the incidence rates of different types of structured markup. The available formats include [RDFa](https://www.w3.org/TR/rdfa-primer/) and [Schema.org](https://schema.org/) which come in both the microformats and [JSON-LD](https://www.w3.org/TR/json-ld11/) flavors. Google has recently [dropped the support for data-vocabulary](https://developers.google.com/search/blog/2020/01/data-vocabulary), a vocabulary that was primarily used to implement breadcrumbs.
{# TODO(authors): Is schema.org itself a "format"? #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi I really can't think of another way to call it - could you or maybe @Tiggerito suggest one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema.org is a vocabulary, it defines what can exist in its world. Formats are things like json-ld, microdata and rdfa which are ways to write information in a vocabulary. data-vocabulary is an alternate vocabulary that Google is dropping.

Chiseling in a stone is a format, the hieroglyphics you make are the vocabulary.

This said after a few 🍷s. So don't quote me, unless it works.

There are more formats. Those are the ones I looked for because they are the main ones that Google officially cares about.

In the future I think it would be worth looking into other things like open graph (a vocabulary+rdfa format) and its hybrid/proprietary use by Facebook, Twitter and Pinterest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tiggerito omg yes! they're vocabularies :) Will update accordingly @rviscomi!

Tony, you always bring "structured data light", with wine or without :D

src/content/en/2020/seo.md Outdated Show resolved Hide resolved

{# TODO(authors): Is this disparity really noteworthy? The difference seems quite small. #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi Although the disparity between mobile vs. desktop is not "big" it's important to show how there are still slightly more desktop pages featuring one, because of mobile first index. Would you suggest to eliminate the mention of the desktop instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, I'll drop this


Additionally, we found that 38.61% of desktop pages and 39.26% of mobile pages feature JSON-LD or microformat structured data in the raw HTML, while 40.09% of desktop pages and 40.97% of mobile pages feature structured data in the rendered DOM.
{# TODO(authors): This section introduces a few stats but doesn't go into your interpretations of the results. What do you hope readers take away from these stats? #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi I added context/interpretation of SD raw vs. rendered stats.

src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved
src/content/en/2020/seo.md Outdated Show resolved Hide resolved

With the increasing popularity of mobile devices to browse and search across the web, search engines have been taking mobile friendliness into consideration as a ranking factor for several years.

{# TODO(authors): MFI has been discussed earlier, so the "in fact" doesn't pack as much punch this time. Consider rephrasing. #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi the "last year reference" that had been included was correct. The MFI for new sites was launched in July last year (as can be seen here) but the confusion was caused because I hadn't linked to that "announcement page" but instead to one published this July describing the different stages of the process and the next ones. So, it was indeed last year.
What I've done to avoid confusions is to 1) add directly the year "2019" 2) link tot he announcement page about the last year MFI launch for all new sites.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great thank you

{# TODO(authors): MFI has been discussed earlier, so the "in fact" doesn't pack as much punch this time. Consider rephrasing. #}
In fact, [since 2016](https://developers.google.com/search/blog/2016/11/mobile-first-indexing) Google has been moving to a mobile-first index, meaning that the content that is crawled, indexed, and ranked is the one accessible to mobile users and the [Smartphone Googlebot](https://developers.google.com/search/docs/advanced/crawling/googlebot?hl=en).

{# TODO(authors): Can you clarify the timeline? You say "July last year" but the blog post is dated July 2020. Would "July this year" change how you structure this sentence chronologically? #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi I reworded this too along the previous change.

{{ figure_link(
caption="Percent of pages that include each media query feature.",
sheets_gid="1141218471",
sql_file="TODO..sql"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tiggerito I think this change obsoleted the last comment thread, reviving it here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My response was:

It looks like I stole this from you 😮

I asked the CSS chapter for some help:

#898 (comment)

And you provided the data...

https://docs.google.com/spreadsheets/d/1sMWXWjMujqfAREYxNbG_t1fOJKYCA6ASLwtz4pBQVTw/edit#gid=1374950017


{# NOTE(authors): I've made some ruthless edits to this section to remove everything related to synthetic measurement of CWV, including the entire Lighthouse discussion, which is orthogonal to the real-user aspect of CWV. Please push back if you disagree with any of these edits. #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ @aleyda FYI this was a significant edit. I wanted to makes sure the CWV discussion was focused on real-user data from CrUX and not get it confused with lab data from Lighthouse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up @rviscomi ! @fellowhuman1101 since these are major editions from the performance section, could you please take a look too in case there's something you would like to change/replace? :) I'll check it out too but want to make sure you're ok with them!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aleyda Confirming-- no changes requested. I think @rviscomi 's edits are judicious and help disambiguate Crux data's role as a ranking factor from Lighthouse's lab data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fellowhuman1101, that's great - I just took a look too, it's all good @rviscomi :)


{# TODO(analysts): Please double check the following two sql_files, as these metrics are related to Lighthouse. #}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot @rviscomi . That chart did not come from lighthouse.sql. It's in:

https://docs.google.com/spreadsheets/d/1ram47FshAjzvbQVJbAQPgxZN7PPOPCKIK67VJZCo92c/edit#gid=996380787

I think @fellowhuman1101 did that one?

I now have 20 Github tabs, 10 drive tabs, and 3 sheets tabs open. And 15 tabs in my Visual Studio Code. Hopefully no more needed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi @Tiggerito The data is from the Performance results workbook (Tab: Web Vitals per device) >> https://docs.google.com/spreadsheets/d/164FVuCQ7gPhTWUXJl1av5_hBxjncNi0TK8RnNseNPJQ/edit#gid=1270303192&range=A1

Copied into the SEO chapter to create the charts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rviscomi can I reference this by: ../09_Performance/web_vitals_by_device.sql

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the sql reference for this chart and the one after using a relative path

src/content/en/2020/seo.md Outdated Show resolved Hide resolved
@rviscomi
Copy link
Member Author

rviscomi commented Dec 6, 2020

I've currently edited up through the Performance section and I hope to have the rest edited tonight (New York time). Apologies for the delay! Meanwhile there are a few TODOs for the team to look at while I finish. I'll mark it "ready for review" when I'm done, and open it up to comments on my edits beyond the TODOs.

@rviscomi rviscomi marked this pull request as ready for review December 7, 2020 01:40
antoineeripret and others added 2 commits December 7, 2020 13:47
Several images have been fixed based on Tony's and Rick's comments.
@github-actions
Copy link
Contributor

github-actions bot commented Dec 7, 2020

Images automagically compressed by Calibre's image-actions

Compression reduced images by 44.8%, saving 98.65 KB.

Filename Before After Improvement Visual comparison
src/static/images/2020/seo/seo-canonical-implementation-method.png 29.36 KB 15.94 KB -45.7% View diff
src/static/images/2020/seo/seo-nofollow-ugc-sponsored-attributes.png 31.42 KB 17.48 KB -44.4% View diff
src/static/images/2020/seo/seo-presence-of-canonical-tag.png 33.35 KB 18.80 KB -43.6% View diff
src/static/images/2020/seo/seo-presence-of-h-elements.png 32.67 KB 17.80 KB -45.5% View diff
src/static/images/2020/seo/seo-presence-of-non-empty-h-elements.png 34.43 KB 19.01 KB -44.8% View diff
src/static/images/2020/seo/seo-robots-directive-use.png 29.46 KB 16.02 KB -45.6% View diff
src/static/images/2020/seo/seo-title-character-count.png 29.57 KB 16.56 KB -44.0% View diff

492 images did not require optimisation.

Update required: Update image-actions configuration to the latest version before 1/1/21. See README for instructions.

@rviscomi
Copy link
Member Author

rviscomi commented Dec 7, 2020

Thank you everyone! @aleyda if this looks good to you I can merge.

@aleyda
Copy link
Contributor

aleyda commented Dec 8, 2020

Thank you everyone! @aleyda if this looks good to you I can merge.

It does! Please merge @rviscomi :)

@ipullrank could you please leave a bit of time today to go through the pending reviews/feedback needed from you? I think is the only pending atm. Thanks a lot!

@tunetheweb tunetheweb merged commit c23c72e into main Dec 8, 2020
@tunetheweb tunetheweb deleted the seo-2020-edits branch December 8, 2020 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editing Content excellence
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants