Skip to content

Commit

Permalink
Merge pull request #48 from humanmade/backport-47-to-v7-branch
Browse files Browse the repository at this point in the history
[Backport v7-branch] Document wp_robots
  • Loading branch information
roborourke authored Apr 14, 2021
2 parents 049beae + 8023ab9 commit 29739d3
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 35 deletions.
35 changes: 0 additions & 35 deletions docs/robots-txt.md

This file was deleted.

77 changes: 77 additions & 0 deletions docs/robots.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Robots

## Robots API

The Robots API provides central control over the `robots` meta tag. The `robots` meta tag allows you to utilize a more granular, page-specific approach to controlling how an individual page should be indexed and served to users in search engine results. The meta tag is automatically placed in the `<head>` section of a page.

```html
<!DOCTYPE html>
<html>
<head>
<meta name="robots" content="max-image-preview:large, follow" />
</head>
```

The Robots API allows you to hook into this meta tag to modify its values. By default, the `robots` meta tag will include code that sets the maximum size of an image preview for images on the page. To disable this completely, use a `remove_filter` on the `wp_robots` filter hook.

```php
remove_filter( 'wp_robots', 'wp_robots_max_image_preview_large' );
```

You can modify the contents of the `robots` meta tag using the `wp_robots` filter as well. The values are passed into the filter as an array.

```php
add_filter( 'wp_robots', function( array $robots ) : array {
$robots['follow'] = true;
$robots['foo'] = 'bar';
unset( $robots['max-image-preview'] );

return $robots;
} );
```

The example above would output the following:

```html
<meta name="robots" content="follow, foo:bar" />
```

Note that on local environments, and when the "Search engine visibility" setting in the admin Reading settings is set to "Discourage search engines from indexing this site", the `robots` meta tag will default to include `noindex, nofollow` unless overridden by the filter in addition to any custom parameters.

For more information, refer to the [`wp_robots` hook developer documentation](https://developer.wordpress.org/reference/hooks/wp_robots/) or [this list of available `robots` meta values](https://yoast.com/robots-meta-tags/).

## Robots.txt

The SEO module will read a custom `robots.txt` file from `/.config/robots.txt` in your project's root directory.

The `robots.txt` file is a standard for providing instructions to various bots that may visit your site. There is no guarantee that bots will obey the directives it provides however so other measures should be taken if content should not be indexed such as adding `nofollow` attributes to links and a `robots` meta tag with a value of `noindex` to your website's head.

An example `robots.txt` file may look like the following:

```
# Add a custom sitemap
Sitemap: /custom-sitemap.xml
# Disallow /private for all user agents
User-agent: *
Disallow: /private
# Allow /private/special for one user agent
User-agent: friendly-bot
Allow: /private/special
```

The contents of that file will be appended to the `robots.txt` file generated by the CMS which can be located at `<site-url>/robots.txt`.

Programmatically generated directives may be added to `robots.txt` via the `robots_txt` filter.

```php
add_filter( 'robots_txt', function ( string $output ) : string {
$output .= '
User-agent: *
Disallow: /private
';

return $output;
} );
```

0 comments on commit 29739d3

Please sign in to comment.