Robots
meta
tag,
data-nosnippet
, and
X-Robots-Tag
specifications
This document details how the page- and text-level settings can be used to adjust how Google
presents your content in search results. You can specify page-level settings by including a
meta
tag on HTML pages or in an HTTP header. You can specify text-level settings with the
data-nosnippet
attribute on HTML elements within a page.
Keep in mind that these settings can be read and followed only if crawlers are allowed to
access the pages that include these settings.
The
<meta name="robots" content="noindex">
rule applies to search engine
crawlers. To block non-search crawlers, such as
AdsBot-Google
, you might need to add rules targeted to the specific
crawler (for example,
<meta name="AdsBot-Google" content="noindex">
).
The
robots
meta
tag lets you utilize a granular, page-specific approach to controlling how an
individual page should be indexed and served to users in Google Search results. Place the
robots
meta
tag in the
<head>
section of a given page, like
this:
<!DOCTYPE html>
<html><head>
<meta name="robots" content="noindex">
(…)
</head>
<body>
(…)
</body>
</html>
In this example, the
robots
meta
tag instructs search
engines not to show the page in search results. The value of the
name
attribute
(
robots
) specifies that the rule applies to all crawlers. Both the
name
and the
content
attributes are case-insensitive. To address a specific crawler,
replace the
robots
value of the
name
attribute with the user agent token
of the crawler that you are addressing. Google supports two user agent tokens in the
robots
meta
tag; other values are ignored:
googlebot
: for all text results.
googlebot-news
: for news results.
For example, to instruct Google specifically not to show a page in its search results, you can
specify
googlebot
as the name of the
meta
tag:
<meta name="googlebot" content="noindex">
To show a page in Google's web search results, but not in Google News, use the
googlebot-news
meta
tag:
<meta name="googlebot-news" content="noindex">
To specify multiple crawlers individually, use multiple
robots
meta
tags:
<meta name="googlebot" content="noindex">
<meta name="googlebot-news" content="nosnippet">
To block indexing of non-HTML resources, such as PDF files, video files, or image files,
use the
X-Robots-Tag
response header
instead.
Using the
X-Robots-Tag
HTTP header
The
X-Robots-Tag
can be used as an element of the HTTP header
response for a given URL. Any rule that can be used in a
robots
meta
tag can also be
specified as an
X-Robots-Tag
. Here's an example of an HTTP
response with an
X-Robots-Tag
instructing crawlers not to index a
page:
HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noindex
(…)
Multiple
X-Robots-Tag
headers can be combined within the HTTP
response, or you can specify a comma-separated list of rules. Here's an example of an
HTTP header response which has a
noarchive
X-Robots-Tag
combined with an
unavailable_after
X-Robots-Tag
.
HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noarchive
X-Robots-Tag: unavailable_after: 25 Jun 2010 15:00:00 PST
(…)
The
X-Robots-Tag
may optionally specify a user agent before the
rules. For instance, the following set of
X-Robots-Tag
HTTP
headers can be used to conditionally allow showing of a page in search results for different
search engines:
HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: otherbot: noindex, nofollow
(…)
Rules specified without a user agent are valid for all crawlers. The HTTP header, the
user agent name, and the specified values are not case sensitive.
Valid indexing and serving rules
The following rules, also available in
machine-readable format
, can be used to
control indexing and serving of a
snippet
with
the
robots
meta
tag and the
X-Robots-Tag
. Each value represents a specific
rule.
Multiple rules may be combined
in a comma-separated
list or in separate
meta
tags. These rules are case-insensitive.
Rules
|
all
|
There are no restrictions for indexing or serving. This rule is the default value and
has no effect if explicitly listed.
|
noindex
|
Do not show this page, media, or resource in search results. If you don't specify this
rule, the page, media, or resource may be indexed and shown in search results.
To remove information from Google, follow our
step-by-step guide
.
|
nofollow
|
Do not follow the links on this page. If you don't specify this rule, Google may use
the links on the page to discover those linked pages. Learn more about
nofollow
.
|
none
|
Equivalent to
noindex, nofollow
.
|
noarchive
|
Do not show a
cached link
in search results. If you don't specify this rule, Google may generate a cached page
and users may access it through the search results.
|
nositelinkssearchbox
|
Do not show a
sitelinks search box
in the search results for this page. If you don't specify this rule, Google may
generate a search box specific to your site in search results, along with other direct links to your site.
|
nosnippet
|
Do not show a text snippet or video preview in the search results for this page. A
static image thumbnail (if available) may still be visible, when it results in a better
user experience. This applies to all forms of search results (at Google: web search,
Google Images, Discover, AI Overviews) and
will also prevent the content from being used as a direct input for AI Overviews.
If you don't specify this rule, Google may generate a text snippet and video
preview based on information found on the page.
To exclude certain sections of your content from appearing in search result snippets,
use the
data-nosnippet
HTML attribute
.
|
indexifembedded
|
Google is allowed to index the content of a page if it's embedded in another page
through
iframes
or similar HTML tags, in spite of a
noindex
rule.
indexifembedded
only has an effect if it's accompanied by
noindex
.
|
max-snippet:
[number]
|
Use a maximum of [number] characters as a textual snippet for this search result. (Note
that a URL may appear as multiple search results within a search results page.) This
does not affect image or video previews. This applies to all forms of search results
(such as Google web search, Google Images, Discover, Assistant, AI Overviews) and
will also limit how much of the content may be used as a direct input for AI Overviews.
However, this limit
does not apply in cases where a publisher has separately granted permission for use of
content. For instance, if the publisher supplies content in the form of in-page
structured data or has a license agreement with Google, this setting does not interrupt
those more specific permitted uses. This rule is ignored if no parseable [number]
is specified.
If you don't specify this rule, Google will choose the length of the snippet.
Special values:
-
0
: No snippet is to be shown. Equivalent to
nosnippet
.
-
-1
: Google will choose the snippet length that it
believes is most effective to help users discover your content and direct users to
your site.
Examples:
To stop a snippet from displaying in search results:
<meta name="robots" content="max-snippet:0">
To allow up to 20 characters to be shown in the snippet:
<meta name="robots" content="max-snippet:20">
To specify that there's no limit on the number of characters that can be shown in the
snippet:
<meta name="robots" content="max-snippet:-1">
|
max-image-preview:
[setting]
|
Set the maximum size of an image preview for this page in a search results.
If you don't specify the
max-image-preview
rule, Google may show an
image preview of the default size.
Accepted [setting] values:
-
none
: No image preview is to be shown.
-
standard
: A default image preview may be shown.
-
large
: A larger image preview, up to the width of the
viewport, may be shown.
This applies to all forms of search results (such as Google web search, Google Images,
Discover, Assistant). However, this limit does not apply in cases where a publisher has
separately granted permission for use of content. For instance, if the publisher
supplies content in the form of in-page structured data (such as AMP and canonical
versions of an article) or has a license agreement with Google, this setting will not
interrupt those more specific permitted uses.
If you don't want Google to use larger thumbnail images when their AMP pages
and canonical version of an article are shown in Search or Discover, specify a
max-image-preview
value of
standard
or
none
.
Example:
<meta name="robots" content="max-image-preview:standard">
|
max-video-preview:
[number]
|
Use a maximum of [number] seconds as a video snippet for videos on this page in search
results.
If you don't specify the
max-video-preview
rule, Google may show a
video snippet in search results, and you leave it up to Google to decide how long the
preview may be.
Special values:
-
0
: At most, a static image may be used, in accordance to
the
max-image-preview
setting.
-1
: There is no limit.
This applies to all forms of search results (at Google: web search, Google Images,
Google Videos, Discover, Assistant). This rule is ignored if no parseable [number]
is specified.
Example:
<meta name="robots" content="max-video-preview:-1">
|
notranslate
|
Don't offer translation of this page in search results. If you don't specify this
rule, Google may provide a
translation of the title link and snippet
of a search result for results that aren't in the language
of the search query. If the user clicks the translated title link, all further user
interaction with the page is through Google Translate, which will automatically translate
any links followed.
|
noimageindex
|
Do not index images on this page. If you don't specify this value, images on the page
may be indexed and shown in search results.
|
unavailable_after:
[date/time]
|
Do not show this page in search results after the specified date/time. The date/time
must be specified in a widely adopted format including, but not limited to
RFC 822
,
RFC 850
, and
ISO 8601
.
The rule is ignored if no valid date/time is specified. By default there is no
expiration date for content.
If you don't specify this rule, this page may be shown in search results
indefinitely. Googlebot will decrease the crawl rate of the URL considerably after the specified
date and time.
Example:
<meta name="robots" content="unavailable_after: 2020-09-21">
|
Handling combined indexing and serving rules
You can create a multi-rule instruction by combining
robots
meta
tag rules with commas or by using multiple
meta
tags. Here is an example of a
robots
meta
tag that instructs web crawlers to not index the
page and to not crawl any of the links on the page:
Comma-separated list
<meta name="robots" content="noindex, nofollow">
<meta name="robots" content="noindex">
<meta name="robots" content="nofollow">
Here is an example that limits the text snippet to 20 characters, and allows a large image
preview:
<meta name="robots" content="max-snippet:20, max-image-preview:large">
For situations where multiple crawlers are specified along with different rules, the
search engine will use the sum of the negative rules. For example:
<meta name="robots" content="nofollow">
<meta name="googlebot" content="noindex">
The page containing these
meta
tags will be interpreted as having a
noindex, nofollow
rule when crawled by Googlebot.
Using the
data-nosnippet
HTML attribute
You can designate textual parts of an HTML page not to be used as a snippet. This can be done
on an HTML-element level with the
data-nosnippet
HTML attribute on
span
,
div
, and
section
elements. The
data-nosnippet
is considered a
boolean attribute
.
As with all boolean attributes, any value specified is ignored. To ensure machine-readability,
the HTML section must be valid HTML and all appropriate tags must be closed accordingly.
Examples:
<p>This text can be shown in a snippet
<span data-nosnippet>and this part would not be shown</span>.</p>
<div data-nosnippet>not in snippet</div>
<div data-nosnippet="true">also not in snippet</div>
<div data-nosnippet="false">also not in snippet</div>
<!-- all values are ignored -->
<div data-nosnippet>some text</html>
<!-- unclosed "div" will include all content afterwards -->
<mytag data-nosnippet>some text</mytag>
<!-- NOT VALID: not a span, div, or section -->
<p>This text can be shown in a snippet.</p>
<div data-nosnippet>
<p>However, this is not in snippet.</p>
<ul>
<li>Stuff not in snippet</li>
<li>More stuff not in snippet</li>
</ul>
</div>
Google typically renders pages in order to index them, however rendering is not guaranteed.
Because of this, extraction of
data-nosnippet
may happen both
before and after rendering. To avoid uncertainty from rendering, do not add or remove the
data-nosnippet
attribute of existing nodes through JavaScript.
When adding DOM elements through JavaScript, include the
data-nosnippet
attribute as necessary when initially adding the
element to the page's DOM. If custom elements are used, wrap or render them with
div
,
span
, or
section
elements if you need to use
data-nosnippet
.
Using structured data
Robots
meta
tags govern the amount of content that Google extracts automatically from web
pages for display as search results. But many publishers also use schema.org structured data
to make specific information available for
search presentation
.
Robots
meta
tag
limitations don't affect the use of that structured data, with the exception of
article.description
and the
description
values for structured data specified for other
creative works. To specify the maximum length of a preview based on these
description
values, use the
max-snippet
rule. For example,
recipe
structured data on a page is eligible for inclusion in the
recipe carousel, even if the text preview would otherwise be limited. You can limit the length
of a text preview with
max-snippet
, but that
robots
meta
tag
doesn't apply when the information is provided using structured data for rich results.
To manage the use of structured data for your web pages, modify the structured data types and
values themselves, adding or removing information in order to provide only the data you want
to make available. Also note that structured data remains usable for search results when
declared within a
data-nosnippet
element.
Practical implementation of
X-Robots-Tag
You can add the
X-Robots-Tag
to a site's HTTP responses through
the configuration files of your site's web server software. For example, on Apache-based web
servers you can use .htaccess and httpd.conf files. The benefit of using an
X-Robots-Tag
with HTTP responses is that you can specify crawling
rules that are applied globally across a site. The support of regular expressions allows
a high level of flexibility.
For example, to add a
noindex, nofollow
X-Robots-Tag
to the HTTP response for all
.PDF
files across an
entire site, add the following snippet to the site's root
.htaccess
file or
httpd.conf
file on
Apache, or the site's
.conf
file on NGINX.
Apache
<Files ~ "\.pdf$">
Header set X-Robots-Tag "noindex, nofollow"
</Files>
NGINX
location ~* \.pdf$ {
add_header X-Robots-Tag "noindex, nofollow";
}
You can use the
X-Robots-Tag
for non-HTML files like image files
where the usage of
robots
meta
tags in HTML is not possible. Here's an example of adding a
noindex
X-Robots-Tag
rule for
images files (
.png
,
.jpeg
,
.jpg
,
.gif
) across an entire site:
Apache
<Files ~ "\.(png|jpe?g|gif)$">
Header set X-Robots-Tag "noindex"
</Files>
NGINX
location ~* \.(png|jpe?g|gif)$ {
add_header X-Robots-Tag "noindex";
}
You can also set the
X-Robots-Tag
headers for individual static files:
Apache
# the htaccess file must be placed in the directory of the matched file.
<Files "unicorn.pdf">
Header set X-Robots-Tag "noindex, nofollow"
</Files>
NGINX
location = /secrets/unicorn.pdf {
add_header X-Robots-Tag "noindex, nofollow";
}
Combining robots.txt rules with indexing and serving rules
robots
meta
tags and
X-Robots-Tag
HTTP headers are discovered when
a URL is crawled. If a page is disallowed from crawling through the robots.txt file, then any
information about indexing or serving rules will not be found and will therefore be
ignored. If indexing or serving rules must be followed, the URLs containing those
rules cannot be disallowed from crawling.