SEO

X-Robots-Tag

> [!WARNING] > **Potential Bot Wall/Access Denied Page Detected!** > Reason: Found bot-wall signal: cloudflare

Why It Matters

HTML meta tags only work inside HTML files. Non-HTML resources, PDFs, images, videos, JSON API responses, can't use <meta> tags, leaving a gap in indexing control. X-Robots-Tag fills that gap by applying to any HTTP response. Search Engine Land has documented e-commerce cases where tens of thousands of PDF catalogs got indexed and hurt rankings as duplicate content, fixed in one shot with X-Robots-Tag.

X-Robots-Tag vs Meta Robots vs robots.txt

Method	Location	Scope	Blocks crawling?
robots.txt	`/robots.txt`	URL patterns	Yes, blocks the crawl itself
Meta Robots	HTML `<head>`	That HTML page	No, controls indexing only
X-Robots-Tag	HTTP response header	Any resource type	No, controls indexing only

Critical distinction: robots.txt says "don't crawl," while Meta Robots and X-Robots-Tag say "don't index." To block indexing, Googlebot must actually reach the page to read the directive. Blocking in robots.txt stops the crawl entirely, so Google never sees the indexing instruction.

Main Directives

Directive	Meaning
`noindex`	Don't show in search results
`nofollow`	Don't follow links on the page
`none`	Same as `noindex, nofollow`
`noarchive`	Don't show a cached copy in SERPs
`nosnippet`	Don't show snippets or thumbnails
`unavailable_after: [date]`	Remove from index after the date
`max-snippet: [n]`	Limit snippet length
`max-image-preview: [setting]`	Limit image preview size
`max-video-preview: [n]`	Limit video preview length

For snippet-level control inside an HTML page, use data-nosnippet on the exact element you want excluded from search snippets. That is different from X-Robots-Tag because it hides only selected text rather than changing the whole resource's indexing state.

Example Configurations

Block PDF indexing (Apache .htaccess):

<FilesMatch "\.pdf$">
  Header set X-Robots-Tag "noindex, nofollow"
</FilesMatch>

Block an image directory (Nginx):

location /private-images/ {
  add_header X-Robots-Tag "noindex";
}

Target a specific crawler (Googlebot only):

X-Robots-Tag: googlebot: noindex

Time-limited indexing:

X-Robots-Tag: unavailable_after: 31 Dec 2026 23:59:59 GMT

Element-level snippet exclusion (HTML):

<p data-nosnippet>Do not show this sentence in Google snippets.</p>

Practical Gotchas

Don't combine with robots.txt disallow: If robots.txt blocks the URL, Google can't read the header at all. To block indexing, allow crawling in robots.txt and use X-Robots-Tag noindex.

Requires server configuration: Unlike meta tags, X-Robots-Tag is configured at the web server level (Apache, Nginx, Cloudflare Workers). CMS platforms don't always handle it automatically.

Verify with Search Console or curl: Check that the header actually shows up with Google Search Console's URL Inspection tool or curl -I https://example.com/file.pdf.

Snippet controls are not privacy controls: nosnippet and data-nosnippet change how Google displays a result, but the underlying content remains publicly accessible. Use authentication or server-side access control for private content.

Sources

Publish SEO-ready content with Powerblog

Powerblog helps teams plan, write, and publish optimized blog content that ranks — without the engineering overhead.

Start your free trial