technical seo 6 min read

robots.txt Explained: How to Control Google's Crawlers and Protect Your SEO

A misconfigured robots.txt file can accidentally block Google from indexing your entire site. Here is exactly how robots.txt works and how to set it up correctly.

By SearchRankTool · 10 March 2026

What Is a robots.txt File?

A robots.txt file is a plain text file placed at the root of your website (e.g., https://yoursite.com/robots.txt) that tells search engine crawlers which pages they are and are not allowed to crawl. It is one of the first files Googlebot fetches when it visits your site.

The Basic robots.txt Syntax

A robots.txt file is made up of simple directives:

  • User-agent: specifies which crawler the rule applies to. * means all crawlers.
  • Disallow: tells the crawler not to visit a specific path.
  • Allow: explicitly permits access to a path — useful when a parent directory is blocked.
  • Sitemap: tells crawlers the URL of your XML sitemap.

What You Should Always Block

Most sites benefit from blocking these paths to preserve crawl budget for important pages:

  • /wp-admin/ — WordPress admin dashboard
  • /admin/ — General admin panels
  • /login/ — Login pages
  • /storage/ — Framework storage directories
  • /staging/ — Staging or development environments

The Critical Mistake: Blocking Your Whole Site

The most catastrophic robots.txt error is accidentally blocking all crawlers with Disallow: /. This single instruction tells every search engine bot not to crawl any page on your site — and it takes effect immediately. Always test your robots.txt in Google Search Console after making any changes.

robots.txt vs noindex

robots.txt prevents crawling. The noindex meta tag prevents indexing. These are fundamentally different. A page blocked by robots.txt can still appear in search results if other sites link to it. To truly remove a page from search results, use the noindex meta tag instead of robots.txt.

Generate Your robots.txt in Seconds

Use our free Robots.txt Generator to create a correctly formatted robots.txt file with all the rules your site needs. No technical knowledge required — just enter your site URL, choose your rules, and copy the output.

Back to Blog