What is the Disallow directive in robots.txt?

The Disallow directive tells search engine crawlers not to access a specific URL, directory, or file. For example, 'Disallow: /wp-admin/' blocks all bots from crawling your WordPress admin area. You can use Disallow: / to block all pages, or Disallow: (empty) to allow all pages.

What is crawl-delay in robots.txt and should I set it?

Crawl-delay tells search engine bots how many seconds to wait between crawl requests. This helps prevent server overload on high-traffic or resource-limited servers. Note: Google ignores crawl-delay — use Google Search Console to control Googlebot's crawl rate instead. Bing, Yandex, and other bots do respect crawl-delay.

What is the difference between robots.txt and a sitemap?

robots.txt tells search engine crawlers which pages NOT to visit, while an XML sitemap tells them which pages they SHOULD visit and index. These are complementary tools: use robots.txt to block non-essential pages and an XML sitemap to guide crawlers to your most important content. You can reference your sitemap URL inside your robots.txt file.

How do I block AI bots like GPTBot in robots.txt?

You can block AI training bots by adding specific user-agent rules. For example: 'User-agent: GPTBot' followed by 'Disallow: /' blocks OpenAI's GPTBot from crawling your site. Our robots.txt generator includes a one-click option to block all major AI bots including GPTBot (OpenAI), CCBot (Common Crawl), PerplexityBot, and Google-Extended.

Free Robots.txt Generator – Create, Optimize & Validate Your robots.txt File

⚙️ Free SEO Tool

Free Robots.txt Generator – Create, Optimize & Validate Your robots.txt File

Generate a perfect robots.txt file in seconds. Set user-agent rules, disallow/allow directories, crawl-delay, sitemap URL, WordPress & Shopify presets, and block AI bots — all free, no registration required.

✅ 100% Free ⚡ Instant Generation 🔍 Live Preview ✔️ Syntax Validation 📥 Copy & Download 🤖 Block AI Bots

🛠️ Configure Your robots.txt Rules

Quick Presets:

📄 Live Preview — updates as you type

# Your robots.txt file will appear here as you configure the rules above…

User-agent Select which crawler to target

Crawl-delay (seconds) Not supported by Googlebot — use for Bing/Yandex

Sitemap URL Full URL to your XML sitemap — highly recommended for SEO

Disallow Directories / URLs

Leave blank to disallow nothing (allow all). Use / to block everything.

Allow Directories / URLs

Override a Disallow rule for specific sub-paths

🤖 Block AI Training Bots (2026)

Select AI bots to block from training on your content

Custom Rules (Advanced) Advanced: add any extra directives manually

Step 1: Download your robots.txt file using the button above.

Step 2: Upload the file to the root directory of your website using FTP, cPanel File Manager, or your hosting control panel. The file must be accessible at https://yourdomain.com/robots.txt

Step 3: For WordPress: Upload to /public_html/robots.txt or use an SEO plugin to edit it directly from the dashboard.

Step 4: Verify your robots.txt is live by visiting https://yourdomain.com/robots.txt in your browser.

Step 5: Test your robots.txt in Google Search Console → robots.txt Tester to verify no important pages are accidentally blocked.

⚠️ Important: After uploading, wait 24–48 hours for search engine crawlers to pick up the changes.

250K+

Files Generated

100%

Free, No Sign-Up

<5s

Instant Generation

AI Bots You Can Block

Step-by-Step Guide

How to Create a robots.txt File in 3 Simple Steps

Create and deploy your robots.txt file in under a minute — no coding experience required.

Configure Your Crawler Rules

Select a user-agent (choose * for all crawlers or target a specific bot like Googlebot). Add the directories you want to disallow — such as /wp-admin/, /checkout/, or parameter URLs. Use the WordPress, Shopify, or General Blog presets for instant configuration.

Add Sitemap URL and Optional Crawl-Delay

Enter your XML sitemap URL (e.g., https://example.com/sitemap.xml). Optionally set a crawl-delay in seconds if your server is under load — note that Google ignores this directive; it is respected by Bing, Yandex, and other crawlers.

Generate, Copy, and Upload

Click Generate robots.txt to create your file. Review the live preview and validation results. Click Copy to Clipboard or Download as .txt, then upload the file to the root directory of your website so it is accessible at https://yourdomain.com/robots.txt.

Tool Features

robots.txt Generator Features: Everything You Need in One Free Tool

Our free robots.txt file generator includes every feature that competitors charge for or simply don’t offer.

🤖

User-agent Selector

Target all crawlers at once with * or apply specific rules to Googlebot, Bingbot, Yandexbot, Baiduspider, social bots, and SEO crawlers separately.

🚫

Disallow & Allow Directives

Block crawlers from admin pages, private directories, and parameter URLs. Use Allow rules to override Disallow for specific sub-paths within a blocked directory.

⏱️

Crawl-delay Setting

Set a crawl-delay to prevent server overload from aggressive crawlers. Respected by Bing, Yandex, and most non-Google bots. Use Google Search Console for Googlebot rate control.

🗺️

Sitemap URL Integration

Add your XML sitemap URL directly in your robots.txt file. This helps search engines discover your most important pages faster and improves crawl efficiency.

⚡

WordPress & Shopify Presets

One-click presets for WordPress (blocks /wp-admin/, /wp-includes/), Shopify (blocks /admin/, /checkout/), general blogs, and e-commerce sites.

👁️

Real-time Live Preview

See your robots.txt file update instantly as you configure each rule — no clicking required. Review the exact output before downloading.

✔️

Syntax Validation

Automatic syntax checking highlights missing directives, empty rules, and configuration issues that could accidentally block important pages from Google.

🛡️

Block AI Training Bots

Block AI data-collection bots including GPTBot (OpenAI), CCBot (Common Crawl), PerplexityBot, Google-Extended (Gemini), Claude-Web, and other AI scrapers with one click.

📥

Copy & Download Output

Copy your robots.txt to clipboard instantly or download it as a ready-to-upload .txt file. No email required, no hidden fees, no account creation.

Syntax Guide

robots.txt Directives Explained: User-agent, Disallow, Allow & Crawl-delay

Understanding the four core robots.txt directives is essential for controlling how search engines crawl your site.

Directive	Purpose	Example	Notes
User-agent	Specifies which crawler the following rules apply to	User-agent: *	Use `*` for all bots. Each new User-agent block resets the rules for that bot.
Disallow	Tells the crawler NOT to access a specific URL or directory	Disallow: /wp-admin/	Empty value means “allow all”. Use `/` to block everything. Case-sensitive.
Allow	Overrides a Disallow rule for a specific sub-path	Allow: /wp-admin/admin-ajax.php	Useful when a Disallow blocks a directory but you need one file accessible.
Crawl-delay	Sets the wait time (in seconds) between crawler requests	Crawl-delay: 10	Google ignores this. Respected by Bing, Yandex. Use for server protection.
Sitemap	Points crawlers to your XML sitemap file	Sitemap: https://example.com/sitemap.xml	Can appear outside any User-agent block. Place at the end of the file.
# Comment	Adds explanatory comments — ignored by all crawlers	# Block admin area	Start the line with `#`. Useful for documenting your rules.

WordPress Guide

WordPress robots.txt Generator: Best Rules for WordPress & Shopify Sites

Different platforms require different robots.txt configurations. Here are the recommended rules for the most popular CMS platforms.

🔵 WordPress – Recommended robots.txt

User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /trackback/ Disallow: /?s= Allow: /wp-admin/admin-ajax.php User-agent: GPTBot Disallow: / Sitemap: https://example.com/sitemap.xml

🟢 Shopify – Recommended robots.txt

User-agent: * Disallow: /admin/ Disallow: /cart/ Disallow: /checkout/ Disallow: /orders/ Disallow: /account/ Disallow: /collections/*sort_by* Disallow: /*?* Sitemap: https://example.com/sitemap.xml

⚫ Allow All Crawlers (Minimal)

User-agent: * Disallow: Sitemap: https://example.com/sitemap.xml

🔴 Block All Crawlers (Coming Soon)

User-agent: * Disallow: / # Warning: This blocks all search # engines from crawling your site. # Use only for sites under construction # or private staging environments.

SEO Impact

How robots.txt Protects Your SEO Crawl Budget & Google Rankings

robots.txt is one of the most powerful — and most misunderstood — technical SEO tools available to website owners.

💰

Save Your Crawl Budget

Google assigns a limited crawl budget to each website based on its authority and server health. Blocking admin pages, duplicate URLs, filtered category pages, and faceted navigation URLs with robots.txt ensures Googlebot focuses on your most valuable content.

📊

Prevent Duplicate Content Indexing

Parameter-based URLs (like ?sort=price, ?ref=email) create duplicate content that dilutes your page’s SEO value. Blocking these with robots.txt prevents Google from wasting crawl resources on thin or duplicate content.

🔒

Protect Sensitive Areas

Block your WordPress admin area, staging environment, internal search results, user account pages, and payment/checkout pages from appearing in search results. These pages provide no SEO value and should never be indexed.

⚡

Faster Indexing of New Content

When Googlebot isn’t wasting time crawling irrelevant pages, it discovers and indexes your new blog posts, product pages, and landing pages faster — resulting in quicker ranking improvements after publishing.

⚠️ Important Distinction: robots.txt Disallow does NOT remove a page from Google’s index. It only prevents crawling. To remove a page from search results, use the <meta name="robots" content="noindex"> tag inside the page’s HTML, or use Google Search Console’s URL removal tool.

Mistakes to Avoid

Common robots.txt Mistakes That Can Hurt Your SEO Rankings

These are the most frequent robots.txt errors that accidentally block important pages from Google and destroy search rankings.

🚫

Blocking Your Entire Website

Using Disallow: / under User-agent: * blocks all crawlers from your entire site. This is a catastrophic SEO error that removes all your pages from Google’s index.

📄

Blocking CSS and JavaScript Files

Disallowing your CSS and JS files prevents Google from rendering your pages properly. Google needs to see your styling and scripts to understand your content and user experience.

🔀

Confusing robots.txt with noindex

robots.txt blocks crawling. noindex prevents indexing. A page blocked by robots.txt can still appear in Google’s index if other sites link to it. Use noindex meta tags to fully remove pages from search results.

⚡

Setting Crawl-delay for Googlebot

Google officially ignores the crawl-delay directive. Setting it for Googlebot has no effect. Use the crawl rate settings in Google Search Console instead to control how fast Googlebot crawls your site.

🗺️

Forgetting to Add Your Sitemap URL

Not including your sitemap URL in robots.txt is a missed opportunity. Adding Sitemap: https://example.com/sitemap.xml makes it easy for all crawlers to find and index your content quickly.

📁

Uploading to the Wrong Directory

robots.txt must be in your site’s root directory, accessible at https://yourdomain.com/robots.txt. Uploading to a subdirectory (like /blog/robots.txt) means search engines won’t find it.

Best Practices 2026

robots.txt Best Practices for Website Owners in 2026

Follow these expert-recommended practices to optimize your robots.txt file for maximum SEO benefit.

Always Include Your Sitemap URL

Add Sitemap: https://yourdomain.com/sitemap.xml to every robots.txt file. This one line helps all search engines discover your content immediately.

Block Non-Public Directories

Disallow admin areas, login pages, user account pages, and internal scripts. These pages provide zero SEO value and waste your crawl budget if left accessible to bots.

Block Faceted & Parameter URLs

E-commerce and content sites often generate thousands of duplicate URLs through sorting, filtering, and tracking parameters. Block these with Disallow: /*?* or specific parameter patterns.

Consider Blocking AI Training Bots

In 2026, AI companies use specialized bots to scrape content for model training. If you want to prevent your content from being used to train AI models, block GPTBot, CCBot, Google-Extended, and PerplexityBot.

Test in Google Search Console

After uploading, test your robots.txt using Google Search Console’s robots.txt Tester to confirm no important pages are accidentally blocked before changes go live.

Keep It Simple and Well-Commented

Add # comments to explain your rules. Simple, clearly documented robots.txt files are easier to maintain and less likely to contain accidental blocking errors.

More Free Tools

More Free SEO Tools by QuickSEOTool

All free. No registration. Trusted by website owners, developers, and SEO professionals.

XML Sitemap Generator

🔍

Keyword Density Checker

🔗

Backlink Checker

🔐

SSL Certificate Checker

⚡

Website Speed Test

▶️

YouTube Tag Generator

📊

Domain Authority Checker

🤖

AI Overview Snippet Optimizer

📜

Plagiarism Certificate Generator

Frequently Asked Questions

robots.txt Generator — Frequently Asked Questions

Everything website owners and developers need to know about robots.txt files, directives, and SEO best practices.

What is a robots.txt file? ▼

A robots.txt file is a plain text file placed at the root of your website (e.g., example.com/robots.txt) that tells search engine crawlers which pages or directories they can or cannot access. It helps control your crawl budget, prevents indexing of duplicate or sensitive content, and is a fundamental part of technical SEO for any website.

How do I create a robots.txt file for WordPress? ▼

Use our free robots.txt generator and click the WordPress Preset button. It automatically adds recommended rules including Disallow for /wp-admin/ and /wp-includes/ while keeping all your posts and pages accessible to search engines. Download the file and upload it to your /public_html/ directory via FTP or cPanel File Manager. The file must be live at yourdomain.com/robots.txt.

What is the Disallow directive and how does it work? ▼

The Disallow directive tells search engine crawlers not to access a specific URL, directory, or pattern. For example, Disallow: /wp-admin/ blocks all bots from your WordPress admin area. Use Disallow: / to block everything, or leave it empty (Disallow:) to allow all pages. Disallow is case-sensitive and supports basic wildcard patterns with *.

What is crawl-delay in robots.txt and should I use it? ▼

Crawl-delay instructs search bots how many seconds to wait between requests. It helps prevent server overload on resource-limited servers. However, Google officially ignores the crawl-delay directive — use Google Search Console’s crawl rate settings for Googlebot instead. Bing, Yandex, and most other crawlers do respect crawl-delay. Set it only if your server struggles under heavy crawl loads.

Does robots.txt affect my Google rankings? ▼

Yes — robots.txt directly affects your SEO crawl budget. Blocking admin pages, duplicate parameter URLs, and thin content helps Googlebot focus on your most valuable pages, leading to faster indexing and better rankings. However, blocking a page in robots.txt does NOT remove it from Google’s index. Pages linked from other sites can still appear in search results even if blocked by robots.txt. Use the noindex meta tag to actually remove pages from Google’s index.

What is the difference between robots.txt and an XML sitemap? ▼

robots.txt tells search engine crawlers which pages NOT to visit. An XML sitemap tells them which pages they SHOULD visit and prioritize for indexing. These tools work together: use robots.txt to block non-essential pages and waste-reducing URLs, and use a sitemap to guide crawlers to all your important content. You should reference your sitemap URL inside your robots.txt file for maximum SEO benefit.

What is the difference between robots.txt Disallow and noindex? ▼

Disallow in robots.txt prevents crawlers from accessing a page — but if external websites link to that URL, Google may still index the page without crawling it. The noindex meta tag (placed inside a page’s HTML head) tells Google explicitly not to include that page in search results, even if it crawls it. For reliably removing a page from Google’s index, always use noindex rather than robots.txt Disallow.

How do I block AI bots like GPTBot in my robots.txt? ▼

To block AI training bots, add separate user-agent blocks for each bot. For example: User-agent: GPTBot followed by Disallow: / blocks OpenAI’s crawler from your entire site. Our robots.txt generator includes a one-click option to block the most common AI bots: GPTBot (OpenAI), CCBot (Common Crawl), PerplexityBot, Google-Extended (Gemini/Bard), anthropic-ai, Claude-Web, FacebookBot, and Diffbot.

Generate Your robots.txt File in Under 30 Seconds

Free, instant, no sign-up required. The most complete free robots.txt generator available online.

⚙️ Create My robots.txt File Free

✅ Copied to clipboard!