All tools
SEO tools

Robots.txt Generator

Build a valid robots.txt file in seconds — control crawler access, block unwanted bots, set crawl delays and point to your sitemap.

Allow / Disallow rules Block specific bots Crawl-delay Sitemap reference Instant download
Get started free Sign in

Free · No credit card · 50 credits/day

Robots.txt directives explained

🤖

User-agent

Specifies which crawler the rules apply to. Use * for all crawlers, or a specific bot name like Googlebot, Bingbot, or GPTBot for targeted rules.

Allow

Explicitly permits crawling of a URL or path, even when a broader Disallow rule would block it. Useful for allowing a specific file inside a blocked directory.

🚫

Disallow

Prevents the specified crawler from accessing a URL or directory. Disallow: / blocks all pages; Disallow: /admin/ blocks the entire admin section.

⏱️

Crawl-delay

Requests the crawler waits N seconds between requests. Reduces server load. Respected by Bing and Yandex; Google ignores it — use Search Console for Google's crawl rate.

🗺️

Sitemap

Points crawlers to your XML sitemap URL. Declare Sitemap: https://example.com/sitemap.xml at the bottom of your robots.txt to help crawlers discover your full URL list.

🃏

Wildcards

Use * to match any sequence of characters and $ to match the end of a URL. Example: Disallow: /*.pdf$ blocks all PDF files across the site.

⚠️ robots.txt is not a security measure

Disallowing a URL in robots.txt doesn't stop all bots — malicious scrapers ignore it entirely. Sensitive pages (admin panels, private APIs) must be protected by authentication. Use noindex meta tags to prevent pages from appearing in search results even if they're crawled.

Frequently asked questions

What is a robots.txt file?

A robots.txt file is placed at the root of your website (e.g. https://example.com/robots.txt) and tells web crawlers which pages they are allowed or not allowed to access. Crawlers check this file before crawling any other page.

Does robots.txt prevent pages from being indexed?

No. Disallowing a URL prevents crawlers from fetching the page, but Google can still index the URL if other pages link to it. To prevent indexing, use a noindex meta tag on the page itself. For truly sensitive content, block access via authentication.

What is the Crawl-delay directive?

Crawl-delay tells a crawler to wait N seconds between requests to reduce server load. Google does not respect Crawl-delay in robots.txt — set your preferred crawl rate in Google Search Console instead. Other crawlers (Bing, Yandex) do respect it.

Can I use wildcards in robots.txt?

Yes. Google supports two wildcards: * matches zero or more characters (e.g. Disallow: /wp-admin/* blocks all URLs under /wp-admin/), and $ matches the end of a URL (e.g. Disallow: /*.pdf$ blocks all PDFs). Other crawlers may not support wildcards.

Related SEO tools

Complete your technical SEO setup.

Sitemap XML Generator

Build a valid XML sitemap for Google Search Console in minutes.

Canonical Tag Generator

Generate canonical link tags to prevent duplicate content issues.

Redirect Checker

Follow redirect chains and detect redirect loops or broken hops.

Control what crawlers see

Free account. 50 credits per day. Access to 75+ tools instantly.

Create free account →