TF*IDF Tool
Optimize content relevancy metrics
Generate optimized robots.txt files for your website on WordPress, Shopify, Medium, Ghost, Joomla, Drupal, or any other platform. Control search engine crawling and improve your SEO.
Start Generating NowChoose your website platform or select custom for a tailored robots.txt file.
Select between an open or closed robots.txt file based on your crawling preferences.
Our tool will generate the appropriate robots.txt file based on your selections, ready to be used on your website.
Block specific directories, allow specific bots, set sitemap paths — all without editing server config files or touching code.
Pre-built templates for WordPress, Shopify, Next.js, and static sites. Pick your platform and get a working robots.txt in under a minute.
Tell Googlebot to skip your staging pages, admin panels, and duplicate content filters — so it spends crawl budget on pages that actually rank.
A clean robots.txt prevents crawl waste. Pages that get crawled faster get indexed faster. Pages that get indexed faster start ranking sooner.
Writing robots.txt by hand means Googling the syntax every time. This generates valid syntax with the directives you actually need.
Follows Google's current robots.txt specification, including the 2019 RFC 9309 standard. No outdated directives, no deprecated syntax.
Robots.txt is a plain text file that tells search engine crawlers which URLs on your site they're allowed to access. Every time Googlebot, Bingbot, or any other crawler visits your site, the first thing it checks is yourdomain.com/robots.txt. If the file exists, the crawler reads the rules before crawling anything else. For server-level access control (redirects, password protection, IP blocking), you'd use an .htaccess file instead — robots.txt only handles crawler instructions.
The file always lives at the root of your domain. Not in a subfolder, not with a different name. Google will only look at https://example.com/robots.txt — nothing else counts.
| Directive | What It Does | Example |
|---|---|---|
User-agent
|
Specifies which crawler the rules apply to. Use * for all crawlers.
|
User-agent: *
|
Disallow
|
Blocks crawling of a path. An empty value means nothing is blocked. |
Disallow: /admin/
|
Allow
|
Overrides a Disallow for a specific path. Useful for allowing a subfolder inside a blocked folder. |
Allow: /admin/public/
|
Sitemap
|
Points crawlers to your XML sitemap. Not technically part of the robots protocol, but universally supported. |
Sitemap: https://example.com/sitemap.xml
|
This trips people up constantly. Robots.txt blocks crawling — it stops the bot from visiting the page. A noindex meta tag blocks indexing — it tells the search engine not to show the page in results. Here's the catch: if you block a page via robots.txt, Google can't see the noindex tag on that page (because it never crawls it). The page can still appear in search results with a "No information is available for this page" snippet if other sites link to it. If you want a page out of Google entirely, use noindex and allow crawling so the bot can actually read the tag.
/wp-content/themes/ or /wp-includes/, Googlebot sees a blank page. That kills your rankings.
Disallow: / under User-agent: * blocks everything. This happens more often than you'd think, especially during staging-to-production migrations.
User-agent rules — or use our AI Crawler Inspector to see which AI bots are already hitting your site.
Every platform has different URL structures and admin paths. Here are working robots.txt examples you can use as a starting point for the most common setups.
WordPress is the most common case. You want to block the admin area and internal search results while keeping everything else open. Don't block /wp-content/ — Google needs access to your theme CSS and JS to render pages properly.
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /?s=
Sitemap: https://example.com/sitemap_index.xml
Shopify auto-generates a robots.txt that covers most cases. Since mid-2021, you can customize it via a robots.txt.liquid template in your theme. The default blocks checkout, cart, and internal search — which is usually what you want.
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /checkout
Disallow: /search
Sitemap: https://example.com/sitemap.xml
Static sites and Next.js apps usually have minimal paths to block. If you're using Next.js, place the file in your public/ directory. For most static sites, an open robots.txt with just a sitemap reference is enough.
User-agent: *
Disallow: /api/
Disallow: /_next/
Allow: /_next/static/
Sitemap: https://example.com/sitemap.xml
These are starting points. Use the generator above to create a robots.txt tailored to your specific platform and crawling preferences.
A robots.txt file is a text file that tells search engine crawlers which pages or files the crawler can or can't request from your site. It's used to manage website traffic and avoid overloading your site with requests.
Our tool allows you to select your website platform, choose crawling preferences, and optionally block specific search engines. It then generates a robots.txt file based on your selections, following best practices for each platform.
While not mandatory, a robots.txt file is highly recommended for most websites. It helps you control how search engines crawl your site, potentially improving your SEO and server performance.
After generating the file, copy its contents and create a new file named "robots.txt" in the root directory of your website. For most websites, this would be accessible at yourdomain.com/robots.txt.
No. Robots.txt blocks crawling, not indexing. If other websites link to a page you've blocked in robots.txt, Google can still show that URL in search results — it just won't have a snippet because Google never crawled the content. To actually remove a page from search results, use a noindex meta tag on the page itself, and make sure robots.txt allows crawling so Google can read the tag.
Review it whenever you make structural changes to your site — adding new sections, switching CMS platforms, launching a staging environment, or noticing crawl budget issues in Google Search Console. For most sites, checking it once a quarter is enough. If you're running a large site with thousands of pages, you'll want to monitor crawl stats more frequently and adjust your robots.txt to prioritize important content.
Search engines will crawl everything they can find on your site. For small sites, this is usually fine — there's nothing wrong with full crawling. But for larger sites, you're wasting crawl budget on admin pages, internal search results, and other low-value URLs. You're also missing an easy opportunity to point crawlers to your sitemap. Even a minimal robots.txt with just a Sitemap directive is better than nothing.
Optimize content relevancy metrics
Analyze AI search crawling
Analyze anchor text distribution
How search intent affects rankings
Create compelling testimonials
Generate critical CSS for faster page loads