Testing Disallowed Path
Example Input
User-agent: * Disallow: /admin/ Path to test: /admin/settings
Sample Output
Access Blocked (Matches rule Disallow: /admin/ at line 2)
Audit crawl rules locally. Paste your robots.txt file and test if specific URLs are allowed or blocked by search engine crawlers with line-level match precision.
A robots.txt file is placed in the root directory of a website to guide web crawlers (like Googlebot) on which files and directories they can access. Spacing issues, typos, or incorrect wildcard usage in your directives can block search engines from indexing crucial pages. A robots.txt validator parses these directives locally and runs path queries against simulated user-agents to verify access control.
Always place your robots.txt file in the absolute root directory of your host (e.g., https://example.com/robots.txt). Remember that rules are case-sensitive and search engine crawlers interpret directives differently. For example, Googlebot recognizes standard wildcards (* and $), but older user-agents might ignore them. Verify your disallow configurations before publishing to avoid accidental search index drops.
Example Input
User-agent: * Disallow: /admin/ Path to test: /admin/settings
Sample Output
Access Blocked (Matches rule Disallow: /admin/ at line 2)
No, robots.txt is a public file and search engines can still index pages if they are linked from other sites, even if crawling is disallowed. To hide a page from search results, use a 'noindex' meta tag instead.
Yes. You can declare specific rules for different web crawlers. For example, you can write one block for 'User-agent: Googlebot' and a separate, fallback block for 'User-agent: *' for general crawlers.
Related Tool
Generate a custom robots.txt file for your website to guide search engine crawlers like Googlebot and Bingbot. Direct bots on what to index and ignore.
Related Tool
Fetch and validate XML sitemaps to score HTTPS coverage, duplicate URLs, lastmod usage, and sitemap size limits.