How to Set Up a Crawl Delay in Your Robots.txt
A crawl delay is a robots.txt
directive that instructs web crawlers (like search engine bots) to wait a specified number of seconds between requests to your server. This prevents server overload during intensive crawling operations while maintaining site performance for human visitors. Important note: This is part of the unofficial Robots Exclusion Protocol, and compliance varies significantly across crawlers.
Step-by-Step Implementation Guide
1. Access/Create Your robots.txt File
Navigate to your website's root directory (typically public_html
or www
). Access via:
- FTP client (FileZilla, WinSCP)
- cPanel File Manager
- SSH terminal
New file? Create plain text file → name exactly: robots.txt
2. Configure Crawl Delay Directive
Syntax for targeted bots:
User-agent: [Bot-Identifier]
Crawl-delay: [Delay-in-Seconds]
Practical example:
# Target Bing's crawler:
User-agent: Bingbot
Crawl-delay: 5
# Target all compliant crawlers:
User-agent: *
Crawl-delay: 7
Replace [Bot-Identifier]
with specific UA (e.g., Googlebot
) or *
for all bots
3. Upload & Validate
Save changes and upload to root directory. Verify accessibility at: yoursite.com/robots.txt
Validation Tools:
- Google Search Console → Robots Tester
- SEO tools (Screaming Frog, Ahrefs)
- Online validators (RobotsTesting.com)
Critical Best Practices
- Google-Specific Handling: Google ignores
Crawl-delay
. Use Search Console → Settings → Crawl rate for control - Strategic Disallows: Combine with blocking low-value pages:
Disallow: /tmp/
Disallow: /private-folder/
- Server Monitoring: Check access logs weekly for crawler compliance:
grep "Googlebot" access.log | wc -l
- Delay Thresholds: Avoid values above 10 seconds - may cause incomplete indexing
Advanced Alternatives
Web Server Rate Limiting
Apache (.htaccess):
SetEnvIfNoCase User-Agent "Googlebot" bad_bot
<Location "/">
SetOutputFilter RATE_LIMIT
SetEnv rate-limit 50
</Location>
Nginx Configuration
location / {
limit_req zone=crawler burst=20;
}
limit_req_zone $http_user_agent zone=crawler:10m rate=1r/s;
Cloud Solutions
- Cloudflare Rate Limiting (Web Application Firewall)
- AWS WAF Bot Control
- Akamai Bot Manager
Strategic Implementation Tips
While Crawl-delay
provides basic bot management, its effectiveness depends on crawler compliance. For mission-critical sites:
- Use server-level rate limiting for guaranteed enforcement
- Combine with XML sitemaps for efficient crawling paths
- Monitor crawl budget metrics in Search Console
- Set different delays per bot (Bingbot vs. Yandex vs. Baidu)
Always test new configurations during low-traffic periods and verify with multiple bot simulators before full deployment.
Join the conversation