Tuesday, September 19, 2006
Vanessa's been posting a lot lately, and I'm starting to feel left out. So here my tidbit of wisdom for you: I've noticed a couple of webmasters confused by "blocked by robots.txt" errors, and I wanted to share the steps I take when debugging robots.txt problems:
A handy checklist for debugging a blocked URL
Let's assume you are looking at crawl errors for your website and notice a URL restricted by robots.txt that you weren't intending to block:
https://www.example.com/amanda.html URL restricted by robots.txt Sep 3, 2006
Check the robots.txt analysis tool
The first thing you should do is go to the
robots.txt analysis tool
for that site. Make sure you are looking at the correct site for that URL, paying attention that
you are looking at the right protocol and subdomain. (Subdomains and protocols may have their own
robots.txt file, so https://www.example.com/robots.txt
may be different from