Dan Crow has written about this on the Google Blog recently, including an introduction to setting
up your own rules for robots and a description of some of the more advanced options. His first
two posts in the series are:
[null,null,[],[[["Search engines, like Googlebot, utilize the Robots Exclusion Protocol to understand which parts of your website they should and should not crawl."],["You can control search engine access to your website by creating a robots.txt file that uses this protocol."],["Google provides resources and documentation to help you understand and implement the Robots Exclusion Protocol, including blog posts and help center articles."],["Dan Crow's blog posts offer insights into setting up robots.txt rules and using advanced options for controlling search engine behavior."],["Google has previously published articles on topics like debugging blocked URLs, Googlebot's functionality, and using robots.txt files."]]],["Search engine robots respect website owners' wishes regarding crawling. The Robots Exclusion Protocol is the industry standard language used to communicate these preferences. Resources are provided for setting up rules, including blog posts by Dan Crow on controlling search engine access and the protocol itself. Additional help center content and past posts about debugging blocked URLs, Googlebot, and robots.txt files are also linked. The provided resources are not updated and a link to up-to-date information is also provided.\n"]]