Tuesday, April 17, 2007
As a site owner, you control what content of your site is indexed in search engines. The easiest
way to let search engines know what content you don't want indexed is to use a
robots.txt file or robots meta
tag.
But sometimes, you want to remove content that's already been indexed. What's the best way to do
that?
As always, the answer begins: it depends on the type of content that you want to remove. Our webmaster help center provides detailed information about each situation. Once we recrawl that page, we'll remove the content from our index automatically. But if you'd like to expedite the removal rather than wait for the next crawl, the way to do that has just gotten easier.
For sites that you've verified ownership for in your Webmaster Tools account, you'll now see a new option under the Diagnostic tab called URL Removals. To get started, simply click the URL Removals link, then New Removal Request. Choose the option that matches the type of removal you'd like.
Individual URLs
Choose the Individual URLs option if you'd like to remove a URL or image. In order for the URL to be eligible for removal, one of the following must be true:
-
The URL must return a status code of either
404
or410
- The URL must be blocked by the site's robots.txt file.
-
The URL must be blocked by a
robots
meta
tag.

Once the URL is ready for removal, enter the URL and indicate whether it appears in our web search results or image search results. Then click Add. You can add up to 100 URLs in a single request. Once you've added all the URLs you would like removed, click Submit Removal Request.

A directory

Choose this option if you'd like to remove all files and folders within a directory on your site.
For instance, if you request removal of https://www.example.com/myfolder
,
this will remove all URLs that begin with that path, such as:
https://www.example.com/myfolder
https://www.example.com/myfolder/page1.html
https://www.example.com/myfolder/images/image.jpg
In order for a directory to be eligible for removal, you must block it using a robots.txt file.
For instance, for the example above, https://www.example.com/robots.txt
could include
the following:
User-agent: Googlebot Disallow: /myfolder
Your entire site
Choose this option only if you want to remove your entire site from the Google index. This
option will remove all subdirectories and files. Do not use this option to remove the non-preferred
version of your site's URLs from being indexed. For instance, if you want all of your URLs indexed
using the www version, don't use this tool to request removal of the non-www version. Instead,
specify the version you want indexed using the Preferred domain tool
(and do a
301
redirect
to the preferred version, if possible). To use this option, you must
block the site using a robots.txt file.
Cached copies
Choose this option to remove cached copies of pages in our index. You have two options for making pages eligible for cache removal.
Using a meta noarchive
tag and requesting expedited removal
If you don't want the page cached at all, you can add a
meta noarchive
tag
to the page and then request expedited cache removal using this tool. By requesting removal
using this tool, we'll remove the cached copy right away, and by adding the meta
noarchive
tag, we will never include the cached version. (If you change your mind
later, you can remove the meta noarchive
tag.)
Changing the page content

If you want to remove the cached version of a page because it contained content that you've removed and don't want indexed, you can request the cache removal here. We'll check to see that the content on the live page is different from the cached version and if so, we'll remove the cached version. We'll automatically make the latest cached version of the page available again after six months (and at that point, we likely will have recrawled the page and the cached version will reflect the latest content) or, if you see that we've recrawled the page sooner than that, you can request that we reinclude th