Overview of crawling and indexing topics

The topics in this section describe how you can control Google's ability to find and parse your content in order to show it in Search and other Google properties, as well as how to prevent Google from crawling specific content on your site.

Here's a brief description of each page. To get an overview of crawling and indexing, read our How Search works guide.

Topics
File types indexable by Google Google can index the content of most types of pages and files. Explore a list of the most common file types that Google Search can index.
URL structure Consider organizing your content so that URLs are constructed logically and in a manner that is most intelligible to humans.
Sitemaps Tell Google about pages on your site that are new or updated.
Crawler management
robots.txt A robots.txt file tells search engine crawlers which pages or files the crawler can or can't request from your site.
Canonicalization Learn what URL canonicalization is and how to tell Google about any duplicate pages on your site in order to avoid excessive crawling. Learn how Google auto-detects duplicate content, how it treats duplicate content, and how it assigns a canonical URL to any duplicate page groups found.
Mobile sites Learn how you can optimize your site for mobile devices and ensure that it's crawled and indexed properly.
AMP If you have AMP pages, learn how AMP works in Google Search.
JavaScript There are some differences and limitations that you need to account for when designing your pages and applications to accommodate how crawlers access and render your content.
Page and content metadata
Removals
Site moves and changes