使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
Google 如何抓取语言区域自适应网页
如果您的网站包含语言区域自适应网页(也就是说,您的网站会根据检测到的访问者所在国家/地区或访问者首选语言返回不同的内容),Google 可能不会将您的不同语言区域网页的所有内容都纳入抓取/索引/排名范围。这是因为,Googlebot 抓取工具的默认 IP 地址看起来是位于美国境内的。另外,该抓取工具在发送 HTTP 请求时并不会在请求标头中设置 Accept-Language
。
基于地理位置的抓取
除了使用美国境内的 IP 地址之外,Googlebot 还会使用美国境外的 IP 地址进行抓取。
正如我们一直建议的,当 Googlebot 看似来自特定国家/地区时,请像对待来自该国家/地区的任何其他用户一样对待它。这意味着,如果您阻止位于美国的用户访问您的内容,但允许来自澳大利亚的用户访问,那么您的服务器就应该阻止看似来自美国的 Googlebot 访问,但允许看似来自澳大利亚的 Googlebot 访问。
其他注意事项
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-08-04。
[null,null,["最后更新时间 (UTC):2025-08-04。"],[[["\u003cp\u003eGooglebot's default IP addresses appear to be US-based, which may impact the crawling and indexing of locale-adaptive pages targeting other regions.\u003c/p\u003e\n"],["\u003cp\u003eIt is recommended to use separate locale URL configurations with \u003ccode\u003erel="alternate"\u003c/code\u003e hreflang annotations for better localization.\u003c/p\u003e\n"],["\u003cp\u003eGooglebot crawls from various global locations, so treat it like any other user based on its apparent location, including access restrictions.\u003c/p\u003e\n"],["\u003cp\u003eEnsure consistent robots exclusion protocol (robots.txt and meta tags) across all locales to avoid unintended crawling restrictions.\u003c/p\u003e\n"]]],["Google crawls locale-adaptive pages using IP addresses from various locations, not just the USA. When Googlebot appears to be from a specific country, treat it like a user from that region. For locale-adaptive sites, using separate URL configurations with `rel=\"alternate\"` hreflang annotations is recommended. Ensure consistent application of robots exclusion protocols, such as robots.txt and meta tags, across all locales. You can verify Googlebot's geo-distributed crawls through reverse DNS lookups.\n"],null,["# How Google Crawls Locale-Adaptive Pages | Google Search Central\n\nHow Google crawls locale-adaptive pages\n=======================================\n\nIf your site has *locale-adaptive* pages (that is, your site returns different content\nbased on the perceived country or preferred language of the visitor), Google might not crawl,\nindex, or rank all your content for different locales. This is because the default IP\naddresses of the Googlebot crawler appear to be based in the USA. In addition, the crawler\nsends HTTP requests without setting `Accept-Language` in the request header.\n| **Important** : We recommend using separate locale URL configurations and annotating them with [`rel=\"alternate\"`\n| hreflang annotations](/search/docs/specialty/international/localized-versions).\n\nGeo-distributed crawling\n------------------------\n\nGooglebot crawls with IP addresses based outside the USA, in addition to the US-based IP addresses.\n\nAs we have always recommended, when Googlebot appears to come from a certain country, treat\nit like you would treat any other user from that country. This means that if you block\nUSA-based users from accessing your content, but allow visitors from Australia to see it,\nyour server should block Googlebot if it appears to be coming from the USA, but allow access\nto Googlebot if it appears to come from Australia.\n\n### Other considerations\n\n- Googlebot uses the same user agent string for all crawling configurations. Learn more about the [user agent strings used\n by Google crawlers](/search/docs/crawling-indexing/overview-google-crawlers).\n- You can [verify Googlebot\n geo-distributed crawls](/search/docs/crawling-indexing/verifying-googlebot) using reverse DNS lookups.\n- If your site is using the [*robots exclusion protocol*](https://www.rfc-editor.org/rfc/rfc9309.html), make sure you apply it consistently across locales. This means that [robots `meta` tags](/search/docs/crawling-indexing/robots-meta-tag) and the [robots.txt file](/search/docs/crawling-indexing/robots/create-robots-txt) must specify the same rules in each locale."]]