网址编码
使用集合让一切井井有条
根据您的偏好保存内容并对其进行分类。
某些字符不能作为网址的一部分(例如,空格),而其他一些字符在网址中具有特殊含义。在 HTML 表单中,字符 =
用于将名称与值分隔开来。URI 通用语法使用网址编码来解决这一问题,而 HTML 表单则会进行一些额外的替换,而不是针对所有此类字符应用百分号编码。
例如,字符串中的空格要么使用 %20
进行编码,要么替换为加号 (+
)。如果您使用竖线字符 (|
) 作为分隔符,请务必将竖线编码为 %7C
。字符串中的逗号应编码为 %2C
。
建议您使用自己平台的常规网址构建库来自动对网址进行编码,以确保针对您的平台正确转义网址。
构建有效网址
您可能认为“有效”网址不言自明,但实际并非如此。例如,在浏览器地址栏中输入的网址可能包含特殊字符(例如 "上海+中國"
);浏览器需要先在内部将这些字符转换为其他编码,然后再进行传输。同样,任何生成或接受 UTF-8 输入的代码都可能会将包含 UTF-8 字符的网址视为“有效”,但同样需要先转换这些字符,然后再将其发送给网络服务器。该过程称为网址编码或百分号编码。
特殊字符
我们之所以需要转换特殊字符,是因为所有网址都需要符合统一资源标识符 (URI) 规范所规定的语法。实际上,这意味着网址必须只包含一个特殊的 ASCII 字符子集:熟悉的字母数字符号以及一些在网址内用作控制字符的预留字符。下表汇总了这些字符:
有效网址字符摘要
字符集 | 字符 | 在网址中的用法 |
字母数字 |
a b c d e f g h i j k l m
n o p q r s t u v w x y z
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
| 文本字符串、协议方案用法 (http )、端口 (8080 ) 等 |
未预留 |
- _ . ~ |
文本字符串 |
预留 |
! * ' ( ) ; : @ & = + $ , / ? % # [ ] |
控制字符和/或文本字符串 |
构建有效网址时,您必须确保网址只包含“有效网址字符摘要”表格中显示的那些字符。让网址按照上述字符集使用字符通常会带来两个问题,一个是遗漏问题,一个是替换问题:
- 您要处理的字符未包含在上述字符集内。例如,外语字符(例如
上海+中國
)需要使用上述字符进行编码。按照常见惯例,空格(网址内不允许使用空格)通常也使用加号字符 '+'
表示。
- 字符以预留字符的形式存在于上述字符集内,但需要按原义使用。例如,
?
在网址内用于表示查询字符串的开头;如果您想要使用字符串“? and the Mysterions”,则需要对 '?'
字符进行编码。
所有要进行网址编码的字符都会使用一个 '%'
字符和一个与其 UTF-8 字符对应的两个字符的十六进制值进行编码。例如,UTF-8 中的 上海+中國
在进行网址编码后将变为 %E4%B8%8A%E6%B5%B7%2B%E4%B8%AD%E5%9C%8B
。字符串 ? and the Mysterians
在进行网址编码后将变为 %3F+and+the+Mysterians
或 %3F%20and%20the%20Mysterians
。
需要编码的常见字符
以下是一些必须进行编码的常见字符:
不安全的字符 |
编码值 |
空格 |
%20 |
" |
%22 |
< |
%3C |
> |
%3E |
# |
%23 |
% |
%25 |
| |
%7C |
转换您通过用户输入获取的网址有时颇为棘手。例如,用户可能会输入“5th&Main St.”这样的地址。一般而言,您应该根据网址的组成部分进行构建,将所有用户输入都视为原义字符。
此外,对于所有的 Google Maps Platform 网络服务 API 或静态网络 API,网址最多可包含 8192 个字符。对于大多数服务,很少出现接近这一字符数限制的情况。但请注意,某些服务具有的若干参数可能会导致网址较长。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2023-08-02。
[null,null,["最后更新时间 (UTC):2023-08-02。"],[[["\u003cp\u003eURLs can only contain a specific set of ASCII characters and some reserved characters, requiring encoding of other characters.\u003c/p\u003e\n"],["\u003cp\u003eURL encoding involves replacing unsafe characters with a '%' followed by their two-digit hexadecimal UTF-8 representation.\u003c/p\u003e\n"],["\u003cp\u003eCharacters like spaces, quotation marks, less than, greater than, hash, percent, and pipe often require URL encoding.\u003c/p\u003e\n"],["\u003cp\u003eBuilding URLs programmatically using platform libraries is recommended for automatic and proper encoding.\u003c/p\u003e\n"],["\u003cp\u003eGoogle Maps Platform web services and static web APIs have a URL character limit of 16384.\u003c/p\u003e\n"]]],["URLs require encoding because they only accept a specific subset of ASCII characters. Special characters, like spaces and foreign language characters, must be encoded using percent encoding (e.g., `%20` for space) or plus signs in specific instances. Reserved characters, such as `=`, `?`, and `|`, also need encoding when used literally, not as control characters. It's best to utilize platform URL building libraries for automatic encoding, guaranteeing proper escaping, because it can be complex to do manually. It is also important to note that URLs are limited to 16384 characters for Google Maps Platform web services.\n"],null,["Some characters cannot be part of a URL (for example, the space) and some other\ncharacters have a special meaning in a URL. In HTML forms, the character `=` is\nused to separate a name from a value. The URI generic syntax uses URL encoding\nto deal with this problem, while HTML forms make some additional substitutions\nrather than applying percent encoding for all such characters.\n\nFor example, spaces in a string are either encoded with `%20` or replaced with\nthe plus sign (`+`). If you use a pipe character (`|`) as a separator, be sure\nto encode the pipe as `%7C`. A comma in a string should be encoded as `%2C`.\n\nIt is recommended you use your platform's normal URL building libraries to\nautomatically encode your URLs, to ensure the URLs are properly escaped for your\nplatform.\n\nBuilding a valid URL\n\nYou may think that a \"valid\" URL is self-evident, but\nthat's not quite the case. A URL entered within an address bar in a\nbrowser, for example, may contain special characters (e.g.\n`\"上海+中國\"`); the browser needs to internally translate\nthose characters into a different encoding before transmission.\nBy the same token, any code that generates or accepts UTF-8 input\nmight treat URLs with UTF-8 characters as \"valid\", but would also need\nto translate those characters before sending them out to a web server.\nThis process is called [URL-encoding](https://en.wikipedia.org/wiki/Query_string#URL_encoding) or [percent-encoding](https://en.wikipedia.org/wiki/Percent-encoding).\n\nSpecial characters\n\nWe need to translate special characters because\nall URLs need to conform to the syntax specified by the\n[Uniform\nResource Identifier (URI)](http://tools.ietf.org/html/rfc3986) specification. In effect, this means that URLs\nmust contain only a special subset of ASCII characters: the familiar\nalphanumeric symbols, and some reserved characters for use as control\ncharacters within URLs. This table summarizes these characters:\nSummary of Valid URL Characters\n\n| Set | characters | URL usage |\n|--------------|-----------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------|\n| Alphanumeric | a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 | Text strings, scheme usage (`http`), port (`8080`), etc. |\n| Unreserved | - _ . \\~ | Text strings |\n| Reserved | ! \\* ' ( ) ; : @ \\& = + $ , / ? % # \\[ \\] | Control characters and/or Text Strings |\n\nWhen building a valid URL, you must ensure that it contains only those characters shown in the\ntable. Conforming a URL to use this set of characters generally\nleads to two issues, one of omission and one of substitution:\n\n- Characters that you wish to handle exist outside of the above set. For example, characters in foreign languages such as `上海+中國` need to be encoded using the above characters. By popular convention, spaces (which are not allowed within URLs) are often represented using the plus `'+'` character as well.\n- Characters exist within the above set as reserved characters, but need to be used literally. For example, `?` is used within URLs to indicate the beginning of the query string; if you wish to use the string \"? and the Mysterions,\" you'd need to encode the `'?'` character.\n\nAll characters to be URL-encoded are encoded\nusing a `'%'` character and a two-character hex\nvalue corresponding to their UTF-8 character. For example,\n`上海+中國` in UTF-8 would be URL-encoded as\n`%E4%B8%8A%E6%B5%B7%2B%E4%B8%AD%E5%9C%8B`. The\nstring `? and the Mysterians` would be URL-encoded as\n`%3F+and+the+Mysterians` or `%3F%20and%20the%20Mysterians`.\n\nCommon characters that need encoding\n\nSome common characters that must be encoded are:\n\n| Unsafe character | Encoded value |\n|------------------|---------------|\n| Space | `%20` |\n| \" | `%22` |\n| \\\u003c | `%3C` |\n| \\\u003e | `%3E` |\n| # | `%23` |\n| % | `%25` |\n| \\| | `%7C` |\n\nConverting a URL that you receive from user input is sometimes\ntricky. For example, a user may enter an address as \"5th\\&Main St.\"\nGenerally, you should construct your URL from its parts, treating\nany user input as literal characters.\n\nAdditionally, URLs are limited to 16384 characters for all Google Maps Platform web services\nand static web APIs. For most services, this character limit will seldom be approached. However,\nnote that certain services have several parameters that may result in long URLs."]]