壓縮
透過集合功能整理內容
你可以依據偏好儲存及分類內容。
本文適用於下列方法:
更新 API (第 4 版):
threatListUpdates.fetch。
關於壓縮
壓縮是 Safe Browsing API (v4) 的主要功能。壓縮可大幅降低頻寬需求,這對行動裝置尤其重要,但並非僅限於行動裝置。安全瀏覽伺服器目前支援 Rice 壓縮。日後可能會新增其他壓縮方法。
壓縮設定是使用 supportedCompressions 欄位和 CompressionType 設定。用戶端應使用 RICE 和 RAW 壓縮類型。如果未設定壓縮類型,安全瀏覽會使用 COMPRESSION_TYPE_UNSPECIFIED 類型 (系統會改用 RAW 壓縮)。
無論選取的壓縮類型為何,只要用戶端設定正確的 HTTP 壓縮標頭 (請參閱維基百科文章「HTTP 壓縮」),安全瀏覽伺服器就會使用標準 HTTP 壓縮進一步壓縮回應。
Rice 壓縮
如前所述,安全瀏覽伺服器目前支援 Rice 壓縮 (如要完整瞭解 Golomb-Rice 編碼,請參閱 Wikipedia 文章「Golomb coding」)。
壓縮/解壓縮
RiceDeltaEncoding 物件代表 Rice-Golomb 編碼資料,用於傳送壓縮的移除索引或壓縮的 4 位元組雜湊前置字元。(如果雜湊前置字元長度超過 4 個位元組,系統不會壓縮,而是以原始格式提供。)
如果是移除索引,系統會先將索引清單依遞增順序排序,然後使用 RICE 編碼進行 Delta 編碼。如果是新增項目,系統會將 4 位元組的雜湊前置字元重新解讀為小端序的 uint32,並依遞增順序排序,然後使用 RICE 編碼進行差異編碼。請注意,RICE 壓縮和 RAW 之間的雜湊格式有所不同:原始雜湊是經過字典排序的位元組,而 Rice 雜湊是經過遞增排序的 uint32 (解壓縮後)。
也就是說,整數清單 [1, 5, 7, 13] 會編碼為 1 (第一個值) 和增量 [4, 2, 6]。
第一個值會儲存在 firstValue
欄位中,而差異則會使用 Golomb-Rice 編碼器編碼。Rice 參數 k (見下文) 會儲存在 riceParameter 中。numEntries
欄位包含以 Rice 編碼器編碼的 delta 數量 (上例中為 3,而非 4)。encodedData
欄位包含實際編碼的差異。
編碼器/解碼器
在 Rice 編碼器/解碼器中,每個 delta n 都會編碼為 q 和 r,其中 n = (q<<k) + r (或 n = q * (2**k) + r)。k 是常數,也是 Rice 編碼器/解碼器的參數。q 和 r 的值會使用不同的編碼方式編碼至位元串流中。
商數 q 會以一元編碼方式編碼,後面接著 0。也就是說,3 會編碼為 1110,4 會編碼為 11110,7 則會編碼為 11111110。首先會解碼商數 q。
其餘部分 r 則使用截斷的二進位編碼進行編碼。只有 r 的最低有效 k 位元會寫入 (並因此讀取) 位元串流。解碼 q 後,系統會解碼餘數 r。
位元編碼器/解碼器
Rice 編碼器依賴位元編碼器/解碼器,其中單一位元可以附加至位元編碼器;也就是說,要編碼的商數 q 最多只能是兩位元。
位元編碼器是 (8 位元) 位元組清單。位元會從第一個位元組的最低有效位元,設定至第一個位元組的最高有效位元。如果位元組的所有位元都已設定,系統會在位元組清單的結尾附加新的位元組 (初始化為零)。如果最後一個位元組未完全使用,則最高有效位元會設為零。範例:
新增的 Bits |
新增位元後的 BitEncoder |
|
[] |
0 |
[00000000] |
1 |
[00000010] |
1 |
[00000110] |
1,0,1 |
[00101110] |
0,0,0 |
[00101110, 00000000] |
1,1,0 |
[00101110, 00000110] |
除非另有註明,否則本頁面中的內容是採用創用 CC 姓名標示 4.0 授權,程式碼範例則為阿帕契 2.0 授權。詳情請參閱《Google Developers 網站政策》。Java 是 Oracle 和/或其關聯企業的註冊商標。
上次更新時間:2025-07-25 (世界標準時間)。
[null,null,["上次更新時間:2025-07-25 (世界標準時間)。"],[[["\u003cp\u003eSafe Browsing APIs (v4) utilize compression, primarily Rice compression, to minimize bandwidth usage, especially beneficial for mobile devices.\u003c/p\u003e\n"],["\u003cp\u003eClients should specify RICE or RAW compression types using the \u003ccode\u003esupportedCompressions\u003c/code\u003e field and \u003ccode\u003eCompressionType\u003c/code\u003e enum; if unspecified, RAW is used by default.\u003c/p\u003e\n"],["\u003cp\u003eIn addition to Rice or RAW compression, Safe Browsing servers employ standard HTTP compression if the client sets the appropriate HTTP compression header.\u003c/p\u003e\n"],["\u003cp\u003eRice compression involves encoding data using the Rice-Golomb method, where data is delta-encoded and represented using the \u003ccode\u003eRiceDeltaEncoding\u003c/code\u003e object.\u003c/p\u003e\n"],["\u003cp\u003eThe Rice encoder/decoder utilizes unary coding for the quotient and truncated binary encoding for the remainder, relying on a bit encoder/decoder to append individual bits to a byte list.\u003c/p\u003e\n"]]],["The Safe Browsing API uses compression to reduce bandwidth, supporting Rice and RAW compression. Clients specify compression types using `supportedCompressions` and `CompressionType`. Rice compression encodes removal indices and 4-byte hash prefixes by sorting values as uint32s, delta encoding them, and storing them in `RiceDeltaEncoding`. This involves unary coding quotients and truncated binary encoding remainders. A bit encoder manages bit streams, packing bits into bytes, adding new bytes as needed. The API also uses HTTP compression.\n"],null,["# Compression\n\nThis document applies to the following method:\n[Update API (v4)](/safe-browsing/v4/update-api):\n[threatListUpdates.fetch](/safe-browsing/v4/update-api#example-threatListUpdatesfetch).\n\nAbout compression\n-----------------\n\nCompression is a key feature of the Safe Browsing APIs (v4). Compression significantly reduces\nbandwidth requirements, which is particularly, but not exclusively, relevant for mobile devices.\nThe Safe Browsing server currently supports Rice compression. Additional compression methods may\nbe added in the future.\n\nCompression is set using the\n[supportedCompressions](/safe-browsing/reference/rest/v4/threatListUpdates/fetch#constraints)\nfield and\n[CompressionType](/safe-browsing/reference/rest/v4/threatListUpdates/fetch#compressiontype).\nClients should use the RICE and RAW compression types. Safe Browsing uses the\nCOMPRESSION_TYPE_UNSPECIFIED type when the compression type is not set (RAW compression will be\nsubstituted).\n\nThe Safe Browsing server will also use standard HTTP compression to further compress responses,\nregardless of the compression type selected, as long as the client sets the correct HTTP compression\nheader (see the Wikipedia article [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression)).\n\nRice compression\n----------------\n\nAs noted, the Safe Browsing server currently supports Rice compression (see the Wikipedia article\n[Golomb coding](https://en.wikipedia.org/wiki/Golomb_coding)\nfor a full discussion of Golomb-Rice coding).\n\n### Compression/decompression\n\nThe\n[RiceDeltaEncoding](/safe-browsing/reference/rest/v4/threatListUpdates/fetch#RiceDeltaEncoding)\nobject represents the Rice-Golomb encoded data and is used to send compressed removal indices or compressed\n4-byte hash prefixes. (Hash prefixes longer than 4 bytes will not be compressed, and will be served in raw\nformat instead.)\n\nFor removal indices, the list of indices is sorted in ascending order and then delta encoded\nusing RICE encoding. For additions, the 4-byte hash prefixes are re-interpreted as\nlittle-endian uint32s, sorted in ascending order, and then delta encoded using RICE encoding.\nNote the difference in hash format between RICE compression and RAW: raw hashes are\nlexicographically sorted bytes, whereas Rice hashes are uint32s sorted in ascending order (after\ndecompression).\n\nThat is, the list of integers \\[1, 5, 7, 13\\] will be encoded as 1 (the first value) and the\ndeltas \\[4, 2, 6\\].\n\nThe first value is stored in the `firstValue` field and the deltas are encoded using a Golomb-Rice\nencoder. The Rice parameter k (see below) is stored in riceParameter. The `numEntries` field\ncontains the number of deltas encoded in the Rice encoder (3 in our example above, not 4). The\n`encodedData` field contains the actual encoded deltas.\n\n### Encoder/decoder\n\nIn the Rice encoder/decoder every delta n is encoded as q and r where n = (q\\\u003c\\\u003ck) + r\n(or, n = q \\* (2\\*\\*k) + r). k is a constant and a parameter of the Rice encoder/decoder. The\nvalues for q and r are encoded in the bit stream using different encoding schemes.\n\nThe quotient q is encoded in unary coding followed by a 0. That is, 3 would be encoded as 1110, 4\nas 11110 and 7 as 11111110. The quotient q is decoded first.\n\nThe remainder r is encoded using truncated binary encoding. Only the least significant k bits\nof r are written (and therefore read) from the bit stream. The remainder r is decoded after having\ndecoded q.\n\n### Bit encoder/decoder\n\nThe Rice encoder relies on a bit encoder/decoder where single bits can be appended to the bit\nencoder; that is, to encode a quotient q that could be only two bits long.\n\nThe bit encoder is a list of (8-bit) bytes. Bits are set from the lowest significant bit in the\nfirst byte to the highest significant bit in the first byte. If a byte has all its bits already\nset, a new byte (initialized to zero) is appended to the end of the byte list. If the last byte\nis not fully used, its highest significant bits are set to zero. Example:\n\n| Bits Added | BitEncoder After Adding Bits |\n|------------|------------------------------|\n| | \\[\\] |\n| 0 | \\[00000000\\] |\n| 1 | \\[00000010\\] |\n| 1 | \\[00000110\\] |\n| 1,0,1 | \\[00101110\\] |\n| 0,0,0 | \\[00101110, 00000000\\] |\n| 1,1,0 | \\[00101110, 00000110\\] |"]]