<<

Brotli Compression Algorithm motivation Main differences with Areas ● entropy code reuse ● window size ● context modeling ● literal count coding ● distance cache ● match lengths ● block end codes Entropy code reuse

Entropy code requires some to be described.

Deflate: the whole : a block may entropy code has to re-use past entropy be described before codes (within the the data. same metablock). Window size

Deflate: deflate can Brotli: brotli can refer remember past 32 back to past 16 MB. kB.

TBD: freeze a resource sweet spot for the specification, perhaps just 4 MB window. Context modeling

Deflate: each literal Brotli: literal bytes is coded can have an entropy independently. code that depends on one, two or three last bytes. Literal count coding

Deflate: every symbol Brotli: the number of pure has a probability for literals is codified, next literals, backward copy the literals with a literal length, or end code. only entropy code. These are entropy coded individually for every symbol. Distance cache

Deflate: each distance Brotli: backward codified separately. references can be described in the form of past four distances

...abcdeX12345...... abcdeY12345... Match lengths

Deflate: match lengths of Brotli: match lengths of 3–258. Match lengths 2–enough. Match lengths codified in the same codified in joint entropy entropy code as literals. code with literal lengths. Block end code

Deflate: each symbol can Brotli: block length is be the block end code, codified in the beginning stealing some efficiency of the block. from the entropy when unique symbol count is small.