Brotli Compression Algorithm motivation Main differences with deflate Areas ● entropy code reuse ● window size ● context modeling ● literal count coding ● distance cache ● match lengths ● block end codes Entropy code reuse
Entropy code requires some bytes to be described.
Deflate: the whole Brotli: a block may entropy code has to re-use past entropy be described before codes (within the the data. same metablock). Window size
Deflate: deflate can Brotli: brotli can refer remember past 32 back to past 16 MB. kB.
TBD: freeze a resource sweet spot for the specification, perhaps just 4 MB window. Context modeling
Deflate: each literal Brotli: literal bytes byte is coded can have an entropy independently. code that depends on one, two or three last bytes. Literal count coding
Deflate: every symbol Brotli: the number of pure has a probability for literals is codified, next literals, backward copy the literals with a literal length, or end code. only entropy code. These are entropy coded individually for every symbol. Distance cache
Deflate: each distance Brotli: backward codified separately. references can be described in the form of past four distances
...abcdeX12345...... abcdeY12345... Match lengths
Deflate: match lengths of Brotli: match lengths of 3–258. Match lengths 2–enough. Match lengths codified in the same codified in joint entropy entropy code as literals. code with literal lengths. Block end code
Deflate: each symbol can Brotli: block length is be the block end code, codified in the beginning stealing some efficiency of the block. from the entropy when unique symbol count is small.