Lz4m: a Fast Compression Algorithm for In-Memory Data

2017 IEEE International Conference on Consumer Electronics (ICCE) LZ4m: A Fast Compression Algorithm for In-Memory Data Se-Jun Kwon∗, Sang-Hoon Kim†, Hyeong-Jun Kim∗, and Jin-Soo Kim∗ ∗College of Information and Communication Engineering, Sungkyunkwan University Email: [email protected], [email protected], [email protected] †School of Computing, Korea Advanced Institute of Science and Technology (KAIST) Email: [email protected] Abstract—Compressing in-memory data is a cost-effective as Deflate [7], LZ4 [8], and LZO [9] compress in-memory data solution for dealing with the memory demand from data-intensive better than WKdm. However, they exhibit poor performance applications. This paper proposes a fast data compression algo- in terms of compression and decompression speed, thereby rithm for in-memory data that improves performance by utilizing the characteristics frequently observed from in-memory data. impairing system performance. This paper proposes a new novel compression algorithm I. INTRODUCTION called LZ4m, which stands for LZ4 for in-memory data. We Dealing with the ever-increasing memory requirements of expedite the scanning of input stream of the original LZ4 data-intensive applications is the subject of intensive work algorithm by utilizing the characteristics frequently observed in both industry and academia. In particular, it becomes an from in-memory data. We also revise the encoding scheme important issue in consumer devices as their fast evolution of the LZ4 algorithm so that the compressed data can be places a high demand on memory. As a cost-effective solution represented in a more concise form. Our evaluation results for the demand, manufacturers attempt to compress in-memory with the data obtained from a running mobile device show data thereby increasing the effective memory size and lower- that LZ4m outperforms previous compression algorithms in ing manufacturing cost [1], [2]. Specifically, in-memory data compression and decompression speed by up to 2.1× and compression techniques become important since the market 1.8×, respectively, with a marginal loss of less than 3% in for consumer devices becomes extremely price sensitive and compression ratio. sales are shifted from premium to cheaper models equipped II. ZIV-LEMPEL ALGORITHM AND LZ4 with small memory [3]. Data compression, or compression shortly, refers to the Many compression algorithms aiming at low compression technique that reduces the size of data by representing the latency are based on the Lempel-Ziv algorithm [10]. The key data in a more concise form. There have been many studies strategy of Lempel-Ziv algorithm is to scan an input stream that employ the compression for in-memory data, ranging and find the longest match between the substring starting from from computer architecture systems to operating systems. For the current scanning position and the substrings in the already instance, some memory technologies [4], [?] compress cache scanned part of the input stream. If such a match is found, the lines at the memory controller before writing them to memory, matched substring at the current position is substituted with providing a larger amount of memory than physically equipped a pair of the backward offset to the match and the length of one. Compcache and zram [5] find out rarely referenced page the match. However, identifying the longest match can incur frames and compress them to reduce their memory footprint. high overheads in time and space. Thus, algorithms in practice A compression algorithm for in-memory data must be usually alter the strategy to quickly identify a substring that fast in both compression and decompression while providing is long enough to be compressed [7], [8], [9]. acceptable compression ratio1 as the performance of memory LZ4 is one of the most popular and successful compres- access heavily influences on overall system performance. To sion algorithms based on the Lempel-Ziv algorithm. Fig- this end, Wilson et al. have proposed the WKdm algorithm ure 1 illustrates how LZ4 identifies the match and encodes that utilizes the regularities that are frequently observed from an input stream. LZ4 scans an input stream with a 4-byte in-memory data [6]. However, although WKdm exhibits out- window (“DABE” at position 10 in this case) and checks standing compression speed, its poor compression ratio is the whether the substring in the window appeared before. To assist major concern that hinders the application of the algorithm. On the match, LZ4 maintains a hash table that maps a 4-byte the other hand, general-purpose compression algorithms such substring to a position from the start of the input stream. If the hash table contains an entry for the current 4-byte This work was supported by the National Research Foundation of Ko- substring in the scanning window (Figure 1(a)- 1 ), it implies rea (NRF) grant founded by the Korea Government (MSIP) (No. NRF- that the current substring appeared at the certain position in 2016R1A2A1A05005494). 1Throughout the paper, we use the term compression ratio to refer to the the previously scanned stream (Figure 1(a)- 2 ). Therefore, the ratio of the compressed size to the original size. substring starting at the position has an identical prefix with 978-1-5090-5544-9/17/$31.00 ©2017 IEEE 2017 IEEE International Conference on Consumer Electronics (ICCE) Scanning window ¥ Scanning window ¥ Position 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Position 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Data F C D A B E G E A B D A B E G E A F C D D Data F C D A B E G E A B D A B E G E A B C D D Substring Position Substring Position ¤ FCDA 0 ¤ FCDA 0 £ CDAB 1 BEGE 4 12 DABE 2 10 £ ABDA 8 (a) Matching process of LZ4 (a) Matching process of LZ4m Token Body Token Body Literal Match Literal Match Literal Match Literal Match Offset length length Length Data Offset Length length length Length Data Offset Length 4 bits 4 bits 0+ bytes 0+ bytes 2 bytes 0+ bytes 2 bits 3 bits 3 bits 0+ bytes 0+ bytes 1 byte 0+ bytes (b) Encoding of LZ4 (b) Encoding of LZ4m Fig. 1. Matching process and encoding of LZ4 Fig. 2. Matching process and encoding of LZ4m III. PROPOSED LZ4M ALGORITHM LZ4 focuses on the compression and decompression speed the substring starting at the current scanning position. LZ4 with an acceptable compression ratio. However, as LZ4 is utilizes this property and finds the longest match starting from designed as a general-purpose compression algorithm, it does these two positions. In this example, the longest match string not utilize the inherent characteristics of in-memory data. is “DABEGEA”. Then, the corresponding hash table entry In-memory data consists of virtual memory pages that is updated to the current scanning position and the scanning contains the data from the stack and the heap regions of window is slid forward by the length of the match (Figure applications. The stack and the heap regions contain constants, 1(a)- 3 ). If no entry for the current substring is found in the variables, pointers, and arrays of basic types which are usually hash table, a new entry is inserted into the hash table and structured and aligned to the word boundary of the system [6]. the scanning window is advanced by one byte. The scanning Based on the observation, we can expect that data compression is repeated until the scanning window reaches the end of the can be accelerated without a significant loss of compressing stream. opportunity by scanning and matching data in the word granularity. In addition, as the larger granularity requires less According to the matching scheme, the input stream is bits to represent the length and the offset, the substrings (i.e., partitioned into a number of substrings, called literals and literals and matches) can be encoded in a more concise form. matches. The literal means a substring that does not match We revise the LZ4 algorithm to utilize these characteristics any of previously appeared substrings. The literal is always and propose the LZ4m algorithm which stands for LZ4 for in- followed by one or more matches unless the literal is the last memory data. Figure 2(a) illustrates the modified compression substring of the input stream. LZ4 encodes the substrings into scheme and the unit encoding of LZ4m. LZ4m uses the same encoding units which are comprised of a token and a body as scanning window and hash table of the original LZ4. In shown in Figure 1(b). Every encoding unit is started with a 1- contrast to the original LZ4 algorithm, LZ4m scans an input byte token. If the substring to encode is a literal followed by a stream and finds the match in a 4-byte granularity. If the hash match, the upper 4 bits of the token are set to the length of the table indicates no prefix match exists, LZ4m advances the literal and the lower 4 bits are set to the length of the following window by 4 bytes and repeats identifying the prefix match. match. If 4 bits are not enough to represent the length, i.e., the As shown in the Figure 2(a), after examining the prefix at substring is longer than 15 bytes, the corresponding bits are position 8, the next position of the window is 12 instead of 9. set to 1111(2), and the length subtracted by 15 is placed on the If a prefix match is found, LZ4m compares subsequent data body which follows the token.

Load more