Nakamichi 'Dragoneye' highlights:

- The latest Zennish LZSS Microdeduplicator, 100% FREE; - File-to-File [de]compressor; - Superfast decompression rates, superslow compression rates; - On big (500++MB) textual data, second only to Hamid's LzTurbo 29, ratiowise, resourcewise and speedwise - TRIPLE TRUMP :P; - Single-threaded Non-SIMD console tool written in plain C, compileable under Windows and ; - An LZSS (Lempel–Ziv–Storer–Szymanski) implementation with Greedy Parsing and 1TB Sliding Window; - Ability to deduplicate (as little as) 64 bytes long chunks 1TB backwards; - Targets huge textual datasets (mainly English), weak-'n'-slow on binary data; - One goal is to boost traversing (full-text parsing) of the whole XML dump of Wikipedia being ~64GB strong via TRANSPARENT decompression; - The first matchfinder using both the fastest memmem() Railgun ‘Trolldom’ and B-trees; - The first parser using both Internal or External RAM, decided by a single command line option - 'i' or 'e'; - Hashpot/hashpool (residing in Physical RAM) could be tuned via command line parameter, thus lessening the B-trees heights; - The B-trees form the second layer, the first being HASH table handled by FNV1A-Jesteress; - The Leprechaunesque (Internal/External) B-trees order 3 (2 keys MAX) are highly-optimized; - DEPRECIATED (too slow): To keep LEAF’s footprint small, keys 36/64 bytes long are hashed by SHA3-224, otherwise left intact; - The building of B-trees is done in 128 PASSES, thus LOCALITY/LOCALIZATION leads to cache-friendliness, for example, instead of confusing/blinding the SSD controller with building 2^27 ~= 128M B-trees at a time, 'PASSES' revision lowers the "noise/mayhem" 128 times by processing 1M B-trees at a time; - SCALABLE! Gets faster when more Physical or/and External RAM is available, on servers with 1TB RAM (or desktops with 64GB and 1TB Optane SSD) it will dance...

HOMEPAGE: http://www.sanmayce.com/Nakamichi/index.html#DOWNLOAD Downloadable at: https://software.intel.com/en-us/forums/intel-moderncode-for-parallel-architectures/topic/520602#comment-1943095 https://gist.githubusercontent.com/Sanmayce/33e5047d45cdcb8e7711cd7d3ed52c7f/raw/d72e7126c8fbfde07c0d727dcb353b0267b8196c/Nakamichi_Ryuugan-ditto-1TB.c https://community.centminmod.com/threads/a-lzss-microdeduplicator-tagetting-huge-texts-with-c-source.16427/#post-75533

How to compile?: _MakeELF_Nakamichi_GCC.sh: gcc -O3 -static -msse4.1 -fomit-frame-pointer Nakamichi_Ryuugan-ditto-1TB_btree.c -o Nakamichi_Ryuugan-ditto-1TB_btree.elf -D_N_XMM -D_N_prefetch_4096 -D_N_alone -DHashInBITS=24 -DHashChunkSizeInBITS=24 -DRAMpoolInKB=5120 -DBtreeHEURISTIC -D_POSIX_ENVIRONMENT_ -DLongestLineInclusive=64 _MakeEXE_Nakamichi_GCC.bat: gcc -O3 -msse4.1 -fomit-frame-pointer Nakamichi_Ryuugan-ditto-1TB_btree.c -o Nakamichi_Ryuugan-ditto-1TB_RAM_(5GB)_GCC730.exe -D_N_XMM -D_N_prefetch_4096 -D_N_alone -D_N_HIGH_PRIORITY -DHashInBITS=24 -DHashChunkSizeInBITS=24 -DRAMpoolInKB=5120 -DBtreeHEURISTIC -D_WIN32_ENVIRONMENT_ -DLongestLineInclusive=64

Corpus ‘XML’: E:\Nakamichi_2019-Aug-06>Nakamichi_Ryuugan-ditto-1TB_btree.exe

SMMi :MM2 0MMMMM: rMMMMMa ZMMM. 0Z :MMM7 7B rMM@ MMM 7 MMM MMM. BMMMa XZ :MM: MMM XMMX XMMMMMMZ@M; rMMMM rMM; MM@ WM0 2MMMMM ZMMMM MMW BMM : MMMB aMMMMW 2MM :MM8 MM7 8MMi MMMZ XMMM MMa .WMMM0 @MM rMMMMi MMMMa MM, aMM XMMB MM: MMM2 @MXMMM7 0MM MMS WMM MM MMM, MM7 7MM MM: MMX ;MMM iMM MMM8 . MM8 MMS MMMZ . iMBi :MMMX @MMa WB: XMMZ2aX ZMM MMM2 :M2 :MM ZMMM BM8 ;MMMMMaaMM ,MM XMM@MM rMMMMM2aMM XMMMM SMMSMM. MMXMMM MMMM0 aMMMa@MMr MMS ;MM;MM8 XMMMM. MMr MMM0 MM 2MMW MMM 0MM ZMS 2MM. aMMB MMM BM aMM 8Ma MM iMM BMM M7 MMZ ;MMX MMi ;MM ZMW MM0 rM 7MM ;MM MMM7 iMM ZMM2 0MMa MMX 8M: MMM ZMMS BMM2 7 MM8 ZMr :MM rMB MM@ iMM MMZ iMM WMW BM2 MMX @MM MMi ;MMM 0MS 2MM7 aMMM ;MM .8Mi MMM aMM7 ZMMM MM ;Mr XMM iM8 MMa 0MM BMM MMM MM: BMX .MM MM7 aMM BMMM MM ;MMX ZMMMZ BM@ MMM . rMM7 ZMMMZ :MM:MX 0MZ M8 XMM, MMX MM. ;W: rMM aMX ZMM 7MM MM MMMi MM MMZ 8M:MM MM; MMM, MMa 0M:MM 8MWM8 MMZM0 MMM rMM 0MM WMa,M2 MMS MM8 @M0 :MMMXMa MMM WM XMB ; :MM;M8MMM MMM @M XM0 ; MMM@ MMMW MMrZi MMS MMi MMZM0 rMM Mi MM : ;MM BMMMM: rMM MM MMSMa ZMMM@ iMMr 7MM MM MMXM2 :MMM ZMMM 8MM@@ MMZ0 7MM , :MMMM MMBMa 0MWBr ,MMMM MMMM MMS MM MMMZ MMMM MMM MMS .MM MMMZ @MM. MMM MMMM WMM@ 0MM 2M ZMMM :MMMa MMM7 iMMMMMM@ XMMM .MM XMM aMMM MMM: MMW .MM XM@ ZMMM MM0 aMMX 8MMM .MMM WMM ZMr MMMX MMM8 0MMS SMMMMi MMM, rMM,MM2 MMM: SMMM ;MM2 ;MM,MMS MMM, MMM MMM MMM. MMM .MM MMZ MMM XMMM rMMB :MMM7 MM@ BMMM aMM ;MMX aMMX 8MMM ZMM MM8 .MMX MMM2 iMMX ,MMMMMMZ .@; MMM WMM . .: : 0MMZ .2Z :7r. ;i 0MMMaSMi aMMMM7 .WW

Nakamichi 'Ryuugan-ditto-1TB', written by Kaze, inspired by Haruhiko Okumura sharing, based on Nobuo Ito's LZSS source, babealicious suggestion by m^2 enforced, muffinesque suggestion by Jim Dempsey enforced. Note0: Nakamichi 'Dragoneye' is 100% FREE, licenseless that is. Note1: Hamid Buzidi's LzTurbo ([a] FASTEST [Textual] Decompressor, Levels 19/29/39) retains kingship, his TurboBench (2017-Apr-07) proves the supremacy of LzTurbo, Turbo-Amazing! Note2: Conor Stokes' LZSSE2 ([a] FASTEST Textual Decompressor, Level 17) is embedded, all credits along with many thanks go to him. Note3: The matchfinder is either 'Railgun_Trolldom' (matches longer than 18, except 36 and 64) or Leprechaun's B-tree order 3. Note4: Instead of '_mm_loadu_si128' '_mm_lddqu_si128' is used. Note5: Maximum compression ratio is 44:1, for 704 bytes long matches within 1TB Sliding Window. Note6: Please send me (at [email protected]) decompression results obtained on machines with fast CPU-RAM subsystems. Note7: In this compile, clock() was replaced with time() - to counter bigtime stats misreporting. Note8: Multi-way hashing allows each KeySize to occupy its own HASH pool, thus less RAM is in use - the LEAF is smaller. Note9: In this revision, B-tree heuristics are in use, allowing skipping many unnecessary memmem() invocations. NoteA: The file being compressed should be 64 bytes or longer due to Building-Blocks being in range 4..18, 36, 64. NoteB: In this compile, the keysizes in the LEAF are not HEXed i.e. not doubled. NoteC: In this latest (2019-Aug-06) compile, keysizes 36/64 are no longer hashed with SHA3-224, it is slow for this case. Syntax: Nakamichi infile [outfile hashsize treesize treetype] hashsize - hash pool in bits, 0..32, 0 meaning 2^0=1 B-tree per keysize treesize - B-trees pool in MB treetype - i|e or I|E, meaning (Internal|External) or (Internal|External but building B-trees in 128 passes) Example1: Nakamichi OSHO.TXT Large Text Compression Benchmark Example2: Nakamichi OSHO.TXT.Nakamichi Matt Mahoney, Last update: July 25, 2019 http://mattmahoney.net/dc/text.html Example3: Nakamichi OSHO.TXT OSHO.TXT.Nakamichi 24 49000 i Note1: Example above uses (8x2^24)x10 bytes for hash and ~48GB for B-trees of physical RAM. Compression Compressed size Decompresser Note2: Total The size bigger Time the hash(ns/byte) pool, the lesser B-tree tiering, i.e. significantly faster the compression is. Program Options enwik8 enwik9 size (zip) Note3: enwik9+prog The 'outfile' Comp name Decomp is a dummy,Mem Alg it alwaysNote is 'infile'+'.Nakamichi', not a bug, just enforcing avoidance of filename mayhem.

------E:\Nakamichi_2019-Aug-06> ------

phda9 1.8 15,010,414 116,544,849 42,944 xd 116,587,793 86182 86305 6319 CM 83 cmix v17 14,877,373 116,394,271 208,263 s Nakamichi_2019-Aug-06.zip 116,602,534 641189 645651 (112,899 25258 bytes): CM 83https://drive.google.com/file/d/1wQyl7MhUXDtr-ZBxwwN6n1KRbo5axa6B/view?usp=sharing ... enwik8.Nakamichi (32,917,888 bytes): https://drive.google.com/file/d/1IqeHzpzoHZGvMkUbGRxnuiqCAHZ-eO3L/view?usp=sharing cabarc 1.00.0601 -m :21 28,465,607 250,756,595 51,917 xd 250,808,853 1619 15 20 LZ77 sr3 28,926,691 253,031,980 9,399 s enwik9.Nakamichi 253,054,625 (277,293,058 148 160 bytes): 68 SR https://drive.google.com/file/d/1f1NJjwPXCO8FvnQ-7nzY_I4FEoKdq0kW/view?usp=sharing 26 bzip2 1.0.2 -9 29,008,736 253,977,839 30,036 x 254,007,875 379 129 8 BWT ... libzling 20160107 e4 29,721,114 259,475,639 35,582 s 259,511,221 83 27 28 ROLZ 48 ... lzc v0.08 10 30,611,315 266,565,255 11,364 x 266,576,619 302 63 550 LZ77 Nakamichi 'Dragoneye' 32,917,888 277,293,058 112,899 277,405,957 1.3 LZSS 85 crush 1.00 cx 31,731,711 279,491,430 2,489 s 279,493,919 948 2.9 148 LZ77 60 xeloz 0.3.5.3 c889 32,441,272 283,621,211 18,771 s 283,639,982 1079 8 230 LZ77 48 bzp 0.2 31,563,865 283,908,295 36,808 x 283,945,103 110 120 3 LZP ha 0.98 a2 31,250,524 285,739,328 28,404 x 285,767,732 2010 1800 0.8 PPM ulz 0.06 c9 32,945,292 291,028,084 49,450 x 291,077,534 325 1.1 490 LZ77 82

60. Tested by Ilia Muravyov on an Intel Core i7-3770K, 4.8 GHz, 16 GB Corsair Vengeance LP 1800 MHz CL9, Corsair Force GS 240 GB SSD, Windows 7 SP1. 82. Tested by Ilia Muraviev on an Intel Core i7-4790K @ 4.6GHz, 32GB @ 1866MHz DDR3 RAM, RAMDisk. Nakamichi ‘The-Eye-of-the-Dragon’ 85. Tested by Georgi Marinov on i5-7200U @ 3.1GHz, 8GB @ 2133MHz DDR4 RAM, Windows 10. sets a Pareto efficiency (OPEN- SOURCE), only Oodle ‘Mermaid’ and Decompression rate in nanoseconds per byte, 1.3ns/B: LzTurbo 29 outperform Nakamichi, they set the REAL PARETO FRONTIER! enwik9.Nakamichi 725MB/s is 725x1024x1024B per 1,000,000,000ns enwik9 1,000,000,000B per (1,000,000,000B/(725x1024x1024B))x1,000,000,000ns= 1,315,412,850ns Or, Nakamichi decompresses enwik9 in 1.3s on a laptop. Needed memory for Compression: Needed memory for Decompression: Compression Rate (Nakamichi.exe enwik9 enwik9.Nakamichi 26 370000 E): 4N + 293N + (HASHPOOL=5N) <= 302N or 302GB <= 2N or 2GB 122B/s or 1000000000/122/3600/24 =~ 95 days [Physical] [Physical|External] [Physical]

D:\TEXTORAMIC_benchmarking>lzbench173 -c4 -i1,15 -o3 -ecrush,2/lzo1b,999/libdeflate,12/lizard,19,29,49/lzfse/lzsse2,17/xpack,9/,9 enwik9 lzbench 1.7.3 (64-bit Windows) Assembled by P.Skibinski Compressor name Compress. Decompress. Orig. size Compr. size Ratio Filename Nakamichi 'Dragoneye' 725 MB/s 277293058 ! Outside lzbench ! crush 1.0 -2 0.35 MB/s 269 MB/s 1000000000 279083341 27.91 enwik9 xpack 2016-06-02 -9 10 MB/s 746 MB/s 1000000000 300716430 30.07 enwik9 libdeflate 0.7 -12 5.65 MB/s 559 MB/s 1000000000 310824785 31.08 enwik9 lizard 1.0 -49 1.38 MB/s 1046 MB/s 1000000000 318854201 31.89 enwik9 lzfse 2017-03-08 47 MB/s 596 MB/s 1000000000 319756993 31.98 enwik9 zlib 1.2.11 -9 17 MB/s 253 MB/s 1000000000 322789230 32.28 enwik9 lizard 1.0 -29 1.44 MB/s 1191 MB/s 1000000000 323348239 32.33 enwik9 lzsse2 2016-05-14 -17 2.26 MB/s 2923 MB/s 1000000000 340270593 34.03 enwik9 lzo1b 2.09 -999 12 MB/s 459 MB/s 1000000000 363178533 36.32 enwik9 lizard 1.0 -19 5.39 MB/s 2507 MB/s 1000000000 372092974 37.21 enwik9 ♦ Home of lzbench: https://github.com/inikep/lzbench ♦ Home of TurboBench: https://github.com/powturbo/TurboBench ♦ page - 1 - ♦ Home of The-Eye-of-the-Dragon: http://www.sanmayce.com/Nakamichi/Nakamichi_2019-Aug-06.zip; The B-tree boosted variant: https://community.centminmod.com/posts/75533/ ♦ Testmachine: Laptop 'Compressionette' Lenovo Ideapad 310; i5-7200u @2.5GHz; 8GB DDR4 @1066MHz (2133MHz) CL15 CR2T; L2 cache: 2x256KB; L3 cache: 3MB ♦ Starfox – Kaze’s Superfast Decompression Textual Showdown, Ryuugan vs ‘Usual Suspects’; update: 2019-Aug-09; tester: Kaze, https://twitter.com/Sanmayce D:\TEXTORAMIC_benchmarking>lzbench173 -c4 -i1,15 -o3 - etornado,16/blosclz,9/brieflz/crush,2/csc,5/density,3/fastlz,2/gipfeli/lzo1b,999/lzham,4/lzham24,4/libdeflate,1,12/lz4hc,1,10,12/lizard,19,29,39,49/lzf,1/lzfse/lzg,9/lzham,1//lzlib,9/lzma,9/ ,5/lzsse2,17/lzsse4,17/lzsse8,17/lzvn/pithy,9/quicklz,3//slz_zlib,3/ucl_nrv2b,9/ucl_nrv2d,9/ucl_nrv2e,9/xpack,1,9/xz,9/yalz77,12/yappy,99/zlib,1,5,9/zling,4/shrinker/wflz/lzmat enwik9 lzbench 1.7.3 (64-bit Windows) Assembled by P.Skibinski Compressor name Compress. Decompress. Orig. size Compr. size Ratio Filename D:\TEXTORAMIC_benchmarking>"turbobench_v18.05_-_build_04_May_2018" enwik9 - csc 2016-10-13 -5 2.33 MB/s 59 MB/s 1000000000 213296889 21.33 enwik9 esnappy_c/yappy/bzip2/lzlib,9d30fb273/lzham,4fb258:x4:d30/lzma,9d30:fb273:mf=bt4/oodle,89,91,95,99,111 lzma 16.04 -9 1.12 MB/s 81 MB/s 1000000000 213337819 21.33 enwik9 ,115,119,129,139/lzturbo,19,12,10,29,22,20,39,32,30,59/,11d30/zstd,1,5,12,22,22d30/lizard,11,19, xz 5.2.3 -9 1.19 MB/s 79 MB/s 1000000000 213337866 21.33 enwik9 21,29,31,39,41,49/trle -I3 -J31 -k1 -B2G lzham 1.0 -d26 -4 0.81 MB/s 188 MB/s 1000000000 215673584 21.57 enwik9 167396049 16.7 6.20 23.19 lzturbo 59 enwik9 lzlib 1.8 -9 1.07 MB/s 57 MB/s 1000000000 216832124 21.68 enwik9 198005976 19.8 0.28 87.13 lzma 9d30:fb273:mf=bt4 enwik9 tornado 0.6a -16 1.31 MB/s 175 MB/s 1000000000 217735325 21.77 enwik9 199103934 19.9 0.22 233.04 brotli 11d30 enwik9 lzham24 1.0 -4 1.01 MB/s 188 MB/s 1000000000 227368900 22.74 enwik9 202142749 20.2 0.03 610.95 lzturbo 39 enwik9 zling 2016-01-10 -4 27 MB/s 144 MB/s 1000000000 259449164 25.94 enwik9 205667751 20.6 0.94 559.23 zstd 22d30 enwik9 lzham 1.0 -d26 -1 1.97 MB/s 191 MB/s 1000000000 259506946 25.95 enwik9 206978999 20.7 0.18 450.44 oodle 139 ‘Leviathan’ enwik9 Nakamichi 'Ryuugan-ditto-1TB' 725 MB/s 277293058 ! outside lzbench ! 207970574 20.8 0.14 544.96 oodle 129 ‘Hydra’ enwik9 crush 1.0 -2 0.35 MB/s 269 MB/s 1000000000 279083341 27.91 enwik9 207973249 20.8 0.21 546.08 oodle 89 ‘Kraken’ enwik9 xpack 2016-06-02 -9 10 MB/s 746 MB/s 1000000000 300716430 30.07 enwik9 215393166 21.5 1.14 581.24 zstd 22 enwik9 libdeflate 0.7 -12 5.65 MB/s 559 MB/s 1000000000 310824785 31.08 enwik9 252380843 25.2 31.79 609.15 lzturbo 32 enwik9 lizard 1.0 -49 1.38 MB/s 1046 MB/s 1000000000 318854201 31.89 enwik9 253977895 25.4 8.08 23.56 bzip2 enwik9 lzfse 2017-03-08 47 MB/s 596 MB/s 1000000000 319756993 31.98 enwik9 262776183 26.3 0.27 1308.03 oodle 99 ‘Mermaid’ enwik9 zlib 1.2.11 -9 17 MB/s 253 MB/s 1000000000 322789230 32.28 enwik9 265937799 26.6 0.47 1342.20 oodle 95 ‘Mermaid’ enwik9 lizard 1.0 -29 1.44 MB/s 1191 MB/s 1000000000 323348239 32.33 enwik9 271849741 27.2 7.55 536.40 zstd 12 enwik9 zlib 1.2.11 -5 30 MB/s 249 MB/s 1000000000 327365805 32.74 enwik9 277293058 725 Nakamichi 'Ryuugan-ditto-1TB' ! outside turbobench ! ucl_nrv2e 1.03 -9 1.30 MB/s 248 MB/s 1000000000 332405521 33.24 enwik9 277467220 27.7 1.41 772.44 lizard 49 enwik9 ucl_nrv2d 1.03 -9 1.30 MB/s 250 MB/s 1000000000 335533150 33.55 enwik9 305640093 30.6 87.16 625.00 zstd 5 enwik9 lzsse2 2016-05-14 -17 2.26 MB/s 2923 MB/s 1000000000 340270593 34.03 enwik9 322789234 32.3 16.32 245.60 zlib 9 enwik9 ucl_nrv2b 1.03 -9 1.27 MB/s 242 MB/s 1000000000 341785796 34.18 enwik9 323348243 32.3 1.47 1183.03 lizard 29 enwik9 lzsse4 2016-05-14 -17 2.51 MB/s 3123 MB/s 1000000000 344520599 34.45 enwik9 327365809 32.7 27.68 242.01 zlib 5 enwik9 lzsse8 2016-05-14 -17 2.33 MB/s 3032 MB/s 1000000000 346637623 34.66 enwik9 330778607 33.1 30.67 973.07 lzturbo 22 enwik9 libdeflate 0.7 -1 122 MB/s 587 MB/s 1000000000 355647167 35.56 enwik9 331985915 33.2 0.28 1983.41 oodle 119 ‘Selkie’ enwik9 xpack 2016-06-02 -1 114 MB/s 575 MB/s 1000000000 358741520 35.87 enwik9 334059045 33.4 0.50 2033.86 oodle 115 ‘Selkie’ enwik9 lzo1b 2.09 -999 12 MB/s 459 MB/s 1000000000 363178533 36.32 enwik9 334314023 33.4 5.45 1536.01 lizard 39 enwik9 lzmat 1.01 31 MB/s 255 MB/s 1000000000 367262723 36.73 enwik9 346761203 34.7 167.92 947.82 lzturbo 30 enwik9 lz4hc 1.8.0 -12 11 MB/s 2127 MB/s 1000000000 371677964 37.17 enwik9 358200480 35.8 106.75 1554.31 oodle 91 ‘Mermaid’ enwik9 lizard 1.0 -19 5.39 MB/s 2507 MB/s 1000000000 372092974 37.21 enwik9 358512497 35.9 244.64 823.24 zstd 1 enwik9 lz4hc 1.8.0 -10 20 MB/s 2145 MB/s 1000000000 373026340 37.30 enwik9 368598857 36.9 126.30 905.38 lizard 41 enwik9 zlib 1.2.11 -1 71 MB/s 245 MB/s 1000000000 378355076 37.84 enwik9 371850400 37.2 0.02 3018.56 lzturbo 19 enwik9 lizard 1.0 -39 5.27 MB/s 2337 MB/s 1000000000 378912990 37.89 enwik9 372092978 37.2 5.85 2274.69 lizard 19 enwik9 lzg 1.0.8 -9 0.84 MB/s 447 MB/s 1000000000 386976301 38.70 enwik9 378355080 37.8 59.19 238.50 zlib 1 enwik9 brieflz 1.1.0 86 MB/s 135 MB/s 1000000000 388557049 38.86 enwik9 396080892 39.6 49.13 2977.63 lzturbo 12 enwik9 yalz77 2015-09-19 -12 17 MB/s 269 MB/s 1000000000 394059341 39.41 enwik9 417363747 41.7 151.17 1628.11 lizard 31 enwik9 quicklz 1.5.0 -3 39 MB/s 589 MB/s 1000000000 395494056 39.55 enwik9 448489061 44.8 154.99 1662.09 lizard 21 enwik9 lzvn 2017-03-08 39 MB/s 708 MB/s 1000000000 395814407 39.58 enwik9 449484382 44.9 196.15 2322.59 lizard 11 enwik9 lz4hc 1.8.0 -1 86 MB/s 2030 MB/s 1000000000 400818043 40.08 enwik9 461711589 46.2 116.72 2864.88 oodle 111 ‘Selkie’ enwik9 gipfeli 2016-07-13 205 MB/s 318 MB/s 1000000000 411940549 41.19 enwik9 474984506 47.5 314.25 1996.00 lzturbo 20 enwik9 density 0.12.5 beta -3 252 MB/s 263 MB/s 1000000000 432914184 43.29 enwik9 494865421 49.5 75.58 1541.27 yappy enwik9 lzrw 15-Jul-1991 -5 101 MB/s 355 MB/s 1000000000 436682071 43.67 enwik9 504440877 50.4 353.25 2740.71 lzturbo 10 enwik9 pithy 2011-12-24 -9 201 MB/s 1200 MB/s 1000000000 437729417 43.77 enwik9 507711414 50.8 308.63 923.10 snappy_c enwik9 slz_zlib 1.0.0 -3 164 MB/s 213 MB/s 1000000000 478256185 47.83 enwik9 ERROR at 0:3c, c3 lzham 4fb258:x4:d30 enwik9 fastlz 0.1 -2 210 MB/s 370 MB/s 1000000000 487260752 48.73 enwik9 ERROR at 0:3c, c3 lzlib 9d30fb273 enwik9 yappy 2014-03-22 -99 72 MB/s 1574 MB/s 1000000000 492824491 49.28 enwik9 VirtualAlloc failed lzturbo 29 enwik9 lzf 3.6 -1 214 MB/s 458 MB/s 1000000000 492987190 49.30 enwik9 VirtualAlloc failed lzturbo 49 enwik9 blosclz 2015-11-10 -9 170 MB/s 634 MB/s 1000000000 498688572 49.87 enwik9 snappy 1.1.4 214 MB/s 889 MB/s 1000000000 507860747 50.79 enwik9 Notes: wflz 2015-09-16 165 MB/s 572 MB/s 1000000000 559915321 55.99 enwik9 - Latest available ‘oo2core_6_win64.dll’ was used; lzjb 2010 176 MB/s 329 MB/s 1000000000 665072021 66.51 enwik9 - The tweak value spacespeedtradeoff=[64..1024] “tweak size vs decode time” has not been played with! shrinker 0.1 197 MB/s 5596 MB/s 1000000000 968576855 96.86 enwik9 - Oodle Levels: HyperFast = -4..-1; None = 0; SuperFast = 1; VeryFast = 2; Fast = 3; Normal = 4; memcpy 11010 MB/s 10010 MB/s 1000000000 1000000000 100.00 enwik9 Optimal1,2,3,4 = 5,6,7,8; - Oodle Compressors: Kraken = 8 (Default); Leviathan = 13 (Best); Mermaid = 9 (Crazy fast); Selkie = D:\TEXTORAMIC_benchmarking_2019-Aug-06>sha1sum enwik9 11 (Fastest); Hydra = 12 (Tuneable composite of above); 2996e86fb978f93cca8f566cc56998923e7fe581 enwik9 Sadly, the 8GB RAM plus 96000MB “Virtual” RAM were not enough 153,626,112 enwik9.method511.zpaq | "zpaq_v7.05_x64.exe" add enwik9.method511.zpaq enwik9 -method 511 -threads 1 | to show my favorite compressor 165,862,216 enwik9.ST0Block1024.bsc | "bsc_v3.1.0_x64.exe" e enwik9 enwik9.ST0Block1024.bsc -b1024 -m0 -cp -Tt | mode LzTurbo 29 performing :( 174,271,235 enwik9.512M.rz | rz_1.01.exe a -d 512M enwik9.512M.rz enwik9 | Note: RAZOR fits nicely in 8GB (the footprint is: Virtual Memory = 6484 MB), moreover it COMBINES deduplicator and compressor :) 184,182,264 enwik9.ST6Block1024.bsc | "bsc_v3.1.0_x64.exe" e enwik9 enwik9.ST6Block1024.bsc -b1024 -m6 -cp -Tt | 197,114,230 enwik9.O16.PPMd_varI | PPMd_varI_rev2_Intel15_32bit.exe e -o16 -m256 -fenwik9.O16.PPMd_varI enwik9 | 197,232,198 enwik9.O6.PPMd_varI | PPMd_varI_rev2_Intel15_32bit.exe e -o6 -m256 -fenwik9.O6.PPMd_varI enwik9 | 200,302,581 enwik9.MX9Dict1024.7z | "7za_x64_v1900.exe" a -t7z -mx9 -md=30 enwik9.MX9Dict1024.7z enwik9 | 200,332,976 enwik9.L9Dict1024.xz | "xz_v5.2.3_x64.exe" -z -k -f -9 -e -v -v --lzma2=dict=1024MiB --threads=1 enwik9 | 205,362,522 enwik9.2GB.L22.zst | zstd-v1.4.2-win64.exe --ultra -22 --zstd=wlog=31,clog=30,hlog=30,slog=26 enwik9 | 232,248,535 enwik9.rar560_m5_m1g | rar-x64-560.exe a -m5 -ma5 -md1g enwik9.rar560_m5_m1g enwik9 | 243,538,554 enwik9.ST3Block1024.bsc | "bsc_v3.1.0_x64.exe" e enwik9 enwik9.ST3Block1024.bsc -b1024 -m3 -cp -Tt | 244,711,837 enwik9.method211.zpaq | "zpaq_v7.05_x64.exe" add enwik9.method211.zpaq enwik9 -method 211 -threads 1 | 277,293,058 enwik9.Nakamichi | "Nakamichi_Ryuugan-ditto-1TB_btree.exe" enwik9 enwik9.Nakamichi 31 300123 i | Note: This Fastest (ALL-IN-RAM) compression mode needs 4*1GB + 2^31*(8*10)B + 300123MB =~ 464GB Physical RAM :P 277,293,058 enwik9.Nakamichi | "Nakamichi_Ryuugan-ditto-1TB_btree.exe" enwik9 enwik9.Nakamichi 26 300123 E | Note: This Hybrid compression mode needs 4*1GB + 2^26*(8*10)B =~ 9GB Internal RAM plus 300GB External RAM :( 310,706,136 enwik9.MX9.zip | "7za_x64_v1900.exe" a -tgzip -mx9 enwik9.MX9.zip enwik9 | 372,443,347 enwik9.12.lz4 | lz4_v1_9_1_win64.exe -12 enwik9 enwik9.12.lz4 |

D:\TEXTORAMIC_benchmarking_2019-Aug-06>zstd-v1.4.2-win64.exe -b1e22 -i9 --priority=rt "enwik9" Note : switching to real-time priority D:\TEXTORAMIC_benchmarking_2019-Aug-06>lz4_v1_9_1_win64.exe -b1e12 -i9 --no-frame-crc enwik9 1#enwik9 :1000000000 -> 357408193 (2.798), 256.4 MB/s , 749.8 MB/s Benchmarking levels from 1 to 12 2#enwik9 :1000000000 -> 329020812 (3.039), 165.7 MB/s , 679.9 MB/s 1#enwik9 :1000000000 -> 509196023 (1.964), 381.6 MB/s ,2895.8 MB/s 3#enwik9 :1000000000 -> 313417240 (3.191), 117.8 MB/s , 650.6 MB/s 2#enwik9 :1000000000 -> 509196023 (1.964), 381.5 MB/s ,2895.2 MB/s 4#enwik9 :1000000000 -> 307416113 (3.253), 98.7 MB/s , 623.6 MB/s 3#enwik9 :1000000000 -> 387882359 (2.578), 67.2 MB/s ,2671.4 MB/s 5#enwik9 :1000000000 -> 301594332 (3.316), 61.9 MB/s , 600.2 MB/s 4#enwik9 :1000000000 -> 380593309 (2.627), 55.5 MB/s ,2707.0 MB/s 6#enwik9 :1000000000 -> 295044831 (3.389), 40.4 MB/s , 616.2 MB/s 5#enwik9 :1000000000 -> 376871181 (2.653), 45.7 MB/s ,2732.1 MB/s 7#enwik9 :1000000000 -> 284665291 (3.513), 30.2 MB/s , 658.1 MB/s 6#enwik9 :1000000000 -> 375209520 (2.665), 39.0 MB/s ,2755.6 MB/s 8#enwik9 :1000000000 -> 280669567 (3.563), 24.2 MB/s , 684.8 MB/s 7#enwik9 :1000000000 -> 374513912 (2.670), 34.7 MB/s ,2760.1 MB/s 9#enwik9 :1000000000 -> 278213009 (3.594), 17.1 MB/s , 691.7 MB/s 8#enwik9 :1000000000 -> 374223597 (2.672), 32.0 MB/s ,2758.2 MB/s Nakamichi 'Ryuugan-ditto-1TB' 277293058 725 MB/s ! outside zstdbench ! 9#enwik9 :1000000000 -> 374083516 (2.673), 30.0 MB/s ,2757.5 MB/s 10#enwik9 :1000000000 -> 273413148 (3.657), 12.3 MB/s , 644.4 MB/s 10#enwik9 :1000000000 -> 372315756 (2.686), 19.0 MB/s ,2763.1 MB/s 11#enwik9 :1000000000 -> 270882974 (3.692), 9.65 MB/s , 622.2 MB/s 11#enwik9 :1000000000 -> 371711026 (2.690), 15.9 MB/s ,2754.1 MB/s 12#enwik9 :1000000000 -> 268803119 (3.720), 5.87 MB/s , 622.9 MB/s 12#enwik9 :1000000000 -> 371677964 (2.691), 13.6 MB/s ,2716.0 MB/s 13#enwik9 :1000000000 -> 265614307 (3.765), 5.75 MB/s , 652.3 MB/s 14#enwik9 :1000000000 -> 260814193 (3.834), 4.60 MB/s , 598.1 MB/s 15#enwik9 :1000000000 -> 257582679 (3.882), 3.52 MB/s , 547.9 MB/s 16#enwik9 :1000000000 -> 249939606 (4.001), 3.22 MB/s , 639.5 MB/s 17#enwik9 :1000000000 -> 242716889 (4.120), 2.35 MB/s , 562.4 MB/s 18#enwik9 :1000000000 -> 239533930 (4.175), 1.99 MB/s , 480.5 MB/s 19#enwik9 :1000000000 -> 235586643 (4.245), 1.66 MB/s , 449.1 MB/s 20#enwik9 :1000000000 -> 226026294 (4.424), 1.41 MB/s , 590.3 MB/s 21#enwik9 :1000000000 -> 220258348 (4.540), 1.23 MB/s , 592.8 MB/s 22#enwik9 :1000000000 -> 215052701 (4.650), 1.10 MB/s , 591.8 MB/s

My TOP 4 (textual-decompression-speed-wise): - Hamid’s LzTurbo 29; - Conor’s LZSSE2 17; - Bloom’s Oodle 99 ‘Mermaid’; - Przemyslaw’s and Yann’s Lizard 49.

♦ Home of lzbench: https://github.com/inikep/lzbench ♦ Home of TurboBench: https://github.com/powturbo/TurboBench ♦ page - 2 - ♦ Home of The-Eye-of-the-Dragon: http://www.sanmayce.com/Nakamichi/Nakamichi_2019-Aug-06.zip; The B-tree boosted variant: https://community.centminmod.com/posts/75533/ ♦ Testmachine: Laptop 'Compressionette' Lenovo Ideapad 310; i5-7200u @2.5GHz; 8GB DDR4 @1066MHz (2133MHz) CL15 CR2T; L2 cache: 2x256KB; L3 cache: 3MB ♦ Starfox – Kaze’s Superfast Decompression Textual Showdown, Ryuugan vs ‘Usual Suspects’; update: 2019-Aug-09; tester: Kaze, https://twitter.com/Sanmayce In order to enrich the experience in textual realm, along with the classic XML type, DNA and HTM are to be included, IT IS INTERESTING WHETHER the ROSTERS CHANGE ACROSS THESE 3 corpora ...

Corpus ‘DNA’: SILVA_132_SSURef_Nr99_tax_silva.fasta (1,108,994,702 bytes):

Corpus ‘HTM’: Sacred_Texts_7_(97830_htm_files).tar (1,198,966,784 bytes):

John Bruno Hare (JBH or Bruno), the founder and architect of ISTA (Internet Sacred Text Archive) passed away on April 27, 2010 after a four-year battle with Melanoma. JBH’s life mission was to keep the archive free and available worldwide, forever, and ISTA is his legacy. Bruno’s efforts placed this website, sacred-texts.com, among the top 10,000 read websites in the United States, and among the top 20,000 read websites of the entire Internet. He dedicated ISTA to religious tolerance and scholarship, calling it a “QUIET PLACE IN CYBERSPACE."

♦ Home of lzbench: https://github.com/inikep/lzbench ♦ Home of TurboBench: https://github.com/powturbo/TurboBench ♦ page - 3 - ♦ Home of The-Eye-of-the-Dragon: http://www.sanmayce.com/Nakamichi/Nakamichi_2019-Aug-06.zip; The B-tree boosted variant: https://community.centminmod.com/posts/75533/ ♦ Testmachine: Laptop 'Compressionette' Lenovo Ideapad 310; i5-7200u @2.5GHz; 8GB DDR4 @1066MHz (2133MHz) CL15 CR2T; L2 cache: 2x256KB; L3 cache: 3MB ♦ Starfox – Kaze’s Superfast Decompression Textual Showdown, Ryuugan vs ‘Usual Suspects’; update: 2019-Aug-09; tester: Kaze, https://twitter.com/Sanmayce