IBM z15 Hardware Compression The Integrated Accelerator for zEnterprise

November 2020 —

Andreas Krebbel

Compiler & Toolchain for on Z Trademarks

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. DirMaint* FlashSystems* IBM* IBM Z* PR/SM z13* zEnterprise DS8000 GDPS* .com OMEGAMON* RACF* z13s* z/OS* ECKD HiperSockets IBM (logo)* Parallel Sysplex* Storwize z14* z/VSE* FICON HyperSwap* Cloud Pak* Performance Toolkit for VM XIV z15 z/VM*

* Registered trademarks of IBM Corporation

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited. ITIL is a Registered Trade Mark of AXELOS Limited. Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. The registered trademark Linux® is used pursuant to a sublicense from the , the exclusive licensee of Linus Torvalds, owner of the mark on a world­wide basis. Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. OpenStack is a trademark of OpenStack LLC. The OpenStack trademark policy is available on the OpenStack website. Red Hat®, JBoss®, OpenShift®, Fedora®, Hibernate®, Ansible®, CloudForms®, RHCA®, RHCE®, RHCSA®, Ceph®, and Gluster® are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. VMware, the VMware logo, VMware Cloud Foundation, VMware Cloud Foundation Service, VMware vCenter Server, and VMware vSphere are registered trademarks or trademarks of VMware, Inc. or its subsidiaries in the United States and/or other jurisdictions. Zowe™, the Zowe™ logo and the Open Mainframe Project™ are trademarks of The Linux Foundation. Other product and service names might be trademarks of IBM or other companies. Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. This information provides only general descriptions of the types and portions of workloads that are eligible for execution on Specialty Engines (e.g, zIIPs, zAAPs, and IFLs) ("SEs"). IBM authorizes customers to use IBM SE only to execute the processing of Eligible Workloads of specific Programs expressly authorized by IBM as specified in the “Authorized Use Table for IBM Machines” provided at http://www.ibm.com/systems/support/machine_warranties/machine_code/aut.html (“AUT”). No other workload processing is authorized for execution on an SE. IBM offers SE at a lower price than General Processors/Central Processors because customers are authorized to use SEs only to process certain types and/or amounts of workloads as specified by IBM in the AUT. November 11, 2020 2 Compression Algorithm

LZ77LZ77 HuffmanHuffman DeduplicationDeduplication EncodingEncoding

uncompressed uncompressed compressedcompressed

LZ77LZ77 HuffmanHuffman ExpansionExpansion DecodingDecoding

November 11, 2020 3 LZ77 Lempel-Ziv 77

● Root of an entire family of compression methods. ● Variations can be found in almost all modern compression algorithms. ● Find duplicate strings and replace them with back-references consisting of tuples:

a ship shipping ship is shipping ships

a ship<5,5>ping<14,6>is<17,14>s

November 11, 2020 4 Huffman Encoding

● Find compact bit encodings of the values depending on their frequency. ● Applied to characters and back-references from LZ77. ● The mapping from bit patterns to values becomes part of the output file.

Char Freq Fixed Bits Huffman Bits „ „ 1 000 3 1111 4 d 1 001 3 1110 4 e 1 010 3 010 3 „Hello World“ H 1 011 3 011 3 l 3 100 9 10 6 o 2 101 6 00 4 r 1 110 3 1100 4 W 1 111 3 1101 4 Total 33 32

November 11, 2020 5 IBM Integrated Accelerator for zEnterprise Data Compression

● One accelerator per chip (1 per 12 cores) ● Full hardware implementation of Deflate compression and decompression providing huge speedup over . ● New instruction to interface with the accelerator → DFLTCC ● Synchronous execution → Core using accelerator is blocked during execution.

November 11, 2020 6 , , and zlib

● ZIP file format and the Deflate algorithm have been invented by Phil Katz for his pkzip compression tool (). ● 1990: Info-ZIP group implemented free versions of Deflate as zip/unzip ● Early 90‘s: gzip format was developed. gzip‘s Deflate implementation was derived from the Info-ZIP tools. ● Over the GIF patent dispute the PNG format was developed. In order to make Deflate usable in the gzip implementation was turned into a library → zlib

● gzip itself does NOT use zlib. It still uses its own implementation. ● gzip is A) a tool and B) a file format. ● tar does NOT use zlib. It uses gzip via pipes.

November 11, 2020 7 zlib/gzip Configuration

● By default only compression level 1 is accelerated! ● Specify a compression level of 1 in zlib API calls to enable HW compression: deflate_init, deflateParams, gzopen, ... ● Override zlib/gzip default behavior using environment variables: – DFLTCC=0: Disable hardware compression and decompression Default is 1 – DFLTCC_LEVEL_MASK=0x1fe: Bit mask indicating the compression levels to be accelerated Hardware decompression is not affected.

November 11, 2020 8 How to set the Environment Variables

● Enable hardware compression for levels 1-8: – Single command invokation: env DFLTCC_LEVEL_MASK=0x1fe – Per user (for the interactive shell): echo DFLTCC_LEVEL_MASK=0x1fe >>~/.bashrc – Global (if invoked from users session): echo DFLTCC_LEVEL_MASK=0x1fe >>/etc/environment – For all systemd services: mkdir /etc/systemd/system.conf.d printf "[Manager]\nDefaultEnvironment=DFLTCC_LEVEL_MASK=0x1fe\n“ >/etc/systemd/system.conf.d/dfltcc.conf systemctl daemon-reload systemctl restart – For a Docker (podman) container: Use the -e option to add the setting to the container environment: docker run -it --rm -e DFLTCC_LEVEL_MASK=0x1fe IMAGE COMMAND [ARGS]

November 11, 2020 9 Performance - zEDC vs Software Compression

x times faster than software compression

Most files are too small to see bigger speedups.

Compression Level 1 comparison using minigzip on squash benchmark suite using 128 kB buffers November 11, 2020 10 Performance - zEDC vs Software Decompression

x times faster than software decompression

Most files are too small to see bigger speedups.

Comparison using minigzip on squash benchmark suite with 128kB buffers November 11, 2020 11 Buffer Size

● Number of bytes passed to the deflate compression/decompression call. ● Pass larger buffers to reduce overhead when using the zlib raw interface. ● gzip and the zlib-gzip-API already do internal buffering to handle that.

16k 16k 16k 16k 16k 16k 16k 16k

128k

Millicode-entry, parameter validation, synchronization Millicode-exit, continuation buffer preparation, synchronization Actual compression/decompression work

November 11, 2020 12 Performance - Buffer Sizes

● zEDC benefits more from larger input buffers than software compression ● Adjust zlib calls to pass 128k buffers!

Geomean across squash benchmark files bigger than 1MB

November 11, 2020 13 Software Compression Levels xabcdeabcdefdef

xabcde<5,5>fdef xabcde<5,5>f<3,3> Fast Deflate Slow Deflate ● Levels 1-3 ● Levels 4-9 ● No lazy matching: ● Lazy matching: ● Matches inside other ● Expensive search to further matches cannot be found improve existing matches

● The compression level only affects LZ77 search for duplicates. ● 6 is the default level used if no compression level is specified.

November 11, 2020 14 Compression Ratio

File size ratio compressed / uncompressed

November 11, 2020 15 Compression Ratio

Deflate Slow

Deflate Fast

Geomean across squash benchmarks.

November 11, 2020 16 Is it working?

Is the feature enabled? $ cat /proc/cpuinfo ... features : esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs vxe2 vxp sort dflt sie

Yes? Then the following should be faster than usual ... $ time python - 'import gzip; gzip.open("test.gz", "wb", \ compresslevel=1).write(b"abcdefghijklmnopqrstuvwxyz" * 50000000)'

This command returns after 1.5s if your zlib is zEDC enabled. It needs more than 4s if it is not or you accidentally ran it on x86.

November 11, 2020 17 zEDC - CPU Counters

perf stat -e DFLT_CC >= Kernel 5.7.0 Note: CPU counters perf stat -e cpum_cf/config=264/ >= Kernel 5.5.0 must be enabled. perf stat -e r108 < Kernel 5.5.0 Check with lscpumf

DFLT_ACCESS 247 0xf7 Cycles CPU spent obtaining access to Deflate unit DFLT_CYCLES 252 0xfc Cycles CPU is using Deflate unit DFLT_CC 264 0x108 Increments by one for every DEFLATE CONVERSION CALL instruction executed *The name of that counter used DFLT_CCFINISH 265 0x109 to be DFLT_CCERROR. But the Increments by one for every DEFLATE CONVERSION CALL counter is incremented for instruction executed that ended in Condition Codes 0, 1 or 2 successful completions. So it had to be renamed.

November 11, 2020 18 zEDC - CPU Counters - Example

$ perf stat -e DFLT_ACCESS,DFLT_CYCLES,DFLT_CC,DFLT_CCFINISH \ python -c 'import gzip; gzip.open("test.gz", "wb", \ compresslevel=1).write(b"abcdefghijklmnopqrstuvwxyz" * \ 50000000)'

Performance counter stats for 'python -c import gzip; gzip.open("test.gz", "wb", \ compresslevel=1).write(b"abcdefghijklmnopqrstuvwxyz" * \ 50000000)':

605083 DFLT_ACCESS 436249038 DFLT_CYCLES 9230 DFLT_CC 1865 DFLT_CCFINISH ...

November 11, 2020 19 Use Cases

● Offline - Time vs Storage – Filesystem backups ● Via tar (gzip) or tools using zlib – Database backups ● Backup usually has to be finished within a time slot. ● Online - Storage/Memory vs CPU – Java Workloads ● WebSphere – Compressed file systems ● , zram (compressed RAM disk) – Network connections ● ssh compression (-C uses zlib compression) ● (Nginx, Apache, …) ● Communication with mobile devices → Costs – In-Memory compression (e.g. zram, Java Inflater/Deflater)

November 11, 2020 20 Reproducibility

● Attention with integrity checks on compressed files! ● zEDC might produce a different result when being interrupted during compression – Compressed output is correct but different! ● Checksums on the compressed output might miscompare even if the uncompressed data would be identical. ● Reproducible builds: – Distros set SOURCE_EPOCH_TIME to make builds independent of the build time – zlib and gzip fall back to software compression if SOURCE_EPOCH_TIME is set

November 11, 2020 21 Linux Distro Support

zlib gzip ● For all these distros hardware acceleration by 8.1 RHEL default is enabled only for >=8.2 compression level 1. ● We plan to enable it for 12 SP5 levels 1-6 in future distro releases (e.g. 20.10) SLES 15 SP1 ● This behavior can always be overridden using the >=15 SP2 environment variable Ubuntu >=19.10 DFLTCC_LEVEL_MASK.

November 11, 2020 22 Java Support

● Workloads using the Java compression APIs benefit immediately without any code changes: WebSphere, … ● Accelerated Java APIs: – java/util/zip/GZIPInputStream, GZIPOutputStream – java/util/zip/InflaterStream, DeflaterStream – java/util/zip/Inflater, Deflater ● Supported versions: – IBM SDK for Java 8 SR6 FP16 required to exploit Integrated zEDC ● zEDC compression enabled for levels 1-6 by default. ● Includes own copy of zlib and does not require zlib distro support. ● DFLTCC_LEVEL_MASK settings do affect Java workloads! – OpenJDK 8/11 with OpenJ9 0.17 uses system zlib - see „Linux Distro Support“

November 11, 2020 23 Conclusion

Thanks to the Integrated Accelerator for zEDC compression and decompression becomes cheap enough to be enabled everywhere.

November 11, 2020 24 References

on zip/gzip/zlib history: https://stackoverflow.com/questions/20762094/how-are-zlib-gzip-and-zip-related-what-do-they-have-in-common-and-how-are-they ● Accelerated Data Compression with Linux on IBM z15 – Managing Data Growth: https://mediacenter.ibm.com/media/t/1_n8rnkcdj ● IBM z15 CPUMF Counters https://www.ibm.com/support/pages/sites/default/files/inline-files/119190_SA23-2261-06.pdf ● How to enable IBM LinuxONE III zEDC Integrated Accelerator https://github.com/iii-i/zlib/tree/dfltcc-howto/contrib/s390 ● Linux and Mainframe - Blog from Eberhard Pasch https://linux.mainframe.blog/zlib-acceleration/ ● HOWTO: Exploiting Hardware Compression in zlib with IBM z15: http://linux-on-z.blogspot.com/2019/10/howto-exploiting-hardware-compression.html ● RFC1951 - DEFLATE Compressed Data Format Specification version 1.3 https://tools.ietf.org/html/rfc1951 ● Squash Compression Benchmark Suite https://github.com/quixdb/squash-benchmark

November 11, 2020 25