Tools for Managing Big Data Analytics on z/OS

Mike Stebner, Joe Sturonas PKWARE, Inc.

Wednesday, March 12, 2014 Session ID 14948

Test link: www.SHARE.org

Introduction Heterogeneous Analysis

Addressing the process of packaging and transferring z/OS based information to an off-board analytic platform in an Effective, Cost-efficient and Secure manner.

What are some major hurdles that exploitation of advanced System z facilities can overcome in this venue?

2

Introduction Heterogeneous Analysis

• Data Transformation • Code page differences (EBCDIC/ASCII) • Data Structures (Binary, Endian mode numerics, Parsing) • Portability between dissimilar formats • Data Packaging (multiple discrete components) • Data Protection • Data Volume • Total raw size • Number of exchanges

3 Finding the Sweet Spot

4 What is the business impact of selected designs and facilities?

5

Focus on experiences with System z Facilities that help address two areas

• Data Transformation • Code page differences (EBCDIC/ASCII) • Data Structures (Binary, Endian numerics, Parsing) • Portability between dissimilar file system formats • Data Packaging (multiple discrete components) • Data Protection - Encryption • Data Volume – Hardware Assisted Compression • Total raw size • Number of exchanges

6 Data Protection Data-Centric Encryption using ICSF

Machine z10-EC z10-BC z196 z114 zEC12 zBC12 2097 2098 2817 2818 2827 2828

Algorithm DES DES DES DES DES DES Supported 3DES 3DES 3DES 3DES 3DES 3DES AES128, AES128, AES 128, AES 128, AES 128, AES 128, 192, 256 192, 256 192, 256 192, 256 192, 256 192, 256 Crypto CPACF CPACF CPACF CPACF CPACF CPACF Hardware CEX2C CEX2C CEX3C CEX3C CEX3C CEX3C CEX3C CEX3C CEX4C CEX4C

7 Application Design Cryptographic Design Influences

• Data Exchange Format • Collection with associative constructs • Data Transport (Container Format) • In-flight and ‘at rest’ security • Authentication and decryption service availability • Cryptographic Identity and Associated Key Management • Dynamic vs. Static Keys • Inter-system Key Coordination • Data Recovery (Contingency Keys) • Resource Capacity • Timeliness of service

8 Key Exposures – The need for Key Management

9 Crypto Facilities

ICSF OpenPGP CKDS & PKDS Keyrings RACF/ACF2/Top Secret Proprietary Certificate Store

Application Services LDAP Administration

Certificate Cryptographic

CEXnC / CPACF / Software Crypto

X.509 Certificates Certificate Native X.509 Public Authority Certificates

10 Data-Centric Encryption ICSF Data Encipherment Algorithms

• RSA PKi Encryption • Losing ground for longevity due to high cost of processing increased key lengths • Symmetric Clear Key • DES class, AES (128 – 256 bit key strength) • May be employed with passphrase-generated key or CKDS stored key • Symmetric Protected Key (SYMCPACFWRAP) • CKDS Secure Key

11 Symmetric Key Operational Comparison

“Clear” “Protected” “Secure” Fast, but Risky Fast & Secure Slow

o ICSF Software o System z CPACF o Cryptographic -or- Card o System z CPACF

o Passphrase Value o ICSF CKDS o ICSF CKDS -or- Registered Registered o ICSF CKDS (encrypted) (encrypted) Registered (clear)

12 Leverage ICSF CKDS to Protect Passphrase Derived Keys

13 Illustrate Registered ICSF CKDS Key Set

14 CKDS Policy Control – Duplicate Key Value Protection

15 RACF key ring/certificate with PKDS

Label:MSTEBNERSHARETEST ç RACF Label (r_datalib API access) Certificate ID:2QPVweLV4uPFwtXF2fLw8P1A Status:TRUST Start Date:2013/12/17 19:00:25 End Date: 2014/01/18 19:00:24 Serial Number:10F0F1FF3C718DEE4D24BBEDA47A49D0

Issuer's Name:CN=UTN-USERFirst-Client Authentication and Email.OU=http: //www.usertrust.com.O=The USERTRUST Network.L=Salt Lake City.SP=UT.C=US

Subject's Name:[email protected]=Mike Stebner.OU=Corporate Secure Email.OU=Issued through PKWARE E-PKI Manager.O=PKWARE.648 N PL ANKINTON AVE.L=MILWAUKEE.SP=WI.53203.C=US

Key Usage:HANDSHAKE Key Type:RSA Key Size:2048 Private Key:YES PKDS Label:SHARE2014MSTEBNER ç ICSF PKDS Label (implied access)

16 What is the business impact of selected designs and facilities?

17 Inherited OpenPGP Data Flow

Encryption Layer

Compression Layer

• Onion layer concept Literal Data Layer • Encryption Layer • Compression Layer • Literal Data layer 18 • Data stream packets on each layer

Consider the Basic Data Flow

Simple copies from phase to phase

19 Understand OpenPGP Internal Stream Formatting (RFC 2440 or 4880)

20 OpenPGP Data Flow Overhead

Additional data manipulation logic from phase to phase

21 Illustration of Container Format Influence on Encipherment Facilities

Symmetric Keys X.509 Certificates OpenPGP

RACF/ACF/CA-TSS

ICSF PKDS

ICSF CKDS

FIPS 140-2

GOOD WORK REQUIRED NOT AVAILABLE

22 Compression Why is it important?

Data acquisition APPLICATION SERVICES

Result: Compressed & Encrypted GCP/ Data on Target Platform zIIP/zEDC

Data is offloaded, encrypted, and compressed.

23 What Compression Facilities are Available on System z? Software-based • General CP (e.g. , OpenPGP, PKZIP, zlib) • Any viable cross-platform compatible algorithm chosen for implementation • (RFC1951) is a commonly used algorithm that combines LZ77 sliding dictionary compression with Huffman coding.

• Software using zIIP offload • Execute software routines on a System z9 or later • Requires APF authorization to run SRB enclave scheduling • Provides economic compression, but may not improve latency performance.

24 What Compression Facilities are Available on System z?

Hardware-based • System z CMPSC Static Dictionary hardware compression • Available since the early 1990’s • Static dictionary LZ77 • Limited applicability outside of z/OS

• System z Enterprise hardware • New with zEC12 and zBC12 systems • PCIE adapter card • Implements Deflate algorithm

25 Compression Facility Functional Comparison

Software Software on CMPSC Static zEDC General CP zIIP Dictionary

Portable

Generalized Compression System z9 zEC12/zBC12 General CP Pre-defined Requirements zIIP Capacity z/OS 2.1 Capacity data structures (APF) zEDC Card

GOOD WORK REQUIRED NOT AVAILABLE

26 IBM zEnterprise Data Compression for z/ OS and the zEDC Express Feature (I)

IBM Announcement; Document Number: ZSB03059USEN • Implements RFC 1951 Deflate compression • “When zlib uses zEDC, there can be up to 118X reduction in CPU and up to 24X throughput improvement” • One or more PCIE cards servicing multiple partitions (15) • Currently supported only under a native z/OS LPAR • Check IBM statements of direction • Optimized for larger amounts of data • Has configurable minimum size limits (4k floor) • PTFs available for z/OS 1.12 and 1.13 to inflate • Also see SMP/E FIXCAT(IBM.Function.ZEDC)

27 IBM zEnterprise Data Compression for z/ OS and the zEDC Express Feature (III)

• System Use Cases • SMF • Phased Roll-out intentions • BSAM/QSAM (infrastructure layer) • DFSMSdss™/DFSMShsm™ backup/restore • z/OS Java™ Technology Edition, Version 7 • Detailed SHARE sessions • 15209: Experiences with IBM zAware and zEDC • 15099: zEnterprise Data Compression: What is it and How Do I Use it? (Wed. 4:30 PM) • 15080: z/OS zEnterprise Data Compression Usage and Configuration

28 IBM zEnterprise Data Compression for z/ OS and the zEDC Express Feature (IV)

• z/OS V2R1.0 MVS Callable Services for HLL (Ch. 13-15) • Deflate stream compatible with GZIP, PKZIP, OpenPGP • Hardware availability checks to determine availability • IBM-provided compatible C library functions • APF Authorized API for single-block compress/inflate • Unauthorized zlib interface (streaming data)

29 IBM zEnterprise Data Compression for z/ OS and the zEDC Express Feature (V)

• z/OS V2R1.0 MVS Callable Services for HLL (Ch. 13-15) • Unauthorized zlib interface (streaming data) • Uses zlib.net z_stream programming interface (subset) • Raw Deflate Stream or GZIP modes (CRC32 with GZIP) • libzz.a include wrapper • Controlled by SAF-protected FACILITY class resource FPZ.ACCELERATOR.COMPRESSION • z/OS UNIX _HZC_COMPRESSION_METHOD environment control variable • May fall back to zlib software routines depending on zEDC requirements, including size limitations • PARMLIB IQPPRMxx DEFMINREQSIZE (4K) and INFMINREQSIZE (16K)

30 IBM zEnterprise Data Compression PKWARE Early Test Program Experience

• Objective • Assess compression using software GCP, zIIP and zEDC • zEC12 • 5 General CPs, 2 zIIPs, 1 zEDC • Workloads – Single system (no LPAR sharing of zEDC) • “Large” (1gb+) linear with multiple parallel (80 concurrent) • “Small” (256k) high volume • Metrics • Elapsed Time • Processor time

31 zEDC Operations

Console Display General PCIE Status

32 zEDC Operations

Display zEDC PCIE Adapter Status

33 zEDC Operational Monitoring (II)

34 zEDC Processing Characteristics

• Multi-tasking with the zlib API is available • zlib API may not run on the zEDC hardware (per design) • Different minimum buffer size thresholds for deflate & inflate • Only one ‘level’ of zEDC Deflate compression • 9 levels available in zlib software • Internal implementations of RFC 1951 Deflate may differ • May experience varying compression ratios (based on level) right around the minimum buffer size restriction.

35 IBM zEnterprise Data Compression PKWARE Early Test Program Experience

Initial Results Overview (I) • zEDC sustained 1gb+ per second of raw compression • zEDC capacity exceeded application resource constraints • The affects of I/O and application processing prevented saturation of zEDC • Under appropriate conditions, zIIP met or exceeded application performance when compared to zEDC. • Optimized zlib C routines showed benefits over the libzz.a wrapper code under some conditions. • Small files under the minimum buffer size • Inflation

36 IBM zEnterprise Data Compression PKWARE Early Test Program Experience

Initial Results Overview (II) • ETP limitations of first implementation identified • Buffer allocation issues • Buffer release • Rejected concurrent requests for the same size buffer • Compression ratio (77% vs. 89% for software implementations)

37 Effect of Resource Availability zEDC vs. zIIP

38 Incorporate Design with Facility Transactional Example (1.5mb each)

39 Summary Slide

• The Mainframe is typically the source of record for critical business data • Data needs to move off the mainframe quickly, efficiently and securely. • Numerous facilities on z/OS exist to make this quick, efficient and secure – zIIP, CryptoExpress4S, CPACF, zEDC • Proper Transformation is critical to reduce hardware dependencies and facilitate long term viability

40