Intel® ® Scalable Processor: The Foundation of Data Centre Innovation Software Developer Conference – London, 2017 Notices and Disclaimers

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance.

Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel, the Intel logo, Intel Optane and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the united states and other countries.

* Other names and brands may be claimed as the property of others. © 2017 Intel Corporation. © 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 2 For more complete information about compiler optimizations, see our Optimization Notice. Agenda

• Intel® Xeon® Scalable Processor Overview, Platform Features • Skylake-SP CPU Architecture • Performance Summary

Content Acknowledgement • Akhilesh Kumar, Skylake-SP CPU Architect • Malay Trivedi, Lewisburg PCH Architect

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 3 For more complete information about compiler optimizations, see our Optimization Notice. Intel® Xeon® Processor Roadmap

2016 2017 2018 Intel® Xeon® Processor E7 Brickland Platform Purley Platform Targeted at mission critical applications that value a scale-up Skylake Cascade Lake E7 v3 E7 v4 system with leadership18 cores memory capacity and advanced RAS Intel® Xeon® PLATINUM Intel Xeon GOLD Intel® Xeon® Processor E5 Grantley-EP Platform Targeted at a wide variety of Intel Xeon SILVER applications that value a balanced E5 v3 E5-4600 v4 (4S) system with leadership Intel Xeon BRONZE E5 v3 E5-2600 v4 performance/watt/$ Converged platform with innovative Skylake-SP

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 4 For more complete information about compiler optimizations, see our Optimization Notice. Intel® Xeon® Scalable Processor Feature Overview

3x16 PCIe* 3x16 PCIe Feature Details 2 or 3 Intel® UPI Gen3 Gen3 Socket Scalability 2S, 4S, 8S, and >8S (with node controller support) Skylake-SP DDR4 Skylake-SP CPU TDP 70W – 205W CPU 2666 CPU Intel® C620 Series (code name Lewisburg) OPA OPA Networking Intel® Omni-Path Fabric (integrated or discrete) 1x 100Gb OPA 1x 100Gb OPA 4x10GbE (integrated w/ chipset) Fabric Fabric 100G/40G/25G discrete options DMI CPU VRs Compression and Intel® QuickAssist Technology to support 100Gb/s OPA VRs Crypto comp/decomp/crypto Lewisburg PCH Mem VRs Acceleration 100K RSA2K public key Intel®QAT ME High USB3 Storage Integrated QuickData Technology, VMD, and NTB IE Speed IO PCIe3 Intel® Optane™ SSD, Intel® 3D-NAND NVMe & 4x10GbE NIC SATA3 BMC SATA SSD GPIO Firmware eSPI/LPC Security CPU enhancements (MBE, PPK, MPX) 10GbE SPI Manageability Engine TPM Firmware Intel® Platform Trust Technology Intel® Key Protection Technology

BMC: Baseboard Management Controller PCH: Intel® IE: Innovation Engine Manageability Innovation Engine (IE) Intel® OPA: Intel® Omni-Path Architecture Intel QAT: Intel® QuickAssist Technology ME: Manageability Engine Intel® Node Manager NIC: Network Interface Controller VMD: Volume Management Device NTB: Non-Transparent Bridge Intel® Datacenter Manager

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 5 For more complete information about compiler optimizations, see our Optimization Notice. Platform Topologies 2S Configurations 4S Configurations 8S Configuration

LBG LBG

SKL SKL SKL SKL SKL SKL Intel® UPI

** DMI x4 LBG LBG LBG 3x16 3x16 PCIe* 1x100G PCIe* 1x100G SKL Intel® OP Fabric Intel® OP Fabric SKL

SKL SKL (2S-2UPI & 2S-3UPI shown)

DMI SKL SKL LBG 3x16 PCIe*

(4S-2UPI & 4S-3UPI shown) SKL SKL Intel® Xeon® Scalable Processor supports DMI 3x16 PCIe* configurations ranging from 2S-2UPI to 8S LBG LBG

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 6 For more complete information about compiler optimizations, see our Optimization Notice. Intel® Xeon® Scalable Processor Re-architected from the Ground Up

• Skylake core microarchitecture, with data • New mesh interconnect architecture • Intel® Speed Shift Technology center specific enhancements • Enhanced memory subsystem • Security & Virtualization enhancements • Intel® AVX-512 with 32 DP per core (MBE, PPK, MPX) • Modular IO with integrated devices • Data center optimized hierarchy – • Optional Integrated Intel® Omni-Path • New Intel® Ultra Path Interconnect 1MB L2 per core, non-inclusive L3 Fabric (Intel® OPA) (Intel® UPI)

Features Intel® Xeon® Processor E5-2600 v4 Intel® Xeon® Scalable Processor 6 Channels DDR4

Cores Per Socket Up to 22 Up to 28 DDR4 Core Core 2 or 3 UPI Threads Per Socket Up to 44 threads Up to 56 threads DDR4 Core Core UPI Last-level Cache (LLC) Up to 55 MB Up to 38.5 MB (non-inclusive) DDR4 QPI/UPI Speed (GT/s) 2x QPI channels @ 9.6 GT/s Up to 3x UPI @ 10.4 GT/s UPI DDR4 Core Core PCIe* Lanes/ 40 / 10 / PCIe* 3.0 (2.5, 5, 8 GT/s) UPI Controllers/Speed(GT/s) 48 / 12 / PCIe 3.0 (2.5, 5, 8 GT/s) DDR4 Shared L3 4 channels of up to 3 RDIMMs, 6 channels of up to 2 RDIMMs, Omni-Path Memory Population Omni-Path HFI LRDIMMs, or 3DS LRDIMMs LRDIMMs, or 3DS LRDIMMs DDR4 Max Memory Speed Up to 2400 Up to 2666 48 Lanes DMI3 PCIe* 3.0 TDP (W) 55W-145W 70W-205W

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 7 For more complete information about compiler optimizations, see our Optimization Notice. Core Microarchitecture Enhancements

DecodersDecoders 5 Broadwell 32KB L1 I$ Pre decode Inst Q Decoders uArch Skylake uArch 6 μop Front End Branch Prediction Unit μop Cache Queue Out-of-order 192 224 Window Load Store Reorder Buffer Buffer Buffer Allocate/Rename/Retire In-flight Loads + In order 72 + 42 72 + 56 Scheduler OOO Stores Port 0 Port 1 Port 5 Port 6 Port 4 Port 2 Port 3 Port 7 Scheduler Entries 60 97 Registers – ALU ALU ALU ALU 168 + 168 180 + 168 Store Data Load/STA Load/STA STA Shift LEA LEA Shift Integer + FP INT JMP 2 MUL JMP 1 Allocation Queue 56 64/ FMA FMA FMA Load Data 2 Memory ALU ALU ALU Load Data 3 Memory Control L1D BW (B/Cyc) –

VEC Shift Shift Shuffle 64 + 32 128 + 64 DIV Load + Store 1MB L2$ Fill Buffers 32KB L1 D$ 4K+2M: 1536 L2 Unified TLB 4K+2M: 1024 Fill Buffers 1G: 16

• Larger and improved branch predictor, higher throughput decoder, larger window to extract ILP • Improved scheduler and execution engine, improved throughput and latency of divide/sqrt • More load/store bandwidth, deeper load/store buffers, improved prefetcher • Data center specific enhancements: Intel® AVX-512 with 2 FMAs per core, larger 1MB MLC About 10% performance improvement per core on integer applications at same frequency

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 8 For more complete information about compiler optimizations, see our Optimization Notice. Intel® Advanced Vector Extensions 512 (Intel® AVX-512)

• 512-bit wide vectors Microarchitecture Instruction Set SP FLOPs / cycle DP FLOPs / cycle Intel® AVX-512 & • 32 operand registers Skylake 64 32 FMA • 8 64b mask registers Haswell / Broadwell Intel AVX2 & FMA 32 16 • Embedded broadcast Sandybridge Intel AVX (256b) 16 8 • Embedded rounding Nehalem SSE (128b) 8 4

Intel AVX-512 Instruction Types AVX-512-F AVX-512 Foundation Instructions AVX-512-VL Vector Length Orthogonality : ability to operate on sub-512 vector sizes AVX-512-BW 512-bit /Word support AVX-512-DQ Additional D/Q/SP/DP instructions (converts, transcendental support, etc.) AVX-512-CD Conflict Detect : used in vectorizing loops with potential address conflicts Powerful instruction set for data-parallel computation

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 9 For more complete information about compiler optimizations, see our Optimization Notice. Performance and Efficiency with Intel® AVX-512

GFLOPs / Watt 6.00 4.83 LINPACK Performance 5.00 4.00 2.92 3.00 3500 3.1 3259 3.5 1.74 2.8 2.00

3000 3 GFLOPs/Watt 1.00 2.5 1.00

2500 2.5 SSE4.2to Normalized 2.1 0.00 System Power System 2000 2 SSE4.2 AVX AVX2 AVX512

2034 1500 1178 1.5 GFLOPs, GFLOPs,

1000 669 1 Frequency Core GFLOPs / GHz 8.00 7.19 500 0.5 760 768 791 767 7.00 0 0 6.00 SSE4.2 AVX AVX2 AVX512 5.00 3.77 4.00 3.00 1.95 GFLOPs/GHz 2.00 1.00 GFLOPs Power (W) Frequency (GHz) 1.00 Normalized to SSE4.2to Normalized 0.00 SSE4.2 AVX AVX2 AVX512 Intel® AVX-512 delivers significant performance and efficiency gains

Source as of June 2017: Intel internal measurements on platform with Xeon Platinum 8180, Turbo enabled, UPI=10.4, SNC1, 6x32GB DDR4-2666 per CPU, 1 DPC. Software and workloads used in performance tests may have been optimized for performance only on Intel . Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 10 For more complete information about compiler optimizations, see our Optimization Notice. New Mesh Interconnect Architecture

Broadwell EX 24-core die Skylake-SP 28-core die

QPI QPI PCI-E PCI-E PCI-E PCI-E 2x UPI x20 PCIe** x16 PCIe x16 On Pkg 1x UPI x20 PCIe x16 Link Link X16 X16 X8 X4 (ESI) UBox PCU CB DMA DMI x 4 PCIe x16 R3QPI R2PCI IOAPIC QPI Agent IIO CBDMA

CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC U D P N

SKX Core SKX Core SKX Core SKX Core SKX Core SKX Core IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo IDI/QPII SAD SAD IDI/QPII SAD SAD CBO CBO CBO CBO DDR4 MC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC MC DDR4 IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core DDR4 DDR4 IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo IDI/QPII SAD SAD IDI/QPII SAD SAD SKX Core SKX Core SKX Core SKX Core CBO CBO CBO CBO DDR4 DDR4 IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo IDI/QPII SAD SAD IDI/QPII SAD SAD CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CBO CBO CBO CBO IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo SKX Core SKX Core SKX Core SKX Core SKX Core SKX Core IDI/QPII SAD SAD IDI/QPII SAD SAD CBO CBO CBO CBO IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo IDI/QPII SAD SAD IDI/QPII SAD SAD CBO CBO CBO CBO IDI/QPII IDI/QPII IDI Core Cache LLC LLC Cache IDI Core Core U D Cache LLC LLC Cache D U Core SKX Core SKX Core SKX Core SKX Core SKX Core SKX Core IDI Core IDI Core Core Core Bo Bo 2.5MB 2.5MB Bo Bo Bo P N Bo 2.5MB 2.5MB Bo N P Bo IDI/QPII SAD SAD IDI/QPII SAD SAD CBO CBO CBO CBO CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC U D P N

UP SKX Core SKX Core SKX Core SKX Core SKX Core SKX Core DN

Home Agent Home Agent CHA – Caching and Home Agent ; SF – Snoop Filter ; LLC – Last Level Cache ; DDR DDR DDR DDR Mem Ctlr Mem Ctlr SKX Core – Skylake Server Core ; UPI – Intel® UltraPath Interconnect Mesh Improves Scalability with Higher Bandwidth and Reduced Latencies

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 11 For more complete information about compiler optimizations, see our Optimization Notice. Re-Architected L2 & L3 Cache Hierarchy Previous Architectures Skylake-SP Architecture

Shared L3 Shared L3 1.375MB/core 2.5MB/core (non-inclusive) (inclusive)

L2 L2 L2 (1MB private) (1MB private) (1MB private) L2 L2 L2 (256KB private) (256KB private) (256KB private)

Core Core Core Core Core Core

• On-chip cache balance shifted from shared-distributed (prior architectures) to private-local (Skylake architecture): • Shared-distributed  shared-distributed L3 is primary cache • Private-local  private L2 becomes primary cache with shared L3 used as overflow cache • Shared L3 changed from inclusive to non-inclusive: • Inclusive (prior architectures)  L3 has copies of all lines in L2 • Non-inclusive (Skylake architecture)  lines in L2 may not exist in L3 Skylake-SP cache hierarchy architected specifically for Data center use case

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 12 For more complete information about compiler optimizations, see our Optimization Notice. Memory Subsystem

1x16/2x8/4x4 2 Memory Controllers, 3 channels each  total of 2x UPI x20 @ 1x16/2x8/4x4 PCIe @ 8GT/s 1x16/2x8/4x4 10.4GT/s PCIe @ 8GT/s x4 DMI PCIe @ 8GT/s 6 memory channels • DDR4 up to 2666, 2 DIMMs per channel 2x UPI x20 PCIe* x16 PCIe x16 PCIe x16 • Support for RDIMM, LRDIMM, and 3DS-LRDIMM DMI x4 CBDMA • 1.5TB Max Memory Capacity per Socket (2 DPC with 128GB DIMMs) CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC • >60% increase in Memory BW per Socket compared to Core Core Core Core Intel® Xeon® processor E5 v4 DDR 4 MC CHA/SF/LLC CHA/SF/LLC MC DDR 4 2666 2666 - - Supports XPT prefetch to reduce LLC miss latency DDR 4 DDR 4 DDR 4 Core Core DDR 4 Introduces a new memory device failure detection 3x DDR4 3x 3X DDR4 3X CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC CHA/SF/LLC and recovery scheme with Adaptive Double Device

Core Core Core Core Data Correction (ADDDC)

Significant memory bandwidth and capacity improvements

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 13 For more complete information about compiler optimizations, see our Optimization Notice. Memory Performance Bandwidth-Latency Profile

Source as of June 2017: Intel internal measurements on platform with Xeon Platinum 8180, Turbo enabled, UPI=10.4, SNC1/SNC2, 6x32GB DDR4-2400/2666 per CPU, 1 DPC, and platform with E5-2699 v4, Turbo enabled, 4x32GB DDR4-2400, RHEL 7.0. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 14 For more complete information about compiler optimizations, see our Optimization Notice. 1.65x Average1 Generational Gains on 2-Socket Servers with Intel® Xeon® Scalable Processor Higher is better 2.5 1-node 2x Intel® Xeon® processor E5-26xx v4 ("Broadwell-EP 2S") 1-node 2x Intel® Xeon® Scalable processor 2.27

2 1.87 1.73 1.73 1.77 1.65 1.65 Average 1.53 1.58 1.65 1.40 1.44 1.5 1.33

1 Relative Performance 2S

0.5 jOPS Molecular Dynamics Molecular Options Pricing Options – – Java* Business Ops BusinessJava* Critical Memory Bandwidth Database OLTP Performance HPC Infrastructure Infrastructure virtualization App 0 BrokerageFirm OLTP EnterpriseandSales (Linux) Distribution App General Integer Throughput App Compute Technical Throughput Network L3 Packet Forwarding FSI Throughput LINPACK E5-26xx v4 TPC*-E SPECvirt_ Two-tier SPECint* SPECjbb2015* SPECfp* STREAM* HammerDB* LAMMPS DPDK L3 Black Intel® Baseline sc* 2013 SAP SD* _rate MultiJVM _rate Triad Packet Scholes Distribution (Linux) _base2006 critical-jOPS _base2006 Forwarding for LINPACK

1 Geomean based on Normalized Generational Performance (estimates based on Intel internal testing and published results of TPC-E, SPECvirt_sc*2013, SAP SD 2-Tier, SPEC*int_rate_base2006, SPEC*fp_rate_base2006, SPECjbb2015* MultiJVM, STREAM* triad, HammerDB, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel Distribution for LINPACK. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase. Configurations: see slides 23, 24. *Other names and brands may be claimed as the property of others. © 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 15 For more complete information about compiler optimizations, see our Optimization Notice. HPC Financial services Intel® Xeon® Scalable Processors Intel® AVX-512 multi-gen

Monte Carlo European Option increased performance with the 2S Intel® Xeon® Gold 6148 MontE Carlo European options 1 processor Application: 3 Monte Carlo is a numerical method that uses statistical Up to sampling techniques to approximate solutions to quantitative 3.1X problems. In finance, Monte Carlo algorithms are used to AT A GLANCE faster evaluate complex instruments, portfolios, and investments. 2.38X Hardware: This is compute bound, double precision workload. 2S Intel® Xeon® Gold 6148 Processor 2 Potential Customer Benefits: Platform Features: . Higher performance allow either doing the same work Intel® Advanced Vector Extensions 512 faster leading to improved TCO or simulation of more paths (Intel® AVX-512) leading to higher confidence in results. More cores Improved memory hierarchy 1 Up to Performance Factors:

Normalized Normalized Performance 1.3X . Using Intel® AVX-512 SIMD vectorization improved Software Tools/Libraries: faster performance by 1.85X over Intel® AVX2. Intel® Parallel Studio XE 2017 . Higher core counts of Intel Xeon® Gold 6148 processor Composer Edition (C++) contributes to higher performance. 0 . Better memory hierarchy adds to the performance 2S Intel® Xeon® processor E5-2697 v3 . Code modernization strategy: Parallelizing outer loop over 2S Intel® Xeon® processor E5-2697 v4 options and vectorize inner loop of paths.

2S Intel® Xeon® Gold 6148 processor Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, Performance Metric: Speed-up using options/sec operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of 1 See configurations on slide 76. that product when combined with other products. For more complete information visit http://www.intel.com/performance. *Other names and brands may be claimed as the property of others.

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 16 For more complete information about compiler optimizations, see our Optimization Notice. Intel® Xeon® Scalable Processor

The Secure, Agile, Next-Generation Platform for Multi-Cloud Infrastructures Pervasive performance for Actionable Insights Security without compromise Agile service delivery

Skylake-SP cores Intel® AVX-512 Intel® Volume Management Device Technology Intel® AVX-512 PPK, MPX, MBE Intel® RAS Feeds: UPI, 6x DDR4, 3x16 PCIe, Intel® QAT w/ Secure Key PURLEY Management PRESALESOpen Stack Software Optimizations Intel® SSDs 30-3-30 Intel® Boot Guard Integration: Intel® Ethernet / Omni-Path / Intel® QuickAssist / Intel® Trusted Infrastructure FPGA

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 17 For more complete information about compiler optimizations, see our Optimization Notice. Code that performs and outperforms Download a free, 30-day trial of Intel® Parallel Studio XE 2018 today https://software.intel.com/en-us/intel-parallel-studio-xe/try-buy

And Don’t Forget… To check your inbox for the evaluation survey which will be emailed after this presentation. P.S. Everyone who fills out the survey will receive a personalized certificate indicating completion of the training!

© 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. For more complete information about compiler optimizations, see our Optimization Notice. © 2017 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. For more complete information about compiler optimizations, see our Optimization Notice.

Configurations: Average Generational Gains on 2S Servers

1. 1.65x Average Performance: Geomean based on Normalized Generational Performance (estimates based on Intel internal testing and published results of TPC-E, SPECvirt_sc*2013, SAP SD 2-Tier, SPEC*int_rate_base2006, SPEC*fp_rate_base2006, SPECjbb2015* MultiJVM, STREAM* triad, HammerDB, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel Distribution for LINPACK). a) Up to 1.33x on TPC*-E: 1-Node, 2 x Intel® Xeon® Processor E5-2699 v4 on Group Limited with 512 GB Total Memory on Windows Server* 2012 Standard using SQL Server 2016 Enterprise Edition. Data Source:http://www.tpc.org/tpce/results/tpce_result_detail.asp?id=116032402 , Benchmark: TPC Benchmark* E (TPC-E), Score: 4938.14 vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 processor on Lenovo Group Limited with 1536 GB Total Memory on Windows Server* 2016 Standard using SQL Server 2017 Enterprise Edition. Data Source: http://www.tpc.org/tpce/results/tpce_result_detail.asp?id=117062701, Benchmark: TPC Benchmark* E (TPC-E), Score: 6598.36. Higher is better b) Up to 1.40x on SPECvirt_sc* 2013: Claim based on best-published 2-soclet SPECvirt_sc* 2013 result submitted to/published at http://www.spec.org/virt_sc2013/results/res2016q3/virt_sc2013-20160823-00060-perf.html as of 11 July 2017, Score: 2360 @ 137 VMs vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor with 768 GB (24 x 32 GB, 2R x4 PC4-2666 DDR4 2666MHz RDIMM) Total Memory on SUSE Linux Enterprise Server 12 SP2. Data Source: http://www.spec.org, Benchmark: SPECvirt_sc* 2013, Score: 3323 @ 189 VMs Higher is better c) Up to 1.44x on 2-Tier SAP* SD : Claim based on best-published two-socket SAP SD 2-Tier on Linux* result published at http://global.sap.com/solutions/benchmark/sd2tier.epx as of 11 July 2017. New configuration: 2-tier, 2 x Intel® Xeon® Platinum 8180 Processor (56 cores/112 threads) on DellEMC PowerEdge* R740xd with 768 GB total memory on Red Hat Enterprise Linux* 7.3 using SAP Enhancement Package 5 for SAP ERP 6.0, SAP NetWeaver 7.22 pl221, and Sybase ASE 16.0. Source: Certification #: 2017017: www.sap.com/benchmark, SAP* SD 2-Tier enhancement package 5 for SAP ERP 6.0 score: 32,085 benchmark users. d) Up to 1.53x on SPECint*_rate_base2006 : Claim based on best-published two-socket SPECint*_rate_base2006 result submitted to/published at http://www.spec.org/cpu2006/results/ as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Huawei 2288H V5 with 384 GB total memory on SUSE Linux Enterprise Server 12 SP2 (x86_64) Kernel 4.4.21-69-default, using C/C++: Version 17.0.1.132 of Intel C/C++ Compiler for Linux. Source: submitted to www.spec.org, SPECint*_rate_base2006 Score: 2800. Results are pending SPEC approval; they are considered estimates until SPEC approves e) Up to 1.58x on SPECjbb*2015 MultiJVM critical-jOPS: Claim based on best-published two-socket SPECjbb*2015 MultiJVM critical-jOPS results published at http://www.spec.org/jbb2015/results/jbb2015multijvm.html as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Cisco* Systems UCS C240 M5 with 1536 GB total memory on Red Hat Enterprise Linux* 7.3 (Maipo) using Java* HotSpot 64-bit Server VM, version 1.8.0_131. Source: submitted to http://www.spec.org, SPECjbb2015* - MultiJVM scores: 141,360 max-jOPS and 118,551 critical-jOPS f) Up to 1.65x on SPECfp*_rate_base2006 :Claim based on best-published two-socket SPECfp*_rate_base2006 result submitted to/published at http://www.spec.org/cpu2006/results/ as of 11 July 2017. New configuration: 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Huawei 2288H V5 with 384 GB total memory on SUSE Linux Enterprise Server 12 SP2 (x86_64) Kernel 4.4.21-69-default, using C/C++ and Fortran: Version 17.0.0.098 of Intel C/C++ and Intel Fortran Compiler for Linux. Source: submitted to www.spec.org, SPECfp*_rate_base2006 Score: 1850. g) Up to 1.65x on STREAM - triad: 1-Node, 2 x Intel® Xeon® Processor E5-2699 v4 on Grantley-EP (Wellsburg) with 256 GB Total Memory on Red Hat Enterprise Linux* 6.5 kernel 2.6.32-431 using Stream NTW avx2 measurements. Data Source: Request Number: 1709, Benchmark: STREAM - Triad, Score: 127.7 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Neon City with 384 GB Total Memory on Red Hat Enterprise Linux* 7.2-kernel 3.10.0-327 using STREAM AVX 512 Binaries. Data Source: Request Number: 2500, Benchmark: STREAM - Triad, Score: 199 Higher is better Configurations: Average Generational Gains on 2S Servers h) Up to 1.73x on HammerDB:1-Node, 2 x Intel® Xeon® Processor E5-2699 v4 on Grantley-EP (Wellsburg) with 384 GB Total Memory on Red Hat Enterprise Linux* 7.1 kernel 3.10.0-229 using Oracle 12.1.0.2.0 (including database and grid) with 800 warehouses, HammerDB 2.18. Data Source: Request Number: 1645, Benchmark: HammerDB, Score: 4.13568e+006 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Purley-EP (Lewisburg) with 768 GB Total Memory on Oracle Linux* 7.2 using Oracle 12.1.0.2.0, HammerDB 2.18. Data Source: Request Number: 2510, Benchmark: HammerDB, Score: 7.18049e+006 Higher is better i) Up to 1.73x on LAMMPS: LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. It is used to simulate the movement of atoms to develop better therapeutics, improve alternative energy devices, develop new materials, and more. E5-2697 v4: 2S Intel® Xeon® processor E5-2697 v4, 2.3GHz, 36 cores, Intel® Turbo Boost Technology and Intel® Hyperthreading Technology on, BIOS 86B0271.R00, 8x16GB 2400MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: 2S Intel® Xeon® Gold 6148 processor, 2.4GHz, 40 cores, Intel® Turbo Boost Technology and Intel® Hyperthreading Technology on, BIOS 86B.01.00.0412.R00, 12x16GB 2666MHz DDR4, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. j) Up to 1.77x on DPDK L3 Packet Forwarding: E5-2658 v4: 5 x Intel® XL710-QDA2, DPDK 16.04. Benchmark: DPDK l3fwd sample application Score: 158 Gbits/s packet forwarding at 256B packet using cores. Gold 6152: Estimates based on Intel internal testing on Intel Xeon 6152 2.1 GHz, 2x Intel®, FM10420(RRC) Gen Dual Port 100GbE Ethernet controller (100Gbit/card) 2x Intel® XXV710 PCI Express Gen Dual Port 25GbE Ethernet controller (2x25G/card), DPDK 17.02. Score: 281 Gbits/s packet forwarding at 256B packet using cores, IO and memory on a single socket k) Up to 1.87x on Black-Scholes: which is a popular mathematical model used in finance for European option valuation. This is a double precision version. E5-2697 v4: 2S Intel® Xeon® processor CPU E5-2697 v4 , 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Gold 6148: Intel® Xeon® Gold processor 6148@ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB INTEL SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327 l) Up to 2.27x on LINPACK*: 1-Node, 2 x Intel® Xeon® Processor E5-2699 v4 on Grantley-EP (Wellsburg) with 64 GB Total Memory on Red Hat Enterprise Linux* 7.0 kernel 3.10.0-123 using MP_LINPACK 11.3.1 (Composer XE 2016 U1). Data Source: Request Number: 1636, Benchmark: Intel® Distribution of LINPACK, Score: 1446.4 Higher is better vs. 1-Node, 2 x Intel® Xeon® Platinum 8180 Processor on Wolf Pass SKX with 384 GB Total Memory on Red Hat Enterprise Linux* 7.3 using mp_linpack_2017.1.013. Data Source: Request Number: 3753, Benchmark: Intel® Distribution of LINPACK, Score: 3295.57 Higher is better Monte Carlo Benchmark Configuration Summary

.

Monte Carlo – Testing conducted on Monte Carlo software comparing 2S Intel® Xeon® Gold 6148 processor to 2S Intel® Xeon® Processor E5-2697 v3 and to 2S Intel® Xeon® Processor E5-2697 v4. OS: Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. Testing by Intel March 2017. BASELINE: 2S Intel® Xeon® processor E5-2697 v3, 2.6GHz, 28 cores, turbo and HT on, BIOS 86B.0036.R05, 64GB total memory, 8x8GB 2133 MHz DDR4, Fedora release 20 kernel 3.15.10-200. NEXT GEN: 2S Intel® Xeon® processor E5- 2697 v4 , 2.3GHz, 36 cores, turbo and HT on, BIOS 86B0271.R00, 128GB total memory, 8 x16GB 2400 MHz DDR4 RDIMM, 1 x 1TB SATA, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327. NEW: 2S Intel® Xeon® Gold 6148 processor @ 2.4GHz, H0QS, 40 cores 150W. QMS1, turbo and HT on, BIOS SE5C620.86B.01.00.0412.020920172159, 192GB total memory, 12 x 16 GB 2666 MHz DDR4 RDIMM, 1 x 800GB Intel® SSD SC2BA80, Red Hat Enterprise Linux* 7.2 kernel 3.10.0-327.

23