The IBM POWER9 Scale up Processor

Total Page:16

File Type:pdf, Size:1020Kb

The IBM POWER9 Scale up Processor The IBM POWER9 Scale Up Processor Jeff Stuecheli William Starke POWER Systems, IBM Systems © 2018 IBM Corporation POWER Processor Technology Roadmap POWER9 Family 14nm POWER8 Family Built for the Cognitive Era 22nm POWER7+ − Enhanced Core and Chip 32 nm Enterprise & Architecture Optimized for POWER7 Big Data Optimized Emerging Workloads 45 nm Enterprise − Processor Family with - Up to 12 Cores Scale-Up and Scale-Out Enterprise - 2.5x Larger L3 cache - SMT8 - On-die acceleration - CAPI Acceleration Optimized Silicon - 8 Cores - Zero-power core idle state - High Bandwidth GPU Attach - SMT4 − Premier Platform for - eDRAM L3 Cache Accelerated Computing 1H10 2H12 1H14 – 2H16 2H17 – 2H18+ © 2018 IBM Corporation 2 POWER9 Processor – Common Features SMP/Accelerator Signaling Memory Signaling Leadership New Core Microarchitecture Hardware Acceleration Platform • Stronger thread performance Core Core Core Core • Enhanced on-chip acceleration • Efficient agile pipeline L2 L2 L2 L2 • Nvidia NVLink 2.0: High bandwidth • POWER ISA v3.0 L3 Region L3 Region L3 Region L3 Region and advanced new features (25G) L3 Region L3 Region L3 Region L3 Region • CAPI 2.0: Coherent accelerator and L2 L2 L2 L2 Enhanced Cache Hierarchy storage attach (PCIe G4) • 120MB NUCA L3 architecture Core Core Core Core • OpenCAPI: Improved latency and bandwidth, open interface (25G) • 12 x 20-way associative regions PCIe Signaling PCIe On-Chip Accel Signaling SMP L3 Region L3 Region SMP Interconnect & L3 Region L3 Region • Advanced replacement policies L2 L2 Chip Accelerator Enablement L2 L2 State of the Art I/O Subsystem - • Fed by 7 TB/s on-chip bandwidth Off Core Core Core Core • PCIe Gen4 – 48 lanes Cloud + Virtualization Innovation SMP/Accelerator Signaling Memory Signaling High Bandwidth Signaling Technology • Quality of service assists 14nm finFET Semiconductor Process • 16 Gb/s interface • New interrupt architecture • Improved device performance and – Local SMP • Workload optimized frequency reduced energy • PowerAXON 25 GT/sec Link interface • Hardware enforced trusted execution • 17 layer metal stack and eDRAM – Accelerator, remote SMP • 8.0 billion transistors © 2018 IBM Corporation 3 POWER9 – Dual Memory Subsystems Scale Out Scale Up Direct Attach Memory Buffered Memory 8 Direct DDR4 Ports 8 Buffered Channels • Up to 150 GB/s of sustained bandwidth • Up to 230GB/s of sustained bandwidth • Low latency access • Extreme capacity – up to 8TB / socket • Commodity packaging form factor • Superior RAS with chip kill and lane sparing • Adaptive 64B / 128B reads • Compatible with POWER8 system memory • Agnostic interface for alternate memory innovations © 2018 IBM Corporation 4 PowerAXON High-speed 25 GT/s Signaling Power Processor Utilize Best-of-Breed Power Processor 25 GT/s Optical-Style Flexible & Modular Coherence Packaging Signaling Technology OpenCAPI Infrastructure App FPGA © 2018 IBM Corporation 5 Scale Out Scale Up Direct Attach Memory Buffered Memory 2 Socket SMP 16 Socket SMP PowerAXON PowerAXON x24 x48 DDRx4 DMIx4 Mem Cntl Mem Cntl SMPx2 SMPx3 PCIe PCIe Mem Cntl Mem Cntl DDRx4 DMIx4 PowerAXON PowerAXON x24 x48 © 2018 IBM Corporation 6 Scale Up Chipset Block Diagram PCI G4 Processor Accel Local 3x32 x48 I/O Cores SMP Attach Links (16 GHz) Cache and (16 GHz) Interconnect 32B 4 ops Bypass Remote 32B SMP Link Active 4 KB Operations OpenCAPI NVLINK Control 64 + 12 2 KB Control Control Connect to Framing / 4x8 6x8 4x24 Memory other POWER9 Parsing PowerAXON 25 GHz Signaling Scale-up chips, Control GPU’s, and x96 OpenCAPI Devices 8 Channels Command, 1B write, per Processor 2B read @ 9.6 GHz 4 ports of Centaur Memory Buffer 1600 MHz 10B DDR4 DDR4 Operation Framing / Sched DRAM and Data Parsing 16 MB DRAM 8 x Control Cache DRAM © 2018 IBM Corporation DRAM 7 Four Socket Server Topology 16 GHz x32 1 TB POWER9 POWER9 Buffered POWER9 POWER9 Scale-Up Scale-Up Memory Scale-Up Scale-Up © 2018 IBM Corporation 8 Sixteen Socket Server Topology 16 GHz x32 25 GHz x24 POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up © 2018 IBM Corporation 9 Power E980 Server Modular, Scalable POWER9 server . 1 to 4 x 5U CEC drawers + 2U Control Unit POWER9 Enterprise processor Up to 192 cores in a single system Up to 64 TB DDR4 memory Up to 32 PCIe Gen4 slots PowerAXON 25Gb/s ports . Used for SMP cabling between nodes - 4x bandwidth improvement . Enabled for OpenCAPI accelerators © 2018 IBM Corporation 10 PCIe Expansion Drawer PCIe NVMe Bays SMP/etc Conn. Cable Card CXP Conn FPGA CXP Conn Cable Card CXP Conn FPGA CXP Conn Fan-out Module Fan-out Module © 2018 IBM Corporation 11 GPU on PowerAXON 16 GHz x32 25 GHz x8 POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU © 2018 IBM Corporation Demonstration of POWER9 capability, does not imply specific system plans 12 Storage Class Memory on PowerAXON 16 GHz x32 25 GHz x8 POWER9 POWER9 POWER9 POWER9 Scale-Up Scale-Up Scale-Up Scale-Up 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM 32 TB SCM © 2018 IBM Corporation Demonstration of POWER9 capability, does not imply specific system plans 13 Future Evolution of System Architecture Yesterday’s Plumbing Tomorrow’s Differentiation B B B B B B B B B B B B B B B B B B JEDEC Buffer OpenCAPI Northbound Cores & Cores & Cores & Caches System Bus Caches Caches 768 GB/s OpenCAPI & PCI Yesterday’s Plumbing PCI Gen X OpenCAPI / NVLink Tomorrow’s Differentiation CPU/Accelerator Bandwidth 1x 2x 5x 7-10x System bottleneck 14 © 2018 IBM Corporation 14 OpenCAPI Memory • Signaling AXON @25.6GHz vs DDR4 @ 3200 MHz – 4 times the bandwidth per IO • Idle latency over traditional DDR – POWER8/9 Centaur design ~10 ns – OpenCAPI target of ~5 ns • Centaur One proprietary design • OpenCAPI Open SCM © 2018 IBM Corporation 15 Proposed POWER Processor Technology and I/O Roadmap Todays talk POWER7 Architecture POWER8 Architecture POWER9 Architecture POWER10 2010 2012 2014 2016 2017 2018 2019 2020+ POWER7 POWER7+ POWER8 POWER8 P9 SO P9 SU P9 P10 8 cores 8 cores 12 cores w/ NVLink 12/24 cores 12/24 cores w/ Adv. I/O TBA cores 45nm 32nm 22nm 12 cores 14nm 14nm 12/24 cores 22nm 14nm New Micro- Enhanced New Micro- Enhanced New Micro- Enhanced New Micro- Architecture Micro- Architecture Micro- Architecture Micro- Enhanced Architecture Architecture Architecture Architecture Micro- With NVLink Direct attach Architecture memory Buffered New Process New Process New Process Memory New New Technology Technology Technology New Process Memory Technology Technology Subsystem 65 GB/s 65 GB/s 210 GB/s 210 GB/s 150 GB/s 210 GB/s 350+ GB/s 435+ GB/s Sustained Memory Bandwidth PCIe Gen2 PCIe Gen2 PCIe Gen3 PCIe Gen3 PCIe Gen4 x48 PCIe Gen4 x48 PCIe Gen4 x48 PCIe Gen5 Standard I/O Interconnect 20 GT/s 25 GT/s 25 GT/s N/A N/A N/A 25 GT/s 32 & 50 GT/s 160GB/s Advanced I/O Signaling 300GB/s 300GB/s 300GB/s CAPI 2.0, CAPI 2.0, CAPI 2.0, CAPI 1.0 , OpenCAPI3.0, OpenCAPI3.0, OpenCAPI4.0, N/A N/A CAPI 1.0 NVLink 1.0 TBA Advanced I/O Architecture NVLink2.0 NVLink2.0 NVLink3.0 © 2018 IBM Corporation Statement of Direction, Subject to Change 16 Thanks Questions? © 2018 IBM Corporation.
Recommended publications
  • Investigations on Hardware Compression of IBM Power9 Processors
    Investigations on hardware compression of IBM Power9 processors Jérome Kieffer, Pierre Paleo, Antoine Roux, Benoît Rousselle Outline ● The bandwidth issue at synchrotrons sources ● Presentation of the evaluated systems: – Intel Xeon vs IBM Power9 – Benchmarks on bandwidth ● The need for compression of scientific data – Compression as part of HDF5 – The hardware compression engine NX-gzip within Power9 – Gzip performance benchmark – Bitshuffle-LZ4 benchmark – Filter optimizations – Benchmark of parallel filtered gzip ● Conclusions – on the hardware – on the compression pipeline in HDF5 Page 2 HDF5 on Power9 18/09/2019 Bandwidth issue at synchrotrons sources Data analysis computer with the main interconnections and their associated bandwidth. Data reduction Upgrade to → Azimuthal integration 100 Gbit/s Data compression ! Figures from former generation of servers Kieffer et al. Volume 25 | Part 2 | March 2018 | Pages 612–617 | 10.1107/S1600577518000607 Page 3 HDF5 on Power9 18/09/2019 Topologies of Intel Xeon servers in 2019 Source: intel.com Page 4 HDF5 on Power9 18/09/2019 Architecture of the AC922 server from IBM featuring Power9 Credit: Thibaud Besson, IBM France Page 6 HDF5 on Power9 18/09/2019 Bandwidth measurement: Xeon vs Power9 Computer Dell R840 IBM AC922 Processor 4 Intel Xeon (12 cores) 2 IBM Power9 (16 cores) 2.6 GHz 2.7 GHz Cache (L3) 19 MB 8x 10 MB Memory channels 4x 6 DDR4 2x 8 DDR4 Memory capacity → 3TB → 2TB Memory speed theory 512 GB/s 340 GB/s Measured memory speed 160 GB/s 270 GB/s Interconnects PCIe v3 PCIe v4 NVlink2 & CAPI2 GP-GPU co-processor 2Tesla V100 PCIe v3 2Tesla V100 NVlink2 Interconnect speed 12 GB/s 48 GB/s CPU ↔ GPU Page 8 HDF5 on Power9 18/09/2019 Strength and weaknesses of the OpenPower architecture While amd64 is today’s de facto standard in HPC, it has a few competitors: arm64, ppc64le and to a less extend riscv and mips64.
    [Show full text]
  • Wind Rose Data Comes in the Form >200,000 Wind Rose Images
    Making Wind Speed and Direction Maps Rich Stromberg Alaska Energy Authority [email protected]/907-771-3053 6/30/2011 Wind Direction Maps 1 Wind rose data comes in the form of >200,000 wind rose images across Alaska 6/30/2011 Wind Direction Maps 2 Wind rose data is quantified in very large Excel™ spreadsheets for each region of the state • Fields: X Y X_1 Y_1 FILE FREQ1 FREQ2 FREQ3 FREQ4 FREQ5 FREQ6 FREQ7 FREQ8 FREQ9 FREQ10 FREQ11 FREQ12 FREQ13 FREQ14 FREQ15 FREQ16 SPEED1 SPEED2 SPEED3 SPEED4 SPEED5 SPEED6 SPEED7 SPEED8 SPEED9 SPEED10 SPEED11 SPEED12 SPEED13 SPEED14 SPEED15 SPEED16 POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 WEIBC1 WEIBC2 WEIBC3 WEIBC4 WEIBC5 WEIBC6 WEIBC7 WEIBC8 WEIBC9 WEIBC10 WEIBC11 WEIBC12 WEIBC13 WEIBC14 WEIBC15 WEIBC16 WEIBK1 WEIBK2 WEIBK3 WEIBK4 WEIBK5 WEIBK6 WEIBK7 WEIBK8 WEIBK9 WEIBK10 WEIBK11 WEIBK12 WEIBK13 WEIBK14 WEIBK15 WEIBK16 6/30/2011 Wind Direction Maps 3 Data set is thinned down to wind power density • Fields: X Y • POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 • Power1 is the wind power density coming from the north (0 degrees). Power 2 is wind power from 22.5 deg.,…Power 9 is south (180 deg.), etc… 6/30/2011 Wind Direction Maps 4 Spreadsheet calculations X Y POWER1 POWER2 POWER3 POWER4 POWER5 POWER6 POWER7 POWER8 POWER9 POWER10 POWER11 POWER12 POWER13 POWER14 POWER15 POWER16 Max Wind Dir Prim 2nd Wind Dir Sec -132.7365 54.4833 0.643 0.767 1.911 4.083
    [Show full text]
  • IBM Power System POWER8 Facts and Features
    IBM Power Systems IBM Power System POWER8 Facts and Features April 29, 2014 IBM Power Systems™ servers and IBM BladeCenter® blade servers using IBM POWER7® and POWER7+® processors are described in a separate Facts and Features report dated July 2013 (POB03022-USEN-28). IBM Power Systems™ servers and IBM BladeCenter® blade servers using IBM POWER6® and POWER6+™ processors are described in a separate Facts and Features report dated April 2010 (POB03004-USEN-14). 1 IBM Power Systems Table of Contents IBM Power System S812L 4 IBM Power System S822 and IBM Power System S822L 5 IBM Power System S814 and IBM Power System S824 6 System Unit Details 7 Server I/O Drawers & Attachment 8 Physical Planning Characteristics 9 Warranty / Installation 10 Power Systems Software Support 11 Performance Notes & More Information 12 These notes apply to the description tables for the pages which follow: Y Standard / Supported Optional Optionally Available / Supported N/A or - Not Available / Supported or Not Applicable SOD Statement of General Direction announced SLES SUSE Linux Enterprise Server RHEL Red Hat Enterprise Linux a One x8 PCIe slots must contain a 4-port 1Gb Ethernet LAN available for client use b Use of expanded function storage backplane uses one PCIe slot Backplane provides dual high performance SAS controllers with 1.8 GB write cache expanded up to 7.2 GB with c compression plus Easy Tier function plus two SAS ports for running an EXP24S drawer d Full benchmark results are located at ibm.com/systems/power/hardware/reports/system_perf.html e Option is supported on IBM i only through VIOS.
    [Show full text]
  • March 11, 2010 Presentation
    IBM Power Systems POWER7TM Announcement, The Next Generation of Power Systems Power your planet. February 25, 2010 IBM Power Systems 2 February 25, 2010 IBM Power Systems POWER7 System Highlights .Balance System Design - Cache, Memory, and IO .POWER7 Processor Technology - 6th Implementation of multi-core design - On chip L2 & L3 caches .POWER7 System Architecture - Blades to High End offerings - Enhances memory implementation - PCIe, SAS / SATA .Built in Virtualization - Memory Expansion - VM Control .Green Technologies - Processor Nap & Sleep Mode - Memory Power Down support - Aggressive Power Save / Capping Modes 600 500 .Availability 400 - Processor Instruction Retry 300 - Alternate Process Recovery 200 100 - Concurrent Add & Services 0 JS23 JS43 520 550 560 570/16 570/32 595 3 February 25, 2010 IBM Power Systems 4 February 25, 2010 IBM Power Systems Power Processor Technology IBM investment in the Power Franchise Dependable Execution for a decade POWER8 POWER7 45 nm Globalization and globally POWER6 available resources 65 nm •Performance/System Capacity POWER5 •4-5X increase from Power6 130 nm •Multi Core – Up to 8 POWER4 •SMT4 – 4 threads/core 180 nm . Dual Core •On-Chip eDRAM . High Frequencies • Energy . Dual Core . Virtualization + . Enhanced Scaling • Efficiency: 3-4X Power6 . Memory Subsystem + . SMT • Dynamic Energy . Dual Core . Altivec . Distributed Switch + Management . Chip Multi Processing . Instruction Retry . Distributed Switch . Core Parallelism + • Reliability + . Dyn Energy Mgmt . Shared L2 . FP Performance + . SMT + •Memory DIMM – DRAM . Dynamic LPARs (32) . Memory bandwidth + . Protection Keys Sparing . Virtualization •N+2 Voltage Regulator Redundancy •Protection Keys + 5 February 25, 2010 IBM Power Systems POWER6 – POWER7 Compare Wireless world Mobile platforms are developing as new means of identification.
    [Show full text]
  • POWER® Processor-Based Systems
    IBM® Power® Systems RAS Introduction to IBM® Power® Reliability, Availability, and Serviceability for POWER9® processor-based systems using IBM PowerVM™ With Updates covering the latest 4+ Socket Power10 processor-based systems IBM Systems Group Daniel Henderson, Irving Baysah Trademarks, Copyrights, Notices and Acknowledgements Trademarks IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: Active AIX® POWER® POWER Power Power Systems Memory™ Hypervisor™ Systems™ Software™ Power® POWER POWER7 POWER8™ POWER® PowerLinux™ 7® +™ POWER® PowerHA® POWER6 ® PowerVM System System PowerVC™ POWER Power Architecture™ ® x® z® Hypervisor™ Additional Trademarks may be identified in the body of this document. Other company, product, or service names may be trademarks or service marks of others. Notices The last page of this document contains copyright information, important notices, and other information. Acknowledgements While this whitepaper has two principal authors/editors it is the culmination of the work of a number of different subject matter experts within IBM who contributed ideas, detailed technical information, and the occasional photograph and section of description.
    [Show full text]
  • Openpower AI CERN V1.Pdf
    Moore’s Law Processor Technology Firmware / OS Linux Accelerator sSoftware OpenStack Storage Network ... Price/Performance POWER8 2000 2020 DRAM Memory Chips Buffer Power8: Up to 12 Cores, up to 96 Threads L1, L2, L3 + L4 Caches Up to 1 TB per socket https://www.ibm.com/blogs/syst Up to 230 GB/s sustained memory ems/power-systems- openpower-enable- bandwidth acceleration/ System System Memory Memory 115 GB/s 115 GB/s POWER8 POWER8 CPU CPU NVLink NVLink 80 GB/s 80 GB/s P100 P100 P100 P100 GPU GPU GPU GPU GPU GPU GPU GPU Memory Memory Memory Memory GPU PCIe CPU 16 GB/s System bottleneck Graphics System Memory Memory IBM aDVantage: data communication and GPU performance POWER8 + 78 ms Tesla P100+NVLink x86 baseD 170 ms GPU system ImageNet / Alexnet: Minibatch size = 128 ADD: Coherent Accelerator Processor Interface (CAPI) FPGA CAPP PCIe POWER8 Processor ...FPGAs, networking, memory... Typical I/O MoDel Flow Copy or Pin MMIO Notify Poll / Int Copy or Unpin Ret. From DD DD Call Acceleration Source Data Accelerator Completion Result Data Completion Flow with a Coherent MoDel ShareD Mem. ShareD Memory Acceleration Notify Accelerator Completion Focus on Enterprise Scale-Up Focus on Scale-Out and Enterprise Future Technology and Performance DriVen Cost and Acceleration DriVen Partner Chip POWER6 Architecture POWER7 Architecture POWER8 Architecture POWER9 Architecture POWER10 POWER8/9 2007 2008 2010 2012 2014 2016 2017 TBD 2018 - 20 2020+ POWER6 POWER6+ POWER7 POWER7+ POWER8 POWER8 P9 SO P9 SU P9 SO 2 cores 2 cores 8 cores 8 cores 12 cores w/ NVLink
    [Show full text]
  • IBM Power System E850 the Most Agile 4-Socket System in the Marketplace, Optimized for Performance, Reliability and Expansion
    IBM Systems Data Sheet IBM Power System E850 The most agile 4-socket system in the marketplace, optimized for performance, reliability and expansion Businesses today are demanding faster insights that analyze more data in Highlights new ways. They need to implement applications in days versus months, and they need to achieve all these goals while reducing IT costs. This is ●● ●●Designed for data and analytics, delivers creating new demands on IT infrastructures, requiring new levels of per- secure, reliable performance in a compact, 4-socket system formance and the flexibility to respond to new business opportunities, all at an affordable price. ●● ●●Can flexibly scale to rapidly respond to changing business needs The IBM® Power® System E850 server offers a unique blend of ●● ●●Can reduce IT costs through application enterprise-class capabilities in a space-efficient, 4-socket system with consolidation, higher availability and excellent price performance. With up to 48 IBM POWER8™ processor virtualization to yield over 70 percent utilization cores, advanced IBM PowerVM® virtualization that can yield over 70 percent system utilization and Capacity on Demand (CoD), no other 4-socket system in the industry delivers this combination of performance, efficiency and business agility. These capabilities make the Power E850 server an ideal platform for medium-size businesses and as a departmental server or data center building block for large enterprises. Designed for the demands of big data and analytics Businesses are amassing a wealth of data and IBM Power Systems™, built with innovation to support today’s data demands, can store it, secure it and, most important, extract actionable insight from it.
    [Show full text]
  • IBM Power Systems Performance Report Apr 13, 2021
    IBM Power Performance Report Power7 to Power10 September 8, 2021 Table of Contents 3 Introduction to Performance of IBM UNIX, IBM i, and Linux Operating System Servers 4 Section 1 – SPEC® CPU Benchmark Performance 4 Section 1a – Linux Multi-user SPEC® CPU2017 Performance (Power10) 4 Section 1b – Linux Multi-user SPEC® CPU2017 Performance (Power9) 4 Section 1c – AIX Multi-user SPEC® CPU2006 Performance (Power7, Power7+, Power8) 5 Section 1d – Linux Multi-user SPEC® CPU2006 Performance (Power7, Power7+, Power8) 6 Section 2 – AIX Multi-user Performance (rPerf) 6 Section 2a – AIX Multi-user Performance (Power8, Power9 and Power10) 9 Section 2b – AIX Multi-user Performance (Power9) in Non-default Processor Power Mode Setting 9 Section 2c – AIX Multi-user Performance (Power7 and Power7+) 13 Section 2d – AIX Capacity Upgrade on Demand Relative Performance Guidelines (Power8) 15 Section 2e – AIX Capacity Upgrade on Demand Relative Performance Guidelines (Power7 and Power7+) 20 Section 3 – CPW Benchmark Performance 19 Section 3a – CPW Benchmark Performance (Power8, Power9 and Power10) 22 Section 3b – CPW Benchmark Performance (Power7 and Power7+) 25 Section 4 – SPECjbb®2015 Benchmark Performance 25 Section 4a – SPECjbb®2015 Benchmark Performance (Power9) 25 Section 4b – SPECjbb®2015 Benchmark Performance (Power8) 25 Section 5 – AIX SAP® Standard Application Benchmark Performance 25 Section 5a – SAP® Sales and Distribution (SD) 2-Tier – AIX (Power7 to Power8) 26 Section 5b – SAP® Sales and Distribution (SD) 2-Tier – Linux on Power (Power7 to Power7+)
    [Show full text]
  • IBM Energyscale for POWER8 Processor-Based Systems
    IBM EnergyScale for POWER8 Processor-Based Systems November 2015 Martha Broyles Christopher J. Cain Todd Rosedahl Guillermo J. Silva Table of Contents Executive Overview...................................................................................................................................4 EnergyScale Features.................................................................................................................................5 Power Trending................................................................................................................................................................5 Thermal Reporting...........................................................................................................................................................5 Fixed Maximum Frequency Mode...................................................................................................................................6 Static Power Saver Mode.................................................................................................................................................6 Fixed Frequency Override...............................................................................................................................................6 Dynamic Power Saver Mode...........................................................................................................................................7 Power Management's Effect on System Performance................................................................................................7
    [Show full text]
  • Towards a Portable Hierarchical View of Distributed Shared Memory Systems: Challenges and Solutions
    Towards A Portable Hierarchical View of Distributed Shared Memory Systems: Challenges and Solutions Millad Ghane Sunita Chandrasekaran Margaret S. Cheung Department of Computer Science Department of Computer and Information Physics Department University of Houston Sciences University of Houston TX, USA University of Delaware Center for Theoretical Biological Physics, [email protected],[email protected] DE, USA Rice University [email protected] TX, USA [email protected] Abstract 1 Introduction An ever-growing diversity in the architecture of modern super- Heterogeneity has become increasingly prevalent in the recent computers has led to challenges in developing scientifc software. years given its promising role in tackling energy and power con- Utilizing heterogeneous and disruptive architectures (e.g., of-chip sumption crisis of high-performance computing (HPC) systems [15, and, in the near future, on-chip accelerators) has increased the soft- 20]. Dennard scaling [14] has instigated the adaptation of heteroge- ware complexity and worsened its maintainability. To that end, we neous architectures in the design of supercomputers and clusters need a productive software ecosystem that improves the usability by the HPC community. The July 2019 TOP500 [36] report shows and portability of applications for such systems while allowing how 126 systems in the list are heterogeneous systems confgured every parallelism opportunity to be exploited. with one or many GPUs. This is the prevailing trend in current gen- In this paper, we outline several challenges that we encountered eration of supercomputers. As an example, Summit [31], the fastest in the implementation of Gecko, a hierarchical model for distributed supercomputer according to the Top500 list (June 2019) [36], has shared memory architectures, using a directive-based program- two IBM POWER9 processors and six NVIDIA Volta V100 GPUs.
    [Show full text]
  • IBM Power System S814 and S824 Technical Overview and Introduction
    Front cover IBM Power Systems S814 and S824 Technical Overview and Introduction Outstanding performance based on POWER8 processor technology 4U scale-out desktop and rack-mount servers Improved reliability, availability, and serviceability features Alexandre Bicas Caldeira Bartłomiej Grabowski Volker Haug Marc-Eric Kahle Cesar Diniz Maciel Monica Sanchez ibm.com/redbooks Redpaper International Technical Support Organization IBM Power System S814 and S824 Technical Overview and Introduction August 2014 REDP-5097-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (August 2014) This edition applies to IBM Power System S814 (8286-41A) and IBM Power System S824 (8286-42A) systems. © Copyright International Business Machines Corporation 2014. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . vii Trademarks . viii Preface . ix Authors. ix Now you can become a published author, too! . xi Comments welcome. xi Stay connected to IBM Redbooks . xi Chapter 1. General description . 1 1.1 Systems overview . 2 1.1.1 Power S814 server . 2 1.1.2 Power S824 server . 3 1.2 Operating environment . 3 1.3 Physical package . 4 1.3.1 Tower model . 4 1.3.2 Rack-mount model . 5 1.4 System features . 7 1.4.1 Power S814 system features . 7 1.4.2 Power S824 system features . 8 1.4.3 Minimum features . 9 1.4.4 Power supply features . 9 1.4.5 Processor module features . 9 1.4.6 Memory features .
    [Show full text]
  • System Automation for Z/OS : Planning and Installation About This Publication
    System Automation for z/OS Version 4.Release 1 Planning and Installation IBM SC34-2716-01 Note Before using this information and the product it supports, read the information in Appendix H, “Notices,” on page 201. Edition Notes This edition applies to IBM System Automation for z/OS® (Program Number 5698-SA4) Version 4 Release 1, an IBM licensed program, and to all subsequent releases and modifications until otherwise indicated in new editions. This edition replaces SC34-2716-00. © Copyright International Business Machines Corporation 1996, 2017. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures................................................................................................................. xi Tables................................................................................................................ xiii Accessibility........................................................................................................ xv Using assistive technologies...................................................................................................................... xv Keyboard navigation of the user interface................................................................................................. xv About this publication........................................................................................ xvii Who Should Use This Publication.............................................................................................................xvii
    [Show full text]