8335-GTB Hardware Architecture Overview

8335-GTB Hardware Architecture Overview Chris Mann Power Systems Development January 6, 2017 IBM CONFIDENTIAL Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 2 © 2014 IBM Corporation2 Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 3 © 2014 IBM Corporation3 Garrison Design - 2 Socket P8 with NVLink, 4 GPU System Power 8 with NVLink (2x) PCIe slot (2x) • 190W Sort • Gen3 PCIe • Integrated NVLink 1.0 • HHHL Adapter PCIe slot (1x) • Gen3 PCIe NVidia GPU • HHHL Adapter • SXM2 form factor • NVLink 1.0 Service Controller Card • 300 W • BMC Content • Max of 2 per socket Memory DIMM’s Riser (8x) • 4 IS DDR4 DIMMs per riser • Single Centaur per riser • 32 IS DIMM’s total Power Supplies (2x) • 32-1024 GB memory capacity • 1300W • Common Form Factor Supply Cooling Fans (4x) • 80mm Counter- Rotating Fans • Hot swap Storage Option (2x) • 0-2, SATA HDD.SSD • Tray design for install/removal • Hot Swap IBM CONFIDENTIAL 4 © 2014 IBM Corporation Garrison Design - 2 Socket P8 with NVLink, 4 GPU System Power Supplies (2x) Power 8 with NVLink (2x) • 1300W • 190W Sort Service Controller Card • Common Form Factor Supply • Integrated NVLink 1.0 • BMC Content PCIe slot (1x) • Gen3 PCIe • HHHL Adapter Water Cooling Access NVidia GPU • Removable panel for water • SXM2 form factor line access • NVLink 1.0 • 300 W PCIe slot (2x) • Max of 2 per socket • Gen3 PCIe • HHHL Adapter IBM CONFIDENTIAL 5 © 2014 IBM Corporation Garrison Design – Water Cooled 2 Socket P8 with NVLink, 4 GPU Server Power Supplies (2x) • 1300W • Common Form Factor Supply Service Controller Card • Firestone BMC Content • Daughter card due to planar space constraints PCIe slot (1x) • Gen3 PCIe • x16 HHHL Adapter PCIe slot (2x) • Gen3 PCIe • 1, x16 HHHL Adapter • 1, x8 HHHL Adapter NVidia GPU (4x) Power 8 with NVLink (2x) Memory DIMM’s Riser (8x) • SXM2 form factor • 190W Sort • 4 IS DDR4 DIMMs per Riser • NVLink 1.0 • Integrated NVLink 1.0 • Single Centaur per Riser • 300 W • 32 IS DIMM’s total • 2 per socket IBM CONFIDENTIAL 6 © 2014 IBM Corporation Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 7 © 2014 IBM Corporation7 POWER8 with NVLink Overview • Processor Chip • 659mm2, 22nm SOI with embedded DRAM • 12 Processor core per chip • Up to 8 threads per core (SMT8) • 4 Concurrent LPARs per core • Garrison uses 8, 10 Core sorts • Integrated SMP interconnect • Single 8B X-B bus at 4.8GHz • Integrated Memory Controller • 4 High Speed DMI ports at 9.6GHz • 16MB memory cache / buffer chip (Centaur) • GPU Interfaces • 4 Bricks (32 lanes) NVLink 1.0 • Each brick running at 19.2GB/s • Integrated I/O Sub-system • 40 PCIe Gen 3 lanes • 2, x16 and 1, x8 interfaces • Support for 2 CAPI connections over the x16 PCIe • Fine Grained Power Management • On chip controller, Power gating and on chip VRM IBM CONFIDENTIAL 8 © 2014 IBM Corporation8 POWER8 with NVLink Module Interfaces X-Bus 8B @ 4.8GHz 2 Memory DMI Ports 2 Memory DMI Ports 9.6GHz 9.6GHz PCIe Gen3 x8 PCIe Gen3 x16 PCIe Gen3 x16 2 Bricks (16 Lanes) 2 Bricks (16 Lanes) 19.2 GHz 19.2 GHz IBM CONFIDENTIAL 9 © 2014 IBM Corporation9 Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 10 © 2014 IBM Corporation10 Garrison Memory Interconnect Summary • DMI Interface Memory Riser Card • Unidirectional interface to allow for high frequency and high effective data rates DIMM000B DIMM000C DIMM000D DIMM000A DIMM000B DIMM000D DIMM000C DIMM000A DIMM000B DIMM000D DIMM000C DIMM000A DIMM000B DIMM000D DIMM000C • Memory Controller in Buffer DIMM000A C A C A C A C A • 2 Memory controllers located in Centaur3 Centaur2 Centaur1 Centaur0 buffer chip, each controlling 2, 8 D B D B D B D B byte interfaces to the memory DIMMs • 115GB/second Peak DMI Bandwidth • 3 bytes x 4 channels x 9.6 Gbits/sec • 205 GB/second Peak Memory 2 Memory DMI Ports 2 Memory DMI Ports Bandwidth 9.6GHz 9.6GHz • 32 bytes x 4 buffers x 1.6 Gb/s data rate • Industry Standard DIMMs • 4, 8, 16 and 32GB DIMMs • 128 – 1024 GB system capacity IBM CONFIDENTIAL 11 © 2014 IBM Corporation Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 12 © 2014 IBM Corporation12 GPU Interconnect PCIe, Power, Misc • NVLink Interface • High bandwidth interface far exceeding any existing or planned future PCIe interface • 16 Lanes CPU to GPU • 16 Lanes GPU to GPU • PCIe Interface NV3 NV2 NVLink NV1 NV0 • Used for initiation, control and in PCIe Gen3 x8 band reporting of GPU status PCIe Gen3 x8 • PCIe x16 interface on the GPU, PCIe Gen3 x16 2 Bricks (16 Lanes) 19.2 GHz 2 Bricks (16 Lanes) 19.2 GHz Garrison uses this interface in x8 2 Bricks (16 Lanes) 19.2 GHz mode NV0 NV1 NVLink NV2 NV3 PCIe, Power, Misc PCIe Gen3 x8 IBM CONFIDENTIAL 13 © 2014 IBM Corporation13 Garrison Architecture Overview - Agenda • Hardware Overview • Processor Interfaces • Memory Architecture • GPU Interconnect • I/O Interconnect IBM CONFIDENTIAL 14 © 2014 IBM Corporation14 Garrison I/O Interconnect X-Bus X-Bus 8B @ 4.8GHz 8B @ 4.8GHz • 3 Gen3 PCIe adapter slots • All slots CAPI enabled • HHHL adapter support • Gen 3 PCIe x8 interface to PCIe switch module (PLX) • BMC interconnect • Ethernet, SATA and USB controller PCIe Gen3 x8 PCIe Gen3 x16 PCIe Gen3 x8 PCIe Gen3 x16 x2 G3 NSCI Internal USB for Disaster recovery PCI Gen 3 x16 Connector PCI Gen 3 x8 Connector PCI Gen 3 x16 Connector IBM CONFIDENTIAL 15 © 2014 IBM Corporation15 Garrison I/O Interconnect IBM CONFIDENTIAL 16 © 2014 IBM Corporation Backup IBM CONFIDENTIAL 17 © 2014 IBM Corporation Garrison Concept – Water Cooled 2 Socket P8 with NVLink, 4 GPU Server Power Supplies (2x) • 1300W • Configuration limits for redundancy • Hot Swap • Common Form Factor Supply Service Controller Card • Firestone BMC Content • Daughter card due to planar space constraints HDD Option (2x) • Tray design for install/removal • Hot Swap • Front Service • Replaces fan module Cooling Fans (4x) • 80mm Counter- Rotating Fans PCIe slot (1x) • Hot swap • Gen3 PCIe • x16 HHHL Adapter PCIe slot (2x) • Gen3 PCIe • 1, x16 HHHL Adapter • 1, x8 HHHL Adapter NVidia GPU (2/4x) Memory DIMM’s Riser (8x) • SXM2 form factor Power 8 with NVLink (2x) • 4 IS DDR4 DIMMs per Riser • NVLink 1.0 • 190W Sort • Single Centaur per Riser • 300 W • Integrated NVLink 1.0 • 32 IS DIMM’s total • 1-2 per socket IBM CONFIDENTIAL 18 © 2014 IBM Corporation Garrison Memory Interconnect Memory Riser Card DIMM000B DIMM000D DIMM000C DIMM000A DIMM000B DIMM000D DIMM000C DIMM000A DIMM000B DIMM000D DIMM000C DIMM000A DIMM000B DIMM000D DIMM000C DIMM000A A C A C A C A C Centaur0 Centaur3 Centaur2 Centaur1 B D B D B D B D 2 Memory DMI Ports 2 Memory DMI Ports 9.6GHz 9.6GHz IBM CONFIDENTIAL 19 © 2014 IBM Corporation.

8335-GTB Hardware Architecture Overview

Supermicro GPU Solutions Optimized for NVIDIA Nvlink

High Performance Computing and AI Solutions Portfolio

BRKIOT-2394.Pdf

NVIDIA Gpudirect RDMA (GDR) Host Memory Host Memory • Pipeline Through Host for Large Msg 2 IB IB 4 2

Dell EMC Poweredge C4140 Technical Guide

HPE Apollo 6500 Gen10 System Overview

The Computer That Could Be Smarter Than Us Cognitive Computing

NVIDIA Geforce RTX 2080 User Guide | 3 Introduction

NVIDIA Geforce RTX 2080 User Guide | 3 Introduction

Performance Analysis of Deep Learning Workloads on Leading-Edge Systems

The Ultimate Pc Gpu Nvidia Titan Rtx

NVIDIA DGX-1 System Architecture White Paper