Cray XT and Cray XE Y Y System Overview

Crayyy XT and Cray XE System Overview Customer Documentation and Training Overview Topics • System Overview – Cabinets, Chassis, and Blades – Compute and Service Nodes – Components of a Node Opteron Processor SeaStar ASIC • Portals API Design Gemini ASIC • System Networks • Interconnection Topologies 10/18/2010 Cray Private 2 Cray XT System 10/18/2010 Cray Private 3 System Overview Y Z GigE X 10 GigE GigE SMW Fibre Channels RAID Subsystem Compute node Login node Network node Boot /Syslog/Database nodes 10/18/2010 Cray Private I/O and Metadata nodes 4 Cabinet – The cabinet contains three chassis, a blower for cooling, a power distribution unit (PDU), a control system (CRMS), and the compute and service blades (modules) – All components of the system are air cooled A blower in the bottom of the cabinet cools the blades within the cabinet • Other rack-mounted devices within the cabinet have their own internal fans for cooling – The PDU is located behind the blower in the back of the cabinet 10/18/2010 Cray Private 5 Liquid Cooled Cabinets Heat exchanger Heat exchanger (XT5-HE LC only) (LC cabinets only) 48Vdc flexible Cage 2 buses Cage 2 Cage 1 Cage 1 Cage VRMs Cage 0 Cage 0 backplane assembly Cage ID controller Interconnect 01234567 Heat exchanger network cable Cage inlet (LC cabinets only) connection air temp sensor Airflow Heat exchanger (slot 3 rail) conditioner 48Vdc shelf 3 (XT5-HE LC only) 48Vdc shelf 2 L1 controller 48Vdc shelf 1 Blower speed controller (VFD) Blooewer PDU line filter XDP temperature XDP interface & humidity sensor board (behind (behind perforated perforated panel, panel in some XT5-HE LC only) XT5-HE LC cabinets) 10/18/2010 Cray Private 6 Blades • The system contains two types of blades: – Compute blades 4 SeaStar or 2 Gemini ASICs 4 nodes – Service blades SIO (XT systems) • 4 SeaStar ASICs • 2 nodes • One dual-slot PCI-X or PCIe riser assemblies per node XIO (XE systems) • 2 Gemini ASICs • 4 nodes • OilltPCIiOne single slot PCIe riser per nod d(tthbtd)e (except on the boot node) 10/18/2010 Cray Private 7 PCI Cards • Cray supports the following types of PCIe cards: – Gigabit Ethernet – 10Gigabit Ethernet – Fibre Channel (FC2, FC4, and FC8) – IfiiBdInfiniBand 10/18/2010 Cray Private 8 Node Components – Opteron processor Single-, dual-, quad-, six-, eight-, twelve-core versions – NdNode memory 2- to 8-GB of DDR1 memory in Cray XT3 compute nodes 2- to 8-GB of DDR1 memory in Cray XT service (SIO) nodes 4- to 16-GB of DDR2 memory in Cray XT4 compute nodes 4- to 32-GB of DDR2 memory in Cray XT5 compute nodes 8- t64to 64-GB o f DDR3 memory in Cray XT6 compu te no des 8- to 64-GB of DDR3 memory in Cray XE6 compute nodes 8- to 16-GB of DDR2 memoryyy in Cray XE6 service ()(XIO) nodes – SeaStar or Gemini ASIC 10/18/2010 Cray Private 9 Cray XT4 Compute Blade Node memory Node VRM Node 1 NdNode 2 SeaStar Node 0 Node 3 L0 controller 48Vdc to 12Vdc VRMs 10/18/2010 Cray Private 10 XT4 Node Block Diagram Opteron slots processor MM X86-64 4 DIM HyperTransport Link High-speed network (HSN) SeaStar (()interconnect) L0 controller 10/18/2010 Cray Private 11 SIO Service Blade PCI card Node 3 3 Node 0 2 PCI riser 1 (PCI-X or PCIe) 0 10/18/2010 Cray Private 12 SeaStar ASIC – Direct HyperTransport connection to Opteron – The DMA engine transfers data between the host memory andthd the net work Separate read and write sides in the DMA engine The Oppyteron does not directly load/store to network On the receive side a set of content addressable memory (CAM) locations are used to route incoming messages • There are 256 CAM entries – Designed for Sandia Portals API 10/18/2010 Cray Private 13 SeaStar Diagram Core DDR- Northbridge SDRAM Memory interface HyperTransport HyperTransport Interface Interface SeaStar To PCI-X on Service blades DMA Hyper x R engine Transport o y u Memory t z e r SSI PPU L0 controller 10/18/2010 Cray Private 14 Node Connection Bandwidths Opteron core(s) +Y ry oo HyperTransport Mem B/s GG +Z HyperTransport Cray XT3 6.4 GB/s bidirectional 9.6 2.17 GB/s sustained Cray XT4 8.0 GB/s bidirectional Memory Cray XT3 DDR1 6.4 GB/s (400 MHz DIMMs) 5.7 GB/s sustained -X 9.6 GB/s 9.6 GB/s +X 51.6 ns latency - measured SeaStar Cray XT4 DDR2 10.4 GB/s (667 MHz DIMMs) 8. 5 GB/s sustained 50 ns latency - measured High Speed Network Cray XT4 DDR2 12.8 GB/s (800 MHz DIMMs) 9.6 GB/s bidirectional 6.5 GB/s sustained Cray XT5 DDR2 10.4 GB/s (667 MHz DIMMs) GB/s Cray XT5 DDR2 12.8 GB/s (800 MHz DIMMs) -Z 9.6 10/18/2010 -Y Cray Private 15 Portals API Design – Designed to message among thousands of nodes Performance is critical only in terms of scalability SiSuccess is measure dbhllitiilldtd by how large an application is allowed to scale, not by a 2-node ping-pong time – Designed to avoid scalability limitations Is network independent Bypasses OS – no memory copies into or out of the kernel; few interrupts Bypasses application – does not require activity by the application to ensure progress Reliable, ordered delivery of messages between two nodes 10/18/2010 Cray Private 16 Portals API Design – Receiver (target) managed – no reserved memory for thousands of potential senders Is connec tionl ess; no expli c it connec tion is est abli sh ed with other processes in the application • When connected, maintains a minimum amount of state The target process determines how to respond to the incoming message • The target node can choose to accept or ignore a message from any node • Destination of any message is not an address – Instead “Match bits ” enable the receiver to place it Unexpected messages are handled by the unexpected message queue All buffers are in user space 10/18/2010 Cray Private 17 X6 Compute Blade with Gemini Node 3 NdNode 2 Gemini 1 Node 1 Gemini 0 Node 0 10/18/2010 Cray Private 18 XE6 Node Block Diagram lots lots Opteron Opteron ss ss processor processor X86-64 X86-64 4 DIMM 4 DIMM HyperTransport Links High-speed SeaStar (XT) network (HSN) Gemini (XE) (()interconnect) L0 controller 10/18/2010 Cray Private 19 XIO Service Blades • Four nodes; each node contains: – One AMD Opteron processor with up to 16 GB of DDR2 memory Processor is a six-core Opteron – A connection to a Gemini ASIC – Voltage regulating modules (VRMs) • L0 controller • Gemini mezzanine card – Contains two Gemini ASICs • Four PCIe risers – One riser per node 10/18/2010 Cray Private 20 XIO Blade node 3 connects Node 3 to the lower mezzanine Node 2 Gemini 1 node 1 connects Node 1 to the lower mezzanine Gemini 0 Node 0 10/18/2010 Cray Private 21 XIO Riser Configuration DDR2 Upper bay memory Inner x16 PCIe G2 HyperTransport Opteron mezzanine bridge HyperTransport node 3 DDR2 75 Watt slot memory Gemini 225 Watt slot HyperTransport Outer x16 PCIe G2 HyperTransport Opteron mezzanine bridge node 2 DDR2 memory Inner x16 PCIe G2 HyperTransport Opteron mezzanine bridge HyperTransport node 1 75 Watt slot DDR2 memory Gemini 225 Watt slot HyperTransport Outer x16 PCIe G2 HyperTransport Opteron mezzanine bridge node 0 Lower bay 10/18/2010 Cray Private 22 XIO Boot Node Configuration DDR2 Upper bay memory Inner x16 PCIe G2 HyperTransport Opteron mezzanine bridge HyperTransport node 3 DDR2 memory Gemini HyperTransport Outer x16 PCIe G2 HyperTransport Opteron mezzanine bridge node 2 DDR2 memory Inner x16 PCIe G2 HyperTransport Opteron mezzanine bridge HyperTransport node 1 DDR2 memory Gemini HyperTransport Outer boot x8 PCIe G2 HyperTransport Opteron mezzanine bridge node 0 Lower bay 10/18/2010 Cray Private 23 Opteron Processors CPU CPU CPU CPU CPU CPU CPU L1 I cache L1 D cache L1 caches L1 caches L1 s L1 s L1 s L1 s L2 cache L2 cache L2 cache L2 L2 L2 L2 System request queue System request queue L3 cache Crossbar Crossbar System request queue Memory Memory Crossbar Controller Controller Hyper- Hyper- Memory Transport Transport Controller Hyper- Links Links DDR DDR Transport Links DDR AMD Opteron single-core processor AMD Opteron dual-core processor AMD Opteron quad-core processor Cray XT3 System Cray XT3 System Cray XT4 System Socket 940: DDR1 buffered DIMMs Socket 940: DDR1 buffered DIMMs Socket AM2: DDR2 unbuffered DIMMs and 3 HyperTransportsand 3 HyperTransports and one HyperTransport Cray XT4 System Cray XT5 System Socket AM2: DDR2 unbuffered DIMMs Socket F: DDR2 registered (buffered) and one HyperTransport DIMMs and 3 HyperTransports AMD Opteron hex-core processor 10/18/2010 Cray Private 24 Opteron Processors CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L1 s L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L3 cache L3 cache L3 cache System request queue System request queue System request queue Crossbar Crossbar Crossbar Memory Memory Memory Controller Controller Controller Hyper- Hyper- Hyper- Transport Transport Transport Links Links Links DDR DDR DDR AMD Opteron six-core processor AMD Opteron twelve-core processor Cray XT5 System Dual-die socket Socket F: DDR2 unbuffered DIMMs (for an eight-core socket remove two and 3 HyperTransports CPUs from each side) L3 Cache: 6 MB Cray XT6 and Cray XE6 Systems Socket G34: DDR3 unbuffered DIMMs and 4 HyperTransports L3 cache: each die has 6 MBs, 12 MB per socket 10/18/2010 Cray Private 25 Gemini • Gemini is designed to pass data to and from the network with less control – GiiiitdfSMPGemini is suited for SMP – Gemini allows for adaptive routing – Hardware support for PGAS languages Remote memory address translations, as with Cray X2 systems – Atomic memoryyp operations • The Gemini

Cray XT and Cray XE Y Y System Overview

New CSC Computing Resources

Petaflops for the People

Overview of Recent Supercomputers

Workload Management and Application Placement for the Cray Linux Environment™

Introducing the Cray XMT

HDAMA Rev.G User's Guide

Parallelization Hardware Architecture Type to Enter Text

Pubtex Output 2006.05.15:1001

Jaguar and Kraken -The World's Most Powerful Computer Systems

Cray XT3/XT4 Software

The Gemini Network

Archana Subramanian Nikhat Farha