THE COMPANY

The Rainier System: Integrated Scalar/Vector Computing

Per Nyberg 11th ECMWF Workshop on HPC in Meteorology TopicsTopics

• Current Product Overview • Cray Technology Strengths • Rainier System: Integrated Scalar/Vector Computing – Overview – Motivation – Benefits • Cascade Project Overview

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 2 Cray’sCray’s FamilyFamily ofof SupercomputersSupercomputers

• 1 to 50+ TFLOPS • 4 – 4,069 processors • for uncompromised sustained performance

• 1 to 50+ TFLOPS • 256 – 10,000+ processors • Compute system for large-scale sustained performance Cray XT3

• 48 GFLOPS - 2+ TFLOPS • 12 – 288+ processors • Entry/Mid range system optimized for sustained Cray XD1 performance

PurposePurpose-Built-Built HighHigh PerformancePerformance ComputersComputers 26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 3 GrowingGrowing thethe AddressableAddressable MarketMarket

t et $7,000 rk Cray XD1 ar M le $6,000 b sa ss re d Cray XT3 $5,000 d A X 5X Departmental (7.4%) $4,000 Cray X1 $3,000 Divisional (5.3%)

$2,000 Enterprise (8.2%)

$1,000 ` Capability (2.4%) $0 (CAGR) 2002 2003 2004 2005 2006 2007 Source: IDC 2003 TakingTaking Success Success Formula Formula to to Br Broadoadeerr HPC HPC Market Market

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 4 RecentRecent XD1XD1 AnnouncementsAnnouncements

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 5 RecentRecent XT3XT3 AnnouncementsAnnouncements

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 6 RecentRecent X1X1 AnnouncementsAnnouncements

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 7 ORNLORNL NationalNational LeadershipLeadership ComputingComputing FacilityFacility

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 8 AA WealthWealth ofof TechnologyTechnology

XD1 RapidArray and FPGA XT3 Compute PE

Red Storm Architecture

X1 Vector Node and Global Address Space

Nodes Active Management

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 9 Cray’sCray’s Vision…Vision… Scalable High-Bandwidth Computing

20102010 ‘BlackWidow’ ‘Cascade’‘Cascade’ SustainSustaineedd PetaflopsPetaflops 2006 X1E 2006

20042004 ‘Rainier’‘Rainier’

X1 IntegratedIntegrated ComputingComputing

2004 20042004 20052005 ‘Mt. Adams’

Red Storm 20052005 20042004

XD1

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 10 RainierRainier IntegratedIntegrated ComputingComputing • The Concept: – Single system: • Common infrastructure and high performance network. • Common global address space. • Common OS, storage and administration. – Variety of compute nodes: • Follow on nodes for vector line (X1/X1E) and (XT3/XD1) lines. – Opteron based login, service and I/O nodes. • First customer shipment in 2006.

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 11 IntegratedIntegrated ProductProduct ConceptConcept

• Single hardware infrastructure (cabinet, power, cooling, etc.) • Common high speed network • Heterogeneous compute capabilities • Service nodes based on COTS processors • Global address space across machine • -based OS

Vector Vector Vector Service

Compute Compute Compute Node Optional FCs

High Performance Service Vector Vector Vector Global Node Compute Compute Compute File w

it System ch

Scalar Scalar Scalar Service Compute Compute Compute Node

Scalar Scalar Other Service Customer Network/Grid Compute Compute Compute Node

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 12 MotivationMotivation • Different algorithms are appropriate for different architectures. • Different requirements for: – vs. memory bandwidth – local memory size – mixed MPI/SMP vs. pure MPI – granularity of computation and communication – regular vs. irregular memory accesses and communication – network bandwidth – global vs. nearest neighbor communication – ability to tune application – capacity vs. capability • Benchmark suites are often split as to best platform. ⇒ One size does not fit all. ⇒ Design a single system with heterogeneous computing capabilities. 26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 13 BenefitsBenefits ofof IntegratedIntegrated ComputingComputing

•Customer: – Single solution for diverse workloads – Maintain and foster architectural diversity – Reduced administration and training costs – Single, unified user interface and environment – Better login, pre/post processing environment for vector machines – More configuration and upgrade flexibility – Improved performance by matching processor to the job •Cray: – Better focus – Reduced development costs through commonality – Reduced manufacturing costs through increased volumes – Able to support specialized computing (vectors, MTA, FPGAs)

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 14 FitFit toto EarthEarth ScienceScience RequirementsRequirements

• Rainier architecture offers strong fit for diverse, complex and evolving workload: – Heterogeneity ideal for coupled modeling. – Capability features well suited to: • Production workload • Advanced research requiring high-resolution and complexity (eg: high-resolution, sub-scale processes, atmospheric chemistry). – Ability to explore alternative processor architectures (MT, FPGA). – Architecture flexibility at upgrade points; better leveraging investment.

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 15 IncreasedIncreased ModelingModeling ComplexityComplexity

Rudiman, 2001

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 16 IncreasedIncreased MultiMulti--DisciplinaryDisciplinary CouplingCoupling

MPI-M SDEM Structural Dynamical Economic Model

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 17 Scalable,Scalable, CommonCommon InfrastructureInfrastructure

Interconnection Network Interconnection Network

Opteron Opteron Other (Adams) (BlackWidow) basedOpteron basedOpteron CompOtherute Opteron(Adams) (BlackWidVector ow) I/O Servicbasede Loginbased and NoCompdesOtherute basedOpteron(Adams) based(BlackWidVector ow) Other (Adams) (BlackWidow) I/ONo desService ServiceLogin and (FutureNoCompdes) ute CompbasedOpteronute CompbasedVectorute Compute Opteron Vector Nodes NoServicedes (FutureNodes) NoCompdesbasedute NoCompdesbasedute Nodes based based (OST) Nodes (Future) NoCompdes ute NoCompdes ute (Future) Compute Compute (OST) (Compile, (e.g. FPGA, Nodes Nodes Nodes Nodes Debug)(Compile, (e.g.Multi- FPGA, Debug) Th(e.g.read)Multi- FPGA, Th(e.g.read)Multi- FPGA, Multi- Memory Memory MemoryThread) Memory Memory Memory Memory MemoryThread) Memory Memory Memory Memory Memory Memory Memory Memory

• Scalability and Growth – Configurable network: according to need at whatever scale – Grow network bandwidth over time to maintain balance parallel filesystem – Grow scalar and/or vector partitions as need develops – Configure service and I/O partition and grow as need develops

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 18 ComputeCompute NodesNodes • Adams (Opteron scalar nodes) – Excellent scalar performance – Very low memory latency – Many applications available (Linux + x86-64) – Potential for both uniprocessor and SMP nodes, single and dual cores – But, requires high degree of cache effectiveness • BlackWidow (Cray vector nodes) – Fast, high bandwidth processors – 4-way vector SMP nodes • Large local memory • Supports hierarchical parallelism – Latency tolerance for global and irregular references – But, requires vectorizable code • Other future planned compute capabilities – FPGA: direct hardware execution of kernels – MTA: highly threaded access to global memory

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 19 ReliabilityReliability andand ScalingScaling FeaturesFeatures

• Fault detection, diagnoses and recovery – Enhanced diagnostic error reporting – Memory retries for transmission-induced multi-bit errors – Timeouts and self-cleansing datapaths (no cascading errors) • Hardware firewalls for fault containment – Secure, hierarchical boundaries between kernel groups – Protects the rest of the system even if a kernel is corrupted • Graceful network degradation – Auto-degrade rails rather than lose a whole link – Hot swappable boards and reconfigurable routing tables • Full node translation tables (NTTs) – Allows scheduling of parallel jobs across an arbitrary collection of processors, with efficient, scalable address translation ⇒ Much higher system utilization under heavy workloads

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 20 CascadeCascade ProjectProject CrayCray Inc.Inc. StanfordStanford Caltech/JPLCaltech/JPL NotreNotre DameDame

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 21 HighHigh ProductivityProductivity ComputingComputing SystemsSystems Goals: ¾ Provide a new generation of economically viable high productivity computing systems for the national security and industrial user community (2007 – 2010)

Impact: z Performance (efficiency): critical national security applications by a factor of 10X to 40X z Productivity (time-to-solution) z Portability (transparency): insulate research and operational application software from system z Robustness (reliability): apply all known techniques to protect against outside attacks, hardware faults, HPCS Program Focus Areas & programming errors

Applications: z Intelligence/surveillance, reconnaissance, cryptanalysis, weapons analysis, airborne contaminant modeling and biotechnology FillFill thethe CriticalCritical TechnologyTechnology andand CapabilityCapability GapGap TodayToday (late(late 80’s80’s HPCHPC technology)…..technology)…..toto…..Future…..Future (Quantum/Bio(Quantum/Bio Computing)Computing) 26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 22 HPCSHPCS PhasesPhases

CRAY SGI Sun HP IBM Phase I: Concept Development – Forecast available technology 1 Year – Propose HPCS hw/sw concepts 2H 2002 – 1H 2003 – Explore productivity metrics $3M/year – Develop research plan for Phase II

CRAY Sun IBM Phase II: Concept Validation – Focused R&D 3 Years – Hardware and software prototyping 2H 2003 – 1H 2006 – Experimentation and simulation $17M/year – Risk assessment and mitigation

? ? Phase III: Full Scale Product Development 4 Years – Commercially available system by 2H 2006 – 2010 2010 $?/year – Outreach and cooperation in software and applications areas The HPCS program lets us explore technologies we otherwise couldn’t. A three year head start on typical development cycle. 26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 23 Cray’sCray’s ApproachApproach toto HPCSHPCS

• High system efficiency at scale – Bandwidth is the most critical and expensive part of scalability – Enable very high (but configurable) global bandwidth – Design processor and system to use this bandwidth wisely – Reduce bandwidth demand architecturally • High human productivity and portability – Support legacy and emerging languages – Provide strong compiler, tools and runtime support – Support a mixed UMA/NUMA programming model – Develop higher-level programming language and tools • System robustness – Provide excellent fault detection and diagnosis – Implement automatic reconfiguration and fault containment – Make all resources virtualized and dynamically reconfigurable

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 24 SummarySummary

• Cray offers a unique range of science-driven technologies: – XD1, XT3, X1/X1E • Rainier architecture offers strong fit for diverse, complex and evolving earth sciences workload. • Cray continues to support sustained innovation to meet the needs of the scientific community: – Rainier: Integrated computing capability in 2006 – Cascade: Aggressive research program for 2010

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 25 What Do You Need To Know ?

26 Oct 2004 11th ECMWF Workshop on HPC in Meteorology Slide 26