High-Performance Computing at SGI and the Status of Climate and Weather Codes on the SGI Altix Gerardo Cisneros, Ph.D

High-Performance Computing at SGI and the Status of Climate and Weather Codes on the SGI Altix Gerardo Cisneros, Ph.D

High-Performance Computing at SGI and the Status of Climate and Weather Codes on the SGI Altix Gerardo Cisneros, Ph.D. Scientist C2004 Silicon Graphics, Inc. All rights reserved. Silicon Graphics, SGI, IRIX, Origin, Onyx, Onyx2, IRIS, Altix, InfiniteReality, Challenge, Reality Center, Geometry Engine, ImageVision Library, OpenGL, XFS, the SGI logo and the SGI cube are registered trademarks and CXFS, Onyx4, InfinitePerformance, IRIS GL, Power Series, Personal IRIS, Power Challenge, NUMAflex, REACT, Open Inventor, OpenGL Performer, OpenGL, Optimizer, OpenGL Volumizer, OpenGL Shader, OpenGL Multipipe, OpenGL Vizserver, SkyWriter, RealityEngine, SGI ProPack, Performance Co- Pilot, SGI Advanced Linux, UltimateVision and The Source of Innovation and Discovery are trademarks of Silicon Graphics, Inc., in the U.S. and/or other countries worldwide. Linux is a registered trademark of Linus Torvalds in several countries, used with permission by Silicon Graphics, Inc. MIPS is a registered trademark of MIPS Technologies, Inc., used under license by Silicon Graphics, Inc. Intel and Itanium are registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries. Linux penguin logo created by Larry Ewing. All other trademarks mentioned herein are the property of their respective owners. (04/04) 9/9/2004 Slide 2 AA Overview • Company focus • SGI Altix: present and future • Performance of NWS codes • Conclusions 9/9/2004 Slide 3 Silicon Graphics Providing the Industry’s Highest-Performing Compute, Storage and Visualization Products •Exclusively focused on the technical computing market •Technology is designed to enable the most significant scientific and creative breakthroughs of the 21st century •Products and services are mission critical to government and defense, science and research, manufacturing, energy and media industries 9/9/2004 Slide 4 Images courtesy of SCI Institute, University of Utah, Navy Rehearsal TOPSCENE Program, Magic Earth LLC, and WETA Digital AA Strategic Focus Areas High Performance Advanced Storage Computing Visualization High-performance CXFS™ shared file system Onyx4™ allows users to combine NUMAflex™ architecture allows transparent, multiple industry-standard delivers unprecedented heterogeneous file access graphics cards in a high- flexibility and performance. everywhere, without copying bandwidth, low-latency data. architecture, for cost-effective high performance visualization. 9/9/2004 Slide 5 AA Architecture Designed for HPC; Choice of Deployments NUMAflex™ Global Shared-Memory Architecture Balanced, scalable performance Operating environment optimized for HPC Low-latency memory access Easily Deployable MIPS® and IRIX® Intel® Itanium® 2 and Linux® SGI® Origin® Family SGI® Altix® Family 9/9/2004 Slide 6 AA Modular SGI® NUMAflex™ Architecture System PPeerrffoorrmmaannccee:: HHiigghh--bbaannddwwiiddtthh Interconnect iinntteerrccoonnnneecctt wwiitthh vveerryy llooww llaatteennccyy FFlleexxiibbiilliittyy:: TTaaiilloorreedd ccoonnffiigguurraattiioonnss ffoorr CPU & ddiiffffeerreenntt ddiimmeennssiioonnss ooff ssccaallaabbiilliittyy Memory IInnvveessttmmeenntt pprrootteeccttiioonn:: AAdddd nneeww tteecchhnnoollooggiieess aass tthheeyy eevvoollvvee System I/O SSccaallaabbiilliittyy:: NNoo cceennttrraall bbuuss oorr sswwiittcchh;; jjuusstt Standard I/O mmoodduulleess aanndd NNUUMMAAlliinnkk™™ ccaabblleess Expansion High-Bandwidth I/O Expansion Graphics Expansion Storage Expansion 9/9/2004 Slide 7 SGI Family of Scalable Linux® Solutions SGI® Altix® 350 SGI® Altix® 3000 Servers and Clusters Servers and Superclusters Image courtesy: NASA Ames Mid-range Departmental Capability 9/9/2004 Slide 8 AA SGI® Altix® 3000 Scaling Roadmap 2048p y ® t Worlds Most Scalable Linux Supercomputer i l i b a p a c r o 1024p s s e c o r p 4 512p 512p 512p 512p 8 3 , 6 1 256p 256p 256p 128p 64p 64p 64p 64p 64p Jan 2003 Apr 2003 Sept 2003 Oct 2003 Mar 2004 mid-2004 late 2004 2005 SSI Scalability Shared Memory Scalability 9/9/2004 Slide 9 This slide contains forward-looking statements. The results and forecasts as stated may vary. Other risks and uncertainties relating to this slide may be found in the "Safe-Harbor" statement at the beginning of this presentation. SGI® Altix® 3000 C-Brick Detail 16 x PC2100 or PC2700 DDR SDRAM 8 to 16GB of memory per node 8.51–10.2GB/sec memory b/w • 4x Intel® Itanium® 2 processors • 2 processor per 6.4GB/sec frontside bus • 4–64GB memory C-brick • SHUB memory controller 8.51–10.2GB/sec memory SHUB SHUB bandwidth (varies with memory speed 133 MHz vs. 166 MHz) • 6.4GB/sec aggregate interconnect bandwidth • 4.8GB/sec aggregate I/O bandwidth 9/9/2004 Slide 10 Other Altix Bricks IX-brick Base I/O module D-brick2 Disk expansion R-brick2 Router interconnect PX-brick PCI-X expansion M-brick Memory expansion 9/9/2004 Slide 11 Altix 3700 — 128 processors 9/9/2004 Slide 12 Altix 3700 — 512 processors 9/9/2004 Slide 13 Next generation Altix (8) Fewer External Routers (1) Less Power Bay (1) Less Rack (20) Fewer Cables Altix 3700 “Tornado” 9/9/2004 Slide 14 Next generation Altix 9/9/2004 Slide 15 The Altix® 350: “Expand on Demand” Growth Path 1-16P/2-192GB CPU Expansion Module Right-size systems 1-12P/ 2-144 GB Memory Expansion for the ultimate Module price/performance 1-8P/ 2-96GB I/O Expansion (4-32) Module 1- 4P/ 2- 48GB • Independently scale CPU, memory, I/O • One Linux® instance to manage Base Unit: 1 or 2P/2-24GB • Investment protection & leverage current assets • Allocate budget and resources to ongoing needs 9/9/2004 Slide 16 Customers with Altix systems running climate and weather applications • NASA – 10240p – ECCO, CAM, CCSM, etc. • NCSA – 1024p – WRF • GFDL – 608p (2x256, 1x96) – MOM4, CM2.1 • NRL – 384p (1x128, 1x256) – various • ORNL – 256p – CCSM, CAM, POP • LANL – 256p – POP, HYCOM • BAMS – 20p – MM5 and MAQSIP • CMMACS – 12p Altix350 – MOM4 • APAT – 8p Altix350 – MM5 • Romanian Nat’l Met Admin – 2p Altix350 – HRM 9/9/2004 Slide 17 IFS - Performance on SGI Altix 3000 T511 performance in Forecast days per day 700 600 500 32 Cpu 64 Cpu 400 128 Cpu 300 256 Cpu 200 384 Cpu 512 Cpu 100 0 IBM Power4 SGI Altix 3000 SGI 3000 1.3GHz 1.3GHz 1.5GHz Altix 3000 @1.5GHz is 2.22 x faster IBM Power4 @1.3GHz 9/9/2004 Slide 18 Source: Roland Richter, SGI, May 2004 IFS - Scalability on SGI Altix 3000 Itanium2 @1.5GHz is 1.3x faster than Itanium @1.3GHz because of the larger cache and higher clock rate 9/9/2004 Slide 19 Source: Roland Richter, SGI, May 2004 HRM Performance on the Altix 350 Simulation speed 350 Origin 350/600 MHz Altix 350/1.4 GHz 300 250 200 150 100 50 0 1 2 4 8 10 12 16 Nproc 78h forecast in 2340 time steps over a 181x217 horizontal grid with 26 vertical levels 9/9/2004 Slide 20 LM-RAPS 2.1 (Optimization using shared Memory) Scalability on Altix using Intel Compiler and SGI’s MPT library 9/9/2004 Slide 21 Benchmark case used by Emy (Greece) in 2003 Grid 363x263x45 MM5 3.6.3 - Standard benchmark, 2004 Higher is Better Altix version was compiled with the Intel 8.1.007 (beta) ifort Fortran compiler and the Intel 8.1.010 (beta) icc C compiler, and linked with SGI's MPI from MPT 1.10. Source: http://www.mmm.ucar.edu/mm5/mpp/helpdesk/20040304a.html 9/9/2004 Slide 22 WRF 1.3 - Scalability & Performance on Altix 3000 Altix 3000 is 1.28x faster than HP cluster at the same clock frequency. Source: http://www.mmm.ucar.edu/wrf/bench and Gerardo Cisneros, SGI 9/9/2004 Slide 23 WRF 2.0.2 on a Large Problem WRF SI 5km CONUS 48h forecast (980x720x37, 5km, 30s) 50000 s 40000 d e 30000 s p 20000 a l E 10000 0 32 64 128 256 512 NCPUs 1.5GHz/6MB L3 1.3GHz/3MB L3 9/9/2004 Slide 24 WRF 2.0.2 on a Large Problem WRF SI 5km CONUS 48h forecast (980x720x37, 5km, 30s) d e 40 e p s 30 n o i 20 t a l u 10 m i 0 S 32 64 128 256 512 NCPUs 1.5GHz/6MB L3 1.3GHz/3MB L3 9/9/2004 Slide 25 GFS Performance on the Altix 3700 (1.5GHz) GFS T240L30 (720x360x30) 120h forecast 400.00 343.91 350.00 d e 300.00 e 255.20 p s 250.00 n 190.01 o 200.00 i t a l 150.00 u 98.34 m i 100.00 S 51.15 50.00 23.68 2.23 4.21 9.58 0.00 1 2 4 8 16 32 64 96 128 NCPUs 9/9/2004 Slide 26 MOM4 - Performance on SGI Altix 3700 0.25° Ocean Model (1440x600x24) 18,000 120 15,476 16,000 100 14,000 12,000 80 p d 10,185 e u 10e ,000 s 7,813 d m 60 p e i a 8,000 e T l p E 6,000 4,640 40 3,822 S 4,000 2,308 2,397 20 2,000 0 0 16 32 64 128 Number of processors Altix 3000 @1.5GHz w/ Dynamic memory compilation Altix 3000 @1.5GHz w/ Static memory compilation Speedup Source: Gerardo Cisneros, SGI, May 2004 9/9/2004 Slide 27 POP 1.4.3 (Optimization by reducing the # of synchronizations) Scalability on ALTIX using Intel Compiler and SGI’s MPT library 9/9/2004 Slide 28 Increasing Message length and reducing synchronization frequency improved the performance by 20% at 256 CPU POP 1.4.3 - Performance on SGI Altix 3000 SSI POP 1.4.3 - Performance on 1 Degree “X1” Problem Higher 100.0 is Better Altix 3000 @1.5 GHz (SGI) ) y a D Altix 3000 @ 1.3 GHz (SGI) 80.0 k Configuration: c o l Altix 3000 @1.5GHz (NASA) c Altix 3000 l l 96 a 60.0 88 w @1.3GHz, 512P Ideal / 80 s 72 r Altix 3000 SSI, 64 a p e 40.0 u 56 d Intel compiler Y e 48 e d p 40 e 7.1.035 MPT 1.9 S t 32 a l 24 u 20.0 16 m i 8 S 0 0 32 64 96 128 160 192 224 256 288 320 352 384 0.0 0 64 128 192 256 320 384 Nr.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us