The Nec Sx-6 Asia Nec Hpc Marketing Supercomputer with Single-Chip Vector Processor Promotion Division

SX-6 Datenblatt Single Node 28.09.2001 9:37 Uhr Seite 1 Courtesy of HLRS, Stuttgart/Germany, Cave Simulation Courtesy of ESI Group For Information Contact: THE NEC SX-6 ̈ ASIA NEC HPC MARKETING SUPERCOMPUTER WITH SINGLE-CHIP VECTOR PROCESSOR PROMOTION DIVISION 7-1 Shiba, 5-chome Minato-ku, Tokyo 108-8001 SOFTWARE Japan +81-3-3798-9131 phone +81-3-3798-9132 fax [email protected] Operating System and Applications ̈ EUROPE NEC EUROPEAN With traditional NEC reliabili- Applications development is SUPERCOMPUTER SYSTEMS ty and the SUPER-UX operating made easy through the simplicity system, now in its 12th year, SX-6 of shared memory programming Prinzenallee 11 is designed for production sites to within a node, and portability is D-40549 Düsseldorf Germany perform production computing. assured by industry-standard MPI +49-211-5369-0 phone All the functions other systems and OpenMP support. NEC’s +49-211-5369-199 fax promise, like checkpoint-restart PSUITE Integrated Development [email protected] or a robust batch environment, Environment provides all of the ̈ LATIN AMERICA are available today. tools and utilities necessary under NEC DO BRASIL S.A. SX-OFFICE All third party applications rel- a single package for project man- evant to parallel vector supercom- agement, editing, compiling, opti- Rua Arabé, 71 CEP 04042-070 V.Clementino puters are available and provide mizing, and test/debugging. The São Paulo SP industry-leading performance. cross development environment Brasil Turnaround time for solutions is PSUITE is available on all popular +55-11-5591-7147 phone minimized. Complex analysis workstation class products as well +55-11-5591-7146 fax [email protected] streams can be completed over- as Linux personal computers to night instead of taking the week. maximize accessibility and devel- ̈ OCEANIA opment efficiency. The languages, NEC AUSTRALIA PTY.LTD. HPCD libraries and tools available in- VECTOR SUPERCOMPUTING: 635 Ferntree Gully Road clude Fortran90, OpenMP, C++, Glen Waverly, VIC 3150 THE STATE-OF-THE-ART MPI and Vampir/SX. Australia +61-3-9262-1209 phone +61-3-9262-1534 fax Products are turned out at Development is quicker. Time is duction sites. Now SX-6 breaks [email protected] CONFIGURATION TABLE an accelerating rate. Research is precious. new ground, introducing high- completed at record pace. In Vector supercomputers have end scalable parallel vector super- Selected Machine Configurations single-node Systems the time required to read this always provided the absolute high- computing to the technical server sentence each SX-6 processor est performance available. High- competitive market. With power- performs 40 billion computa- performance high-bandwidth ful shared-memory nodes and Model Group Name SX-6A SX-6B tions. Solutions are faster. Sim- memory, powerful processors and unmatched communications band- Model Name 8A 7A 6A 5A 4A 4B 3B 2B ulations are more exhaustive. commercial robustness for pro- width it is second to none. CPU Number of CPUs 8 7 6 5 4 4 3 2 Peak Vector Performance 64GF 56GF 48GF 40GF 32GF 32GF 24GF 16GF Vector Registers 144kbx8 144kbx7 144kbx6 144kbx5 144kbx4 144kbx4 144kbx3 144kbx2 Scalar Registers 64 bits x128x8 64 bitsx1288x7 64 bitsx1288x6 64 bitsx1288x5 64 bitsx1288x4 64 bitsx1288x4 64 bitsx1288x3 64 bitsx1288x2 MMU Memory Architecture Shared Memory Capacity 32 GB/48 GB/64 GB 16 GB/24 GB/32 GB Peak Data Transfer Rate 256 GB/s 224 GB/s 192 GB/s 160 GB/s 128 GB/s 128 GB/s 96 GB/s 64 GB/s Input/Output Processor Number of IOPs 1~4 1~2 Max. Number of Channels 127 channels 43 channels Peak Data Transfer Rate 8 GB/s 4 GB/s Design: Zimmermann & Jung, Stuttgart, Germany SX-6 Datenblatt Single Node 28.09.2001 9:38 Uhr Seite 3 Courtesy of ESI Group Dr. Ulrich Rist, Institut für Aero- und Gasdynamik der Universität Stuttgart, Direct numerical Simulation of laminar- Courtesy of MSC Software turbulent Transition in a laminar Separation Bubble Courtesy of HLRS, Stuttgart/Germany, Cave Simulation PERFORMANCE HARDWARE FEATURES TECHNOLOGY Ultra-high-speed The Worlds first Single-chip Model Configuration Single-node System: Vector and Scalar Unit Vector Processor A SX-6 node is a complete par- The SX-6 series single-node The vector unit of the SX-6 The high gate density possible allel vector system consisting of 2 models scale up to 8CPUs, deliv- series processor consists of vector for the state-of-the-art CMOS to 8 vector processors each with 8 ering from 16 up to 64GFLOPS of registers and 8 sets of pipelines technology and LSI design en- SX-6 single-chip GFLOPS of peak performance. vector performance and offering for logical operations, multiplica- abled NEC to implement the vec- vector processor The processors are coupled to an a maximum of 64GB main mem- tion, add/shift operations, di- tor processor on just one chip. uniform shared main memory of ory in shared memory architec- vision, masked operations and This LSI and packaging techno- 16 to 64 GB capacity. The SX ture. A model “A” chassis can be load/store. The scalar unit real- logy leads to a performance of Main Memory Unit Ease of Installation series solution for large scalability equipped with up to 8CPUs, a izes ultra-high-speed scalar per- 8 GFLOPS on a single LSI. This is a hybrid system. A powerful model “B” chassis can house up to formance through a 4-way super ultrahigh integration leads to The SX-6 series utilizes ultra- The SX-6 models’ power con- SMP single node with uniform 4CPUs. The maximum I/O band- scalar design. The combination of improved internal latencies and high-speed double data rate syn- sumption and space requirements shared memory provides very width is 8GB/s for an “A” model the single-chip vector micropro- performance in comparison with chronous DRAM. Single-node have been reduced by 80% when high levels of performance for and 4GB/s for a model “B”. cessor with a reduced clock cycle former generation designs, which systems have a memory capacity compared with the previous gener- both capacity and capability decreases the processing time for used dozens of chips to imple- of up to 64 GB and a memory ation of the SX series. The low requirements. For performances each instruction. This leads to the ment a processor, as well as highly bandwidth 256 GB per second. power consumption allows all mod- beyond the ones provided by a Multi-node system: superior short vector and scalar reduced memory latencies by Multi-node systems have a mem- els to be fully air-cooled. These two single node, a multi-node config- performance. drastically narrowing the distance ory capacity of up to 8 Terabytes elements contribute to a great uration having distributed mem- Up to 128 nodes can be con- between memory and processors. and a maximum bandwidth of 32 reduction of installation costs and ory is available. The SX-6 series is nected through an Internode Terabytes per second. complexities. compatible with its predecessor Crossbar Switch (IXS). SX-6 series SX-5 series. It excels in the total multi-node models can be scaled Input/Output Subsystem High Reliability balance of processing performance, up to 1024 CPUs and a peak per- memory bandwidth, input/out- formance of 8TFLOPS. Although The Input/Output subsystem The usage of highly-integrated put throughput in much the same they are built on a distributed of the SX-6 series can be config- CMOS technology has led to way as the former SX series sys- memory architecture, multi-node ured to deliver a bandwidth of up greatly reduced number of com- tems did. Existing applications systems still provide a single sys- to 8 GB per second on a single- ponents in a single system. This, and resources can be easily tem image. Extreme execution node system. Multi-node systems in turn, leads to a tremendously migrated to the SX-6. performance can be obtained on scale up to an I/O capacity of 1 improved hardware reliability. a wide range of applications with TB per second. the powerful single node-systems connected through the ultra-high- speed crossbar system. Dr. Ulrich Lang, HLRS VISiT – Intuitive Industrial Design in Virtual Environments Courtesy of HLRS, Stuttgart/Germany, Cave Simulation Courtesy of ESI, Electromagnetic simulation SX-6 Memory module.

The Nec Sx-6 Asia Nec Hpc Marketing Supercomputer with Single-Chip Vector Processor Promotion Division

User's Manual

DMI-HIRLAM on the NEC SX-6

Hardware Technology of the Earth Simulator 1

Shared-Memory Vector Systems Compared

Recent Supercomputing Development in Japan

Hardware-Oblivious SIMD Parallelism for In-Memory Column-Stores

Performance Evaluation of Supercomputers Using HPCC and IMB Benchmarks

Optimizing Sparse Matrix-Vector Multiplication in NEC SX-Aurora Vector Engine

SX-Aurora TSUBASA Program Execution Quick Guide

NEC SX-Aurora Tsubasa System at ICM UW User Guide

Vampirtrace 5.14.4 User Manual

Beyond Earth Simulator