Programming Guide for Sicortex
Total Page:16
File Type:pdf, Size:1020Kb
SiCortex® System Programming Guide For Software Version 3.0 Field Test Trademarks Cray the a registered trademark of Cray, Inc. Intel is the registered trademark of Intel Corporation. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. The registered trademark Linux is used pursuant to a sublicense from the Linux Mark Institute, the exclusive licensee of Linus Torvalds, owner of the mark in the U.S. and other countries. Lustre is the registered trademark of Cluster File Systems, Inc. MIPS and MIPS64 are registered trademarks of MIPS Technologies, Inc. NIST is a registered trademark of the National Institute of Standards and Technology, U.S. Department of Commerce. OpenMP is a trademark of Silicon Graphics, Inc. PCI, PCI Express, and PCIe are registered trademarks, and EXPRESSMODULE is a trademark of PCI-SIG. Perl is the registered trademark of The Perl Foundation. SiCortex is a registered trademark, and the SiCortex logo, SC5832, SC648, and PathScale are trademarks of SiCortex, Incorporated. TAU Performance System is a trademark of the joint developers: University of Oregon Performance Research Lab; Los Alamos National Laboratory Advanced Computing Laboratory; and The Research Centre Jülich, ZAM, Germany. TotalView is a registered trademark of TotalView Technologies, LLC. Vampir is a registered trademark of Wolfgang E. Nagel. All other brand and product names are trademarks or service marks of their respective owners. Copyrights Copyright© 2007-2008 SiCortex Incorporated. All rights reserved. Disclaimer The content of this document is furnished for informational use only, is subject to change without notice, and should not be construed as a com- mitment by SiCortex, Inc. Document Number 2906-03 Rev. 01 Published August 26, 2008 (PN 2906-03 Rev. 01) i Contacting SiCortex and Getting Support SiCortex is on-line at http://www.sicortex.com. Our Web pages provide information on the company and products, including access to technical information and documentation, product overviews, and product announcements. You can search the SiCortex Knowledge Base or participate in forum dis- cussions online at http://www.sicortex.com/support after you register. You can reach SiCortex Technical Support by e-mailing questions to [email protected] or by calling: • 978.897.0214 main number • +1 877 SICORTE x289 (+1 877.742.6783 x289) toll free number What’s this Book About and Who’s it for? This manual targets C, C++, and Fortran application developers, who have experience coding programs that run on Linux systems. With few exceptions, the Linux environment on SiCortex systems mirrors that on any other Linux system. This manual describes the exceptions and how to work with them. Perl and Python programmers will notice no difference in the SiCortex Linux environment. Conventions of Notation Bold Denotes a selection to make in a GUI program. For example, File>Process>Startup directs the user to select File located on the application’s toolbar, then Process, and then Startup. monospaced Denotes code examples wherever they occur and command font sequences and their arguments, which are entered at the sys- tem prompt. Italics Denotes a term or a cross reference in general text. m Denotes a caution or warning, such as a dependency that must be satisfied before continuing a process. Denotes a tip, hint, or reminder. ii (PN 2906-03 Rev. 01) Table of Contents Chapter 1 Introducing the SiCortex System ............................................................7 Overview of the SiCortex System Architecture 7 The Application Development Environment 11 Chapter 2 Running Applications.............................................................................15 Logging on to the System 16 Running and Managing Multinode Applications 17 Running and Managing Single-Node Applications 21 Running n32 Applications 22 Using a FabriCache File System 22 Troubleshooting SLURM Jobs 24 Chapter 3 Compiling and Linking Applications....................................................25 Installing the Cross-Development Toolkit 25 Choosing a Compiler 25 Using Compiler Options 26 Summary of Simple Build Methods 30 Porting or Building an Application Natively on the System 32 Building an Application on the Cross-Development Workstation 34 Troubleshooting Autotools-Based Cross-Compile Errors 35 Compiling Reference Information 37 Chapter 4 Debugging Applications......................................................................... 39 Compiling Tips for Debugging 39 Debugging with gdb 39 Debugging with TotalView 43 Memory Debugging with DUMA 45 Memory Debugging with Mudflap 47 Chapter 5 Optimizing Application Performance .................................................. 51 General Procedure for Optimizing an Application 52 SCTICK Fast Timers 56 Application Performance Tools 57 Invoking the Tools 60 Displaying Available Hardware Performance Counter Events 63 Using Papiex 64 Using Mpipex 71 Using HPCex 74 Using TAU 79 Using Tauex 81 Using Vampirtrace 82 Using GPTL 87 Using Gptlex 90 Using Ioex 92 Using Pfmon 94 Using Oprofile 94 Hardware Performance Counter Events 95 Performance Tool Program Examples 99 Chapter 6 Using the Optimized Math and Science Libraries ............................ 103 Libscm Tuned Math Library 104 Libscs Tuned Scientific Library 107 Libscstr and Libscfstr Tuned String Libraries 110 Math and Science Libraries 112 Linking the Optimized Atlas Library for Fast BLAS 114 Linking the PETSc Library 114 Linking Interdependent Libraries 115 Chapter 7 Developing MPI Applications ............................................................. 117 SiCortex MPI Implementation 117 MPI Feature Support 118 Compiling and Linking MPI Applications 118 MPI Debugging Hook 120 MPI Performance Tips 120 Thread Support 122 MPI Reference Information 123 iv (PN 2906-03 Rev.01) Chapter 8 Writing Threaded Applications...........................................................125 Multithreaded Programming Considerations 125 OpenMP and Hybrid OpenMP/MPI Applications 127 Chapter 9 Processor and Memory System Functional Features ........................131 Node Details 131 Memory System Operation 132 Chapter 10 Understanding the Application Binary Interfaces ..........................135 What is an ABI? 135 Data Formats 136 Register Usage 138 Alignment Rules 139 Overriding the Default ABI 139 Interlanguage Programming Considerations 140 Appendix A SLURM I/O Buffering ......................................................................147 SLURM I/O Paths 147 Buffering Basics 148 Complications of Buffering 149 Controlling Buffering 149 Recommended Strategy 150 Index.............................................................................................................................. i (PN 2906-03 Rev.01) v vi (PN 2906-03 Rev.01) Overview of the SiCortex System Architecture Chapter 1 Introducing the SiCortex System In this section: • Overview of the SiCortex System Architecture • Node Components • The Interconnect Fabric • System I/O • The Application Development Environment • Software Development Suites • Compiler Suites • GNU Tools and Utilities • Libraries • Debugging Tools • Performance Tools Built to support the dominant High Performance Technical Computing (HPTC) software model, the SiCortex System with its MPI/Linux soft- ware suite empowers users to quickly develop applications that can tackle the most complex and computationally intensive problems that face the scientific, engineering, and financial communities today. Overview of the SiCortex System Architecture A SiCortex System (hereafter, in this document, called System) consists of a number of six-way, symmetric multiprocessing (SMP) compute nodes connected by an Interconnect Fabric. An SC5832 has 972 nodes, and an SC648 has 108. Node Components Each node consists of one SiCortex node chip (Figure 1 on page 8) and two industry-standard DDR2 memory modules. A node chip contains six 64-bit processors, their L1 and L2 caches, two memory controllers (one for each memory module), the Interconnect Fabric interface components (the Fabric Links, the Fabric Switch, and the DMA engine), and a PCI Express® (PCIe®) interface. The PCIe controller provides control for external I/O devices only, not for the Interconnect Fabric. Chapter 1 Introducing the SiCortex System (PN 2906-03 Rev. 01) 7 Overview of the SiCortex System Architecture . For architectural details and programming considerations related to the node components, see Chapter 9, Processor and Memory System Functional Features on page 131 . On the node chip, the DMA engine, Fabric Switch, and Fabric Links pro- vide the interface to the Interconnect Fabric. The DMA engine connects the memory system to the Fabric Switch, which forwards traffic between incoming and outgoing links, and to and from the DMA engine. CPU CPU CPU CPU Six 64-bit MIPS CPUs CPU CPU L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache PCI DMA Engine Coherent L2 Cache Express Controller DDR-2 DDR-2 Fabric Switch Controller Controller Node Chip DDR-2 DIMM DDR-2 DIMM External I/O From other To other nodes nodes Fabric Links Figure 1. Overview of SiCortex node All nodes in a System are connected through a degree-3 directed Kautz network. Twenty-seven nodes populate a module, and all modules plug into the System’s backplane. Of the twenty-seven nodes on a module, three have their PCIe busses connected to EXPRESSMODULE™ slots, and a fourth is attached to an on-module PCIe dual gigabit-Ethernet con- troller. The PCIe interface on all other nodes is disabled. The Interconnect