SGI® GPU and Coprocessor Software Guide

SGI® GPU and Coprocessor Software Guide 007-6424-001 COPYRIGHT © 2015 Silicon Graphics International Corp. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI. LIMITED RIGHTS LEGEND The software described in this document is “commercial computer software” provided with restricted rights (except as to included open/free source) as specified in the FAR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond license provisions is a violation of worldwide intellectual property laws, treaties and conventions. This document is provided with limited rights as defined in 52.227-14. The electronic (software) version of this document was developed at private expense; if acquired under an agreement with the USA government or any contractor thereto, it is acquired as “commercial computer software” subject to the provisions of its applicable license agreement, as specified in (a) 48 CFR 12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto. Contractor/manufacturer is SGI. TRADEMARKS AND ATTRIBUTIONS Silicon Graphics, SGI, the SGI logo, Rackable, and SGI Management Center are trademarks or registered trademarks of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries worldwide. Intel and Xeon Phi are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Mellanox is a registered trademark of Mellanox Corporation. NVIDIA, Quadro, and Tesla are trademarks, or registered trademarks of NVIDIA Corporation in the U.S. and/or other countries. Red Hat and all Red Hat-based trademarks are trademarks or registered trademarks of Red Hat, Inc. in the United States and other countries. SLES, SUSE, and YAST are registered trademarks of SUSE LLC in the United States and other countries. All other trademarks mentioned herein are the property of their respective owners. Record of Revision Version Description 001 September 2015 Initial printing. 007-6424-001 iii Contents Record of Revision . iii About This Guide . xi Audience. xi Related Publications . xii Product Support . xiii 1 Downloading NVIDIA® Software and Building RPMs . 1 Downloading the Software . 1 Downloading a Driver . 1 Specifying a GPU Hardware Type. 2 Selecting a Version from an Archive . 2 Using FTP to Download a Specific Version . 2 Downloading the CUDA Toolkit and CUDA SDK Samples . 3 Building RPMs . 3 Accessing the sgi-package-nvidia Script . 3 Creating the RPMs from Run Files. 4 Using Additional Options of the sgi-package-nvidia Script . 5 2 Installing NVIDIA Software in a Cluster Image . 7 Select a Compute Image as a Base Image. 8 Clone the Base Image. 9 Install Linux Distribution Packages into the Image. 9 Kernel Software. 9 Miscellaneous Packages . 10 Create RPMs from NVIDIA Run Files. 10 Install the NVIDIA Software from the RPMs. 11 Make Any Custom Modifications to the New Image. 12 Push the Image to the Desired Compute Nodes. 12 007-6424-001 v Contents 3 Installing NVIDIA Software on a Standalone System . 15 Installing NVIDIA Software from SGI RPMs . 16 Customizing the Installation . 16 Executing NVIDIA Run Files . 16 Customizing the Install Script . 17 Using the sgi-update-x Script . 18 Managing Your X Configuration . 19 Querying Drivers . 20 Miscellaneous Options . 20 Installation Options (deprecated) . 21 Using the nvidia-xconfig Tool. 22 Defining More Complicated X Configurations . 22 SLES 11 Workaround for a Problem When Starting X . 24 4 Upgrades and Tuning . 25 Upgrading to New NVIDIA Software . 25 SMC Cluster . 25 SGI UV and Other Standalone Systems. 26 Upgrading New NVIDIA Software after Upgrading to a New Kernel . 26 Latest Supported NVIDIA Driver Software. 26 Latest Supported CUDA Toolkit and Samples . 27 Upgrading on an SMC Cluster . 27 Upgrading on an SGI UV or Other Standalone System. 27 Performance Tuning . 28 Error-Correcting Code (ECC) Memory . 28 Tuning on SGI UV Systems. 28 Avoiding Long Startup Delays . 29 Addressing Large Memories . 29 Maximum PCIe Bandwidth. 30 5 Managing Intel® Xeon Phi™ Software . 31 Components of the Intel Xeon Phi Software Stack . 31 Downloading Intel MPSS Software . 32 Installing MPSS on Standalone Systems with Intel Xeon Phi Devices . 33 vi 007-6424-001 Contents Installing MPSS on Standalone Systems with Intel Xeon Phi Devices . 33 Upgrading Intel MPSS Software . 34 Performance-Tuning Xeon Phi Devices. 34 Local Device Assignment (all systems) . 34 Local Device Assignment (UV300 and UV2000 systems only) . 34 A Manually Installing NVIDIA Software (RHEL) . 35 Perform the General Setup. 36 Make the NVIDIA Modules for the Kernel Source. 36 Make Symbol and Weak Module Updates. 37 Install Non-kernel Parts of the Driver. 38 Install the CUDA Toolkit and Sample Code . 40 Set File Permissions for Use of CUDA . 41 B Manually Installing NVIDIA Software (SLES) . 43 Perform the General Setup. 44 Make the NVIDIA Modules for the Kernel Source. 45 Make Symbol and Weak Module Updates. 46 Install Non-kernel Parts of the Driver. 47 Install the CUDA Toolkit and Sample Code . 49 Set File Permissions for Use of CUDA . 50 007-6424-001 vii Tables Tables Table 1-1 Options for the sgi-package-nvidia Script . 5 Table 3-1 sgi-update-x Options–Managing X Configurations . 19 Table 3-2 sgi-update-x Options–Querying Drivers . 20 Table 3-3 sgi-update-x Options–Miscellaneous . 20 Table 3-4 sgi-update-x Options–Installation . 21 007-6424-001 ix . About This Guide This guide describes the installation and management of software for NVIDIA® GPUs and Intel® Xeon Phi™ devices on SGI® systems. The SGI systems include the following: • SGI clusters managed by SGI Management Center™ (SMC) 3.x or later. • SGI UV™ systems and other standalone systems (for example, SGI Rackable™ servers) Audience This guide is written for the system administrators and software developers. The guide assumes the reader is familiar with Linux software installation and clusters. 007-6424-001 xi About This Guide Related Publications The following SGI documents might be helpful: • SGI Foundation Software (SFS) User Guide (007-6410-xxx) • SGI Management Center (SMC) Installation and Configuration Guide for Clusters (007-6359-xxx) • SGI Management Center (SMC) Administration Guide for Clusters (007-6358-xxx) You can obtain SGI documentation in the following ways: • Refer to the SGI Technical Publications Library (TPL) at http://docs.sgi.com. Various formats are available. The TPL contains the most recent and most comprehensive set of books and man pages. To get the latest revision of a document on the TPL, use the core publication number as your search string. For example, use 007-1234 as your search string to get the latest version of the document with part number 007-1234-xxx. • Refer to the SGI support webpage for release notes and other documents whose access require a support contract. See “Product Support” on page xiii. Note: For information about third-party system components, see the documentation provided by the manufacturer/supplier. xii 007-6424-001 About This Guide Product Support SGI provides a comprehensive product support and maintenance program for its products. SGI also offers services to implement and integrate Linux applications in your environment. • Refer to http://www.sgi.com/support/ • If you are in North America, contact the Technical Assistance Center at +1 800 800 4SGI or contact your authorized service provider. • If you are outside North America, contact the SGI subsidiary or authorized distributor in your country. Be sure to have the following information before you call Technical Support: • Product serial number • Product model name and number • Applicable error messages • Add-on boards or hardware • Third-party hardware or software • Operating system type and revision level 007-6424-001 xiii About This Guide Reader Comments If you have comments about the technical accuracy, content, or organization of this document, contact SGI. Be sure to include the title and document number of the manual with your comments. (Online, the document number is located in the front matter of the manual. In printed manuals, the document number is located at the bottom of each page.) You can contact SGI in any of the following ways: • Send e-mail to the following address: [email protected] • Contact your customer service representative and ask that an incident be filed in the SGI incident tracking system. http://www.sgi.com/support/supportcenters.html SGI values your comments and will respond to them promptly. xiv 007-6424-001 Chapter 1 1. Downloading NVIDIA® Software and Building RPMs This chapter describes how to download NVIDIA driver software and how to package that software into RPMs for easy installation. Downloading the Software The NVIDIA software needed to use your GPUs both for graphics and as compute accelerators consists of the following: •A driver • The CUDA Toolkit • CUDA SDK samples (optional) Downloading a Driver You can use one of several methods to download the NVIDIA software: • “Specifying a GPU Hardware Type” on page 2 • “Selecting a Version from an Archive” on page 2 • “Using FTP to Download a Specific Version” on page 2 In all cases, the result of the download is file NVIDIA-Linux-x86_64-version.run. 007-6424-001 1 1: Downloading NVIDIA® Software and Building RPMs Specifying a GPU Hardware Type The NVIDIA DRIVERS DOWNLOAD webpage allows you to select and download the current driver for a GPU hardware type: http://www.nvidia.com/Download/index.aspx?lang=en-us Use Linux 64-bit for the operating system. Selecting a Version from an Archive The NVIDIA Unix Driver Archive webpage allows you to select and download a driver version from a list of current or archived versions: http://www.nvidia.com/object/unix.html Choose a version from the Linux x86_64/AMD64/EM64T category.

Load more