SGI™ 2100 Owner’s Guide

Document Number 007-4114-001 CONTRIBUTORS

Written by M. Schwenden, Bruce Miles, and Kameran Kashani Illustrated by Dan Young and Cheri Brown Production by Amy Swenson Engineering contributions by Brad Morrow, Ed Reidenbach, Philip Montalban, Jim Ammon, Joan Roy, Sameer Gupta, and Dean Olson. St. Peter’s Basilica image courtesy of ENEL SpA and InfoByte SpA. Disk Thrower image courtesy of Xavier Berenguer, Animatica.

© 1999, , Inc.— All Rights Reserved The contents of this document may not be copied or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc.

RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013 and/or in similar or successor clauses in the FAR, or in the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is SGI, 1600 Amphitheatre Pkwy., Mountain View, CA 94043-1351.

Silicon Graphics, the Silicon Graphics logo, CHALLENGE, IRIS, IRIX, and Onyx are registered trademarks, and Origin, Origin2000, Origin Vault, SGI, and XIO are trademarks, of Silicon Graphics, Inc. R10000 is a trademark of MIPS Technologies, Inc. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd. VME is a trademark of Motorola.

SGI™ 2100 Owner’s Guide Document Number 007-4114-001 Contents

List of Figures vii List of Tables ix

About This Guide xi Finding Additional Information xii Online Reference (Manual) Pages xiv Release Notes xiv World Wide Web-Accessible Documentation xv Conventions xv Compliance Information xvi 1. Introducing the SGI 2100 1 System Features 1 SGI 2100 Functional Overview 3 Linked Microprocessors 3 ccNUMA Architecture and Memory 3 The Node Boards 4 The I/O Subsystem 6 About the XIO Boards 6 The System Midplane 7 Module System Controller 7 Internal Drives 7 System Location and Environment 7 2. Chassis Tour 9 SGI 2100 System Physical Description 9 Components and Controls on the Front of the System 12

iii Contents

Components and Controls on the Rear of the System 15 Power Connector and Switch 15 System Node Board Locations 17 Node Board LEDs 17 The System Midplane 18 System Configuration Guidelines 21 Node and Router Board Combinations 21 Maximum Number of CPUs 21 Node and XIO Board Combinations 21 XIO Board Slots 25 The BaseIO Panel 27 3. Getting Started 29 System Operation Guidelines 30 Operating Voltages 30 Safety Precautions 31 Sliding Open the Front Door Panel 31 Removing the System’s Plastic Covers 32 System Drives 36 Connecting to an Ethernet 38 Powering On the SGI 2100 System 39 Powering Off the SGI 2100 System 41 4. SGI 2100 Interface and Cabling Information 43 The Ethernet Interface Connection 44 Standard Serial Ports 46 The Standard SCSI Connector 48 5. Installing and Replacing Customer Replaceable Units 51 Installing or Removing the System Disk and Optional Hard Drives 51 Removing or Inserting a Data Disk 54 Replacing the Module System Controller or CD-ROM Drive 56 Installing External Drives 58 6. Using the Module System Controller 59 The MSC Front Panel 60

iv Contents

Understanding the MSC’s LEDs and Switches 64 MSC Features and Functions 65 MSC Status Messages 67 7. Basic Troubleshooting 69 General Guidelines 69 Operating Guidelines 70 Power Supply Problems 71 The Amber (Yellow) LED 72 The Green LED 72 The Red LED 72 Crash Recovery 73 Rebooting the System 73 Restoring System Software 73 Restoring From Backup Tapes 74 Restoring a Filesystem From the System Maintenance Menu 74 Recovery After System Corruption 76 MSC Shutdown 77 Fixing the MSC Shutdown 77 Hardware Graph and hinv Commands 78 Hardware Graph Information 78 hinv Information 79 Index 85

v

List of Figures

Figure i Information Sources for the SGI 2100 System xiii Figure ii VCCI Information xvii Figure iii Regulatory Insignia xvii Figure 1-1 The SGI 2100 Server 2 Figure 1-2 Node Board Example 5 Figure 2-1 SGI 2100 System Components 11 Figure 2-2 Opening the Front of the SGI 2100 System 12 Figure 2-3 CD-ROM and Module System Controller 13 Figure 2-4 The System Disk and Optional Drive Bays 14 Figure 2-5 Component and Control Locations on the Back 16 Figure 2-6 Node Board LEDs 18 Figure 2-7 The SGI 2100 Midplane (Front View) 19 Figure 2-8 SGI 2100 Midplane (Rear View) 20 Figure 2-9 SGI 2100 Router and Node Board Configurations 23 Figure 2-10 Node and XIO Board Functional Configurations 24 Figure 2-11 XIO Board Slots 26 Figure 2-12 BaseIO Panel Connections and Indicators 27 Figure 3-1 Opening and Closing the Sliding Front Panel 32 Figure 3-2 Removing the Front Plastic Panel 34 Figure 3-3 Removing the Top Plastic Panel 35 Figure 3-4 SGI 2100 Internal Drive Bays 37 Figure 3-5 MSC Keyswitch and Front-Panel Controls 40 Figure 3-6 System Power Cable and Switch 42 Figure 4-1 Standard Ethernet on the SGI 2100 45 Figure 4-2 Serial Port Location and Pinouts 47 Figure 4-3 68-Pin Single-Ended SCSI Connector 50 Figure 5-1 Installing or Removing the System Disk 53

vii List of Figures

Figure 5-2 Removing a Data Disk Drive Module 55 Figure 5-3 Installing or Replacing the MSC or CD-ROM Drive 57 Figure 5-4 External Origin Drive Expansion Box 58 Figure 6-1 MSC Interface Location 60 Figure 6-2 MSC Status Panel and Switches 61 Figure 6-3 MSC Front Diagnostic Port Pinouts 62 Figure 6-4 MSC Rear Diagnostic Serial Connector 63

viii List of Tables

Table 1-1 Air Clearance Requirements for the SGI 2100 System 8 Table 2-1 SGI 2100 System Physical Specifications 10 Table 2-2 Functional Configuration Overview 22 Table 2-3 BaseIO Connectors 28 Table 4-1 Ethernet 100-BASE T Ethernet Port Pin Assignments 44 Table 4-2 68-Pin Single-Ended, High-Density SCSI Pinouts 48 Table 6-1 MSC Messages 67

ix

About This Guide

This guide is designed to help you learn to use, manage, troubleshoot, and upgrade your SGI 2100 server and is organized as follows:

Chapter 1, “Introducing the SGI 2100,” describes the system and its capabilities and contrasts them with other server technology. A brief overview of the system’s compute and interface capabilities is provided.

Chapter 2, “Chassis Tour,” describes all of the system components and reviews all of the controls, indicators, and connectors.

Chapter 3, “Getting Started,” reviews hardware-specific operating procedures. The chapter covers booting the system, graceful shutdown, and proper use of optional console terminals.

Chapter 4, “SGI 2100 Interface and Cabling Information,” covers the use of Ethernet, serial, and external SCSI interfaces. The chapter also describes optional types of connections that make the system operational.

Chapter 5, “Installing and Replacing Customer Replaceable Units,” describes installation and replacement procedures for disk, CD-ROM, and System Controller assemblies. Includes basic information on external peripherals.

Chapter 6, “Using the Module System Controller,” describes the basic System Controller and interface panel used with the SGI 2100 server.

Chapter 7, “Basic Troubleshooting,” offers information on tracking down and fixing simple problems.

Start at the beginning to familiarize yourself with the features of your new system, or proceed directly to the information you need using the table of contents as your guide.

xi About This Guide

Additional software-specific information is found in the following software guides: • Personal System Administration Guide • IRIX Admin: System Configuration and Operation • IRIX Admin: Software Installation and Licensing

Finding Additional Information

This SGI 2100 Owner’s Guide covers many basic and useful topics that are necessary for setting up, operating, and maintaining your system. Refer to it whenever you need help with the basic hardware aspects of your system. The system and the procedures in this guide are designed to help you maintain the system without the help of a trained technician. However, do not feel that you must work with the hardware yourself. You can always contact your maintenance provider to have an authorized service provider work with the hardware instead.

Figure i and the following sections describe multiple, additional sources of information that you may find helpful or vital to your work with the SGI 2100.

xii About This Guide

Hard Copy Optional IRIX 6.X Systems IRIX Admin Manual Set (also available online)

Computer Systems Computer Systems Owner's Guide Computer Systems Computer Systems

Computer Systems Computer Systems Computer Systems

SGI 2100 Owner's Guide

1 1 0 1 1 0 1 1 0 1 1 1 0 1 1 0 1 1 1 0 1 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 01 11 1 0 0 1 0 1 011 0 1 0 0 1 0 1 011 0 1 0 0 1 0 1 0

Online

MAN (1) MAN (1)

NAME man - print entries from the on-line reference manuals: find manual entries by keyword SYNOPSYS man [-cdwWtpr] [-M path] [-T macropackage] [section] title ... man [-M path -k keyword ... man [-M path -f filename DISCRIPTION man locates and prints the titled entries from the on-line reference manuals. man also prints summaries of manual entries selected by keyword or by associated flilename. If a section is given, only that particular section is searched for the specified title. The current list of valid sections are any single digit [0-9], plus the sections local, public, new, and old, corresponding to the sections l, p, n, and o, respectively. When a section name of this form is given, the first character is used to form the directory, thus "local" will cause directories ending in "manl" to be searched. To find a man page with the name of one of these sections, it is necessary to first give a dummy name, such as "man junk local". which is unfortunate. If no section is given, all sections of the on-line reference manuals are searched and all occurrences of title are printed. -Typed by Kam K., 7/99

CDs (InSight Books) Reference (Man) Pages

World Wide Web http://techpubs.sgi.com

Figure i Information Sources for the SGI 2100 System

xiii About This Guide

Online Reference (Manual) Pages

Your system comes with a set of IRIX® reference (manual) pages, formatted in the standard UNIX® “man page” style. These are found online on the internal system disk, (or CD-ROM) and are displayed using the man command. For example, to display the reference page for the Add_disk command, enter the following command at a shell prompt: man Add_disk

Important system configuration files as well as commands are documented on reference pages. References in the documentation to these reference pages include the name of the command and the section number in which the command is found. For example, “Add_disk(1)” refers to the Add_disk command and indicates that it is found in section 1 of the IRIX reference.

For additional information about displaying reference pages using the man command, see man(1).

In addition, the apropos command locates reference pages based on keywords. For example, to display a list of reference pages that describe disks, enter the following command at a shell prompt: apropos disk

For information about setting up and using apropos, see apropos(1) and makewhatis(1M).

Release Notes

You can view the release notes for a variety of SGI™ products and software subsystems using one of two utilities: relnotes Text-based viewer for online release notes. grelnotes Graphical viewer for online release notes.

To see a list of available Release Notes, type the following at a shell prompt: relnotes

For more information, see the relnotes(1) and grelnotes(1) reference pages.

xiv About This Guide

World Wide Web-Accessible Documentation

SGI makes its manuals available in a variety of formats via the World Wide Web (WWW). Using your Web browser, open the following URL: http://techpubs.sgi.com/

Conventions

The SGI 2100 Owner’s Guide uses these conventions: • References to documents are in italics. • References to other chapters and sections within this guide are in quotation marks. • Names of commands that you type at the shell prompt are in italics as are IRIX filenames. • Steps to perform tasks are in numbered sentences. When a numbered step needs more explanation, the explanation follows the step.

xv About This Guide

Compliance Information

FCC WARNING

This equipment has been tested and found compliant with the limits for a Class A digital device, pursuant to Part 15 of the FCC rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio frequency energy and if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications. This product requires the use of external shielded cables in order to maintain compliance. Changes or modification to this product not expressly approved by the party responsible for compliance could void the user’s authority to operate the equipment. Operation of this equipment in a residential area is likely to cause harmful interference, in which case users will be required to correct the interference at their own expense.

You may find the following booklet, prepared by the Federal Communications Commission, helpful: Interference Handbook 1993 Edition. This booklet is available from the U.S. Government Printing Office, Superintendent of Documents, Mail Stop: SSOP, Washington D.C. 20402-9328, ISBN 0-16-041736-8.

Canadian Department of Communications Statement

This digital apparatus does not exceed the Class A limits for radio noise emissions from digital apparatus as set out in the Radio Interference Regulations of the Canadian Department of Communications.

Attention

Le présent appareil numérique n’émet pas de perturbations radioélectriques dépassant les normes applicables aux appareils numériques de Classe A prescrites dans le Règlement sur le interferences radioélectriques établi par le Ministère des Communications du Canada.

xvi About This Guide

Figure ii VCCI Information

TUV

R

geprufte Sicherheit NRTL/C

Figure iii Regulatory Insignia

Manufacturer’s Regulatory Declarations

This workstation conforms to several national and international specifications and European directives as listed on the “Manufacturer’s Declaration of Conformity,” which is included with each computer system and peripheral. The CE insignia displayed on each device is an indication of conformity to the European requirements.

Your workstation has several governmental and third-party approvals, licenses, and permits. Do not modify this product in any way that is not expressly approved by Silicon Graphics, Inc. If you do, you may lose these approvals and your governmental agency authority to operate this device.

xvii

Chapter 1 1. Introducing the SGI 2100

The SGI 2100 system, model CMN A015, is a high-performance server that scales up to eight processors in a compact enclosure. This guide contains end-user hardware information about the system.

System Features

The SGI 2100 server comes with one to four combined CPU and memory boards called node boards. Each node board uses one or two MIPS 64-bit CPU microprocessors. The basic SGI 2100 uses one CPU and the system can use up to eight CPUs when fully configured.

The following standard features come with every SGI 2100 server: • One (CPU and memory) node board with one or two microprocessors. • Slots for up to 12 optional XIO boards. Note that the SGI 2100 server does not come with any VME slots. • An independent system status monitor (System Contoller) that records error information during any unplanned shutdown. • Spaces for up to five half-height single-connector assembly (SCA) SCSI disk drives, plus a 5.25-inch internal drive bay that supports a CD-ROM drive. • A minimum of 256 MB of RAM on each system node board installed.

Available options include: • additional node boards (up to four per system) • additional hard disk drives • a system console ASCII terminal • memory upgrades

1 Chapter 1: Introducing the SGI 2100

• XIO boards providing additional I/O, mass storage connections, and graphics capabilities • a three-board optional peripheral component interconnect (PCI) internal adaptor that connects to the XIO slot directly below the BaseIO board (each of the three PCI slots in the adaptor supports a 25 watt PCI board)

Rear

Front

Figure 1-1 The SGI 2100 Server

The SGI 2100 server is similar in size to previous SGI deskside systems. However, there are some design differences between it and Origin systems, notably an upper limit of eight CPUs for a system.

2 SGI 2100 Functional Overview

SGI 2100 Functional Overview

The SGI 2100 server is a symmetric multiprocessing systems that uses a distributed shared-memory architecture called ccNUMA (-coherent Non-Uniform Memory Architecture).

Linked Microprocessors

The node boards within the system use links that differ from technology. While a bus is a resource that can be used only by one processor at a time, the communications fabric in the SGI 2100 makes connections from processor to processor as they are needed. Each node board contains either one or two processors, a portion of main memory, a directory to maintain , and two interfaces: • The first interface connects to multiple I/O devices. • The second interface connects to other node boards.

This web of connections differs from a bus in the same way that multiple dimensions differ from a single dimension. You could describe a bus as a one-dimensional line while the SGI 2100 uses a multi-dimensional mesh.

The multiple data paths used are constructed as they are needed by router ASICs, which act as switches. As you add node boards, you add to and scale the system bandwidth.

ccNUMA Architecture and Memory

Main memory on each node board in the system can be distributed and shared amongst the system microprocessors. This shared memory is accessible to all processors the interconnection fabric and can be accessed with low latency.

Each node board added to the system is another independent memory source, and each node board is capable of optionally supporting up to 4 GB of memory. A directory memory keeps track of information necessary for hardware coherency and protection.

Each node board uses a “Hub” ASIC that is the distributed shared-memory controller. It is responsible for providing all of the processors and I/O devices with transparent access to all of distributed memory in a cache-coherent manner. Cache coherence is the ability to keep data consistent throughout a system. In the SGI 2100, data can be copied and

3 Chapter 1: Introducing the SGI 2100

shared amongst all the processors and their caches. Moving data into a cache may cause the cached copy to become inconsistent with the same data stored elsewhere. The cache coherence protocol is designed to keep data consistent and to disperse the most-recent version of data to wherever it is being used.

Although memory is physically dispersed across the system node boards, special page migration hardware moves data into memory closer to a processor that frequently uses it. This page migration scheme reduces memory latency — the time it takes to retrieve data from memory.Although main memory is distributed, it is universally accessible and shared between all the processors in the system. Similarly, I/O devices are distributed among the nodes, and each device is accessible to every processor in the system.

The Node Boards

The SGI 2100’s microprocessor and primary memory are located on a processor board called a node board. Each node board (up to four maximum) in the SGI 2100 can house one or two MIPS microprocessors. Each CPU uses a customized two-way interleaved data cache, and has dedicated second-level cache support.

A high-performance bus interface links each CPU processor directly with supporting SRAM. The node board’s main memory slots can be populated with 32-MB or 64-MB memory modules. See Figure 1-2 for an example node board illustration.

4 SGI 2100 Functional Overview

Main Memory DIMMs (16)

Power/ ground

Directory Memory DIMM slots (8) (not used in the SGI 2100) HUB chip with heat sink

300-pin compression connector

Power/ ground MIPS processor and secondary cache (HIMM) with heat sink

Figure 1-2 Node Board Example

Note that directory memory is used only in large-scale rackmounted systems and is not used in the SGI 2100.

5 Chapter 1: Introducing the SGI 2100

The I/O Subsystem

The standard I/O subsystem consists of a base I/O board assembly (BaseIO) that supports • two nine-pin serial ports (selectable for RS-232 or RS-422 operation) • a 100-Mb per second (100 Base-T) Ethernet connection • a 68-pin single-ended Ultra SCSI and SCSI-2 compatible connector

Additional I/O connection capabilities are available with optional XIO boards or by ordering an expanded version of the BaseIO.

About the XIO Boards

XIO boards give the SGI 2100 system a wide range of optional interfaces in a manner similar to older VME interfaces. Optional XIO boards can support interfaces such as: • PCI • Fibre Channel • HIPPI • Ultra (FAST-20) SCSI and SCSI-2 • ATM • Ethernet • Gigabit Ethernet • Digital Video (HDTV) • DIVO • GSN

Check with your SGI sales or support representative for information on these or other optional interfaces available on XIO boards.

6 System Location and Environment

The System Midplane

The SGI 2100 enclosure uses a midplane to which boards, disk drives, and other devices can attach from both the front and rear of the system.

Module System Controller

Located between the disk drive slots and the optional CD-ROM drive bay is the module System Controller (MSC). The MSC is a microprocessor-controlled subsystem that is mounted directly to the system midplane by way of an extender board. It monitors various system operations, including chassis temperature, system fan speed, midplane voltage levels, and the system clock.

When any operating parameter exceeds or drops past a specified limit, the MSC executes a controlled shutdown of the system. For details on using the MSC, see Chapter 6 in this document.

Internal Drives

Each SGI 2100 comes standard with a system disk installed in drive bay one (next to the MSC). Four additional internal hard drives may be installed.

The CD-ROM drive is installed directly to the left of the MSC. Note that single-ended ultra SCSI and SCSI-2 drives are the only internal devices supported by the SGI 2100 system.

System Location and Environment

This section covers the basic requirements for physical location to ensure proper chassis operation.

The SGI 2100 chassis is designed to fit into a typical work environment. Take care to maintain the following operating conditions: • The chassis should be kept in a clean, dust-free location to reduce maintenance problems. • The available power should be rated for computer operation.

7 Chapter 1: Introducing the SGI 2100

• The chassis should be protected from harsh environments that produce excessive vibration, heat, and similar conditions. • The chassis should ideally have a six-inch (15-cm) minimum air clearance above the top. The first line of Table 1-1 shows the side clearances required if the chassis is positioned under a desk or other equipment and the top air clearance is less than six inches (15 cm). The side air clearances should always be at least as great as those listed on the second line of Table 1-1.

Table 1-1 Air Clearance Requirements for the SGI 2100 System

Top Clearance Left Side Right Side Front Back

6” (15 cm) or less 6” (15 cm) 6” (25 cm) 8” (20 cm) 8” (20 cm)

More than 6” (15 cm) 1” (2.5 cm) 1” (2.5 cm) 6” (15 cm) 6” (15 cm)

For more information on system specifications, see Table 2-1 in Chapter 2.

If you have additional questions concerning physical location or site preparation, see the Site Preparation for Origin Family and Onyx2 manual (P/N 007-3452-nnn). If you are unable to find the information you need, contact your SGI System Support Engineer (SSE) or other authorized support organization representative.

8 Chapter 2 2. Chassis Tour

This chapter is intended to familiarize you with the physical, electrical, and mechanical aspects of the SGI 2100 server. Standard controls and connectors are described and illustrated along with major components that go in the SGI 2100 chassis. All boards, drives, and other components are housed in a single, upright enclosure and, with its small physical dimensions and quiet operation, the server fits into a lab, server room, or a normal office environment.

Commonly used optional components are also shown and discussed in this chapter.

SGI 2100 System Physical Description

The SGI 2100 system is a compact, high-performance server that easily fits in most office environments.

The unit weighs a minimum of 120 pounds. (54.5 kg) but is easily moved about on its four rollers. When fully loaded the system could weigh as much as 170 pounds. (77.3 kg). Never attempt to lift the unit without the assistance of other people.

Table 2-1 provides the basic physical specifications for the SGI 2100 system.

See the Site Preparation for Origin Family and Onyx2 manual (P/N 007-3452-nnn) for additional information.

9 Chapter 2: Chassis Tour

Table 2-1 SGI 2100 System Physical Specifications

Parameter Specifications

Dimensions

installed: height 26.5” (67.3 cm) width 20” (50.8 cm) depth 24” (61 cm)

Weight: minimum 120 lbs ( 54.5 kg) maximum 170 lbs (77.3 kg) shipping (max.) 190 lbs (86.4 kg)

Floor Loading: minimum 36 lb/ft2 (175 kg/m2) maximum 51 lb/ft2 (250 kg/m2)

Air Temperature: operating (< 5000 ft) 41° to 95° F (5° to 35° C) operating (> 5000 ft) 41° to 86° F (5° to 30° C) non-operating −15° to 107° F (−20° to 60° C)

Altitude: operating 10,000 ft (3,048 m) MSL, maximum non-operating 40,000 ft (12,192 m) MSL, maximum

Humidity: operating 10% - 90% (non-condensing) non-operating 10% - 95% (non-condensing)

Acoustics: typical 50 dBa

Figure 2-1 shows the SGI 2100 chassis and some of its major components.

10 SGI 2100 System Physical Description

Cap Upper plenum Left side panel assembly

System CD-ROM controller

System disk

Blank drive panels

Front door

Front panel Right side panel assembly

O R I G I N

Figure 2-1 SGI 2100 System Components

11 Chapter 2: Chassis Tour

Components and Controls on the Front of the System

The front of the SGI 2100 system has a number of controls and components that you should be familiar with. The system’s removable media drive bay, Module System Controller (MSC), and hard disk drive bays are all accessible by opening the sliding front plastic cover (door).

Open the front sliding door panel by pushing it down until it catches (see Figure 2-2).

The removable media bay and MSC front panel are next to each other in the upper left corner of the system. Figure 2-3 shows the location of each of these units.

Figure 2-2 Opening the Front of the SGI 2100 System

The MSC is located between the disk drive bays and the CD-ROM drive bay. The MSC is a microprocessor-controlled subsystem that is mounted directly to the system midplane by way of an “extender” board. It monitors various system operations, including ambient temperature, system fan speed, midplane voltage levels, and the system clock.

12 Components and Controls on the Front of the System

When any operating parameter exceeds or drops past a specified limit, the executes a controlled MSC shutdown of the system. During such a shutdown procedure, the controller maintains a log with the last error message(s) received before the shutdown.

For information on using the MSC, see Chapter 6 in this guide.

System Controller CD-ROM System disk

Figure 2-3 CD-ROM and Module System Controller

13 Chapter 2: Chassis Tour

System disk Blank

Optional drive bays

Figure 2-4 The System Disk and Optional Drive Bays

14 Components and Controls on the Rear of the System

Components and Controls on the Rear of the System

The rear of the system houses the following components: • The system node board(s). See Figure 2-5. • The power connector and system power switch. • The BaseIO system interface panel. • Slots and carriers for optional PCI and XIO interface boards.

Power Connector and Switch

The system’s main power connector is located on the lower left side of the chassis. The main system power switch is located opposite it on the lower right side.

System power is on when the switch is up and off when it is down.

15 Chapter 2: Chassis Tour

BaseIO panel

Node 4

Node 3

Node 2

Node 1

AC input

Main power switch

Figure 2-5 Component and Control Locations on the Back

16 Components and Controls on the Rear of the System

System Node Board Locations

The system node board slots are located in the left side on the rear of the chassis. The first node board is always installed in the right-hand slot. See “System Configuration Guidelines” on page 21 for information on how the node boards are used in conjunction with other system components.

Node Board LEDs

On the back of each node board are a total of 18 LEDs (see Figure 2-6 for an example). Two red LEDs are located near the top of the board and a set of 16 yellow ones are located near the middle of the board.

The two LEDs near the top of the board should light only when there is a voltage inconsistency or problem on the node board. If these LEDs light up frequently, the board may need service. If all the top LEDs on all the node boards in the system light up, it indicates a system-wide power problem. In this case, call your service representative for assistance.

The LEDs grouped near the middle of the board are divided into two vertical sets of eight LEDs (16 total). Each vertical set of eight LEDs represents one of the microprocessors installed on the node board. When only one CPU is installed, you can expect to see LED activity on only one vertical set of LEDs.

As a general rule, the bottom LED should always show some activity while the system is powered on. The bottom LED serves as a kind of “heartbeat” that indicates the CPU is alive, even if the system is not generally active.

The other seven LEDs light up as the number of processes on the CPU increases. The more work the CPU is doing, the more LED activity you see on the back of the node board.

17 Chapter 2: Chassis Tour

LEDs

LEDs

Figure 2-6 Node Board LEDs

The System Midplane

The SGI 2100 enclosure uses a midplane to which boards, disk drives, and other devices can attach from both sides of the system. This allows for maximum functionality and expansion in a compact unit. Figure 2-7 shows a front view of the midplane, while Figure 2-8 shows a rear view.

Single-ended ultra SCSI and SCSI-2 drives are the only devices internally supported by the SGI 2100 system.

18 The System Midplane

SCSI drive SCSI drive System Controller ID 2 ID 4 connector SCSI drive SCSI drive SCSI drive ID 1 ID 3 ID 5

CD-ROM connector

Router slot 1 XBOW 1 Router slot 0

XBOW 0

Node 300 pin connector backing plates

System NIC Power/ground Midplane NIC Midplane sockets power/ground sockets

Figure 2-7 The SGI 2100 Midplane (Front View)

19 Chapter 2: Chassis Tour

XIO BaseIO compression connector connectors

Power connectors Router connector backing plates

Power connectors

Node 1 N4 N2 300-pin N3 compression XIO connector compression connectors

Figure 2-8 SGI 2100 Midplane (Rear View)

20 System Configuration Guidelines

System Configuration Guidelines

The SGI 2100 system is designed to expand in functionality depending on your hardware computing needs. Standard and optional boards can be combined to build a maximum functional configuration.

Node and Router Board Combinations

The node and router boards are interdependent and enable high-speed communications within the system among other duties. All SGI 2100 systems ship with the first router board installed. Your system can operate without any router boards, but you can use only one node board and one bank of XIO slots. The first row of configuration information in Table 2-2 shows the restrictions. You must have at least one router board to operate two node boards. Figure 2-9 illustrates the node board and router configuration interdependence.

Maximum Number of CPUs

The maximum of CPUs is eight, installed on four node boards.

Node and XIO Board Combinations

Table 2-2 provides an overview of interdependent system boards that combine to build the chassis into a maximum compute server.

Node board slots are counted from right to left. Router board and XIO board slots are counted from left to right. Figure 2-10 shows a functional view of the back of the system.

21 Chapter 2: Chassis Tour

The circles ({O}) and triangles (∆) represent the interdependence of the XIO slots and the node boards that support them.

Table 2-2 Functional Configuration Overview

1st Router 2nd Router 1st and 2nd 3rd and 4th XIO Slots 1-6 XIO Slots 7-12 Board Board Node Bd. Slots Node Bd. Slots

Not Not 1st node {O} is Not usable Operational{O} (∆) Not usable installed installed operational

Installed Not 2nd node (∆) Not usable Operational{O} (∆) Operational installed Operational if with use of 2nd installed node board

Installed Installed Operational Operational Operational(O) (∆) Operational if node 2 installed

22 System Configuration Guidelines

Schematic Number of Router Board Configuration Node Boards Types

1 Node Board (up to 2 processors) Processor Processor N None

2 Node Boards (up to 4 processors) N Null Router Board N NR

3 Node Boards (up to 6 processors) N N Router Board N R R Router Board

IR1 Jumper

IR1 Jumper

4 Node Boards (up to 8 processors) N N Router Board N R R N Router Board

IR1 Jumper IR1 Jumper N = Node Board NR = Null Router Board R = Router Board

Figure 2-9 SGI 2100 Router and Node Board Configurations

23 Chapter 2: Chassis Tour

Rear Chassis Diagram

Router 2 Router 1 XIO 1 XIO 3 XIO 5 XIO 7 XIO 9 XIO 11 Node 4 Node 3 Node 2 Node 1 XIO 2 XIO 4 XIO 6 XIO 8 XIO 10 XIO 12

Node slots XIO slots Single-ended SCSI BaseIO Ethernet

Serial

Block Diagram Crossbow Node 1 0 XIO

Node2 Router XIO boards such 1 as FDDI, ATM, Quad SCSI, SE to Node 3 Diff. Converter and Fibre Channel Crossbow Router Node 4 1 XIO 2

Figure 2-10 Node and XIO Board Functional Configurations

24 XIO Board Slots

XIO Board Slots

Each SGI 2100 system comes with 12 XIO board slots. Various types of optional interface boards are supported in the XIO slots. These may include • peripheral component interface (PCI) • high-performance parallel interface (HIPPI) • Fibre Channel • graphics interface (SI Viz Console Board)

There are certain installation restrictions that must be followed when XIO boards are installed or removed. Failure to follow these configuration rules may result in system or peripheral malfunction.

Always • Keep the BaseIO (IO6S) board installed in XIO slot 1. • Fill the top XIO slots first (XIO slots 3 and 5 should be filled first). • Have the PCI module installed in XIO slot 2.

Never • Move the BaseIO (IO6S or IO6G) board to a slot other than XIO slot 1. • Have a SCSI board installed in XIO slot 2. • Have an XIO board installed in an unsupported slot (see Figure 2-10).

25 Chapter 2: Chassis Tour

Midplane

XIO IO cage Compression connector

BaseIO Middle card guide

Filler panel

Lower XIO board slots

Rear of chassis

Figure 2-11 XIO Board Slots

26 The BaseIO Panel

The BaseIO Panel

The main system I/O panel is the BaseIO (also known as the IO6). It is used to connect external devices to the system. The BaseIO panel configuration for SGI 2100 systems is shown in Figure 2-12.

68-pin SCSI connector

Additional tty_2 tty_1 serial port

Serial console port Console

Interrupt out

Interrupt in 1234 LEDs

LEDs 1 = SCSI 2 = 100Mb/s Ethernet 3 = DUP

4 = Link TX RX

Ethernet connector LEDs

Figure 2-12 BaseIO Panel Connections and Indicators

27 Chapter 2: Chassis Tour

Devices supported by the BaseIO include an Ethernet network connection, ASCII terminals, printers, or modems, and single-ended ultra SCSI or SCSI-2 peripherals.

Note: If you disconnect a cable from a peripheral device, you should also disconnect it from the I/O connector on the I/O panel. This helps prevent the system from picking up external electrical noise.

Table 2-3 lists a description of the connectors on the BaseIO.

Table 2-3 BaseIO Connectors

Connector Type Connector Description Connector Function

100BaseT 8-pin Jack 100-Mb per second Ethernet

Serial 9-pin DIN RS-232 and 422 Serial

SCSI 68-pin (FAST-20) Ultra SCSI (Single-ended)

See Chapter 4 in this document for a complete description and pin identification for each of the standard and optional BaseIO connectors.

28 Chapter 3 3. Getting Started

This chapter describes all the basic procedures needed to operate your SGI 2100 server. For more detailed information about specific components refer to the table of contents or index.

The design of the SGI 2100 provides customer maintenance access only to specific components within the system.

The following listed components must be serviced or replaced only by SGI trained and approved system support personnel: • The system midplane. • The system fan tray. • The node board(s). • The router board(s). • The XIO boards. • The power supply.

Other components and options within the system can be installed or replaced by the end user.

Note: This product requires the use of external shielded cables in order to maintain compliance with Part 15 of the FCC rules.

29 Chapter 3: Getting Started

System Operation Guidelines

The operating procedures described in the following subsections are designed to ensure your safety and the integrity of your new system.

Operating Voltages

The SGI 2100 chassis can be configured for either 110-120 VAC or 220-240 VAC operation. The system requires alternating current (AC) service at specified voltage and current ratings for proper operation. The power supply is “auto ranging” and automatically adjusts for operation with either voltage range.

Caution: The SGI 2100 requires the use of a 220-240 Volt electrical source whenever it is configured with more than four CPUs (or two node boards). Other factors may also apply; contact your service provider before upgrading your 110 Volt system.

Verify that the correct AC line voltages are selected for any external peripheral you use with your system.

Before connecting or disconnecting any terminal, peripheral, or front-loading drive, be sure the module System Controller’s (MSC) keyswitch is turned to standby and the system circuit breaker located on the back of the chassis is in the off position.

Read the following safety statements carefully before you install or remove any standard or optional components.

30 Sliding Open the Front Door Panel

Safety Precautions

Warning: Read the following safety information carefully before you install or remove standard or optional components. To avoid electric shock and/or a fire hazard, do not disassemble the chassis. No user-serviceable parts are located inside.

This equipment is sensitive to damage from electrostatic discharge (ESD) caused by the buildup of electrical potential on clothing and other materials.

Before connecting or disconnecting any terminal, peripheral, or front-loading drive, be sure the system is powered off and the primary power source is disconnected.

Attach a ground strap to your wrist when working on the system.

Sliding Open the Front Door Panel

To access the drives and MSC you must open the front door of the system.

Use the following information to open or close the system’s front sliding door panel: 1. Push down on the rectangular panel near the top on the front of the system. 2. Slide it downward until it locks in position. You should have clear access to the drives and the MSC interface panel (see Figure 3-1). 3. Close the panel by pushing down until you feel it release. 4. Let it slide back up into its original closed position.

31 Chapter 3: Getting Started

Figure 3-1 Opening and Closing the Sliding Front Panel

Removing the System’s Plastic Covers

Under certain circumstances you may wish to remove some or all of the plastic covers from the SGI 2100 chassis. See “Sliding Open the Front Door Panel” on page 31 if you need to access only the drives or the MSC front panel.

Note: Do not operate the SGI 2100 with the plastic covers removed. Disruption to normal air flow patterns may cause system overheating.

Use the following steps to remove the plastic covers from the SGI 2100:

32 Removing the System’s Plastic Covers

1. Be sure that the system power is turned off and the power cable is disconnected from the back of the system. 2. Remove the plastic front panel cover by undoing the captive Phillips-head retaining screw located near the center of the bottom grill (see Figure 3-2). 3. Lower the panel approximately 0.15 inches (4 mm) and pull it forward off the chassis. 4. To remove the top cover you must first remove the four corner-positioned cover “caps” by using a screwdriver to release the latch for each one (see Figure 3-3). 5. Push the screwdriver through the upper grill and unlatch a cap, then lift the cap off. If you have trouble lifting the cap, press down on the outside corner of the cap until it pops up. 6. Remove the top plastic panel by undoing the captive screw under each cap and lifting the cover straight up off the chassis. Note: It is highly unlikely that you will need to remove the rear plastic “bumper” cover or side panels. These three plastic covers are interconnected and somewhat difficult to remove. If you find it necessary, contact your service provider for assistance.

33 Chapter 3: Getting Started

Captive screw

Figure 3-2 Removing the Front Plastic Panel

34 Removing the System’s Plastic Covers

Cap securing latch

Figure 3-3 Removing the Top Plastic Panel

35 Chapter 3: Getting Started

System Drives

The SGI 2100 system comes standard with six drive bays. The first is located in the upper left sector on the front of the system and holds one 5.25-inch half-height CD-ROM drive. This single-ended drive bay is “hard wired” as SCSI ID 6 on the backplane. To the right of this bay is the MSC and then a bank of five 3.5-inch disk bays (see Figure 3-4).

The system disk is always SCSI ID 1 and is always installed in the disk drive bay directly to the right of the MSC. Each of the five disk drive bays is “hard wired” on the backplane to a single SCSI ID number (1-5). This prevents ever assigning the same SCSI ID to more than one drive installed in the SGI 2100.

Caution: When you remove a drive be sure that you always put it back in the same bay that you removed it from. Placing the system disk in the wrong bay results in the system being unable to boot. Replacing a data disk in a different bay may cause file corruption, data loss, or other malfunction.

See the information in Chapter 5 for details on removing, replacing, or installing drives.

Caution: Use proper handling and storage procedures to avoid the loss of data and equipment. Do not remove disk drives while they are operating. Always power off the system before removing a drive.

Be sure to use standard electrostatic discharge prevention precautions when removing, storing, transporting, or replacing drives.

All hard disk drives installed in the system must be Ultra SCSI or SCSI-2 compatible and use 80-pin single-connector assembly (SCA) drive sleds.

Use of external SCSI devices is supported through the BaseIO and optional XIO boards that install in the back of the system. See Chapter 5, “Installing and Replacing Customer Replaceable Units,” for additional information.

36 System Drives

System disk Optional drive bays

Figure 3-4 SGI 2100 Internal Drive Bays

37 Chapter 3: Getting Started

Connecting to an Ethernet

Your SGI 2100 system comes standard with an 8-pin 100-Mb-per-second Ethernet connector.

Note: Always verify the type of signal being transmitted over your network cable before plugging in the connector. Some networks use a twisted-pair cabling system that carries AUI signals. These networks use an RJ-45 connector that is meant to be plugged into an IEEE 802.3 Transceiver unit.

You can order optional SGI XIO boards for additional Ethernet connections.

Observe the following procedures when making Ethernet connections: 1. Identify the Ethernet drop intended for your system, and route it to the rear of the chassis. Repeat for any additional connections. 2. Plug in the Ethernet connector (make sure to properly secure the 8-pin connector). 3. Continue with any additional peripheral connections or installations. 4. Restart the system.

38 Powering On the SGI 2100 System

Powering On the SGI 2100 System

Use the following procedures to power on your new SGI 2100 system: 1. Make sure the power switches on all of the equipment are turned off. 2. Plug the power cord into each component. Make sure to connect the cords to grounded outlets only. 3. Turn on the power switches in the following order:

■ breaker switch located on the power-in panel on the back of the chassis

■ monitors, terminals or other video output devices

■ printer (if installed)

■ MSC key switch 4. After you turn the MSC’s switch to the On position, watch the LED panel for the [SYS OK] message. See Figure 3-5 for the keyswitch and other front panel switch locations.

Note: Pushing either of the reset buttons during the boot process causes the system to abort the normal boot process.

To better understand the MSC and its front panel interface, see Chapter 6, “Using the Module System Controller,” for more detailed information.

39 Chapter 3: Getting Started

Module Module NMI reset switch switch

Fan hi-speed AC OK LED indicator LED

DC OK LED Ambient over- temperature 8-digit LED LED display

Security key switch

8-pin mini DIN diagnostic port Diagnostic Port

Standby

On

Diagnostic

Figure 3-5 MSC Keyswitch and Front-Panel Controls

40 Powering Off the SGI 2100 System

Powering Off the SGI 2100 System

The SGI 2100 system should be completely powered off only for relocation, routine maintenance, or repair. Warn everyone who uses the system before you shut it down. Before beginning this procedure, log out and shut down the software using the instructions that follow: 1. To halt operating system activity and prepare the system for power off, become superuser and enter /etc/halt in a functional IRIX window. The /etc/halt command gracefully shuts down the system software and leaves you at the PROM monitor level. If you are remotely logged in to the system, you will be prompted before the shutdown procedure is executed. 2. Turn the MSC key switch to the standby position to eliminate power to the boards and peripherals. 3. Switch the system circuit breaker to the off position to eliminate all power to the midplane and power supply (see Figure 3-6 for the location of the switch). 4. Unplug the power cord from the socket if you need to cut off all electrical power to the system.

41 Chapter 3: Getting Started

AC input

Main power switch

ON OFF

Figure 3-6 System Power Cable and Switch

42 Chapter 4 4. SGI 2100 Interface and Cabling Information

When your SGI 2100 system is initially set up in the work area, a trained system support engineering (SSE) technician should configure and connect it.

Your SGI 2100 system is fully functional as a standalone server using Ethernet, modem, optional ATM, HIPPI, or other interconnect technologies.

Proper configuration and interconnection of any optional XIO interconnect cables or hardware can be accomplished by the SSE at initial system installation or when an upgrade is ordered.

43 Chapter 4: SGI 2100 Interface and Cabling Information

The Ethernet Interface Connection

The system comes standard with a single 100 Base-T 8-pin Ethernet connector. Optional boards supporting additional Ethernet connectors are available.

Table 4-1 shows the cable pinout assignments for the Ethernet 100 Base-T Ethernet port.

Table 4-1 Ethernet 100-BASE T Ethernet Port Pin Assignments

Pin Assignment

1 TRANSMIT+

2 TRANSMIT–

3 RECEIVE+

4 (Reserved)

5 (Reserved)

6 RECEIVE–

7 (Reserved)

8 (Reserved)

Figure 4-1 shows the location of the standard Ethernet connector on the SGI 2100.

There are two LEDs on the RJ-45 Ethernet; the top (green) LED lights only when the system is transmitting. The bottom (yellow) LED lights whenever it sees any packet on the wire. This includes packets not destined for your system.

Just above the RJ-45 Ethernet connector is a set of four LEDs. They have the following functions: • The yellow LED on the far left (LED 1) lights to indicate SCSI activity on the BaseIO single-ended SCSI connector. • The green LED (LED 2) lights to indicate 100 Mb-per-second packet activity. • The yellow LED on the right (LED 3) indicates when the Ethernet is operating at full duplex rates of transfer or receive. • The rightmost green LED (LED 4) shows the Ethernet link test. It lights when linkstate is valid.

44 The Ethernet Interface Connection

123 4

Pin 1 Transmit + Pin 2 Transmit - Pin 3 Receive + Pin 4 Reserved Pin 5 Reserved Pin 6 Receive - Pin 7 Reserved Pin 8 Reserved

100 Base-T connector

Figure 4-1 Standard Ethernet on the SGI 2100

45 Chapter 4: SGI 2100 Interface and Cabling Information

Standard Serial Ports

Each SGI 2100 system comes with two standard 9-pin serial ports. These ports can support either RS-232 or RS-422 interface devices. Figure 4-2 shows the location and pinouts for a serial port. Optional additional serial ports are also available for your system.

Note: You cannot use serial cables that work with Silicon Graphics CHALLENGE®, Onyx®, and earlier deskside systems on the SGI 2100. You can, however, use serial cables that work with Origin™ 2000 and Onyx2® systems.

The RS-232 standard recommends the use of cables no longer than 50 feet (15.2 meters). This standard should also be applied to the RS-422 connector. Longer runs introduce a greater possibility of line noise occurring. This can affect data transmission and cause errors. For cable runs longer than 50 feet (15.2 meters), use an appropriate extender device.

Note: Do not run cables through areas that are electrically noisy, such as areas where large electric motors, welding apparatus, or X-ray machines operate. Bury outside wiring in conduit, as lighting strikes can damage the system.

46 Standard Serial Ports

Console Serial port serial port

Pin 5 Ground Pin 9 Ringing Pin 4 Data Indicator (RI) Terminal Ready (DTR) Pin 8 Clear to Send (CTS) Pin 3 Transmit Data (TD) Pin 7 Request to Send (RTS) Pin 2 Receive Data (RD) Pin 6 Data Set Ready (DSR) Pin 1 Data Carrier Detect (DCD)

Figure 4-2 Serial Port Location and Pinouts

47 Chapter 4: SGI 2100 Interface and Cabling Information

The Standard SCSI Connector

A single external 68-pin SCSI connector is provided on the BaseIO panel. This connector supports both Ultra SCSI and SCSI-2 devices. The connector sends single-ended SCSI signals only.

Optional additional SCSI ports can be implemented using SGI XIO option boards.

The hyphen preceding a signal name indicates that the signal is low. Note that 8-bit devices that connect to the P-cable leave these signals open: -DB(8), -DB(9), -DB(10), -DB(11), -DB(12), -DB(13), -DB(14), -DB(15), -DB(P1). All other signals are connected as shown in Table 4-2.

Table 4-2 68-Pin Single-Ended, High-Density SCSI Pinouts

Signal Name Pin Number Pin Number Signal Name

Ground 1 35 -DB(12)

Ground 2 36 -DB(13)

Ground 3 37 -DB(14)

Ground 4 38 -DB(15)

Ground 5 39 -DB(P1)

Ground 6 40 -DB(0)

Ground 7 41 -DB(1)

Ground 8 42 -DB(2)

Ground 9 43 -DB(3)

Ground 10 44 -DB(4)

Ground 11 45 -DB(5)

Ground 12 46 -DB(6)

Ground 13 47 -DB(7)

Ground 14 48 -DB(P)

Ground 15 49 Ground

48 The Standard SCSI Connector

Table 4-2 (continued) 68-Pin Single-Ended, High-Density SCSI Pinouts

Signal Name Pin Number Pin Number Signal Name

Ground 16 50 Ground

TERMPWR 17 51 TERMPWR

TERMPWR 18 52 TERMPWR

Reserved 19 53 Reserved

Ground 20 54 Ground

Ground 21 55 -ATN

Ground 22 56 Ground

Ground 23 57 -BSY

Ground 24 58 -ACK

Ground 25 59 -RST

Ground 26 60 -MSG

Ground 27 61 -SEL

Ground 28 62 -C/D

Ground 29 63 -REQ

Ground 30 64 -I/O

Ground 31 65 -DB(8)

Ground 32 66 -DB(9)

Ground 33 67 -DB(10)

Ground 34 68 -DB(11)

49 Chapter 4: SGI 2100 Interface and Cabling Information

SCSI connector (68-pin)

Pin 1 Pin 35

Pin 34 Pin 68

Figure 4-3 68-Pin Single-Ended SCSI Connector

50 Chapter 5 5. Installing and Replacing Customer Replaceable Units

This chapter explains how to remove, replace, or add the system disk, data disk(s), CD-ROM drive, or module System Controller (MSC) in the SGI 2100 chassis.

Only SGI trained System Support Engineers (SSEs) remove or replace the system midplane, router board(s), fan tray, node boards, power supply, or XIO boards.

Note: If your system is under warranty, or if you have a full service maintenance contract, call your service provider before removing or replacing any parts.

Be sure to carefully read and follow all the safety information regarding power and static discharge in Chapter 3 before performing any of the installation or replacement procedures in this chapter.

Installing or Removing the System Disk and Optional Hard Drives

The main system disk (disk one, SCSI ID 1) always goes in the drive bay immediately to the right of the MSC. The front of the system has five 3.5-inch disk bays that use 80-pin single-connector assembly (SCA) installation sleds.

Note: You must use an SCA-ready disk drive and drive sled mount on all drives being installed in these five bays. Non-SCA drives and sleds from older SGI systems do not fit or function in the SGI 2100 drive bays.

The CD-ROM bay at the upper left section of the system uses a different mounting scheme.

Disk drive modules are aligned vertically at the front of the chassis, as shown in Figure 2-1. In the server chassis, note that the left-most disk drive—the system drive—is oriented differently from the others.

51 Chapter 5: Installing and Replacing Customer Replaceable Units

Caution: Do not remove disk drives while they are operating. Always power off the system prior to removing a drive. When you remove a drive(s) be sure that you always put it back in the same bay that you removed it from. Placing the system disk in the wrong bay results in the system being unable to boot.

To remove the system disk drive module: 1. Power off the system. 2. Unlock the handle by moving it to the right (the handle is centered and in the open position, as shown in Figure 5-1). Note that the handle opens to the left in bays two through five; see the next section “Removing or Inserting a Data Disk.” 3. Pull the disk and sled assembly straight out of the bay.

To insert a hard disk assembly, follow these steps: 1. If necessary, snap the handle to the open position so that it is centered. 2. Align the disk module with the drive guide. 3. Gently but firmly slide the disk module on the guides over the pin. 4. When the system disk assembly is in all the way, snap the handle leftward to the closed position. 5. Use the packaging for the new disk module for repackaging the old disk module.

52 Installing or Removing the System Disk and Optional Hard Drives

Handle in Handle in closed position open position

System disk Drive bracket guide

Figure 5-1 Installing or Removing the System Disk

53 Chapter 5: Installing and Replacing Customer Replaceable Units

Removing or Inserting a Data Disk

Caution: Use proper handling and storage procedures to avoid the loss of data and equipment. Do not remove disk drives while they are operating. Always power off the system prior to removing a drive. When you remove a drive(s) be sure that you always put it back in the same bay that you removed it from. Replacing a data disk in a different bay may cause file corruption, data loss, or other malfunction.

To remove a data disk drive module: 1. Snap the handle to the left to the open position. 2. Center the handle, as shown in Figure 5-2. 3. Pull the disk module straight out.

To insert a data disk drive module, follow these steps: 1. If necessary, snap the handle to the open position so that it is centered. 2. If you are adding a drive, remove the drive filler plate that covers the drive slot you want to use. 3. Align the new disk module with the drive guide. 4. Gently but firmly slide the disk module on the guides over the pin. 5. When the disk module is in all the way, snap the handle rightward to the closed position. 6. If you have replaced a data disk module, repackage it for shipment back to SGI, following instructions included with the replacement shipment.

54 Removing or Inserting a Data Disk

Handle in Handle in closed position open position

Optional disk Drive bracket guide

Figure 5-2 Removing a Data Disk Drive Module

55 Chapter 5: Installing and Replacing Customer Replaceable Units

Replacing the Module System Controller or CD-ROM Drive

The module MSC and CD-ROM drive are packaged together in one assembly. To replace either unit, you must remove the entire assembly and then replace the faulty component.

To replace the MSC or CD-ROM drive, follow these steps: 1. Notify all users to log off, turn the MSC key to Standby, and push the system power switch on the back down (off). See “Powering Off the SGI 2100 System” in Chapter 3 if you need more information. 2. Remove the front cover, see “Removing the System’s Plastic Covers” in Chapter 3 if you are unfamiliar with the procedure. 3. Use a #1 Phillips-head screwdriver to loosen and remove the four screws that hold the assembly in place on the chassis, as shown in Figure 5-3. 4. Grasp the assembly with both hands and gently tug the connectors loose from the midplane. There is an indentation near the upper right sector that provides a good finger grip. 5. Pull the assembly all the way out of the chassis and set it carefully on an anti-static work surface. 6. Remove the four screws that connect the CD drive or MSC (whichever you are replacing) to the sheet-metal assembly frame. 7. Install the new unit using the screws from the last step. 8. Slide the assembly into the chassis carefully until the two connectors are perfectly aligned with the connectors on the system midplane. 9. Seat the connectors firmly into the midplane, then screw in and tighten the four screws that fasten the assembly to the chassis. 10. Reinstall the front cover and power on the system.

56 Replacing the Module System Controller or CD-ROM Drive

Figure 5-3 Installing or Replacing the MSC or CD-ROM Drive

57 Chapter 5: Installing and Replacing Customer Replaceable Units

Installing External Drives

There are a number of optional peripheral devices that can be used with your SGI 2100 system. Figure 5-4 shows an example of an Origin™ Vault drive expansion box connected to the SGI 2100 server.

Note: Always use the shortest possible SCSI cable when connecting to a single-ended SCSI device such as the Origin Vault.

Figure 5-4 External Origin Drive Expansion Box

58 Chapter 6 6. Using the Module System Controller

This chapter describes the functionality of the SGI 2100 module System Controller (MSC). The MSC interacts with the power supply, fan-tray module, node board(s), midplane, and other boards that have on-board regulators in the server.

Note: The MSC is designed to control only the a single chassis, and should not be confused with the function of the multi-module system controller (MMSC) used in Origin 2000 and Onyx2 rackmount and other multi-chassis systems.

The MSC is located in the upper left section on the front of the SGI 2100 system. It is between the CD-ROM drive bay and the hard disk bays. Figure 6-1 shows the MSC location.

59 Chapter 6: Using the Module System Controller

Module System Controller (MSC)

Figure 6-1 MSC Interface Location

The MSC Front Panel

The MSC front panel is shown in Figure 6-2. The MSC provides environmental monitoring for safe operation of the system. The MSC connects to the system midplane via an extender board and provides easy user access to switches and displays at the front of the system.

60 The MSC Front Panel

Module Module NMI reset switch switch

Fan hi-speed AC OK LED indicator LED

DC OK LED Ambient over- temperature 8-digit LED LED display

Security key switch

8-pin mini DIN diagnostic port Diagnostic Port

Standby

On

Diagnostic

Figure 6-2 MSC Status Panel and Switches

Pinouts for the controller’s 8-pin diagnostic serial connector are shown in Figure 6-3.

61 Chapter 6: Using the Module System Controller

DC OK LED

AC OK LED

System controller Module reset switch

Module NMI switch Fan hi-speed indicator LED Ambient over- temperature LED

8-digit LED display

Security key switch

8-pin mini DIN diagnostic port Pin 1: Data Pin 2: Clear terminal ready to send (CTS) (DTR) Pin 4: Signal ground (GND) Pin 3: Transmit data (TXD) Pin 5: Receive data (RXD)

Pin 6: Request Pin 8 Signal to send (RTS) ground (GND) Pin 7: Data carrier select (DCD)

Figure 6-3 MSC Front Diagnostic Port Pinouts

In the lower right section on the back of the system is a 9-pin alternate console diagnostic serial connector that is a direct mirror of the 8-pin diagnostic connector on the front panel. Figure 6-4 shows the location and pinouts of the 9-pin rear-mounted MSC diagnostic connector.

62 The MSC Front Panel

Note: You may not connect serial devices to both the front diagnostic port and rear MSC diagnostic serial connector at the same time. The connectors are wired through the same circuitry and cannot accept or send signals through both ports at the same time.

Diagnostic Serial Port (DB-9) Pin 1 Ground Pin 6 Not Used

Pin 2 Data Terminal Ready (DTR) Pin 7 Clear to Send (CTS)

Pin 3 Transmit Data (TXD) Pin 8 Request to Send (RTS)

Pin 4 Request Data (RXD) Pin 9 Not Used

Pin 5 Data Carrier Detect (DCD)

Module System Controller diagnostic serial port

Figure 6-4 MSC Rear Diagnostic Serial Connector

63 Chapter 6: Using the Module System Controller

Understanding the MSC’s LEDs and Switches

The MSC has one keyswitch, two push buttons, and four LED indicators. The following paragraphs provide information on the use or significance of each control or indicator.

The Front Panel Keyswitch selects Standby, On, or Diagnostic status for the system.

The System Reset push button initiates a system-wide reset of the system. The keyswitch must be in the diagnostic position to use this button.

The Non-Maskable Interrupt (NMI) switch issues a reset signal to all node boards in the system. The keyswitch must be in the diagnostic position to use this button.

The AC Power OK green LED lights when the system is plugged into an outlet and the AC circuit breaker is turned on. The MSC is receiving DC voltage (V_5 Aux) through the midplane, as are other boards that require it.

The DC Power OK green LED lights three and one-half seconds after the keyswitch is turned to the On position. This indicates the system power supply is enabled and operating properly.

The Fan Speed High amber warning LED lights as an indication that the ambient temperature is higher than optimal, or a non-critical fan has failed. When a non-critical fan fails, the remaining fans are set at full speed to compensate. In this case, a service call should be placed immediately.

The Over Temperature Fault amber warning LED lights when the MSC’s incoming air temperature or fan failure detection causes a shutdown of the system. If the environmental temperature exceeds the system’s tolerance, or if a critical fan fails, the MSC shuts down the system. In some cases, a service call should be placed immediately. See the section “MSC Shutdown” in Chapter 7 for tips on how to troubleshoot this problem area.

64 MSC Features and Functions

MSC Features and Functions

The MSC has the following basic features and functions: • Issues a reset signal at power-on. • The front-panel mounted keyswitch provides a soft power-off to standby condition. • A front-panel mounted push-button system reset switch. • A front-panel mounted push-button non-maskable interrupt (NMI) switch. • Monitors ambient incoming air temperature into the system and adjusts fan speed accordingly (two speeds). A soft power-off of the system results when ambient temperature becomes too high for safe operation. • LED display of ambient over-temperature conditions. • NVRAM for storing configuration information (1024 x 8 bits). • Monitors fan rotation and automatically increases to high speed fan operation when a fan fails. Signals an impending shutdown when a single critical fan fails, or two or more non-critical fans fail. • LED display of high fan speed and possible fan tray failure (fan high-speed LED). • LED display of power supply operation. AC OK LED indicates AC voltage applied to system. DC OK indicates all Power Supply DC voltages (+12 V, +5 V, +3.45 V), and remote DC voltages (3.3 V, 2.4 V, 1.6 V) are present with no error conditions in the system. The DC OK LED does not indicate regulation or accuracy of the DC voltages present. • Provides a 100-Kbps bidirectional communication path between the MSC, mid-plane, and Hub ASIC IO space on each node board in the system. This communication path allows the MSC to receive system status messages from all node boards in a system, and to provide status messages from the MSC and all node boards in a system. This communication path is referred to as the I2C interface. • Provides ability to request the system serial number and configuration information via the I2C Interface. • Eight-digit alphanumeric status display. This display is updated by the MSC or the node cards in the system via the I2C interface.

65 Chapter 6: Using the Module System Controller

• Provides a seven-wire 9600 BAUD alternate console diagnostic port for off-line configuration and troubleshooting. Also communicates with the node board(s) when the IO console port or a system console is not available or functional. This interface also supports the minimum requirements for modem support. • Software Reset, NMI, and soft power-off commands through the alternate console diagnostic port. • Supports alternate console diagnostic port command line power supply voltage margining. Margining allows the 3.45-V or 5-V outputs of the power supply to be moved 5% higher or lower independently. This does not effect remote regulated termination voltages (1.6 V, 2.4 V, router 3.3 V). • Supports alternate console diagnostic port command-line regulated termination voltage margining for the termination voltages 1.6 V, 2.4 V, and 3.3 V, (all termination voltages will be margined 5% higher or lower together, not independently). This does not affect the power supply voltages. • Sends early warning high priority interrupt (Panic Interrupt) to all node boards warning of impending shutdown due to AC power fail, ambient over-temperature or the switch being placed in the standby position. • Provides an interlock (removable keyswitch) to prevent unauthorized personnel from turning the system to on or standby, and to limit operation of the System Reset and NMI functions. The software password allows access and permissions through the alternate diagnostic console port.

66 MSC Status Messages

MSC Status Messages

The MSC front panel has an eight-character LED readout that supplies information about system status. In the case of a problem related to the power supply, you should check the information in the section “Power Supply Problems” in Chapter 7 for additional information.

Table 6-1 gives a list of MSC messages and an explanation of what the impacts may be.

Table 6-1 MSC Messages

Error Message Meaning of Message

SYS OK The system is operating normally.

R PWR UP The system is being powered on remotely via the MSC’s serial connection.

TEMP OK The system temperature is within normal operating parameters.

PSTMP OK The power supply operating temperature is OK.

POWER UP The system is being powered on from the front panel switch.

PFW FAIL The power supplied to the system has failed or dropped below acceptable parameters. The system has shut down.

PS OT FL The system’s power supply temperature has exceeded safety limits and the system has shut down.

PS FAIL The internal power supply has failed and the system has shut down.

OVR TEMP The system’s temperature has exceeded acceptable limits and the system has shut down.

KEY OFF The MSC’s switch has been turned to standby.

RESET The Controller’s switch has been turned to the diagnostic position and the reset button pushed.

NMI The Controller’s switch has been turned to the diagnostic position and the non-maskable interrupt (NMI) button pushed.

M FAN FL More than one fan has failed and the system has shut down.

67 Chapter 6: Using the Module System Controller

Table 6-1 (continued) MSC Messages

Error Message Meaning of Message

R PWR DN The system has been powered off from a remote location.

PWR CYCL The system has received the command to power cycle from the console or a remote user.

HBT TO The system has registered a heartbeat time-out. A non-maskable interrupt is generated, followed by a system reset.

FAN FAIL A system fan has failed. If it is fan 1, 2, or 3, the system shuts down. A service call should be placed as soon as possible.

PS HITMP The internal power supply unit is running at higher than normal temperatures.

POK FAIL A power OK failure occurred on an unidentified board.

POK N 0 A power OK failure occurred on the first node board.

POK N 1 A power OK failure occurred on the second node board.

POK N 2 A power OK failure occurred on the third node board.

POK N 3 A power OK failure occurred on the fourth node board.

POK RT 0 A power OK failure occurred on the first router board.

POK RT 1 A power OK failure occurred on the second router board.

SP INT 1 The MSC’s firmware generated a spurious timer interrupt signal.

SP INT 2 The MSC’s firmware generated a spurious clock signal.

68 Chapter 7 7. Basic Troubleshooting

This chapter contains hardware-specific information that can be helpful if you are having trouble with your SGI 2100 system. This information is provided in addition to the module System Controller (MSC) information provided in the previous chapter.

This chapter is intended to give you some basic guidelines to help keep your hardware and the software that runs on it in good working order.

General Guidelines

To keep your system in good running order, follow these guidelines: • Do not enclose the system in a small, poorly ventilated area (such as a closet), crowd other large objects around it, or drape anything (such as a jacket or blanket) over the system. • Do not connect cables or add other hardware components while the system is turned on. • Do not leave the front panel key switch in the diagnostic position. Note: There is clearance provided for the front panel to close while a key is inserted into the MSC. However, the door may snag on any additional keys you have attached to the MSC’s main key. • Do not lay the system on its side. • Do not power off the system frequently; leave it running over nights and weekends, if possible. If a system console terminal is installed, it can be powered off when it is not being used. • Do not place liquids, food, or extremely heavy objects on the system. • Ensure that all cables are plugged in completely. • Ensure that the system has power surge protection.

69 Chapter 7: Basic Troubleshooting

Operating Guidelines

When your system is up and running, follow these operational guidelines: • Do not turn off power to a system that is currently started up and running software. • Do not use the root account unless you are performing administrative tasks. • Make regular backups (weekly for the whole system, nightly for individual users) of all information. • Keep two sets of backup tapes to ensure the integrity of one set while doing the next backup. • Protect the root account with a password: • Check for root UID = 0 accounts (for example, diag) and set passwords for these accounts. • Consider giving passwords to courtesy accounts such as guest and lp. • Look for empty password fields in the /etc/passwd file.

If the behavior of your system is marginal, or faulty, first do a physical inspection using the checklist below. If all of the connections seem solid, go to the previous chapter and use the MSC to try and isolate the problem. If the problem persists, run the diagnostic tests from the System Maintenance menu or PROM Monitor. See the IRIX Admin: System Configuration and Operation manual for more information about diagnostic tests.

If this does not help, contact your system administrator or service provider.

Check every item on this list: • The terminal and MSC power switches are turned on. • The main system power switch is not turned to off. • The fans are running and the fan inlets/outlets are not blocked. • The MSC display for a fault message or warning.

Before you continue, shut down the system and turn off the power.

70 Power Supply Problems

Check all of the following cable connections: • The terminal power cable is securely connected to the terminal at one end and the power source at the other end. • The system power cable is securely connected to the main unit at one end and plugged into the proper AC outlet at the other end. • The Ethernet cable is connected to the connector port labeled Ethernet. • Serial port cables are plugged securely into their corresponding connectors. • All cable routing is safe from foot traffic.

If you find any problems with hardware connections, correct them and turn on the power to the main unit. The MSC may help to determine if internal system problems exist.

Power Supply Problems

The power supply in your SGI 2100 is not considered an end-user replaceable component. There are certain basic checks you can make to determine if a system problem is related directly to the power supply.

If the system will not power on at all, check the following: • Confirm that the system circuit breaker is up (in the On position). • Check to make sure the power cable is firmly plugged in at both the system connector and the wall socket. • Remove the front cover and confirm that the cable connecting the power supply to the fan tray is secure.

In some cases the power supply may be unable to supply enough voltage to meet system requirements. When the MSC indicates a power supply related problem, you can remove the front cover and check the status of the three LEDs on the front of the power supply. For help on properly removing the front cover, see “Removing the System’s Plastic Covers” in Chapter 3.

71 Chapter 7: Basic Troubleshooting

The Amber (Yellow) LED

The amber LED on the power supply (also known as the AC_OK indicator) lights when the AC input voltage is applied and the system circuit breaker is in the On position.

If the amber LED is not lit, you should check the following: • The AC outlet • The system power cord and power switch • The fan tray to power supply cable

If none of these items is a problem, check the other LEDs on the power supply for any indications.

The Green LED

The green LED indicator (also known as the Power Good indicator) lights when power supply outputs are within specification.

If this LED starts to blink on and off, it is a warning that the supply is overloaded. This may indicate a condition such as a 110 volt system that is overloaded with too many node boards or other options. In this case, contact your service provider for information and assistance.

The Red LED

The red LED (also known as the Fault indicator) lights up whenever the power supply shuts off because of insufficient air flow, or when a system over temprature shutdown occurs.

A blinking condition on this LED indicates that an undervoltage condition exists. It means that the supply has dropped below acceptable limits in either the +3.45, +5, or +12 volt ranges. The supply can be reset by power-cycling the system. Note that this could be a symptom of other problems, contact your service provider for additional information.

72 Crash Recovery

Crash Recovery

To minimize data loss from a system crash, back up your system daily and verify the backups. Often a graceful recovery from a crash depends upon good backups.

Your system may have crashed if it fails to boot or respond normally to input devices such as the keyboard. The most common form of system crash is terminal lockup—your system fails to accept any commands from the keyboard. Sometimes when a system crashes, data is damaged or lost.

Before going through a crash recovery process, check your terminal configuration and cable connections. If everything is in order, try accessing the system remotely from another workstation or from the system console terminal (if present).

If none of the solutions in the previous paragraphs is successful, you can fix most problems that occur when a system crashes by using the methods described in the following paragraphs. You can prevent additional problems by recovering your system properly after a crash.

The following sections present several ways to recover your system from a crash. The simplest method, rebooting the system, is presented first. If that fails, go on to the next method, and so on. These sections are an overview of the different crash recovery methods.

Rebooting the System

Rebooting usually fixes problems associated with a simple system crash.

Restoring System Software

If you do not find a simple hardware connection problem and you cannot reboot the system, a system file might be damaged or missing. In this case, you need to copy system files from the installation source to your hard disk. Some site-specific information might be lost.

73 Chapter 7: Basic Troubleshooting

Restoring From Backup Tapes

If restoring system software fails to recover your system fully, you must restore from backup tapes. Complete and recent backup tapes contain copies of important files. Some user- and site-specific information might be lost. Read the following section for information on file restoration.

Restoring a Filesystem From the System Maintenance Menu

If your root filesystem is damaged and your system cannot boot, you can restore your system from the System Maintenance Menu. This is the menu that appears when you interrupt the boot sequence before the operating system takes over the system. To perform this recovery, you need two different tapes: your system backup tape and a bootable tape with the miniroot.

If a backup tape is to be used with the System Recovery option of the System Maintenance Menu, it must have been created with the System Manager or with the Backup command, and must be a full system backup (beginning in the root directory (/) and containing all the files and directories on your system). Although the Backup command is a front-end interface to the bru command, Backup also writes the disk volume header on the tape so that the “System Recovery” option can reconstruct the boot blocks, which are not written to the tape using other backup tools. For information on creating the system backup, see the IRIX Admin: Backup, Security, and Accounting manual.

If you do not have a full system backup made with the Backup command or System Manager —and your root or usr filesystems are so badly damaged that the operating system cannot boot—you have to reinstall your system.

If you need to reinstall the system to read your tapes, install a minimal system configuration and then read your full system backup (made with any backup tool you prefer) over the freshly installed software.

This procedure should restore your system to its former state.

Caution: Existing files of the same pathname on the disk are overwritten during a restore operation, even if they are more recent than the files on tape. 1. Start the system and you should see a message like the following: Starting up the system.... To perform system maintenance instead, press

74 Crash Recovery

2. Press the key. You see the following menu: System Maintenance Menu 1 Start System 2 Install System Software 3 Run Diagnostics 4 Recover System 5 Enter Command Monitor 3. Enter the numeral 4 and press . You see this message: System Recovery... Press Esc to return to the menu. After a few moments, you see the message: Insert the installation tape, then press : 4. Insert your bootable tape and press the key. You see some messages while the miniroot is loaded. Next you see the message: Copying installation program to disk.... Several lines of dots appear on your screen while this copy takes place. 5. You see this message: CRASH RECOVERY You may type sh to get a shell prompt at most questions. Remote or local restore: ([r]emote, [l]ocal): [l] 6. Press for a local restoration. If your tape drive is on another system accessible by the network, press r and then . You are prompted for the name of the remote host and the name of the tape device on that host. If you press to select a local restoration, you see this message Enter the name of the tape device: [/dev/tape] You may need to enter the exact device name of the tape device on your system, since the miniroot may not recognize the link to the convenient /dev/tape filename. As an example, if your tape drive is drive #6 on your integral SCSI bus (bus 0), the most likely device name is /dev/rmt/tps0d6nr. If it is drive #3, the device is /dev/rmt/tps0d3nr. The system prompts you to insert the backup tape. When the tape has been read back onto your system disk, you are prompted to reboot your system.

75 Chapter 7: Basic Troubleshooting

Recovery After System Corruption

From time to time you may experience a system crash caused by file corruption. Systems cease operating (“crash”) for a variety of reasons. Most common are software crashes, followed by power failures of some sort, and least common are actual hardware failures. Regardless of the type of system crash, if your system files are lost or corrupted, you may need to recover your system from backups to its pre-crash configuration.

Once you repair or replace any damaged hardware, you are ready to recover the system. Regardless of the nature of your crash, you should refer to the information in the section “Restoring a Filesystem from the System Maintenance Menu” in the IRIX Admin: Backup, Security, and Accounting manual.

The System Maintenance Menu recovery command is designed for use as a full backup system recovery. After you have done a full restore from your last complete backup, you may restore newer files from incremental backups at your convenience. This command is designed to be used with archives made using the Backup utility or through the System Manager. The System Manager is described in detail in the Personal System Administration Guide. System recovery from the System Maintenance Menu is not intended for use with the tar, cpio, dd,ordump utilities. You can use these other utilities after you have recovered your system.

You may also be able to restore filesystems from the miniroot. For example, if your root filesystem has been corrupted, you may be able to boot the miniroot, unmount the root filesystem, and then use the miniroot version of restore, xfs_restore, bru, cpio, or tar to restore your root filesystem. Refer to the reference (man) pages on these commands for details on their application.

Refer to the IRIX Admin: System Configuration and Operation manual for instructions on good general system administration practices.

76 MSC Shutdown

MSC Shutdown

Under specific circumstances the MSC may shutdown the system. Usually this occurs when the operating environment becomes too warm due to fan failure, high ambient temperatures, or a combination of the two.

The MSC automatically shuts down the system and lights the “Over Temperature Fault” LED if any of the following situations occur: • Failure of two or more of the system’s nine fans. • Failure of one fan plus a high ambient temperature. • Failure of any (critical) fan directly responsible for cooling the power supply or a router board. • An unacceptably high ambient temperature.

Only the last situation can be dealt with completely by the end user. The first three require a service call by a qualified support technician.

Fixing the MSC Shutdown

If you determine that a critical fan or fans have failed, you should immediately place a service call. The system is not usable until the faulty fan(s) are replaced.

If the problem involves the combined failure of a single non-critical fan and a high ambient temperature, you should place a service call. You may be able to keep the system running by lowering the ambient temperature of the operating environment while waiting for service.

To lower the ambient temperature around the system, try these methods: • Lower the air conditioning temperature. • Move the system to a cooler environment. • Use a portable fan(s) to circulate more air around the system. • Use a portable air-conditioner to lower the temperature of the system.

If the problem is simply a high ambient temperature, you will need to either lower the work environment temperature, or move the system to an area with a lower ambient temperature.

77 Chapter 7: Basic Troubleshooting

Hardware Graph and hinv Commands

If you are having trouble determining what options and standard components are installed in your SGI 2100, you may wish to use one or several of the commands listed in the next sections.

Hardware Graph Information

The hardware graph is a tool for inventorying the I/O devices of the SGI 2100 system. Unlike hinv, the hardware graph is a UNIX® filesystem, whose branching character accommodates the possibility of multiple nodes, each with multiple I/O devices of several types. The hardware graph keeps track of information in the kernel that is associated with the hardware.

Most of the hardware graph directories are much like their /dev counterparts, but module numbers are persistent across reboots and hardware changes (until you change the module numbers).

To see the hardware graph, use the ls command. For example: # ls /hw console mem module rdisk ttys scsi_ctlr unknown disk kmem mmem null scsi ttys zero

In this output, module, rdisk, ttys, scsi, scsi_ctrl, and ttys are subdirectories containing files. For example: # ls /hw/ttys tty4d1 tty4f1 tty4m1 ttyc1 ttyd1 ttyf1 ttym1 tty4d2 tty4f2 tty4m2 ttyc2 # ls /hw/scsi sc1d2l0 # ls /hw/rdisk dks1d2s0 dks1d2vh root volume_header dks1d2s1 dks1d2vol swap # ls /hw/scsi_ctlr 0 1

78 Hardware Graph and hinv Commands

To determine I/O devices within a system, follow the directory structure. For example: # ls /hw/module/1/slot/n4/node/link/cpu 0 1 # ls /hw/module/1/slot/n4/node/link/xtalk 0 hinv Information

Use the hinv command to obtain basic information regarding the general configuration of your system. You should see output similar to the following (although it varies from system to system, depending upon how each system is configured): # hinv System SGI-IP27 4 250 MHZ IP27 Processors Main memory size: 512 Mbytes Integral SCSI controller 0 Integral SCSI controller 1 Integral Fast Ethernet IOC3 serial port Disk drive: unit 1 on SCSI Controller 0, (dksc(0,1,0)) >> hinv -v IP27 Node Board, Module 1, Slot n1 ASIC HUB Rev 3, 100 MHz, (nasid 0) Processor A: 250 MHz R10000, Rev 3.4, 4M 250MHz secondary cache, (cpu 0) R10000FPC Rev 0 Processor B: 250 MHz R10000, Rev 3.4, 4M 250MHz secondary cache, (cpu 1) R10000FPC Rev 0 Memory on board, 64 MBytes (Standard) Bank 0, 64 MBytes (Standard) <-- (Physical Bank 0) IP27 Node Board, Module 1, Slot n2 ASIC HUB Rev 5, 100 MHz, (nasid 1) Processor A: 250 MHz R10000, Rev 3.4, 4M 250MHz secondary cache, (cpu 2) R10000FPC Rev 0 Processor B: 250 MHz R10000, Rev 3.4, 4M 250MHz secondary cache, (cpu 3) R10000FPC Rev 0 Memory on board, 512 MBytes (Standard) Bank 0, 128 MBytes (Standard) <-- (Physical Bank 0) Bank 1, 128 MBytes (Standard) Bank 2, 128 MBytes (Standard) Bank 3, 128 MBytes (Standard) BASEIO IO Board, Module 1, Slot io1 ASIC BRIDGE Rev 3, (widget 8)

79 Chapter 7: Basic Troubleshooting

adapter PCI-SCSI Rev 5, (pci id 0) peripheral SCSI DISK, ID 1, SGI IBM DORS-32160W adapter PCI-SCSI Rev 5, (pci id 1) adapter IOC3 Rev 1, (pci id 2) controller multi function SuperIO controller Ethernet Rev 1 ASIC XBOW Rev 2, on midplane of Module 1 ASIC XBOW Rev 2, on midplane of Module 1

80 Glossary

100-Base-TX Twisted-pair variant of 100BASE-X. Uses the physical characteristics of FDDI’s TP-PMD, but uses Ethernet framing and CSMA/CD. Used in Silicon Graphics Origin and Onyx2 systems.

100-Base-X 100Mbps CSMA/CD 802.3/Ethernet-like LAN also known as Fast Ethernet. There are two types: 100BASE-TX (used in Silicon Graphics Origin and Onyx2 systems) and 100BASE-T4. byte A measure of data equal to 8 bits. See also kilobyte, megabyte, and gigabyte. cache Memory used exclusively by the CPU for temporary storage and calculations. channel A specific I/O bus; typically used to describe SCSI bus numbers (for example, SCSI channel 1). See also controller. chassis The metal framework to which components of a computer are attached. controller Literally the circuitry (typically an ASIC) that controls a bus or device. Also used to describe a SCSI bus as a synonym for “channel.” See also channel. dual in-line memory module (DIMM) A module to which are attached SDRAM chips.

81 Glossary

ECC Error correction code; used in memory to correct for single-bit errors.

electromagnetic interference (EMI) Electromagnetic radiation that is produced by electronic equipment, and that can interfere with radio and television reception and cause problems with other electronic devices. Computer equipment is designed to contain EMI to various extents as determined by regulatory agencies.

fabric A method of interconnecting CPUs and memory whereby CPUs can communicate with other CPUs and access memory associated with other CPUs without using a single bus.

Fast-20 The specification for SCSI-2, on a 16-bit bus, running at 20 MHz, and capable of 40 megatransfers per second. See also Ultra SCSI.

gigabyte (GB) A measure of data equal to 1024 megabytes (MB). See also byte, kilobyte, and megabyte.

hertz (Hz) Frequency in cycles per second.

kilobyte (KB) A measure of data equal to 1024 bytes. See also byte, megabyte, and gigabyte.

kilogram (kg) A unit of weight equal to 1000 grams (about 2.2 pounds). See also pound.

MAC address Unique hexadecimal serial number assigned to each Ethernet network device to identify it on the network. With Ethernet devices (as with most other network types), this address is permanently set at the time of manufacture. Each Ethernet device has a unique MAC address so that it can exclusively copy packets that are meant for it off the network.

megabyte (MB) A measure of data equal to 1024 kilobytes (KB). See also byte, kilobyte, and gigabyte.

82 Glossary

megatransfer One million bytes of combined SCSI commands and data transferred across a bus. module An SGI 2100 chassis, CPU, memory, drives, and I/O interfaces contained in a single chassis. module system controller (MSC) The component that contains logic to control low-level system functions, such as power-on, power-off, and fan speed. number in a can (NIC) A device mounted on a circuit board that contains information such as the serial number of the board and other information. In Origin200 systems, the NIC mounted on the module system controller of the master CPU module contains the system serial number. node (node board) A board that contains one or two processors and their caches, a section of the global main memory, a crosstalk I/O port, and an inter-node routing network connection. peripheral component interconnect (PCI) A bus specification. pound A unit of weight equal to 16 ounces (about 0.64 kilograms). See also kilogram. programmable read-only memory (PROM) Memory that contains data, often configuration information and low-level programs, that usually is not updated by the system. Certain types of PROMs can be updated, for example, during software installation.

RAID Redundant array of inexpensive disks. random access memory (RAM) Memory used for calculations and temporary storage. See also cache.

83 Glossary

read-only memory (ROM) Memory that contains data (often configuration information and low-level programs) that cannot be updated by the system.

SCSI Small computer systems interface. See also channel and controller.

system A general term for all the components of a computer. In the case of an SGI 2100 system, this consists of one or more node boards contained withing a single chassis. See also node.

system console A device to which IRIX prints various messages, and which is the first serial I/O device to become active when the system boots. In the SGI 2100 server, the system console is connected to serial port 1.

terminal emulation The process by which software emulates a character-based (ASCII) terminal. See also terminal emulator.

terminal emulator Software that emulates a character-based (ASCII) terminal. Typically used with modems for remote, dial-up access, a terminal emulator can also be used over a serial cable connected directly between two computers. Terminal emulators are often called a “modem software” or “communication software.”

Ultra SCSI The specification for SCSI-2, on a 16-bit bus, running at 20 MHz, and capable of 40 megatransfers per second. See also Fast-20.

Uniform Resource Locator (URL) The address by which resources (files) are identified on the World Wide Web.

XIO An SGI high-speed bus technology; provides greater bandwidth than PCI.

84 Index

Numbers controlled shutdown, 13 controls and components, 12 3.5-inch disk bays, 51 crash recovery, 73 critical fan, 64 A D acoustics typical dBa, 10 DC OK LED, 65 additional node boards, 1 depth of system, 10 air clearances, 8 directory memory, 5 altitude disk drive bays, 12 maximum non-operating, 10 maximum operating, 10 distributed memory, 3 amber warning LED, 64 documentation available via the World-Wide Web, xv ambient incoming air, 65 finding additional manuals on the World-Wide apropos command, xiv Web, xv release notes, xiv C E ccNUMA, 3 CD-ROM eight-character LED readout, 67 location of, 12 environmental monitoring, 60 CD-ROM drive, 56 environmental temperature, 64 commands external devices, 27 apropos, xiv grelnotes, xiv makewhatis, xiv F man, xiv relnotes, xiv fan failure detection, 64

85 Index

floor loading features and functions, 65 maximum, 10 location of, 59 minimum, 10 messages from, 67 rear-mounted connector, 62 serial connector, 63 G system shutdown by, 77 Module System Controller (MSC) front panel, 12 grelnotes command, xiv monitoring environmental, 60 MSC (Module System Controller), 12 H multi-dimensional mesh, 3 height of system, 10 Multi-module System Controler (MMSC), 59 humidity non-operating, 10 N operating, 10

NMI switch, 64 I node board, 1 node board LEDs, 18 internal hard drives, 7 node board slots, 17 I/O subsystem, 6 non-critical fan, 64 non-maskable interrupt, 65 non-operating temperature, 10 L

LEDs, 17 O

office environment, 9 M operating temperature, 10 main memory slots, 4 optional boards, 21 main power connector, 15 options, 1 makewhatis command, xiv man command, xiv P messages from Module System Controller (MSC), 67 midplane, 7, 18 page migration hardware, 4 minimum air clearance, 8 physical dimensions, 9 Module System Controller (MSC), 12 physical location

86 Index

requirements for chassis, 7 depth, 10 physical specifications, 9 floor loading, 10 power supply LEDs, 71 height, 10 humidity tolerances of, 10 power switch, 15 Module System Controller (MSC), 59 power supply LEDs, 71 recovering from a crash, 73 R shutdown by Module System Controller (MSC), 77 temperature range of, 10 recovering from a crash, 73 weight, 10 release notes width, 10 how to view, xiv system disk, 51 relnotes command, xiv system reset switch, 65 removable keyswitch, 66 system-wide reset, 64 removable media drive, 12 removing a disk drive, 52 replacing the Module System Controller (MSC), 56 T router boards, 21 RS-232 standard, 46 Technical Publications Library on the World-Wide Web, xv temperature S non-operating, 10 operating, 10 scale the system bandwidth, 3 troubleshooting, 69 SCA-ready disk drive, 51 general guidelines, 69 second-level cache support, 4 operating guidelines, 70 shared memory, 3 single-connector assembly (SCA), 51 W size of system, 10 soft power-off, 65 Weight, 10 soft power-off commands, 66 weight specifications maximum, 10 physical and enviornmental, 9 minimum, 10 symmetric multiprocessing, 3 World-Wide Web system documentation available via, xv acoustics (dBa), 10 Silicon Graphics URL (address), xv altitude range of, 10 controls and components, 12

87

Tell Us About This Manual

As a user of Silicon Graphics products, you can help us to better understand your needs and to improve the quality of our documentation.

Any information that you provide will be useful. Here is a list of suggested topics: • General impression of the document • Omission of material that you expected to find • Technical errors • Relevance of the material to the job you had to do • Quality of the printing and binding

Please send the title and part number of the document with your comments. The part number for this document is 007-4114-001.

Thank you!

Three Ways to Reach Us • To send your comments by electronic mail, use either of these addresses: – On the Internet: [email protected] – For UUCP mail (through any backbone site): [your_site]!sgi!techpubs • To fax your comments (or annotated copies of manual pages), use this fax number: 650-932-0801 • To send your comments by traditional mail, use this address: Technical Publications Silicon Graphics, Inc. 1600 Amphitheatre Parkway, M/S 535 Mountain View, California 94043-1351