Validation of an HPC Cluster: A Sometimes Neglected Aspect of System Administration walk through of methods and procedures

Michael Hebenstreit INTEL® corp. CRT Datacenter, Senior Cluster Architect

tut118 1 SC2010

1 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

2 SC2010

2 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

3 SC2010

3 CRT Datacenter Challenges • Support for variety – Multitude of different hardware architectures – Early access often leads to alpha and beta systems used in cluster configuration • Support for different customers – OEMs, End users, ISVs – Some want their own configuration – Manage access while preserving security of data for each user – Protect the internal network and Intel IP from external disclosure • Support for scaling – Often requires exclusive period due to custom configurations – Remove compute nodes out of circulation for the duration of the project

4 SC2010 CRT-DC cluster configuration

Panasas*

Force10* network /home QDR IB

long-term 360 64 storage admin1 Urbanna Supermicro* admin2 compute compute DDN* pbs-serv1 nodes Nodes Lustre pbs-serv2 24 GB RAM 24 GB RAM 400 GB SAS HD 500 GB SATA HD LFS4 login 2 (HDD) login

compile LFS5 (SSD) 1GbE network QDR InfiniBand network

5 SC2010 Exemplary Configurations

•Nodes – 360 Intel SR1600UR: Xeon® X5670 (WSM),2.93 GHz,12cores/node,24 GB – 64 Supermicro 6026T-NTR+: 34 Xeon® X5560 (NHM,2.8GHz,8 cores/node), 40 Xeon® X5677 (WSM,3.47GHz,8 cores/node), all 24 GB • Cluster File System – Panasas *(70 TB storage) – DDN* Lustre (28 TB storage) – HDD Lustre (23 TB storage) – SSD Lustre (3 TB storage) • Distributed GigE: – Force10* Networks -300 backbone, Force10 Networks S50N top-of-rack

• Distributed InfiniBand*: – Mellanox* MTS3600Q, 18 spine, 28 leaf switches, 504 ports •Software stack: – RedHat* EL5, OFED 1.3+,Lustre 1.6.4.3+ • has been on Top 500 since June 2006 (best ranking #68, worst #153)

6 SC2010 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Window* • Commercial solutions

7 SC2010

7 Classification • Hardware and software defects: systems dead or does not operate correctly • Inconsistencies: configuration (config files, installed rpms…) are not identical across the cluster • Degradation: system performs correctly but lost performance ->keep log files

8 SC2010 The Toolbox • Executing commands in parallel – pdsh* • Consolidating pdsh output – dshbak* • cat, grep, sum, sed, awk… • shell scripting • advanced programming languages like Python* or Perl*

pdsh homepage: http://sourceforge.net/projects/pdsh

9 SC2010

redirect – To send the output of a file or command into another file [user]$ echo "\"To err is human -" > text1 [user]$ echo "and to blame it on a computer is even more so."\" > text2 [user]echo "Robert Orben" > text3 ------cat (concatenate) Displays the contents of one or more files to standard output. It is most commonly used to display a single file to a monitor. [user]$ cat text1 "To err is human – [user]$ cat text2 and to blame it on a computer is even more so." [user]$ cat text3 Robert Orben [user]$ cat text1 text2 text3 "To err is human - and to blame it on a computer is even more so." Robert Orben [user]$ cat text1 text2 text3 > text4 [user]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben ------grep – Used to find a text pattern within a file and return the line(s) containing the pattern. Most commonly used to find a word, but can find a character, phrase, sentence or any regular expression. [user]$ grep computer text4 and to blame it on a computer is even more so." grep –i Because grep is case sensitive, -i is used to ignore case [user]$ grep to text4 and to blame it on a computer is even more so." [user]$ grep –i to text4 "To err is human - and to blame it on a computer is even more so." grep –c To count the number of lines which contain the expression being grep’d. [user]grep -c is text4 2 redirect – To send the output of a file or command into another file [smartuser@server1~]$ echo "\"To err is human -" > text1 [smartuser@server1~]$ echo "and to blame it on a computer is even more so."\" > text2 [smartuser@server1~]echo "Robert Orben" > text3

------cat (concatenate) Displays the contents of one or more files to standard output. It is most commonly used to display a single file to a monitor.

[smartuser@server1~]$ cat text1 "To err is human – [smartuser@server1~]$ cat text2 and to blame it on a computer is even more so." [smartuser@server1~]$ cat text3 Robert Orben

[smartuser@server1~]$ cat text1 text2 text3 "To err is human - and to blame it on a computer is even more so." Robert Orben

[smartuser@server1~]$ cat text1 text2 text3 > text4

[smartuser@server1~]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben

------grep – Used to find a text pattern within a file and return the line(s) containing the pattern. Most commonly used to find a word, but can find a character, phrase, sentence or any regular expression.

[smartuser@server1~]$ grep computer text4 and to blame it on a computer is even more so." grep –i Because grep is case sensitive, -i is used to ignore case [smartuser@server1~]$ grep to text4 and to blame it on a computer is even more so."

[smartuser@server1~]$ grep –i to text4 "To err is human - and to blame it on a computer is even more so." grep –c To count the number of lines which contain the expression being grep’d. [smartuser@server1~]grep -c is text4 2 grep –v To search for lines which do not contain the expression [smartuser@server1~]grep -v is text4 Robert Orben grep –q Searches and quietly exits if the expression is found. When the grep is finished, the exit code is set to the variable $?. If we echo $?, we will see if the expression is present or not. Succcess = 0, Failure = 1. This is useful in “if” statements to avoid confusing output to a user. [smartuser@server1~]grep -q man text4; echo $? 0 [smartuser@server1~]grep -q woman text4; echo $? 1 ------sum – Computes a 16-bit checksum for each given file and counts the blocks each file occupies. This is calculated after a file transfer and compared to the checksum of the original file to ensure file integrity.

[smartuser@server1~]$ sum text4 05333 1

[smartuser@server1~]$ sum text1 text2 text3 24872 1 text1 63331 1 text2 20594 1 text3

------awk (printing a specific column) – awk generally is used to search output or a file for a pattern and then manipulate it. When awk finds a specified pattern in a line, it assigns each part of that line to unique variables, e.g. $1 $2 $3 $4 $NF. The smart user can then manipulate the values by using the variables.

[smartuser@server1~]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben

[smartuser@server1~]$ err to Orben

To limit the output we can use an option telling awk to only consider the line that begins with “and” [smartuser@server1~]$ awk /^and/'{print $3" "$6" "$7}' text4 blame a computer

piping with "|" – The pipe lets us direct output from one command directly into another. So here is another way to get to the same output.

[smartuser@server1~]$ grep blame text4 | awk '{print$3" "$6" "$7}' blame a computer

------sed (changing text) – sed is most useful for making text transformations on an input stream, whether from a file or a pipeline. The single quotes contain the logic sed is to follow, s = substitute, computer is the expression to find and dog is the expression to put in it’s place, g means global and tells sed not to stop at the first occurrence, but to make the change anywhere in the file where the expression “computer” occurs. [smartuser@server1~]$ grep blame text4 | awk '{print$3" "$6" "$7}' | sed 's/computer/dog/g' blame a dog

Or

awk /^and/'{print $3" "$6" "$7}' text4 | sed 's/computer/cat/g' [smartuser@server1~]$ awk /^and/'{print $3" "$6" "$7}' text4 | sed 's/computer/cat/g' blame a cat

For fun [smartuser@server1~]OTHERS="horse pig mouse goat" [smartuser@server1~]echo $OTHERS horse pig mouse goat

[smartuser@server1~]for i in $OTHERS; do awk /^and/'{print"You should "$3" "$6" "$7}' text4 | sed 's/computer/'$i'/g';done You should blame a horse You should blame a pig You should blame a mouse You should blame a goat

sort is used to sort either alphabetically or numerically. If you | standard output to sort, you will see sorted results on your monitor. You can > the output of sort into a file or | it to another command. [smartuser@server1~]for i in $OTHERS; do awk /^and/'{print"You should "$3" "$6" "$7}' text4 | sed 's/computer/'$i'/g';done | sort You should blame a goat You should blame a horse You should blame a mouse You should blame a pig

uniq is used to manage successive identical lines. Most commonly it is used to omit duplicate lines from standard output. Piping to sort so uniq will find the identical lines adjacent to each other.

[smartuser@server1~]cat other horse pig mouse goat horse pig mouse

[smartuser@server1~]for i in `cat other`; do awk /^and/'{print"You should "$3" "$6" "$7}' text4 | sed 's/computer/'$i'/g'; done | sort | uniq You should blame a goat You should blame a horse You should blame a mouse You should blame a pig Conventions • Examples are in Courier, output will appear in black, commands entered are set in bold blue • Commands usable without special privileges are prefixed by [user]$, those requiring administrative privileges by [root]#

[user]$ ls -l /etc/shadow -r------1 root root 29218 Aug 10 16:42 /etc/shadow

[root]# sum /etc/shadow 47530 29

10 SC2010 Executing commands in parallel • Requirements: – grouping of nodes – ssh based – options for timeout and fanout – leading optional nodename is helpful • pdsh is one (but not only) solution

[user]$ pdsh -w et01,et03,et[10-15] -f 2 -u 5 'date; sleep 1' et01: Fri Aug 6 17:36:48 PDT 2010 et03: Fri Aug 6 17:36:48 PDT 2010 et10: Fri Aug 6 17:36:49 PDT 2010 et11: Fri Aug 6 17:36:49 PDT 2010 et13: Fri Aug 6 17:36:50 PDT 2010 et14: Fri Aug 6 17:36:52 PDT 2010 et15: Fri Aug 6 17:36:53 PDT 2010 pdsh@eln1: et12: command timeout sending SIGTERM to ssh et12 pid 27323 pdsh@eln1: et12: ssh exited with exit code 0

11 SC2010 Consolidating pdsh output: dshbak -c • Requirements: – consolidate into groups of equal output – lists of nodes compatible with pdsh

[user]$ pdsh -w et[01,03,10-11,13-15] pwd | dshbak -c ------et[01,03,10-11,13-15] ------/home/user1

[user]$ pdsh -w et[01,03,10-11,13-15] 'ps aux | wc -l' | dshbak -c ------et[11,13,15] ------306 ------et[01,03,10,14] ------308 12 SC2010 Connecting to the Linux cluster • Using Putty*: – enter IP address into the “Hostname” field – press open – enter username – enter password

Putty* homepage: http://www.chiark.greenend.org.uk/~sgtatham/putty/ 13 SC2010 How to protect against disconnects • Use screen*: –to start type “screen” – to detach press “ctrl-a d” – to create a new window press “ctrl-a c” – to switch to window 0 press “ctrl-a 0” – to switch to window 1 press “ctrl-a 1” – to reconnect after a connection loss type “screen -x” (lower case x)

14 SC2010 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

15 SC2010

15 Components of cluster

network

/home IB

long-term compute storage admin1 nodes admin2 HPFS pbs-serv1 pbs-serv2

login compile

1GbE network QDR InfiniBand network

16 SC2010 Components of a node

IB CARD

Network Motherboard adapter RAM Power supply BIOS BMC

RaidController CPU

DISK Fans

17 SC2010 CPU - /proc/cpuinfo • contains information on all processors in the system • particularly interesting: “model name”, “cpu MHZ”, “cache size” • Note: Hyperthreading can not be directly discerned

[user]$ pdsh -w et[60-78] -u 3 'grep MHz /proc/cpuinfo | sort | uniq' | dshbak -c ------et[60-64,69-73,77] ------cpu MHz : 3458.000 ------et[65-68] ------cpu MHz : 2926.000 ------et[74-76,78] ------cpu MHz : 2793.000 18 SC2010

[user]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz stepping : 6 cpu MHz : 2400.000 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5585.99 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: ……. Methodology of Cluster Validation • divide cluster into groups of identical nodes • run series of tests watching for differences – or simply incorrect values • if a new problem appears, include a test to check for it • re-validate once per week, after changes and after reboots

et[65-68] ------cpu MHz : 2926.000 ------et[74-76,78] ------cpu MHz : 2793.000

19 SC2010 More elaborate Example /proc/cpuinfo shows different #cores – because HT is off

[user]$ cat parse_cpuinfo.sh grep "model name" /proc/cpuinfo | sort | uniq FREQS=`grep "cpu MHz[[:space:]]*:" /proc/cpuinfo | sort | uniq | awk '{print $NF}'` for I in $FREQS do COUNT=`grep -c "cpu MHz[[:space:]]*: $I" /proc/cpuinfo` echo " $COUNT cores at $I MHz" done

[user]$ pdsh -w et[60-64,69-73] -u 3 'sh parse_cpuinfo.sh' | dshbak -c ------et[73] ------model name : Intel(R) Xeon(R) CPU X5677 @ 3.47GHz 8 cores at 3458.000 MHz

------et[60-64,69-72] ------model name : Intel(R) Xeon(R) CPU X5677 @ 3.47GHz 16 cores at 3458.000 MHz 20 SC2010 Sleep and Turbo states of the CPU • modern CPUs can optimize power consumption according to load • idle cores might go into sleep, busy cores could switch into Turbo mode • information is available on a “per core” basis • require that corresponding kernel modules are loaded

21 SC2010

Redhat* Enterprise Linux: look for kernel module “acpi_cpufreq” Information on Sleep/Turbo states • C-states: cat /proc/acpi/processor/CPU#/power • Intel® SpeedStep: cd /sys/devices/system/cpu/cpu#/cpufreq/ – available frequencies: cat scaling_available_frequencies 2927000 2926000 2793000 … 1729000 1596000 – possible control options: cat scaling_available_governors ondemand userspace performance – currently used frequency: cat scaling_cur_freq 2400000 – currently used controller: cat scaling_governor 22 ondemand SC2010

Examples for C-states: [user]$ cat /proc/acpi/processor/CPU0/power active state: C1 max_cstate: C8 bus master activity: 00000000 states: *C1: type[C1] promotion[C2] demotion[--] latency[000] usage[03603131] duration[00000000000000000000] C2: type[C2] promotion[--] demotion[C1] latency[040] usage[1215642043] duration[00000004126523448417] [user]$ ssh en001 cat /proc/acpi/processor/CPU0/power active state: C1 max_cstate: C1 bus master activity: 00000000 states: *C1: type[C1] promotion[--] demotion[--] latency[000] usage[00000000] duration[00000000000000000000]

Note: in the second example C-states were limited to C1 for performance reasons Controlling Frequency • Governor must be “userspace” • for all CPUs: write the same value into “scaling_setspeed”

[root]# cd /sys/devices/system/cpu/ [user]$ cat cpu*/cpufreq/scaling_governor | sort | uniq ondemand [root]# for I in cpu*/cpufreq/scaling_governor; do echo userspace > $I; done [user]$ cat cpu*/cpufreq/scaling_governor | sort | uniq userspace [user]$ cat cpu*/cpufreq/scaling_cur_freq | sort | uniq 2400000 [root]# for I in cpu*/cpufreq/scaling_setspeed ; do echo 2800000 > $I; done [user]$ cat cpu*/cpufreq/scaling_cur_freq | sort | uniq 2800000

23 SC2010 Memory and swap • /proc/meminfo: information on memory subsystem; important fields: – MemTotal: recognized total memory – MemFree, Buffers, Cached: after boot • /proc/swaps: lists mounted swap space [user]$ pdsh -w et[60-64,69-73] -u 3 'grep SwapTotal /proc/meminfo' | dshbak -c ------et[60-64,69-73] ------SwapTotal: 8393952 kB

[user]$pdsh -w et[60-64,69-73] -u 3 ‘grep MemTotal /proc/meminfo' | dshbak –c ------et[60-64,69-72] ------MemTotal: 24673976 kB ------et73 ------24 MemTotal: 24673984 kB SC2010

[root@eln1 ~]# cat /proc/swaps Filename Type Size Used Priority /dev/sda2 partition 8393952 58672 -1

[root@eln1 ~]# cat /proc/meminfo MemTotal: 65984928 kB MemFree: 44880144 kB Buffers: 56252 kB Cached: 19738156 kB SwapCached: 5724 kB Active: 774872 kB Inactive: 19527424 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 65984928 kB LowFree: 44880144 kB SwapTotal: 8393952 kB SwapFree: 8335284 kB Dirty: 2964 kB Writeback: 0 kB AnonPages: 505000 kB Mapped: 13928 kB Slab: 696548 kB PageTables: 35260 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 41386416 kB Committed_AS: 982696 kB VmallocTotal: 34359738367 kB VmallocUsed: 278100 kB VmallocChunk: 34359458135 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB Motherboard and BIOS tools • usually works without special requirements – lspci –dmidecode • depends on hardware and vendor –ipmitool –syscfg

26 SC2010 lspci • generates a list of PCI devices • lspci can also read configuration space • Note: setpci can MODIFY configuration space [user]$ pdsh -w et[01,03,10-15] '/sbin/lspci | grep InfiniBand' | dshbak -c ------et[01,03,10-15] ------04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 …

[root]# pdsh -w et[01,03,10-15] '/sbin/lspci -xxx -s 04:00.0 ' | dshbak -c ------et[01,03,10-15] ------04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 … 00: b3 15 3c 67 06 04 10 00 a0 00 06 0c 10 00 00 00 10: 04 00 e0 fb 00 00 00 00 0c 00 00 f8 00 00 00 00 … f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 27 SC2010

A typical output might contain:

[root]# lspci 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22) 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 22) 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22) 00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 22) 00:13.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22) 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 22) 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22) 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22) 00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 22) 00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1 00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 5 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0) 06:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) 06:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09) 0a:00.0 IDE interface: JMicron Technology Corp. JMB368 IDE controller 0b:01.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)

The output of configuration space, as example the InfiniBand card. For instance you could A typical output might contain:

[root]# lspci 00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22) 00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22) 00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22) 00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 22) 00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22) 00:09.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 22) 00:13.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22) 00:14.0 PIC: Intel Corporation 5520/5500/X58 I/O Hub System Management Registers (rev 22) 00:14.1 PIC: Intel Corporation 5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22) 00:14.2 PIC: Intel Corporation 5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22) 00:14.3 PIC: Intel Corporation 5520/5500/X58 I/O Hub Throttle Registers (rev 22) 00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4 00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6 00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2 00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1 00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 5 00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1 00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2 00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3 00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) 00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller 00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller 00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller 01:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 01:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01) 04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0) 06:00.0 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge A (rev 09) 06:00.2 PCI bridge: Intel Corporation 6700PXH PCI Express-to-PCI Bridge B (rev 09) 0a:00.0 IDE interface: JMicron Technology Corp. JMB368 IDE controller 0b:01.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)

The output of configuration space, as example the InfiniBand card. For instance you could check (and correct) if the card is correctly initialized as PCI Gen2 card.

[root]# lspci -xxx -s 04:00.0 04:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev a0) 00: b3 15 3c 67 06 04 10 00 a0 00 06 0c 10 00 00 00 10: 04 00 e0 fb 00 00 00 00 0c 00 00 f8 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 b3 15 08 00 30: 00 00 00 00 40 00 00 00 00 00 00 00 0b 01 00 00 40: 01 48 03 00 00 00 00 00 03 9c ff 7f 11 11 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 10 00 02 00 01 8e e8 07 20 10 00 00 82 f4 03 08 70: 00 00 82 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 1f 00 00 00 00 00 00 00 00 00 00 00 90: 02 00 00 00 00 00 00 00 00 00 00 00 11 60 ff 80 a0: 00 c0 07 00 00 d0 07 00 05 00 8a 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Memory refined • common problem: slight differences in recognized memory

[user]$ pdsh -w et[60-64,69-73] -u 3 'awk '\''/MemTotal/{print $1,int($2/1024),"MB"}'\'' /proc/meminfo' | dshbak -c ------et[60-64,69-73] ------MemTotal: 24095 MB

[user]$ cat memtotal.sh #!/bin/sh

awk '/MemTotal/{print $1,int($2/1024),"MB"}' /proc/meminfo

[user]$ pdsh -w et[60-64,69-73] -u 3 sh memtotal.sh | dshbak -c ------et[60-64,69-73] ------MemTotal: 24095 MB

25 SC2010

[root@eln1 ~]# cat /proc/swaps Filename Type Size Used Priority /dev/sda2 partition 8393952 58672 -1

[root@eln1 ~]# cat /proc/meminfo MemTotal: 65984928 kB MemFree: 44880144 kB Buffers: 56252 kB Cached: 19738156 kB SwapCached: 5724 kB Active: 774872 kB Inactive: 19527424 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 65984928 kB LowFree: 44880144 kB SwapTotal: 8393952 kB SwapFree: 8335284 kB Dirty: 2964 kB Writeback: 0 kB AnonPages: 505000 kB Mapped: 13928 kB Slab: 696548 kB PageTables: 35260 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 41386416 kB Committed_AS: 982696 kB VmallocTotal: 34359738367 kB VmallocUsed: 278100 kB VmallocChunk: 34359458135 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB accessing BIOS with dmidecode • lists hardware and vendor information • entire table is large • needs root privileges • highly useful: – BIOS version/date and supported features –Board type – CPU type and features – Memory DIMM size, type, manufacturer…

[root]# pdsh -w et[01,03,10-11,13-15] 'dmidecode | grep Date' | dshbak -c ------et[01,03,10-11,13-15] ------Release Date: 04/19/2010

28 SC2010

dmidecode dumps the Desktop Management Interface (DMI) information (also known as System Management BIOS aka SMBIOS). It contains a list of hardware components as well as system vendor information like mother board type and BIOS version. The information is possibly unreliable, but in most cases highly useful.

# dmidecode 2.10 SMBIOS 2.5 present. 80 structures occupying 3493 bytes. Table at 0x7FA32000.

Handle 0x0001, DMI type 38, 18 bytes IPMI Device Information Interface Type: KCS (Keyboard Control Style) Specification Version: 2.0 I2C Slave Address: 0x10 NV Storage Device: Not Present Base Address: 0x0000000000000CA2 (I/O) Register Spacing: Successive Byte Boundaries

Handle 0x0002, DMI type 1, 27 bytes System Information Manufacturer: Intel Product Name: S5400SF Version: Not Specified Serial Number: ...... UUID: 76A516B5-84B5-11DC-BA6C-001517470362 Wake-up Type: LAN Remote SKU Number: ...... Family: Not Specified

Handle 0x0003, DMI type 2, 16 bytes Base Board Information Manufacturer: Intel Product Name: S5400SF Version: FRU Ver 0.03 Serial Number: BZSR74300552 Asset Tag: Not Specified Features: dmidecode dumps the Desktop Management Interface (DMI) information (also known as System Management BIOS aka SMBIOS). It contains a list of hardware components as well as system vendor information like mother board type and BIOS version. The information is possibly unreliable, but in most cases highly useful.

# dmidecode 2.10 SMBIOS 2.5 present. 80 structures occupying 3493 bytes. Table at 0x7FA32000.

Handle 0x0001, DMI type 38, 18 bytes IPMI Device Information Interface Type: KCS (Keyboard Control Style) Specification Version: 2.0 I2C Slave Address: 0x10 NV Storage Device: Not Present Base Address: 0x0000000000000CA2 (I/O) Register Spacing: Successive Byte Boundaries

Handle 0x0002, DMI type 1, 27 bytes System Information Manufacturer: Intel Product Name: S5400SF Version: Not Specified Serial Number: ...... UUID: 76A516B5-84B5-11DC-BA6C-001517470362 Wake-up Type: LAN Remote SKU Number: ...... Family: Not Specified

Handle 0x0003, DMI type 2, 16 bytes Base Board Information Manufacturer: Intel Product Name: S5400SF Version: FRU Ver 0.03 Serial Number: BZSR74300552 Asset Tag: Not Specified Features: Board is a hosting board Board is replaceable Location In Chassis: Not Specified Chassis Handle: 0x0000 Type: Motherboard Contained Object Handles: 0

Handle 0x0004, DMI type 3, 22 bytes Chassis Information Manufacturer: ..... Type: Rack Mount Chassis Lock: Not Present Version: Not Specified Serial Number: Not Specified Asset Tag: Not Specified Boot-up State: Safe Power Supply State: Safe Thermal State: Safe Security Status: Unknown OEM Information: 0x81581CF8 Height: 1 U Number Of Power Cords: 1 Contained Elements: 0

Handle 0x0005, DMI type 0, 24 bytes BIOS Information Vendor: Intel Corporation Version: S5400.86B.06.00.0030.112620081512 Release Date: 11/26/2008 Address: 0xE8000 Runtime Size: 96 kB ROM Size: 4096 kB Characteristics: PCI is supported PNP is supported BIOS is upgradeable BIOS shadowing is allowed Boot from CD is supported Selectable boot is supported EDD is supported Print screen service is supported (int 5h) 8042 keyboard services are supported (int 9h) Serial services are supported (int 14h) CGA/mono video services are supported (int 10h) ACPI is supported USB legacy is supported LS-120 boot is supported ATAPI Zip drive boot is supported BIOS boot specification is supported Function key-initiated network boot is supported Targeted content distribution is supported BIOS Revision: 6.0 Firmware Revision: 0.0

Handle 0x0006, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J9A1 Internal Connector Type: None External Reference Designator: Keyboard External Connector Type: PS/2 Port Type: Keyboard Port

Handle 0x0007, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J9A1 Internal Connector Type: None External Reference Designator: Mouse External Connector Type: PS/2 Port Type: Mouse Port

Handle 0x0008, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1B1 Internal Connector Type: DB-9 male External Reference Designator: COM 1 External Connector Type: DB-9 male Port Type: Serial Port 16550A Compatible

Handle 0x0009, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J9A2 Internal Connector Type: None External Reference Designator: COM 2 External Connector Type: RJ-45 Port Type: Serial Port 16550A Compatible

Handle 0x000A, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1J1 Internal Connector Type: None External Reference Designator: USB0 HEADER External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x000B, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1J1 Internal Connector Type: None External Reference Designator: USB1 HEADER External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x000C, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J4G1 Internal Connector Type: None External Reference Designator: USB2 BRIDGE BOARD External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x000D, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J4G1 Internal Connector Type: None External Reference Designator: USB3 BRIDGE BOARD External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x000E, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J4G1 Internal Connector Type: None External Reference Designator: USB4 BRIDGE BOARD External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x000F, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J5A1 Internal Connector Type: None External Reference Designator: USB5 PORT External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x0010, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J5A1 Internal Connector Type: None External Reference Designator: USB6 PORT External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x0011, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1C2 Internal Connector Type: None External Reference Designator: USB7 RMM CONNECTOR External Connector Type: Access Bus (USB) Port Type: USB

Handle 0x0012, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J9K1 Internal Connector Type: Other External Reference Designator: CPU1 FAN External Connector Type: None Port Type: Other

Handle 0x0013, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J4K1 Internal Connector Type: Other External Reference Designator: CPU2 FAN External Connector Type: None Port Type: Other

Handle 0x0014, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J3K2 Internal Connector Type: Other External Reference Designator: FRNT FAN1 External Connector Type: None Port Type: Other

Handle 0x0015, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J3K3 Internal Connector Type: Other External Reference Designator: FRNT FAN2 External Connector Type: None Port Type: Other

Handle 0x0016, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1B3 Internal Connector Type: Other External Reference Designator: FRNT FAN3 External Connector Type: None Port Type: Other

Handle 0x0017, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1C1 Internal Connector Type: Other External Reference Designator: FRNT FAN4 External Connector Type: None Port Type: Other

Handle 0x0018, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J3G2 Internal Connector Type: On Board IDE External Reference Designator: OnBoard Primary IDE External Connector Type: None Port Type: Other

Handle 0x0019, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: JA8A2 Internal Connector Type: None External Reference Designator: LAN 1 External Connector Type: RJ-45 Port Type: Network Port

Handle 0x001A, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: JA8A1 Internal Connector Type: None External Reference Designator: LAN 2 External Connector Type: RJ-45 Port Type: Network Port

Handle 0x001B, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J6A1 Internal Connector Type: None External Reference Designator: Onboard Video External Connector Type: DB-15 female Port Type: Video Port

Handle 0x001C, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1H1 Internal Connector Type: Other External Reference Designator: SATA_0 External Connector Type: None Port Type: SATA

Handle 0x001D, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1G2 Internal Connector Type: Other External Reference Designator: SATA_1 External Connector Type: None Port Type: SATA

Handle 0x001E, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1G1 Internal Connector Type: Other External Reference Designator: SATA_2 External Connector Type: None Port Type: SATA

Handle 0x001F, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1F2 Internal Connector Type: Other External Reference Designator: SATA_3 External Connector Type: None Port Type: SATA

Handle 0x0020, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1F1 Internal Connector Type: Other External Reference Designator: SATA_4 External Connector Type: None Port Type: SATA

Handle 0x0021, DMI type 8, 9 bytes Port Connector Information Internal Reference Designator: J1E2 Internal Connector Type: Other External Reference Designator: SATA_5 External Connector Type: None Port Type: SATA

Handle 0x0022, DMI type 10, 6 bytes On Board Device Information Type: Video Status: Enabled Description: ATI Rage XL

Handle 0x0023, DMI type 10, 6 bytes On Board Device Information Type: Ethernet Status: Enabled Description: Intel 82563EB Ethernet 1

Handle 0x0024, DMI type 10, 6 bytes On Board Device Information Type: Ethernet Status: Enabled Description: Intel 82563EB Ethernet 2

Handle 0x0025, DMI type 10, 6 bytes On Board Device Information Type: Other Status: Enabled Description: ESB2 Integrated PATA Controller

Handle 0x0026, DMI type 10, 6 bytes On Board Device Information Type: Other Status: Enabled Description: ESB2 Integrated SATA Controller

Handle 0x0027, DMI type 10, 6 bytes On Board Device Information Type: Other Status: Enabled Description: NS PC87427 SIO3

Handle 0x0028, DMI type 13, 22 bytes BIOS Language Information Installable Languages: 1 en|US|iso8859-1 Currently Installed Language: en|US|iso8859-1

Handle 0x0029, DMI type 32, 20 bytes System Boot Information Status: No errors detected

Handle 0x002A, DMI type 11, 5 bytes OEM Strings String 1: String 2: String 3: String 4: String 5:

Handle 0x002B, DMI type 12, 5 bytes System Configuration Options Option 1: J1D2 2-3: Close to clear Password Option 2: J1D3 2-3: Close to clear CMOS Option 3: J3H1 1-2: Close to run BIOS Low Bank Option 4: J1D1 2-3: Close to Force Update Mode Option 5: J1D5: SATA RAID Key

Handle 0x002C, DMI type 129, 8 bytes OEM-specific Type Header and Data: 81 08 2C 00 01 01 02 00 Strings: Intel ASF Intel ASF_001

Handle 0x002D, DMI type 4, 40 bytes Processor Information Socket Designation: CPU_1 Type: Central Processor Family: Xeon Manufacturer: Intel(R) Corporation ID: 76 06 01 00 FF FB EB BF Signature: Type 0, Family 6, Model 23, Stepping 6 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (Fast floating-point save and restore) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Hyper-threading technology) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Xeon(R) CPU E5462 @ 2.80GHz Voltage: 1.1 V External Clock: 1600 MHz Max Speed: 3400 MHz Current Speed: 2800 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x002F L2 Cache Handle: 0x002E L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Core Count: 4 Core Enabled: 4 Thread Count: 4 Characteristics: 64-bit capable

Handle 0x002E, DMI type 7, 19 bytes Cache Information Socket Designation: L2-Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Write Back Location: Internal Installed Size: 12288 kB Maximum Size: 12288 kB Supported SRAM Types: Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Single-bit ECC System Type: Unified Associativity: 24-way Set-associative

Handle 0x002F, DMI type 7, 19 bytes Cache Information Socket Designation: L1-Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 128 kB Maximum Size: 128 kB Supported SRAM Types: Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Single-bit ECC System Type: Data Associativity: 8-way Set-associative

Handle 0x0030, DMI type 4, 40 bytes Processor Information Socket Designation: CPU_2 Type: Central Processor Family: Xeon Manufacturer: Intel(R) Corporation ID: 76 06 01 00 FF FB EB BF Signature: Type 0, Family 6, Model 23, Stepping 6 Flags: FPU (Floating-point unit on-chip) VME (Virtual mode extension) DE (Debugging extension) PSE (Page size extension) TSC (Time stamp counter) MSR (Model specific registers) PAE (Physical address extension) MCE (Machine check exception) CX8 (CMPXCHG8 instruction supported) APIC (On-chip APIC hardware supported) SEP (Fast system call) MTRR (Memory type range registers) PGE (Page global enable) MCA (Machine check architecture) CMOV (Conditional move instruction supported) PAT (Page attribute table) PSE-36 (36-bit page size extension) CLFSH (CLFLUSH instruction supported) DS (Debug store) ACPI (ACPI supported) MMX (MMX technology supported) FXSR (Fast floating-point save and restore) SSE (Streaming SIMD extensions) SSE2 (Streaming SIMD extensions 2) SS (Self-snoop) HTT (Hyper-threading technology) TM (Thermal monitor supported) PBE (Pending break enabled) Version: Intel(R) Xeon(R) CPU E5462 @ 2.80GHz Voltage: 1.1 V External Clock: 1600 MHz Max Speed: 3400 MHz Current Speed: 2800 MHz Status: Populated, Enabled Upgrade: ZIF Socket L1 Cache Handle: 0x0032 L2 Cache Handle: 0x0031 L3 Cache Handle: Not Provided Serial Number: Not Specified Asset Tag: Not Specified Part Number: Not Specified Core Count: 4 Core Enabled: 4 Thread Count: 4 Characteristics: 64-bit capable

Handle 0x0031, DMI type 7, 19 bytes Cache Information Socket Designation: L2-Cache Configuration: Enabled, Not Socketed, Level 2 Operational Mode: Write Back Location: Internal Installed Size: 12288 kB Maximum Size: 12288 kB Supported SRAM Types: Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Single-bit ECC System Type: Unified Associativity: 24-way Set-associative

Handle 0x0032, DMI type 7, 19 bytes Cache Information Socket Designation: L1-Cache Configuration: Enabled, Not Socketed, Level 1 Operational Mode: Write Back Location: Internal Installed Size: 128 kB Maximum Size: 128 kB Supported SRAM Types: Asynchronous Installed SRAM Type: Asynchronous Speed: Unknown Error Correction Type: Single-bit ECC System Type: Data Associativity: 8-way Set-associative

Handle 0x0033, DMI type 16, 15 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: Multi-bit ECC Maximum Capacity: 128 GB Error Information Handle: Not Provided Number Of Devices: 16

Handle 0x0034, DMI type 19, 15 bytes Memory Array Mapped Address Starting Address: 0x00000000000 Ending Address: 0x003FFFFFFFF Range Size: 16 GB Physical Array Handle: 0x0033 Partition Width: 0

Handle 0x0035, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 1 Locator: ONBOARD DIMM_A1 Bank Locator: Channel A Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA63 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x0036, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x0007FFFFFFF Range Size: 2 GB Physical Device Handle: 0x0035 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 1 Interleave Position: 1 Interleaved Data Depth: 1

Handle 0x0037, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 2 Locator: ONBOARD DIMM_A2 Bank Locator: Channel A Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA88 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x0038, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00080000000 Ending Address: 0x000FFFFFFFF Range Size: 2 GB Physical Device Handle: 0x0037 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 1 Interleave Position: 1 Interleaved Data Depth: 1

Handle 0x0039, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 3 Locator: ONBOARD DIMM_A3 Bank Locator: Channel A Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x003A, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 4 Locator: ONBOARD DIMM_A4 Bank Locator: Channel A Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x003B, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 1 Locator: ONBOARD DIMM_B1 Bank Locator: Channel B Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA2F Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x003C, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00100000000 Ending Address: 0x0017FFFFFFF Range Size: 2 GB Physical Device Handle: 0x003B Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 2 Interleave Position: 2 Interleaved Data Depth: 1

Handle 0x003D, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 2 Locator: ONBOARD DIMM_B2 Bank Locator: Channel B Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EAB5 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x003E, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00180000000 Ending Address: 0x001FFFFFFFF Range Size: 2 GB Physical Device Handle: 0x003D Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 2 Interleave Position: 2 Interleaved Data Depth: 1

Handle 0x003F, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 3 Locator: ONBOARD DIMM_B3 Bank Locator: Channel B Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x0040, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 4 Locator: ONBOARD DIMM_B4 Bank Locator: Channel B Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x0041, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 5 Locator: ONBOARD DIMM_C1 Bank Locator: Channel C Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA75 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x0042, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00200000000 Ending Address: 0x0027FFFFFFF Range Size: 2 GB Physical Device Handle: 0x0041 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 1 Interleave Position: 1 Interleaved Data Depth: 1

Handle 0x0043, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 6 Locator: ONBOARD DIMM_C2 Bank Locator: Channel C Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA71 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x0044, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00280000000 Ending Address: 0x002FFFFFFFF Range Size: 2 GB Physical Device Handle: 0x0043 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 1 Interleave Position: 1 Interleaved Data Depth: 1

Handle 0x0045, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 7 Locator: ONBOARD DIMM_C3 Bank Locator: Channel C Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x0046, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 8 Locator: ONBOARD DIMM_C4 Bank Locator: Channel C Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x0047, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 5 Locator: ONBOARD DIMM_D1 Bank Locator: Channel D Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA70 Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x0048, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00300000000 Ending Address: 0x0037FFFFFFF Range Size: 2 GB Physical Device Handle: 0x0047 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 2 Interleave Position: 2 Interleaved Data Depth: 1

Handle 0x0049, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 2048 MB Form Factor: FB-DIMM Set: 6 Locator: ONBOARD DIMM_D2 Bank Locator: Channel D Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: 667 MHz Manufacturer: 80CE Serial Number: 5107EA6D Asset Tag: Not Specified Part Number: M395T5750EZ4-CE65

Handle 0x004A, DMI type 20, 19 bytes Memory Device Mapped Address Starting Address: 0x00380000000 Ending Address: 0x003FFFFFFFF Range Size: 2 GB Physical Device Handle: 0x0049 Memory Array Mapped Address Handle: 0x0034 Partition Row Position: 2 Interleave Position: 2 Interleaved Data Depth: 1

Handle 0x004B, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 7 Locator: ONBOARD DIMM_D3 Bank Locator: Channel D Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x004C, DMI type 17, 27 bytes Memory Device Array Handle: 0x0033 Error Information Handle: Not Provided Total Width: Unknown Data Width: Unknown Size: No Module Installed Form Factor: FB-DIMM Set: 8 Locator: ONBOARD DIMM_D4 Bank Locator: Channel D Type: DDR2 FB-DIMM Type Detail: Synchronous Speed: Unknown Manufacturer: MemUndefined Serial Number: MemUndefined Asset Tag: Not Specified Part Number: MemUndefined

Handle 0x004D, DMI type 9, 13 bytes System Slot Information Designation: Slot 1, PCI EXP x16 Type: x16 PCI Express Current Usage: In Use Length: Long ID: 1 Characteristics: 3.3 V is provided PME signal is supported SMBus signal is supported

Handle 0x004E, DMI type 9, 13 bytes System Slot Information Designation: I/O Module Type: x4 PCI Express Current Usage: Available Length: Short ID: 10 Characteristics: 3.3 V is provided PME signal is supported SMBus signal is supported

Handle 0x004F, DMI type 24, 5 bytes Hardware Security Power-On Password Status: Not Implemented Keyboard Password Status: Not Implemented Administrator Password Status: Disabled Front Panel Reset Status: Disabled

Handle 0xFEFF, DMI type 127, 4 bytes End Of Table IPMI* Intelligent Platform Management Interface IPMI is an industry standard to control a system on hardware level – depends on the presence of a BMC (baseboard management controller) – defines local and remote interfaces – sometimes implemented together with a remote management module providing a web interface

picture from: http://download.intel.com/design/servers/ipmi/IPMI_and_CIM_Spring2005_IDF.pdf 29 SC2010

IPMI homepage: http://developer.intel.com/design/servers/ipmi/index.htm IPMI* Features For Validation • SEL: system event log • Sensors: current information on system status • accessible via: ipmitool – ipmitool sensor lists current sensor reading – ipmitool sel list lists system event log – ipmitool sel info information on system event log (free space) – ipmitool power reset hardware reset

30 SC2010

The OS driver has to be loaded. Under Redhat* Enterprise Linux one can start the driver via: /etc/init.d/ipmi start

Exemplary output: [root]# ipmitool sensor list BB 12V AUX | 11.904 | Volts | ok | na | 10.416 | 10.726 | 13.144 | 13.578 | na BB 1.1V Vtt | 1.103 | Volts | ok | na | 1.002 | 1.033 | 1.184 | 1.216 | na BB 1.5V AUX | 1.474 | Volts | ok | na | 1.334 | 1.373 | 1.622 | 1.669 | na BB 1.5V ESB | 1.513 | Volts | ok | na | 1.357 | 1.404 | 1.591 | 1.638 | na Proc 1 Vcc | 1.159 | Volts | ok | na | na | na | na | na | na Proc 2 Vcc | 1.135 | Volts | ok | na | na | na | na | na | na BB 3.3V | 3.354 | Volts | ok | na | 2.941 | 3.027 | 3.578 | 3.681 | na BB 5V | 5.044 | Volts | ok | na | 4.446 | 4.576 | 5.408 | 5.564 | na BB 1.25V_FXD | 1.235 | Volts | ok | na | 1.092 | 1.131 | 1.365 | 1.417 | na BB 1.8V | 1.795 | Volts | ok | na | 1.622 | 1.673 | 1.907 | 1.969 | na BB 1.5V FBD | 1.532 | Volts | ok | na | 1.316 | 1.354 | 1.626 | 1.673 | na BB 0.9V | 0.898 | Volts | ok | na | 0.811 | 0.835 | 0.955 | 0.979 | na BB 3.3V STB | 3.354 | Volts | ok | na | 2.958 | 3.044 | 3.578 | 3.681 | na BB Temp | 37.000 | degrees C | ok | na | 5.000 | 10.000 | 61.000 | 66.000 | na Front Panel Temp | 29.000 | degrees C | ok | na | 0.000 | 5.000 | 44.000 | 48.000 | na MCH Therm Margin | -38.000 | degrees C | ok | na | na | na | 3.000 | 20.000 | na Mem Therm Margin | na | degrees C | na | na | na | na | 6.000 | 10.000 | na Fan 1A | 12412.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 2A | 12354.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 3A | 12702.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 4A | 12470.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 5A | 12296.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 1B | 15984.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 2B | 15910.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 3B | 15910.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 4B | 16280.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 5B | 16132.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na P1 Therm Margin | -62.000 | degrees C | ok | na | na | na | na | na | na P2 Therm Margin | -69.000 | degrees C | ok | na | na | na | na | na | na P1 Therm Ctrl % | 0.000 | unspecified | ok | na | na | na | na | 49.530 | na P2 Therm Ctrl % | 0.000 | unspecified | ok | na | na | na | na | 49.530 | na HSBP Temp | 29.000 | degrees C | ok | na | 0.000 | 5.000 | 50.000 | 54.000 | na Power Unit Stat | 0x0 | discrete | 0x0000| na | na | na | na | na | na Watchdog | 0x0 | discrete | 0x0000| na | na | na | na | na | na Platform Secu V | 0x0 | discrete | 0x0000| na | na | na | na | na | na Physical Scrty | 0x0 | discrete | 0x0000| na | na | na | na | na | na FP Interrupt | 0x0 | discrete | 0x0000| na | na | na | na | na | na System Event Log | 0x0 | discrete | 0x0000| na | na | na | na | na | na Session Audit | 0x0 | discrete | 0x0000| na | na | na | na | na | na System Event | 0x0 | discrete | 0x0000| na | na | na | na | na | na BB Vbat | 0x0 | discrete | 0x0000| na | na | na | na | na | na ACPI State | 0x0 | discrete | 0x0100| na | na | na | na | na | na Button | 0x0 | discrete | 0x0000| na | na | na | na | na | na SMI Timeout | 0x0 | discrete | 0x0000| na | na | na | na | na | na NMI Signal State | 0x0 | discrete | 0x0000| na | na | na | na | na | na The OS driver has to be loaded. Under Redhat* Enterprise Linux one can start the driver via: /etc/init.d/ipmi start

Exemplary output: [root]# ipmitool sensor list BB 12V AUX | 11.904 | Volts | ok | na | 10.416 | 10.726 | 13.144 | 13.578 | na BB 1.1V Vtt | 1.103 | Volts | ok | na | 1.002 | 1.033 | 1.184 | 1.216 | na BB 1.5V AUX | 1.474 | Volts | ok | na | 1.334 | 1.373 | 1.622 | 1.669 | na BB 1.5V ESB | 1.513 | Volts | ok | na | 1.357 | 1.404 | 1.591 | 1.638 | na Proc 1 Vcc | 1.159 | Volts | ok | na | na | na | na | na | na Proc 2 Vcc | 1.135 | Volts | ok | na | na | na | na | na | na BB 3.3V | 3.354 | Volts | ok | na | 2.941 | 3.027 | 3.578 | 3.681 | na BB 5V | 5.044 | Volts | ok | na | 4.446 | 4.576 | 5.408 | 5.564 | na BB 1.25V_FXD | 1.235 | Volts | ok | na | 1.092 | 1.131 | 1.365 | 1.417 | na BB 1.8V | 1.795 | Volts | ok | na | 1.622 | 1.673 | 1.907 | 1.969 | na BB 1.5V FBD | 1.532 | Volts | ok | na | 1.316 | 1.354 | 1.626 | 1.673 | na BB 0.9V | 0.898 | Volts | ok | na | 0.811 | 0.835 | 0.955 | 0.979 | na BB 3.3V STB | 3.354 | Volts | ok | na | 2.958 | 3.044 | 3.578 | 3.681 | na BB Temp | 37.000 | degrees C | ok | na | 5.000 | 10.000 | 61.000 | 66.000 | na Front Panel Temp | 29.000 | degrees C | ok | na | 0.000 | 5.000 | 44.000 | 48.000 | na MCH Therm Margin | -38.000 | degrees C | ok | na | na | na | 3.000 | 20.000 | na Mem Therm Margin | na | degrees C | na | na | na | na | 6.000 | 10.000 | na Fan 1A | 12412.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 2A | 12354.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 3A | 12702.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 4A | 12470.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 5A | 12296.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na Fan 1B | 15984.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 2B | 15910.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 3B | 15910.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 4B | 16280.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na Fan 5B | 16132.000 | RPM | ok | na | 2442.000 | 3404.000 | na | na | na P1 Therm Margin | -62.000 | degrees C | ok | na | na | na | na | na | na P2 Therm Margin | -69.000 | degrees C | ok | na | na | na | na | na | na P1 Therm Ctrl % | 0.000 | unspecified | ok | na | na | na | na | 49.530 | na P2 Therm Ctrl % | 0.000 | unspecified | ok | na | na | na | na | 49.530 | na HSBP Temp | 29.000 | degrees C | ok | na | 0.000 | 5.000 | 50.000 | 54.000 | na Power Unit Stat | 0x0 | discrete | 0x0000| na | na | na | na | na | na Watchdog | 0x0 | discrete | 0x0000| na | na | na | na | na | na Platform Secu V | 0x0 | discrete | 0x0000| na | na | na | na | na | na Physical Scrty | 0x0 | discrete | 0x0000| na | na | na | na | na | na FP Interrupt | 0x0 | discrete | 0x0000| na | na | na | na | na | na System Event Log | 0x0 | discrete | 0x0000| na | na | na | na | na | na Session Audit | 0x0 | discrete | 0x0000| na | na | na | na | na | na System Event | 0x0 | discrete | 0x0000| na | na | na | na | na | na BB Vbat | 0x0 | discrete | 0x0000| na | na | na | na | na | na ACPI State | 0x0 | discrete | 0x0100| na | na | na | na | na | na Button | 0x0 | discrete | 0x0000| na | na | na | na | na | na SMI Timeout | 0x0 | discrete | 0x0000| na | na | na | na | na | na NMI Signal State | 0x0 | discrete | 0x0000| na | na | na | na | na | na SMI Signal State | na | discrete | na | na | na | na | na | na | na Proc 1 Status | 0x0 | discrete | 0x8000| na | na | na | na | na | na Proc 2 Status | 0x0 | discrete | 0x8000| na | na | na | na | na | na PCIe Link0 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link1 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link2 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link3 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link4 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link5 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link6 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link7 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link8 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link9 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link10 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link11 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link12 | 0x0 | discrete | 0x0000| na | na | na | na | na | na PCIe Link13 | 0x0 | discrete | 0x0000| na | na | na | na | na | na Proc 1 VRD Hot | 0x0 | discrete | 0x0000| na | na | na | na | na | na Proc 2 VRD Hot | 0x0 | discrete | 0x0000| na | na | na | na | na | na Proc 1 Vcc OOR | 0x0 | discrete | 0x0000| na | na | na | na | na | na Proc 2 Vcc OOR | 0x0 | discrete | 0x0000| na | na | na | na | na | na CPU Popul Error | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM A1 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM A2 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM A3 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM A4 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM B1 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM B2 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM B3 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM B4 | 0x0 | discrete | 0x0000| na | na | na | na | na | na Memory Error A | 0x0 | discrete | 0x0000| na | na | na | na | na | na Memory Error B | 0x0 | discrete | 0x0000| na | na | na | na | na | na Memory Error C | 0x0 | discrete | 0x0000| na | na | na | na | na | na Memory Error D | 0x0 | discrete | 0x0000| na | na | na | na | na | na B0 DIMM Spar En | na | discrete | na | na | na | na | na | na | na B0 DIMM Spar Red | na | discrete | na | na | na | na | na | na | na B1 DIMM Spar En | na | discrete | na | na | na | na | na | na | na B1 DIMM Spar Red | na | discrete | na | na | na | na | na | na | na DIMM C1 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM C2 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM C3 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM C4 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM D1 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM D2 | 0x0 | discrete | 0x0400| na | na | na | na | na | na DIMM D3 | 0x0 | discrete | 0x0000| na | na | na | na | na | na DIMM D4 | na | discrete | na | na | na | na | na | na | na Drv 1 Stat | 0x0 | discrete | 0x008e| na | na | na | na | na | na Drv 2 Stat | 0x0 | discrete | 0x008e| na | na | na | na | na | na Drv 3 Stat | 0x0 | discrete | 0x008e| na | na | na | na | na | na Drv 1 Pres | 0x0 | discrete | 0x0280| na | na | na | na | na | na Drv 2 Pres | 0x0 | discrete | 0x0280| na | na | na | na | na | na Drv 3 Pres | 0x0 | discrete | 0x0280| na | na | na | na | na | na

[root]# ipmitool sel info SEL Information Version : 1.5 (v1.5, v2 compliant) Entries : 1921 Free Space : 27112 bytes Percent Used : 53% Last Add Time : 07/27/2010 21:28:43 Last Del Time : 12/03/2007 14:23:13 Overflow : false Supported Cmds : 'Delete' 'Partial Add' 'Reserve' 'Get Alloc Info' # of Alloc Units : 3276 Alloc Unit Size : 20 # Free Units : 1355 Largest Free Blk : 1355 Max Record Size : 5 [root]# ipmitool sel list 2>&1 | head -20 4 | 12/03/2007 | 14:23:13 | Event Logging Disabled #0x09 | Log area reset/cleared | Asserted 18 | 12/03/2007 | 14:23:13 | Processor #0x90 | Presence detected | Asserted 2c | 12/03/2007 | 14:23:14 | Processor #0x91 | Presence detected | Asserted 40 | 12/03/2007 | 14:23:34 | System Event #0x83 | Timestamp Clock Sync | Asserted 54 | 12/03/2007 | 14:23:34 | System Event #0x83 | Timestamp Clock Sync | Asserted 68 | 12/03/2007 | 14:24:20 | System Event #0x01 | OEM System boot event | Asserted 7c | 12/03/2007 | 14:25:48 | System ACPI Power State #0x82 | S0/G0: working | Asserted 90 | 12/03/2007 | 14:29:55 | Button #0x84 | Power Button pressed | Asserted a4 | 12/03/2007 | 14:29:56 | System Event #0x83 | Timestamp Clock Sync | Asserted b8 | 12/03/2007 | 14:29:56 | System Event #0x83 | Timestamp Clock Sync | Asserted cc | 12/03/2007 | 14:29:56 | Power Unit #0x01 | Power off/down | Asserted e0 | 12/03/2007 | 14:30:00 | Power Unit #0x01 | Power off/down | Deasserted f4 | 12/03/2007 | 14:30:01 | Button #0x84 | Power Button pressed | Asserted 108 | 12/03/2007 | 14:30:16 | Drive Slot #0x09 | Device Present 11c | 12/03/2007 | 14:30:18 | System Event #0x83 | Timestamp Clock Sync | Asserted 130 | 12/03/2007 | 14:30:18 | System Event #0x83 | Timestamp Clock Sync | Asserted 144 | 12/03/2007 | 14:31:14 | System Event #0x01 | OEM System boot event | Asserted 158 | 12/03/2007 | 14:32:52 | System ACPI Power State #0x82 | S0/G0: working | Asserted 16c | 12/03/2007 | 14:58:35 | System Event #0x83 | Timestamp Clock Sync | Asserted 180 | 12/03/2007 | 14:58:35 | System Event #0x83 | Timestamp Clock Sync | Asserted

Note: not all information is standardized and a Vendor specific tool might be necessary to retrieve the complete information both for SEL and sensors. ipmitool Examples

[root]# ipmitool sensor BB 12V AUX | 11.904 | Volts | ok | na | 10.416 | 10.726 | 13.144 | 13.578 | na Front Panel Temp | 29.000 | degrees C | ok | na | 0.000 | 5.000 | 44.000 | 48.000 | na MCH Therm Margin | -38.000 | degrees C | ok | na | na | na | 3.000 | 20.000 | na Mem Therm Margin | na | degrees C | na | na | na | na | 6.000 | 10.000 | na Fan 1A | 12412.000 | RPM | ok | na | 1044.000 | 2030.000 | na | na | na [root]# ipmitool sel info SEL Information Version : 1.5 (v1.5, v2 compliant) Entries : 1921 Free Space : 27112 bytes Percent Used : 53% Last Add Time : 07/27/2010 21:28:43 Last Del Time : 12/03/2007 14:23:13 Overflow : false Supported Cmds : 'Delete' 'Partial Add' 'Reserve' 'Get Alloc Info' # of Alloc Units : 3276 Alloc Unit Size : 20 # Free Units : 1355 Largest Free Blk : 1355 Max Record Size : 5 [root]# ipmitool sel list 2>&1 | head -20 4 | 12/03/2007 | 14:23:13 | Event Logging Disabled #0x09 | Log area reset/cleared | Asserted 18 | 12/03/2007 | 14:23:13 | Processor #0x90 | Presence detected | Asserted

31 SC2010 Vendor Tools to access BIOS: syscfg • Many vendors provide tools to access and modify BIOS via command line • for Intel servers: syscfg

Example to dump complete BIOS settings:

[root]# cd /usr/local/syscfg [root]# ./syscfg /s ini /b [root]# less syscfg.INI

32 SC2010

[root]# cat /usr/local/syscfg/syscfg.INI ; Warning!!! Warning!!! Warning!!! ; ------; This file has been generated in a system with the BIOS/Firmware ; specifications as mentioned under [SYSTEM] section. Please do not ; modify or edit any information in this section. Attempt to restore ; these information in incompatible systems could cause serious ; problems to the sytems and could lead the system non-functional. ; Note: The file is best seen using wordpad.

[SYSTEM] BIOSVersion=S5500.86B.01.00.0050.050620101605 ; This field should not be edited FWBootVersion=16 ; This field should not be edited FWOpcodeVersion=53 ; This field should not be edited PIAVersion=53 ; This field should not be edited

[BIOS]

[BIOS::ADVANCED]

[BIOS::ADVANCED::MEMORY CONFIGURATION]

[BIOS::ADVANCED::MEMORY CONFIGURATION::MEMORY RAS AND PERFORMANCE CONFIGURATION] Select Memory RAS Configuration=0 ;Options: 3=Sparing: 0=Maximum Performance NUMA Optimized=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::MASS STORAGE CONTROLLER CONFIGURATION] Intel(R) SAS RAID Module=1 ;Options: 1=Enabled: 0=Disabled Configure Intel(R) SAS RAID Module=0 ;Options: 1=Intel(R) ESRTII: 0=IT/IR RAID Onboard SATA Controller=1 ;Options: 1=Enabled: 0=Disabled SATA Mode=1 ;Options: 2=SW RAID: 1=AHCI: 3=COMPATIBILITY: 0=ENHANCED AHCI Option ROM=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION]

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION::SERIAL A ENABLE] Serial A Enable=1 ;Options: 1=Enabled: 0=Disabled Address=1016 ;Options: 744=2E8: 1000=3E8: 760=2F8: 1016=3F8 IRQ=4 ;Options: 4=4: 3=3

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION::SERIAL B ENABLE] Serial B Enable=1 ;Options: 1=Enabled: 0=Disabled Address=760 ;Options: 744=2E8: 1000=3E8: 760=2F8: 1016=3F8 IRQ=3 ;Options: 4=4: 3=3

[BIOS::ADVANCED::USB CONFIGURATION] USB Controller=1 ;Options: 1=Enabled: 0=Disabled Legacy USB Support=0 ;Options: 2=Auto: 1=Disabled: 0=Enabled [root]# cat /usr/local/syscfg/syscfg.INI ; Warning!!! Warning!!! Warning!!! ; ------; This file has been generated in a system with the BIOS/Firmware ; specifications as mentioned under [SYSTEM] section. Please do not ; modify or edit any information in this section. Attempt to restore ; these information in incompatible systems could cause serious ; problems to the sytems and could lead the system non-functional. ; Note: The file is best seen using wordpad.

[SYSTEM] BIOSVersion=S5500.86B.01.00.0050.050620101605 ; This field should not be edited FWBootVersion=16 ; This field should not be edited FWOpcodeVersion=53 ; This field should not be edited PIAVersion=53 ; This field should not be edited

[BIOS]

[BIOS::ADVANCED]

[BIOS::ADVANCED::MEMORY CONFIGURATION]

[BIOS::ADVANCED::MEMORY CONFIGURATION::MEMORY RAS AND PERFORMANCE CONFIGURATION] Select Memory RAS Configuration=0 ;Options: 3=Sparing: 0=Maximum Performance NUMA Optimized=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::MASS STORAGE CONTROLLER CONFIGURATION] Intel(R) SAS RAID Module=1 ;Options: 1=Enabled: 0=Disabled Configure Intel(R) SAS RAID Module=0 ;Options: 1=Intel(R) ESRTII: 0=IT/IR RAID Onboard SATA Controller=1 ;Options: 1=Enabled: 0=Disabled SATA Mode=1 ;Options: 2=SW RAID: 1=AHCI: 3=COMPATIBILITY: 0=ENHANCED AHCI Option ROM=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION]

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION::SERIAL A ENABLE] Serial A Enable=1 ;Options: 1=Enabled: 0=Disabled Address=1016 ;Options: 744=2E8: 1000=3E8: 760=2F8: 1016=3F8 IRQ=4 ;Options: 4=4: 3=3

[BIOS::ADVANCED::SERIAL PORT CONFIGURATION::SERIAL B ENABLE] Serial B Enable=1 ;Options: 1=Enabled: 0=Disabled Address=760 ;Options: 744=2E8: 1000=3E8: 760=2F8: 1016=3F8 IRQ=3 ;Options: 4=4: 3=3

[BIOS::ADVANCED::USB CONFIGURATION] USB Controller=1 ;Options: 1=Enabled: 0=Disabled Legacy USB Support=0 ;Options: 2=Auto: 1=Disabled: 0=Enabled Port 60/64 Emulation=1 ;Options: 1=Enabled: 0=Disabled Make USB Devices Non-Bootable=0 ;Options: 1=Enabled: 0=Disabled Device Reset Timeout=1 ;Options: 3=40 seconds: 2=30 seconds: 1=20 seconds: 0=10 seconds USB 2.0 Controller=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::PCI CONFIGURATION] Maximize Memory below 4GB=0 ;Options: 1=Enabled: 0=Disabled Memory Mapped I/O above 4GB=0 ;Options: 1=Enabled: 0=Disabled Onboard Video=0 ;Options: 1=Disabled: 0=Enabled Dual Monitor Video=0 ;Options: 1=Enabled: 0=Disabled Onboard NIC1 ROM=1 ;Options: 1=Enabled: 0=Disabled Onboard NIC2 ROM=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::ADVANCED::SYSTEM ACOUSTICS AND PERFORMANCE CONFIGURATION] Set Throttling Mode=2 ;Options: 2=CLTT: 1=OLTT: 0=Auto Altitude=900 ;Options: 3000=Higher than 1500m: 1500=901m - 1500m: 900=301m - 900m: 300=300m or less Fan PWM Offset=0 ;Options: N/A

[BIOS::MEMORY CONFIGURATION]

[BIOS::DIMM DISABLE]

[BIOS::THERMAL THROTTLING]

[BIOS::MEMORY MAP]

[BIOS::TYLERSBURG]

[BIOS::TYLERSBURG IOH 0]

[BIOS::TYLERSBURG CONFIGURATION]

[BIOS::INTEL? VT FOR DIRECTED I/O (VT-D)]

[BIOS::IOH DEVICE AND FUNCTION HIDE OPTIONS]

[BIOS::PCI EXPRESS PORT 0] PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=1 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 1] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=2 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 2] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=3 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 3] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=4 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 4] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=5 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 5] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=1 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 6] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=2 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 7] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=3 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 8] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=4 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 9] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=5 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::PCI EXPRESS PORT 10] Hot Plug Capable=0 ;Options: 1=Enable: 0=Disable PCIe Port VPP=0 ;Options: 1=Enable: 0=Disable VPP SMBUS Address=6 ;Options: 7=7: 6=6: 5=5: 4=4: 3=3: 2=2: 1=1: 0=0

[BIOS::ICH9/ICH10 CONFIGURATION]

[BIOS::ICH PCIE CONFIGURATION]

[BIOS::ICH MISC DEVICES CONFIGURATION] System State After Power Failure=1 ;Options: 1=On: 0=Off

[BIOS::ICH SATA CONFIGURATION]

[BIOS::ICH USB CONFIGURATION]

[BIOS::PROCESSOR CONFIGURATION] Intel(R) QPI Frequency Select=0 ;Options: 32=Auto Strap: 3=6.4 GT/s: 2=5.866 GT/s: 1=4.8 GT/s: 0=Auto Max Intel(R) Turbo Boost Technology=1 ;Options: 1=Enabled: 0=Disabled Enhanced Intel SpeedStep(R) Tech=1 ;Options: 1=Enabled: 0=Disabled Turbo Boost Performance/Watt Mode=0 ;Options: 1=Power Optimized: 0=Traditional Processor C3=0 ;Options: 2=ACPI C3: 1=ACPI C2: 0=Disabled Processor C6=0 ;Options: 1=Enabled: 0=Disabled Intel(R) Hyper-Threading Tech=0 ;Options: 0=Enabled: 1=Disabled Core Multi-Processing=0 ;Options: 5=5: 4=4: 3=3: 2=2: 1=1: 0=All Execute Disable Bit=1 ;Options: 1=Enabled: 0=Disabled Intel(R) Virtualization Technology=1 ;Options: 1=Enabled: 0=Disabled Intel(R) VT for Directed I/O=0 ;Options: 1=Enabled: 0=Disabled Hardware Prefetcher=0 ;Options: 0=Enabled: 1=Disabled Adjacent Cache Line Prefetch=0 ;Options: 0=Enabled: 1=Disabled Direct Cache Access (DCA)=1 ;Options: 1=Enabled: 0=Disabled

[BIOS::MAIN] Quiet Boot=0 ;Options: 1=Enabled: 0=Disabled POST Error Pause=0 ;Options: 1=Enabled: 0=Disabled

[BIOS::SECURITY] Front Panel Lockout=0 ;Options: 1=Enabled: 0=Disabled

[BIOS::SERVER MANAGEMENT] Assert NMI on SERR=1 ;Options: 1=Enabled: 0=Disabled Assert NMI on PERR=1 ;Options: 1=Enabled: 0=Disabled Resume on AC Power Loss=0 ;Options: 2=Reset: 1=Last state: 0=Stay Off Clear System Event Log=0 ;Options: 1=Enabled: 0=Disabled FRB-2 Enable=1 ;Options: 1=Enabled: 0=Disabled OS Boot Watchdog Timer=0 ;Options: 1=Enabled: 0=Disabled Plug & Play BMC Detection=0 ;Options: 1=Enabled: 0=Disabled ACPI 1.0 Support=0 ;Options: 1=Enabled: 0=Disabled

[BIOS::SERVER MANAGEMENT::CONSOLE REDIRECTION] Console Redirection=0 ;Options: 2=Serial Port B: 1=Serial Port A: 0=Disabled

[BIOS::SERVER MANAGEMENT::BMC LAN CONFIGURATION] IP source=0 ;Options: 2=Dynamic: 1=Static IP source=0 ;Options: 2=Dynamic: 1=Static User ID=0 ;Options: 5=User5: 4=User4: 3=User3: 2=root: 1=anonymous

[BIOS::SYSTEM BOOTORDER] 1=IBA GE Slot 0100 v1327 2=#0600 ID01 LUN0 SEAGATE ST3400 3=Internal EFI Shell 4=KVM vmDisk-CD 0.01

Disks and mounts • /proc/partitions: lists all drives and partitions recognized by the kernel • /proc/mounts: information on all mounted file systems; similar to information from mount command and /etc/mtab though considered to be more accurate • convenient to check for network mounted FS

[user]$pdsh -w et[60-64,69-73] cat /proc/mounts |grep " /home "| sort| dshbak -c ------et[60-64,69-73] ------36.101.255.10:/volatile3 /home nfs rw,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nointr,nolock,proto=udp,timeo= 20,retrans=3,sec=sys,addr=36.101.255.10 0 0 33 SC2010 S.M.A.R.T.* capable Disk Drives • Current DDs monitor their own health and provide the SMART interface • Smartmontools* are used to access data • Note: Google found certain S.M.A.R.T* attributes to be more conclusive than overall health status

[root]# pdsh -w et[60-64,69-73] -u 3 /usr/sbin/smartctl -a /dev/sda |grep -i Health | dshbak -c ------et[60-64,69-73] ------SMART overall-health self-assessment test result: PASSED

34 SC2010 http://en.wikipedia.org/wiki/S.M.A.R.T. Google findings: http://labs.google.com/papers/disk_failures.pdf Debug output from Kernel - dmesg • Kernel display host information already during boot • dmesg can be used to dump the current message cache • /var/log/boot contains the boot information • Run time info stored according to syslog

[user]$ cat /etc/syslog.conf # Log all kernel messages to the console. # Logging much else clutters up the screen. kern.* /dev/console

# Log anything (except mail) of level info or higher. # Don't log private authentication messages! *.info;mail.none;authpriv.none;cron.none /var/log/messages

35 SC2010 Interconnect Hardware • ethtool: info on Ethernet ports; important are link speed and duplex type • ibstat: information on InfiniBand adapter; apart from state and rate one can also see type, and both hardware and firmware version

[root]# ethtool eth0 | grep -E '[SD][pu][ep][el][de]' Speed: 1000Mb/s Duplex: Full

[user]$ ibstat | grep -E '[CRvP][Aaeh][ try][tess][yp:it]‘ CA type: MT26428 Firmware version: 2.7.0 Hardware version: a0 Physical state: LinkUp Rate: 40 Physical state: Polling 36 Rate: 10 SC2010

[root]# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbag Wake-on: g Current message level: 0x00000001 (1) Link detected: yes Interconnect Configuration • ifconfig: basic network configuration; usually ssh will not work if incorrect; MTU and counters usable for validation • netstat –rn: routing information; most often should be consistent over cluster

[user]$ pdsh -w et[60-64,69-73] "/sbin/ifconfig | awk '/error/{print \$3}'" | dshbak -c ------et[60-64,69-73] ------errors:0 … errors:0

[user]$ pdsh -w et[60-64,69-73] netstat -rn | dshbak -c ------et[60-64,69-73] ------Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 36.102.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ib0 36.101.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 37 0.0.0.0 36.101.181.109 0.0.0.0 UG 0 0 0 eth0 SC2010

[user]$ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:0E:0C:E3:06:3E inet addr:36.101.203.4 Bcast:36.101.255.255 Mask:255.255.0.0 inet6 addr: fe80::20e:cff:fee3:63e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:16632647 errors:0 dropped:0 overruns:0 frame:0 TX packets:9276607 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:34571169969 (32.1 GiB) TX bytes:27344483841 (25.4 GiB) Memory:99020000-99040000 ib0 Link encap:InfiniBand HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:36.102.203.4 Bcast:36.102.255.255 Mask:255.255.0.0 inet6 addr: fe80::202:c903:2:8bb1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:2229365 errors:0 dropped:0 overruns:0 frame:0 TX packets:2856 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:3000 RX bytes:124844440 (119.0 MiB) TX bytes:171368 (167.3 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:263598 errors:0 dropped:0 overruns:0 frame:0 TX packets:263598 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:97560817 (93.0 MiB) TX bytes:97560817 (93.0 MiB)

[user]$ netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 36.102.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ib0 36.101.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 36.101.181.109 0.0.0.0 UG 0 0 0 eth0 Infiniband speed • OFED knows a number of performance commands: ib_read_bw,ib_read_lat, ib_write_bw, ib_write_lat… • start on one system into “server” mode, then on second system with “server” name as parameter • if running locally on a systems tests PCI speed

[user]$ ib_read_bw ------RDMA_Read BW Test Connection type : RC local address: LID 0x54, QPN 0x34005c, PSN 0xeaa6ef RKey 0x8000252c VAddr 0x002b35ae308000 remote address: LID 0x54, QPN 0x34005d, PSN 0x6c5676, RKey 0x8000262c VAddr 0x002adf8c741000 Mtu : 2048

[user]$ ib_read_bw localhost ------RDMA_Read BW Test Connection type : RC local address: LID 0x54, QPN 0x34005d, PSN 0x6c5676 RKey 0x8000262c VAddr 0x002adf8c741000 remote address: LID 0x54, QPN 0x34005c, PSN 0xeaa6ef, RKey 0x8000252c VAddr 0x002b35ae308000 Mtu : 2048 ------#bytes #iterations BW peak[MB/sec] BW average[MB/sec] 65536 1000 3122.17 3122.17 38------SC2010

[user]$ /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:0E:0C:E3:06:3E inet addr:36.101.203.4 Bcast:36.101.255.255 Mask:255.255.0.0 inet6 addr: fe80::20e:cff:fee3:63e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:16632647 errors:0 dropped:0 overruns:0 frame:0 TX packets:9276607 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:34571169969 (32.1 GiB) TX bytes:27344483841 (25.4 GiB) Memory:99020000-99040000 ib0 Link encap:InfiniBand HWaddr 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 inet addr:36.102.203.4 Bcast:36.102.255.255 Mask:255.255.0.0 inet6 addr: fe80::202:c903:2:8bb1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:2229365 errors:0 dropped:0 overruns:0 frame:0 TX packets:2856 errors:0 dropped:5 overruns:0 carrier:0 collisions:0 txqueuelen:3000 RX bytes:124844440 (119.0 MiB) TX bytes:171368 (167.3 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:263598 errors:0 dropped:0 overruns:0 frame:0 TX packets:263598 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:97560817 (93.0 MiB) TX bytes:97560817 (93.0 MiB)

[user]$ netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 36.102.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ib0 36.101.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 36.101.181.109 0.0.0.0 UG 0 0 0 eth0 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

39 SC2010

39 Problems with Files • Existence: test -e FILE; ls DIR • Consistency: sum FILE •Content: cat FILE

[user]$ pdsh -w et[60-64,69-73] 'test -e /usr/kerberos/bin/rsh;echo $?' | dshbak -c ------et[60-64,69-73] ------0 [user]$ pdsh -w et[60-64,69-73] sum /etc/security/limits.conf | dshbak -c ------et[60-64,69-72] ------52315 2 ------et73 ------49231 2 [user]$ pdsh -w et[60-64,69-73] cat /etc/ntp/step-tickers | dshbak -c ------et[60-64,69-73] ------36.101.201.1 40 SC2010

typical files to watch:

/etc/passwd /etc/group /etc/shadow /etc/hosts /etc/fstab /etc/rc.d/rc.local /etc/infiniband/openib.conf /etc/ntp.conf /etc/ntp/step-tickers /etc/modprobe.conf /etc/nsswitch.conf /etc/ld.so.conf/* /etc/profile /etc/profile.d/* local config files for batch queuing systems like /etc/pbs.conf;sum /var/spool/PBS/mom_priv/{config,prologue,epilogue} local spool directories /var/spool/PBS/mom_priv/jobs Image consistency • Modern Linux installs use rpm or similar package manager • “rpm –q –a” will list all packages • checking for CONSISTENCY – not if image is correct • detailed checks soon become very time consuming

[user]$ pdsh -w et[60-64,69-73] 'pdsh -q -a | sort | sum' | dshbak -c ------et[60-64,69-73] ------50276 1

41 SC2010 detailed checks could be done via tools like “tripwire”, keeping the database on a network mounted directory. Some care has to be taken when checking directories with volatile content. Consistency on File Level - find • using “find” to iterate over a directory and create a checksum for all files on a known good node • create similar file on suspect nodes • compare both files via diff

[root]# ssh et60 'find /root/.ssh -type f -exec md5sum {} \; > /opt/admin/tmp/base' [root]# pdsh -w et[60-64,69-73] 'find /root/.ssh -type f -exec md5sum {} \; > /tmp/file.host; diff /tmp/file.host /opt/admin/tmp/base; exit 0' | dshbak -c ------et[71-73] ------10d9 < 70829cad65dcd9f95e53bcb7aa498091 /root/.ssh/old.authorized_keys 11a11 > 70829cad65dcd9f95e53bcb7aa498091 /root/.ssh/old.authorized_keys ------et62 ------10c10 < d562ac2169f0a3abac36a8f16206897f /root/.ssh/known_hosts --- > 2d57374ba51db1d96d159e2e25b64374 /root/.ssh/known_hosts

42 SC2010

In detail: find /root/.ssh -type f -exec md5sum {} \; find PART1 /root/.ssh PART2 -type f PART3 -exec md5sum {} \; PART4

PART1 the find command PART2 search /root/.ssh and lower PART3 only files PART4 on every found file execute the command “md5sum” “{}” represents the file found “\;” indicates to find the command is complete pdsh -w et[60-64,69-73] ' find /root/.ssh -type f -exec md5sum {} \; > /tmp/file.host; STEP1 diff /tmp/file.host /opt/admin/tmp/base; STEP2 exit 0 STEP3 ' | dshbak –c

STEP1 find command, create local file STEP2 find differences STEP3 exit 0 to suppress an error by pdsh Kernel: Simple Consistency Checks • “ –a” will give the running kernel • “lsmod” lists all loaded kernel modules • “/proc/cmdline” contains the parameters used to start kernel [user]$ pdsh -w et[61-64,69-72] uname -r | dshbak -c ------et[61-64,69-72] ------2.6.18-164.11.1.el5.crt1 [user]$ pdsh -w et[61-64,69-72] "/sbin/lsmod | awk '{print \$1}' | sort | sum" | dshbak -c ------et[61-64,69-72] ------09568 2 [user]$ pdsh -w et[61-64,69-72] cat /proc/cmdline | dshbak -c ------et[61-64,69-72] ------load_ramdisk=1 prompt_ramdisk=0 initrd=initrd-2.6.18-164.11.1.el5.crt1.img.cpio.gz root=/dev/sda3 root=/dev/sda3 rw ip=dhcp sshd ramdisk_size=131071 noinstall console=tty0 processor.max_cstate=1 panic=30 BOOT_IMAGE=vmlinuz-2.6.18-164.11.1.el5.crt1 43 SC2010

even if the same modules are loaded, current usage of the system will modify the status of the system over time and therefore change the output of lsmod. /proc/modules contains the original info

[user]$ pdsh -w et[61-64,69-72] "/sbin/lsmod | sort | sum" | dshbak -c ------et[61-64,69-72] ------39521 6 some time later:

[user]$ pdsh -w et[61-64,69-72] "/sbin/lsmod | sort | sum" | dshbak -c ------et[69-70,72] ------39521 6 ------et[61-64,71] ------46507 6 but we still have the same modules loaded:

[user]$ pdsh -w et[61-64,69-72] "/sbin/lsmod | awk '{print \$1}' | sort | sum" | dshbak -c ------et[61-64,69-72] ------09568 2 Kernel: into the depths • “sysctl –a” lists ALL parameters available in the kernel • depends on loaded modules • most parameters can be modified at run time • parameters can also be found, read and modified via /proc/sys • too volatile to be checked as whole – one has to know exactly what you are looking for

44 SC2010

Example output: [root]sysctl -a lustre.max_dirty_mb = 13184 lustre.alloc_fail_rate = 0 lustre.ldlm_timeout = 20 lustre.pagesused_max = 0 lustre.memused_max = 1069506478 lustre.pagesused = 0 lustre.memused = 1068709182 lustre.dump_on_eviction = 0 lustre.dump_on_timeout = 0 lustre.debug_peer_on_timeout = 0 lustre.timeout = 100 lustre.fail_val = 0 lustre.fail_loc = 0 lnet.nis = nid refs peer max tx min lnet.nis = 0@lo 2 0 0 0 0 lnet.nis = 36.102.21.1@o2ib 28 8 64 64 -116 lnet.buffers = pages count credits min lnet.buffers = 0 0 0 0 lnet.buffers = 1 0 0 0 lnet.buffers = 256 0 0 0 lnet.peers = nid refs state max rtr min tx min queue lnet.peers = 36.102.223.1@o2ib 1 ~rtr 8 8 8 8 4 0 lnet.peers = 36.102.225.1@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.11@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.12@o2ib 1 ~rtr 8 8 8 8 3 0 lnet.peers = 36.102.223.13@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.14@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.15@o2ib 1 ~rtr 8 8 8 8 1 0 lnet.peers = 36.102.223.16@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.17@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.18@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.225.11@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.12@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.13@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.14@o2ib 1 ~rtr 8 8 8 8 -4 0 Example output: [root]sysctl -a lustre.max_dirty_mb = 13184 lustre.alloc_fail_rate = 0 lustre.ldlm_timeout = 20 lustre.pagesused_max = 0 lustre.memused_max = 1069506478 lustre.pagesused = 0 lustre.memused = 1068709182 lustre.dump_on_eviction = 0 lustre.dump_on_timeout = 0 lustre.debug_peer_on_timeout = 0 lustre.timeout = 100 lustre.fail_val = 0 lustre.fail_loc = 0 lnet.nis = nid refs peer max tx min lnet.nis = 0@lo 2 0 0 0 0 lnet.nis = 36.102.21.1@o2ib 28 8 64 64 -116 lnet.buffers = pages count credits min lnet.buffers = 0 0 0 0 lnet.buffers = 1 0 0 0 lnet.buffers = 256 0 0 0 lnet.peers = nid refs state max rtr min tx min queue lnet.peers = 36.102.223.1@o2ib 1 ~rtr 8 8 8 8 4 0 lnet.peers = 36.102.225.1@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.11@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.12@o2ib 1 ~rtr 8 8 8 8 3 0 lnet.peers = 36.102.223.13@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.14@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.15@o2ib 1 ~rtr 8 8 8 8 1 0 lnet.peers = 36.102.223.16@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.17@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.223.18@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.225.11@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.12@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.13@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.14@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.15@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.16@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.17@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.225.18@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.224.1@o2ib 1 ~rtr 8 8 8 8 2 0 lnet.peers = 36.102.224.11@o2ib 1 ~rtr 8 8 8 8 -2 0 lnet.peers = 36.102.224.12@o2ib 1 ~rtr 8 8 8 8 -3 0 lnet.peers = 36.102.224.13@o2ib 1 ~rtr 8 8 8 8 -3 0 lnet.peers = 36.102.224.14@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.224.15@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.224.16@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.224.17@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.peers = 36.102.224.18@o2ib 1 ~rtr 8 8 8 8 -4 0 lnet.routers = ref rtr_ref alive_cnt state last_ping router lnet.routes = Routing disabled lnet.routes = net hops state router lnet.stats = 0 200 0 620082 590942 0 106 104438667666 66599515067 0 21200 lnet.debug_mb = 121 lnet.panic_on_lbug = 0 lnet.catastrophe = 0 lnet.memused = 10711066 lnet.debug_log_upcall = /usr/lib/lustre/lnet_debug_log_upcall lnet.upcall = /usr/lib/lustre/lnet_upcall lnet.debug_path = /tmp/lustre-log lnet.console_backoff = 2 lnet.console_min_delay_centisecs = 50 lnet.console_max_delay_centisecs = 60000 lnet.console_ratelimit = 1 lnet.printk = warning error emerg console lnet.subsystem_debug = undefined mdc mds osc ost class log llite rpc lnet lnd pinger filter echo ldlm lov lquota lmv sec gss mgc mgs fid fld lnet.debug = ioctl neterror warning error emerg ha config console sunrpc.max_resvport = 1023 sunrpc.min_resvport = 665 sunrpc.tcp_slot_table_entries = 16 sunrpc.udp_slot_table_entries = 16 sunrpc.transports = tcp 1048576 sunrpc.transports = udp 32768 sunrpc.nlm_debug = 0 sunrpc.nfsd_debug = 0 sunrpc.nfs_debug = 0 sunrpc.rpc_debug = 0 crypto.fips_enabled = 0 abi.vsyscall32 = 1 dev.parport.default.spintime = 500 dev.parport.default.timeslice = 200 dev.cdrom.check_media = 0 dev.cdrom.lock = 1 dev.cdrom.debug = 0 dev.cdrom.autoeject = 0 dev.cdrom.autoclose = 1 dev.cdrom.info = CD-ROM information, Id: cdrom.c 3.20 2003/12/17 dev.cdrom.info = dev.cdrom.info = drive name: sr0 dev.cdrom.info = drive speed: 0 dev.cdrom.info = drive # of slots: 1 dev.cdrom.info = Can close tray: 0 dev.cdrom.info = Can open tray: 0 dev.cdrom.info = Can lock tray: 1 dev.cdrom.info = Can change speed: 1 dev.cdrom.info = Can select disk: 0 dev.cdrom.info = Can read multisession: 1 dev.cdrom.info = Can read MCN: 1 dev.cdrom.info = Reports media changed: 1 dev.cdrom.info = Can play audio: 1 dev.cdrom.info = Can write CD-R: 0 dev.cdrom.info = Can write CD-RW: 0 dev.cdrom.info = Can read DVD: 0 dev.cdrom.info = Can write DVD-R: 0 dev.cdrom.info = Can write DVD-RAM: 0 dev.cdrom.info = Can read MRW: 1 dev.cdrom.info = Can write MRW: 1 dev.cdrom.info = Can write RAM: 1 dev.cdrom.info = dev.cdrom.info = dev.scsi.logging_level = 0 dev.raid.speed_limit_max = 200000 dev.raid.speed_limit_min = 1000 dev.hpet.max-user-freq = 64 dev.rtc.max-user-freq = 64 debug.exception-trace = 1 net.unix.max_dgram_qlen = 10 net.token-ring.rif_timeout = 600000 net.ipv4.conf.ib0.promote_secondaries = 0 net.ipv4.conf.ib0.force_igmp_version = 0 net.ipv4.conf.ib0.disable_policy = 0 net.ipv4.conf.ib0.disable_xfrm = 0 net.ipv4.conf.ib0.arp_accept = 0 net.ipv4.conf.ib0.arp_ignore = 0 net.ipv4.conf.ib0.arp_announce = 0 net.ipv4.conf.ib0.arp_filter = 0 net.ipv4.conf.ib0.tag = 0 net.ipv4.conf.ib0.log_martians = 0 net.ipv4.conf.ib0.bootp_relay = 0 net.ipv4.conf.ib0.medium_id = 0 net.ipv4.conf.ib0.proxy_arp = 0 net.ipv4.conf.ib0.accept_source_route = 0 net.ipv4.conf.ib0.send_redirects = 1 net.ipv4.conf.ib0.rp_filter = 1 net.ipv4.conf.ib0.shared_media = 1 net.ipv4.conf.ib0.secure_redirects = 1 net.ipv4.conf.ib0.accept_redirects = 1 net.ipv4.conf.ib0.mc_forwarding = 0 net.ipv4.conf.ib0.forwarding = 0 net.ipv4.conf.eth0.promote_secondaries = 0 net.ipv4.conf.eth0.force_igmp_version = 0 net.ipv4.conf.eth0.disable_policy = 0 net.ipv4.conf.eth0.disable_xfrm = 0 net.ipv4.conf.eth0.arp_accept = 0 net.ipv4.conf.eth0.arp_ignore = 0 net.ipv4.conf.eth0.arp_announce = 0 net.ipv4.conf.eth0.arp_filter = 0 net.ipv4.conf.eth0.tag = 0 net.ipv4.conf.eth0.log_martians = 0 net.ipv4.conf.eth0.bootp_relay = 0 net.ipv4.conf.eth0.medium_id = 0 net.ipv4.conf.eth0.proxy_arp = 0 net.ipv4.conf.eth0.accept_source_route = 0 net.ipv4.conf.eth0.send_redirects = 1 net.ipv4.conf.eth0.rp_filter = 1 net.ipv4.conf.eth0.shared_media = 1 net.ipv4.conf.eth0.secure_redirects = 1 net.ipv4.conf.eth0.accept_redirects = 1 net.ipv4.conf.eth0.mc_forwarding = 0 net.ipv4.conf.eth0.forwarding = 0 net.ipv4.conf.lo.promote_secondaries = 0 net.ipv4.conf.lo.force_igmp_version = 0 net.ipv4.conf.lo.disable_policy = 1 net.ipv4.conf.lo.disable_xfrm = 1 net.ipv4.conf.lo.arp_accept = 0 net.ipv4.conf.lo.arp_ignore = 0 net.ipv4.conf.lo.arp_announce = 0 net.ipv4.conf.lo.arp_filter = 0 net.ipv4.conf.lo.tag = 0 net.ipv4.conf.lo.log_martians = 0 net.ipv4.conf.lo.bootp_relay = 0 net.ipv4.conf.lo.medium_id = 0 net.ipv4.conf.lo.proxy_arp = 0 net.ipv4.conf.lo.accept_source_route = 1 net.ipv4.conf.lo.send_redirects = 1 net.ipv4.conf.lo.rp_filter = 0 net.ipv4.conf.lo.shared_media = 1 net.ipv4.conf.lo.secure_redirects = 1 net.ipv4.conf.lo.accept_redirects = 1 net.ipv4.conf.lo.mc_forwarding = 0 net.ipv4.conf.lo.forwarding = 0 net.ipv4.conf.default.promote_secondaries = 0 net.ipv4.conf.default.force_igmp_version = 0 net.ipv4.conf.default.disable_policy = 0 net.ipv4.conf.default.disable_xfrm = 0 net.ipv4.conf.default.arp_accept = 0 net.ipv4.conf.default.arp_ignore = 0 net.ipv4.conf.default.arp_announce = 0 net.ipv4.conf.default.arp_filter = 0 net.ipv4.conf.default.tag = 0 net.ipv4.conf.default.log_martians = 0 net.ipv4.conf.default.bootp_relay = 0 net.ipv4.conf.default.medium_id = 0 net.ipv4.conf.default.proxy_arp = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.default.send_redirects = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.shared_media = 1 net.ipv4.conf.default.secure_redirects = 1 net.ipv4.conf.default.accept_redirects = 1 net.ipv4.conf.default.mc_forwarding = 0 net.ipv4.conf.default.forwarding = 0 net.ipv4.conf.all.promote_secondaries = 0 net.ipv4.conf.all.force_igmp_version = 0 net.ipv4.conf.all.disable_policy = 0 net.ipv4.conf.all.disable_xfrm = 0 net.ipv4.conf.all.arp_accept = 0 net.ipv4.conf.all.arp_ignore = 0 net.ipv4.conf.all.arp_announce = 0 net.ipv4.conf.all.arp_filter = 0 net.ipv4.conf.all.tag = 0 net.ipv4.conf.all.log_martians = 0 net.ipv4.conf.all.bootp_relay = 0 net.ipv4.conf.all.medium_id = 0 net.ipv4.conf.all.proxy_arp = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.all.send_redirects = 1 net.ipv4.conf.all.rp_filter = 0 net.ipv4.conf.all.shared_media = 1 net.ipv4.conf.all.secure_redirects = 1 net.ipv4.conf.all.accept_redirects = 1 net.ipv4.conf.all.mc_forwarding = 0 net.ipv4.conf.all.forwarding = 0 net.ipv4.neigh.ib0.base_reachable_time_ms = 14400000 net.ipv4.neigh.ib0.retrans_time_ms = 1000 net.ipv4.neigh.ib0.locktime = 99 net.ipv4.neigh.ib0.proxy_delay = 79 net.ipv4.neigh.ib0.anycast_delay = 99 net.ipv4.neigh.ib0.proxy_qlen = 64 net.ipv4.neigh.ib0.unres_qlen = 3 net.ipv4.neigh.ib0.gc_stale_time = 30 net.ipv4.neigh.ib0.delay_first_probe_time = 5 net.ipv4.neigh.ib0.base_reachable_time = 14400 net.ipv4.neigh.ib0.retrans_time = 99 net.ipv4.neigh.ib0.app_solicit = 0 net.ipv4.neigh.ib0.ucast_solicit = 3 net.ipv4.neigh.ib0.mcast_solicit = 3 net.ipv4.neigh.eth0.base_reachable_time_ms = 30000 net.ipv4.neigh.eth0.retrans_time_ms = 1000 net.ipv4.neigh.eth0.locktime = 99 net.ipv4.neigh.eth0.proxy_delay = 79 net.ipv4.neigh.eth0.anycast_delay = 99 net.ipv4.neigh.eth0.proxy_qlen = 64 net.ipv4.neigh.eth0.unres_qlen = 3 net.ipv4.neigh.eth0.gc_stale_time = 60 net.ipv4.neigh.eth0.delay_first_probe_time = 5 net.ipv4.neigh.eth0.base_reachable_time = 30 net.ipv4.neigh.eth0.retrans_time = 99 net.ipv4.neigh.eth0.app_solicit = 0 net.ipv4.neigh.eth0.ucast_solicit = 3 net.ipv4.neigh.eth0.mcast_solicit = 3 net.ipv4.neigh.lo.base_reachable_time_ms = 30000 net.ipv4.neigh.lo.retrans_time_ms = 1000 net.ipv4.neigh.lo.locktime = 99 net.ipv4.neigh.lo.proxy_delay = 79 net.ipv4.neigh.lo.anycast_delay = 99 net.ipv4.neigh.lo.proxy_qlen = 64 net.ipv4.neigh.lo.unres_qlen = 3 net.ipv4.neigh.lo.gc_stale_time = 60 net.ipv4.neigh.lo.delay_first_probe_time = 5 net.ipv4.neigh.lo.base_reachable_time = 30 net.ipv4.neigh.lo.retrans_time = 99 net.ipv4.neigh.lo.app_solicit = 0 net.ipv4.neigh.lo.ucast_solicit = 3 net.ipv4.neigh.lo.mcast_solicit = 3 net.ipv4.neigh.default.base_reachable_time_ms = 30000 net.ipv4.neigh.default.retrans_time_ms = 1000 net.ipv4.neigh.default.gc_thresh3 = 1024 net.ipv4.neigh.default.gc_thresh2 = 512 net.ipv4.neigh.default.gc_thresh1 = 128 net.ipv4.neigh.default.gc_interval = 30 net.ipv4.neigh.default.locktime = 99 net.ipv4.neigh.default.proxy_delay = 79 net.ipv4.neigh.default.anycast_delay = 99 net.ipv4.neigh.default.proxy_qlen = 64 net.ipv4.neigh.default.unres_qlen = 3 net.ipv4.neigh.default.gc_stale_time = 60 net.ipv4.neigh.default.delay_first_probe_time = 5 net.ipv4.neigh.default.base_reachable_time = 30 net.ipv4.neigh.default.retrans_time = 99 net.ipv4.neigh.default.app_solicit = 0 net.ipv4.neigh.default.ucast_solicit = 3 net.ipv4.neigh.default.mcast_solicit = 3 net.ipv4.udp_wmem_min = 4096 net.ipv4.udp_rmem_min = 4096 net.ipv4.udp_mem = 2318304 3091072 4636608 net.ipv4.cipso_rbm_strictvalid = 1 net.ipv4.cipso_rbm_optfmt = 0 net.ipv4.cipso_cache_bucket_size = 10 net.ipv4.cipso_cache_enable = 1 net.ipv4.tcp_slow_start_after_idle = 1 net.ipv4.tcp_dma_copybreak = 4096 net.ipv4.tcp_workaround_signed_windows = 0 net.ipv4.tcp_base_mss = 512 net.ipv4.tcp_mtu_probing = 0 net.ipv4.tcp_abc = 0 net.ipv4.tcp_congestion_control = bic net.ipv4.tcp_tso_win_divisor = 3 net.ipv4.tcp_moderate_rcvbuf = 1 net.ipv4.tcp_no_metrics_save = 0 net.ipv4.ipfrag_max_dist = 64 net.ipv4.ipfrag_secret_interval = 600 net.ipv4.tcp_low_latency = 0 net.ipv4.tcp_frto = 0 net.ipv4.tcp_tw_reuse = 0 net.ipv4.icmp_ratemask = 6168 net.ipv4.icmp_ratelimit = 1000 net.ipv4.tcp_adv_win_scale = 2 net.ipv4.tcp_app_win = 31 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_mem = 16777216 16777216 16777216 net.ipv4.tcp_dsack = 1 net.ipv4.tcp_ecn = 0 net.ipv4.tcp_reordering = 3 net.ipv4.tcp_fack = 1 net.ipv4.tcp_orphan_retries = 0 net.ipv4.inet_peer_gc_maxtime = 120 net.ipv4.inet_peer_gc_mintime = 10 net.ipv4.inet_peer_maxttl = 600 net.ipv4.inet_peer_minttl = 120 net.ipv4.inet_peer_threshold = 65664 net.ipv4.igmp_max_msf = 10 net.ipv4.igmp_max_memberships = 20 net.ipv4.route.rt_cache_rebuild_count = 4 net.ipv4.route.secret_interval = 600 net.ipv4.route.min_adv_mss = 256 net.ipv4.route.min_pmtu = 552 net.ipv4.route.mtu_expires = 600 net.ipv4.route.gc_elasticity = 8 net.ipv4.route.error_burst = 5000 net.ipv4.route.error_cost = 1000 net.ipv4.route.redirect_silence = 20480 net.ipv4.route.redirect_number = 9 net.ipv4.route.redirect_load = 20 net.ipv4.route.gc_interval = 60 net.ipv4.route.gc_timeout = 300 net.ipv4.route.gc_min_interval_ms = 500 net.ipv4.route.gc_min_interval = 0 net.ipv4.route.max_size = 8388608 net.ipv4.route.gc_thresh = 524288 net.ipv4.route.max_delay = 10 net.ipv4.route.min_delay = 2 net.ipv4.icmp_errors_use_inbound_ifaddr = 0 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_echo_ignore_all = 0 net.ipv4.ip_local_port_range = 32768 61000 net.ipv4.tcp_max_syn_backlog = 1024 net.ipv4.tcp_rfc1337 = 0 net.ipv4.tcp_stdurg = 0 net.ipv4.tcp_abort_on_overflow = 0 net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_fin_timeout = 60 net.ipv4.tcp_retries2 = 15 net.ipv4.tcp_retries1 = 3 net.ipv4.tcp_keepalive_intvl = 75 net.ipv4.tcp_keepalive_probes = 9 net.ipv4.tcp_keepalive_time = 7200 net.ipv4.ipfrag_time = 30 net.ipv4.ip_dynaddr = 0 net.ipv4.ipfrag_low_thresh = 196608 net.ipv4.ipfrag_high_thresh = 262144 net.ipv4.tcp_max_tw_buckets = 180000 net.ipv4.tcp_max_orphans = 65536 net.ipv4.tcp_synack_retries = 5 net.ipv4.tcp_syn_retries = 5 net.ipv4.ip_nonlocal_bind = 0 net.ipv4.ip_no_pmtu_disc = 0 net.ipv4.ip_default_ttl = 64 net.ipv4.ip_forward = 0 net.ipv4.tcp_retrans_collapse = 1 net.ipv4.tcp_sack = 0 net.ipv4.tcp_window_scaling = 1 net.ipv4.tcp_timestamps = 0 net.core.netdev_budget = 300 net.core.somaxconn = 128 net.core.xfrm_larval_drop = 0 net.core.xfrm_acq_expires = 30 net.core.xfrm_aevent_rseqth = 2 net.core.xfrm_aevent_etime = 10 net.core.optmem_max = 16777216 net.core.message_burst = 10 net.core.message_cost = 5 net.core.netdev_max_backlog = 250000 net.core.dev_weight = 64 net.core.rmem_default = 16777216 net.core.wmem_default = 16777216 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 vm.max_writeback_pages = 1024 vm.flush_mmap_pages = 1 vm.pagecache = 100 vm.min_slab_ratio = 5 vm.min_unmapped_ratio = 1 vm.zone_reclaim_mode = 1 vm.swap_token_timeout = 300 0 vm.legacy_va_layout = 0 vm.vfs_cache_pressure = 100 vm.block_dump = 0 vm.laptop_mode = 0 vm.max_map_count = 65536 vm.percpu_pagelist_fraction = 0 vm.min_free_kbytes = 32768 vm.drop_caches = 1 vm.lowmem_reserve_ratio = 256 256 32 vm.hugetlb_shm_group = 0 vm.nr_hugepages = 0 vm.swappiness = 60 vm.nr_pdflush_threads = 2 vm.dirty_expire_centisecs = 2999 vm.dirty_writeback_centisecs = 499 vm.mmap_min_addr = 4096 vm.dirty_ratio = 40 vm.dirty_background_ratio = 10 vm.page-cluster = 3 vm.overcommit_ratio = 50 vm.panic_on_oom = 0 vm.overcommit_memory = 0 kernel.vsyscall64 = 1 kernel.max_lock_depth = 1024 kernel.compat-log = 1 kernel.acpi_video_flags = 0 kernel.randomize_va_space = 1 kernel.bootloader_type = 50 kernel.panic_on_unrecovered_nmi = 0 kernel.unknown_nmi_panic = 0 kernel.ngroups_max = 65536 kernel.printk_ratelimit_burst = 10 kernel.printk_ratelimit = 5 kernel.panic_on_oops = 1 kernel.pid_max = 32768 kernel.overflowgid = 65534 kernel.overflowuid = 65534 kernel.pty.nr = 0 kernel.pty.max = 4096 kernel.random.uuid = 31886b82-0407-44f9-b4e5-dad23050a76d kernel.random.boot_id = 6d1ae9df-7393-4be6-9693-f11dd6f28526 kernel.random.write_wakeup_threshold = 128 kernel.random.read_wakeup_threshold = 64 kernel.random.entropy_avail = 183 kernel.random.poolsize = 4096 kernel.threads-max = 421888 kernel.cad_pid = 1 kernel.sysrq = 0 kernel.sem = 250 32000 32 128 kernel.msgmnb = 65536 kernel.msgmni = 16 kernel.msgmax = 65536 kernel.shmmni = 4096 kernel.shmall = 4294967296 kernel.shmmax = 22683613184 kernel.acct = 4 2 30 kernel.hotplug = kernel.modprobe = /sbin/modprobe kernel.printk = 1 4 1 7 kernel.ctrl-alt-del = 0 kernel.real-root-dev = 0 kernel.cap-bound = -257 kernel.tainted = 67 kernel.core_pattern = core kernel.core_uses_pid = 1 kernel.print-fatal-signals = 0 kernel.exec-shield = 1 kernel.panic = 30 kernel.domainname = (none) kernel.hostname = en001 kernel.version = #2 SMP Tue Jan 26 10:58:02 PST 2010 kernel.osrelease = 2.6.18-164.11.1.el5.crt1 kernel.ostype = Linux kernel.sched_interactive = 2 fs.nfs.nfs_congestion_kb = 158720 fs.nfs.nfs_mountpoint_timeout = 500 fs.nfs.idmap_cache_timeout = 600 fs.nfs.nfs_callback_tcpport = 0 fs.nfs.nsm_local_state = 0 fs.nfs.nsm_use_hostnames = 0 fs.nfs.nlm_tcpport = 0 fs.nfs.nlm_udpport = 0 fs.nfs.nlm_timeout = 10 fs.nfs.nlm_grace_period = 0 fs.mqueue.msgsize_max = 8192 fs.mqueue.msg_max = 10 fs.mqueue.queues_max = 256 fs.quota.warnings = 1 fs.quota.syncs = 24 fs.quota.free_dquots = 0 fs.quota.allocated_dquots = 0 fs.quota.cache_hits = 0 fs.quota.writes = 0 fs.quota.reads = 0 fs.quota.drops = 0 fs.quota.lookups = 0 fs.suid_dumpable = 0 fs.inotify.max_queued_events = 16384 fs.inotify.max_user_watches = 8192 fs.inotify.max_user_instances = 128 fs.aio-max-nr = 65536 fs.aio-nr = 0 fs.lease-break-time = 45 fs.dir-notify-enable = 1 fs.leases-enable = 1 fs.overflowgid = 65534 fs.overflowuid = 65534 fs.dentry-state = 2684 60 45 0 0 0 fs.file-max = 2341405 fs.file-nr = 1530 0 2341405 fs.inode-state = 2805 80 0 0 0 0 0 fs.inode-nr = 2805 80 fs.binfmt_misc.jexec = enabled fs.binfmt_misc.jexec = interpreter /usr/java/default/lib/jexec fs.binfmt_misc.jexec = flags: fs.binfmt_misc.jexec = offset 0 fs.binfmt_misc.jexec = magic 504b0304 fs.binfmt_misc.status = enabled[root@eln1 init.d]# Configuration of Demons • “chkconfig --list” provides a table for all demons properly listed in /etc/init.d • Again: checking for consistency only

[user]$ pdsh -w et[60-64,69-73] '/sbin/chkconfig --list | sum' | dshbak -c ------et[60-64,69-73] ------08348 3

45 SC2010

a complete output:

[mhebenst@eln1 ~]$ pdsh -w et[60-64,69-73] /sbin/chkconfig --list | dshbak -c ------et[60-64,69-73] ------acpid 0:off 1:off 2:on 3:on 4:on 5:on 6:off atd 0:off 1:off 2:off 3:on 4:on 5:on 6:off cpuspeed 0:off 1:on 2:on 3:on 4:on 5:on 6:off crond 0:off 1:off 2:on 3:off 4:on 5:on 6:off exim 0:off 1:off 2:off 3:off 4:on 5:on 6:off gpm 0:off 1:off 2:off 3:off 4:on 5:on 6:off haldemon 0:off 1:off 2:off 3:off 4:on 5:on 6:off ipmi 0:off 1:off 2:off 3:off 4:off 5:off 6:off ipmievd 0:off 1:off 2:off 3:off 4:off 5:off 6:off irqbalance 0:off 1:off 2:on 3:on 4:on 5:on 6:off jexec 0:on 1:on 2:on 3:on 4:on 5:on 6:on kudzu 0:off 1:off 2:off 3:off 4:on 5:on 6:off lm_sensors 0:off 1:off 2:off 3:off 4:on 5:on 6:off lvm2-monitor 0:off 1:on 2:off 3:off 4:on 5:on 6:off mcstrans 0:off 1:off 2:off 3:off 4:on 5:on 6:off mdmonitor 0:off 1:off 2:off 3:off 4:on 5:on 6:off mdmpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off messagebus 0:off 1:off 2:off 3:off 4:on 5:on 6:off microcode_ctl 0:off 1:off 2:on 3:on 4:on 5:on 6:off multipathd 0:off 1:off 2:off 3:off 4:off 5:off 6:off netconsole 0:off 1:off 2:off 3:off 4:off 5:off 6:off netfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off netplugd 0:off 1:off 2:off 3:off 4:off 5:off 6:off network 0:off 1:off 2:on 3:on 4:on 5:on 6:off nfs 0:off 1:off 2:off 3:off 4:off 5:off 6:off nfslock 0:off 1:off 2:off 3:on 4:on 5:on 6:off ntpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off openibd 0:off 1:off 2:on 3:on 4:on 5:on 6:off opensmd 0:off 1:off 2:off 3:off 4:off 5:off 6:off panfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off portmap 0:off 1:off 2:off 3:on 4:on 5:on 6:off a complete output:

[mhebenst@eln1 ~]$ pdsh -w et[60-64,69-73] /sbin/chkconfig --list | dshbak -c ------et[60-64,69-73] ------acpid 0:off 1:off 2:on 3:on 4:on 5:on 6:off atd 0:off 1:off 2:off 3:on 4:on 5:on 6:off cpuspeed 0:off 1:on 2:on 3:on 4:on 5:on 6:off crond 0:off 1:off 2:on 3:off 4:on 5:on 6:off exim 0:off 1:off 2:off 3:off 4:on 5:on 6:off gpm 0:off 1:off 2:off 3:off 4:on 5:on 6:off haldemon 0:off 1:off 2:off 3:off 4:on 5:on 6:off ipmi 0:off 1:off 2:off 3:off 4:off 5:off 6:off ipmievd 0:off 1:off 2:off 3:off 4:off 5:off 6:off irqbalance 0:off 1:off 2:on 3:on 4:on 5:on 6:off jexec 0:on 1:on 2:on 3:on 4:on 5:on 6:on kudzu 0:off 1:off 2:off 3:off 4:on 5:on 6:off lm_sensors 0:off 1:off 2:off 3:off 4:on 5:on 6:off lvm2-monitor 0:off 1:on 2:off 3:off 4:on 5:on 6:off mcstrans 0:off 1:off 2:off 3:off 4:on 5:on 6:off mdmonitor 0:off 1:off 2:off 3:off 4:on 5:on 6:off mdmpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off messagebus 0:off 1:off 2:off 3:off 4:on 5:on 6:off microcode_ctl 0:off 1:off 2:on 3:on 4:on 5:on 6:off multipathd 0:off 1:off 2:off 3:off 4:off 5:off 6:off netconsole 0:off 1:off 2:off 3:off 4:off 5:off 6:off netfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off netplugd 0:off 1:off 2:off 3:off 4:off 5:off 6:off network 0:off 1:off 2:on 3:on 4:on 5:on 6:off nfs 0:off 1:off 2:off 3:off 4:off 5:off 6:off nfslock 0:off 1:off 2:off 3:on 4:on 5:on 6:off ntpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off openibd 0:off 1:off 2:on 3:on 4:on 5:on 6:off opensmd 0:off 1:off 2:off 3:off 4:off 5:off 6:off panfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off portmap 0:off 1:off 2:off 3:on 4:on 5:on 6:off psacct 0:off 1:off 2:off 3:off 4:off 5:off 6:off rawdevices 0:off 1:off 2:off 3:on 4:on 5:on 6:off rdisc 0:off 1:off 2:off 3:off 4:off 5:off 6:off readahead_early 0:off 1:off 2:on 3:on 4:on 5:on 6:off readahead_later 0:off 1:off 2:off 3:off 4:off 5:on 6:off restorecond 0:off 1:off 2:off 3:off 4:on 5:on 6:off rpcgssd 0:off 1:off 2:off 3:on 4:on 5:on 6:off rpcidmapd 0:off 1:off 2:off 3:on 4:on 5:on 6:off rpcsvcgssd 0:off 1:off 2:off 3:off 4:off 5:off 6:off smartd 0:off 1:off 2:off 3:off 4:on 5:on 6:off sshd 0:off 1:off 2:on 3:on 4:on 5:on 6:off syslog 0:off 1:off 2:off 3:off 4:on 5:on 6:off sysstat 0:off 1:off 2:off 3:off 4:off 5:on 6:off tgtd 0:off 1:off 2:off 3:off 4:off 5:off 6:off uptrack 0:off 1:off 2:on 3:on 4:on 5:on 6:off vncserver 0:off 1:off 2:off 3:off 4:off 5:off 6:off xfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off xinetd 0:off 1:off 2:off 3:off 4:on 5:on 6:off

xinetd based services: chargen-dgram: off chargen-stream: off daytime-dgram: off daytime-stream: off discard-dgram: off discard-stream: off echo-dgram: off echo-stream: off rmcp: off rsync: off tcpmux-server: off time-dgram: off time-stream: off Consistency of running Demons • start files should be present in /etc/init.d • “/etc/init.d/DEMON status” should work • return codes often messy

[root@eln1 init.d]# /etc/init.d/ntpd status ntpd (pid 7470) is running... [root@eln1 init.d]# echo $? 0 [root@eln1 init.d]# /etc/init.d/psacct status Process accounting is disabled. [root@eln1 init.d]# echo $? 3

[root]# pdsh -w et[60-64,69-73] 'for I in /etc/init.d/* ; do $I status 2>/dev/null 1>/dev/null; printf "%-30s %s\n" $I $?; done | sum ' | dshbak -c ------et[60-64,69-73] ------56504 2 46 SC2010

listing services that report no error: [root]# pdsh -w et[60-64,69-73] ' for I in /etc/init.d/* ; do $I status 2>/dev/null 1>/dev/null; RET=$?; if [ $RET = 0 ]; then printf "%-30s %s\n" $I $?; fi; done' | dshbak -c ------et[60-64,69-73] ------/etc/init.d/acpid 0 /etc/init.d/atd 0 /etc/init.d/functions 0 /etc/init.d/irqbalance 0 /etc/init.d/jexec 0 /etc/init.d/lm_sensors 0 /etc/init.d/mdmpd 0 /etc/init.d/microcode_ctl 0 /etc/init.d/mst 0 /etc/init.d/netfs 0 /etc/init.d/network 0 /etc/init.d/nfslock 0 /etc/init.d/ntpd 0 /etc/init.d/openibd 0 /etc/init.d/panfs 0 /etc/init.d/pbs 0 /etc/init.d/portmap 0 /etc/init.d/rawdevices 0 /etc/init.d/readahead_early 0 /etc/init.d/readahead_later 0 /etc/init.d/restorecond 0 /etc/init.d/rpcidmapd 0 /etc/init.d/sep3 0 /etc/init.d/single 0 /etc/init.d/sshd 0 /etc/init.d/syscfgdrv 0 /etc/init.d/sysstat 0 /etc/init.d/vtune 0 /etc/init.d/xfs 0 /etc/init.d/xinetd 0 listing services that report no error: [root]# pdsh -w et[60-64,69-73] ' for I in /etc/init.d/* ; do $I status 2>/dev/null 1>/dev/null; RET=$?; if [ $RET = 0 ]; then printf "%-30s %s\n" $I $?; fi; done' | dshbak -c ------et[60-64,69-73] ------/etc/init.d/acpid 0 /etc/init.d/atd 0 /etc/init.d/functions 0 /etc/init.d/irqbalance 0 /etc/init.d/jexec 0 /etc/init.d/lm_sensors 0 /etc/init.d/mdmpd 0 /etc/init.d/microcode_ctl 0 /etc/init.d/mst 0 /etc/init.d/netfs 0 /etc/init.d/network 0 /etc/init.d/nfslock 0 /etc/init.d/ntpd 0 /etc/init.d/openibd 0 /etc/init.d/panfs 0 /etc/init.d/pbs 0 /etc/init.d/portmap 0 /etc/init.d/rawdevices 0 /etc/init.d/readahead_early 0 /etc/init.d/readahead_later 0 /etc/init.d/restorecond 0 /etc/init.d/rpcidmapd 0 /etc/init.d/sep3 0 /etc/init.d/single 0 /etc/init.d/sshd 0 /etc/init.d/syscfgdrv 0 /etc/init.d/sysstat 0 /etc/init.d/vtune 0 /etc/init.d/xfs 0 /etc/init.d/xinetd 0 Network Consistency • “hostname” should report correctly • “/etc/nsswitch.conf” configures the name services • “/etc/hosts” contains all hosts and their IP-adresses

[user]$ pdsh -w t01,et[61-64,69-72] hostname | awk -F ": " '{if($1 != $2){print $1": fail "$2}else{print $1": OK"}}' | dshbak -c ------et[61-64,69-72] ------OK ------t01 ------fail et77

[user]$ pdsh -w et[61-64,69-72] sum /etc/hosts /etc/nsswitch.conf | dshbak -c ------et[61-64,69-72] ------47 45524 20 /etc/hosts SC2010 11894 2 /etc/nsswitch.conf

original output of pdsh command: [mhebenst@eln1 ~]$ pdsh -w t01,et[61-64,69-72] hostname t01: et77 et61: et61 et72: et72 et70: et70 et62: et62 et69: et69 et63: et63 et71: et71 et64: et64 awk –F “: “ fields will be separated by the combination “: “ (“:” + space) awk command with better formatting:

'{ if($1 != $2) {print $1": fail "$2} else {print $1": OK"} }‘ including the “:” in the 2 print commands makes the output compatible to dshbak Network Interface Configuration • “ifconfig” has to report the same value as “/etc/hosts” • loopback device has to exist

[user]$ pdsh -w t01,et[61-64,69-72] '/sbin/ifconfig lo >/dev/null 2>/dev/null; echo "lo $?"; /sbin/ifconfig ib2 >/dev/null 2>/dev/null; echo "ib2 $?" ' | dshbak -c ------et[61-64,69-72],t01 ------lo 0 ib2 1

[user]$ cat test_hostname.sh #!/bin/sh NAME=`hostname` IP=`python -c "import socket;print socket.gethostbyname('$NAME')"` if `/sbin/ifconfig eth0 | grep -q $IP`; then echo OK; else echo "FAIL $IP";fi

[user]$ pdsh -w t01,et[61-64,69-72] sh test_hostname.sh | dshbak -c ------et[61-64,69-72],t01 ------48OK SC2010

Nice formatting:

‘/sbin/ifconfig lo >/dev/null 2>/dev/null; echo "lo $?"; /sbin/ifconfig ib2 >/dev/null 2>/dev/null; echo "ib2 $?“ ‘

#!/bin/sh NAME=`hostname` IP=`python -c "import socket;print socket.gethostbyname('$NAME')"` if `/sbin/ifconfig eth0 | grep -q $IP`; then echo OK; else echo "FAIL $IP"; fi Remote Access Test (all to all) •ping • ssh

[user]$ NODES=`pdsh -w et[61-64,69-72] -Q | tail -1 | sed 's|,| |g'`

[user]$ echo $NODES et61 et62 et63 et64 et69 et70 et71 et72

[user]$ pdsh -w et[61-64,69-72] "sh test_remote.sh $NODES" | dshbak -c ------et[61-64,69-72] ------OK:et[61-64,69-72]

FAIL:

49 SC2010 Network Test Program

[user]$ cat test_remote.sh #!/bin/sh

SUCCESS=""; FAIL="" for I in $*; do

if `ping -q -c 1 $I 2>@1 | grep -q "1 packets transmitted, 1 received, 0% packet loss"` then if `pdsh -w $I -u 3 pwd >/dev/null 2>/dev/null` then SUCCESS="$SUCCESS $I" continue fi fi FAIL="$FAIL $I" done

printf "OK:"; for I in $SUCCESS; do echo $I:; done | dshbak -c | grep -v -e "---" | head -1;echo printf "FAIL:"; for I in $FAIL; do echo $I:; done | dshbak -c | grep -v -e "- --" | head -1; echo

50 SC2010 Date and Time • “date” returns current value; needs to be consistent (licensing, file stamps) • ntp system is the standard to ensure consistency • “ntpq –p” returns status on the current ntp setting • config files are /etc/ntp.conf and /etc/ntp/step-tickers

51 SC2010

[mhebenst@eln1 ~]$ pdsh -w et[61-64,69-72] /usr/sbin/ntpq -p | dshbak -c ------et[72] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u 441 1024 377 1.185 1.079 0.141

------et[71] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u 278 1024 377 1.149 -0.115 0.307

------et[70] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u 292 1024 377 1.172 0.859 0.190

------et[69] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u 523 1024 377 1.124 1.551 0.113

------et[64] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u 318 1024 377 1.127 0.585 0.170

------Date and Time examples

[user]$ pdsh -w et[61-64,69-72] date | dshbak -c ------et[61-64,69-72] ------Tue Aug 17 13:33:49 PDT 2010

[user]$ pdsh -w et[61-64,69-72] /usr/sbin/ntpq -p|sed -e 's,[- .0-9]*$,,'| dshbak -c ------et[61,64,69-72] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 129.6.15.29 2 u

------et[62-63] ------remote refid st t when poll reach delay offset jitter ======*endeavour3 132.163.4.103 2 u

[user]$ pdsh -w et[61-64,69-72] sum /etc/ntp.conf /etc/ntp/step-tickers | dshbak -c ------et[61-64,69-72] ------55640 3 /etc/ntp.conf 10282 1 /etc/ntp/step-tickers

52 SC2010 Additional File System testing • tests are similar for local FS, network mounted FS and swap space • test might include mount status/options/ point, permissions, accessibility • “df” and “ls –l” are preferred over “ls” • consider the impact of your load on network FS servers when designing the test (imagine 200 nodes executing an access at the same time)

53 SC2010 Sample File Systems Tests

[user]$ pdsh -w et[61-64,69-72] cat /proc/mounts | awk '{if($3=="/home"){print $0}}' | dshbak -c ------et[61-64,69-72] ------36.101.255.10:/volatile3 /home nfs rw,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nointr,nolock,proto=udp,timeo=20,retrans=3 ,sec=sys,addr=36.101.255.10 0 0

[user]$ pdsh -w et[61-64,69-72] df /home | dshbak -c ------et[61-64,69-72] ------Filesystem 1K-blocks Used Available Use% Mounted on 36.101.255.10:/volatile3 923029888 137288672 738853984 16% /home

[user]$ pdsh -w et[61-64,69-72] ls -ld /tmp | awk '{if($2=="drwxrwxrwt"){print $1,"/tmp OK"}else{print $1,"/tmp ",$2}}' | dshbak -c ------et[61-64,69-72] ------/tmp OK

54 SC2010 Users • Consistency of files: nsswitch.conf, passwd, shadow, group • Consistency of user environment: printenv

[root]# pdsh -w et[61-64,69-72] sum /etc/nsswitch.conf /etc/passwd /etc/shadow /etc/group | dshbak -c ------et[61-64,69-72] ------11894 2 /etc/nsswitch.conf 16869 24 /etc/passwd 10750 29 /etc/shadow 08786 11 /etc/group

[user]$ pdsh -w et[61-64,69-72] "printenv | grep -v SSH_ | sum" | dshbak -c ------et[61-64,69-72] ------31634 1

55 SC2010 Unwanted or Stale Processes • “ps aux” standard way, alternative /proc – on empty system no user processes allowed – services already tested

[root]# pdsh -w et[61-64,69-72] 'ps ax -o user,comm | sort | sum ' | dshbak -c ------et[61-64,69-72] ------02290 6

[root]# pdsh -w et[61-64,69-72] 'ps ax -o user | sort | uniq ' | dshbak -c ------et[61-64,69-72] ------USER ganglia ntp root rpc xfs

56 SC2010 System Load • should be close to zero • load values are averaged over specified intervals – testing load as part of a larger test suite therefore requires ample time for the system to cool off • check load via “uptime” or /proc/loadavg

[user]$ pdsh -w et[61-64,69-72] 'uptime' et61: 14:58:02 up 5 days, 4:12, 0 users, load average: 0.00, 0.00, 0.00 et63: 14:58:02 up 5 days, 4:12, 0 users, load average: 0.00, 0.00, 0.00 et62: 14:58:02 up 5 days, 2:36, 0 users, load average: 0.00, 0.00, 0.00 et69: 14:58:02 up 5 days, 4:13, 0 users, load average: 0.00, 0.00, 0.00 et71: 14:58:02 up 5 days, 4:13, 0 users, load average: 0.11, 0.03, 0.01 et72: 14:58:02 up 5 days, 4:13, 0 users, load average: 0.00, 0.00, 0.00 et70: 14:58:02 up 5 days, 4:13, 0 users, load average: 0.00, 0.00, 0.00 et64: 14:58:02 up 5 days, 4:13, 0 users, load average: 0.00, 0.00, 0.00

57 SC2010 Application Correctness Test • should involve as many components as possible • tests needed both for single and multiple systems • can be used for performance test as well • HPLinpack* 2.0 is used at CRT-DC as it tests: – CPU including SMD unit –Memory – MPI software stack – Interconnect

58 SC2010 HPLinpack* 2.0 Example

HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 …. The following parameter values will be used: …. ------The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0

T/V N NB P Q Time Gflops ------WR01C2R4 53248 168 2 4 1189.86 8.459e+01 ------||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0032127 ...... PASSED

Finished 1 tests with the following results: 1 tests completed and passed residual checks, …..

59 SC2010

steps to start xpl:

[user]$ . /opt/intel/mpi/3.1.038/bin64/mpivars.sh [user]$ . /opt/intel/cce/11.0.81/bin/iccvars.sh intel64 [user]$ echo "starting mpdboot" [user]$ mpdboot -n 1 [user]$ mpdtrace [user]$ cd /tmp [user]$ /opt/admin/icsmoke/scripts/mkhpl 0.9 `hostname` > HPL.dat [user]$ echo "starting xhpl" [user]$ mpiexec -np 8 /opt/admin/icsmoke/bin/xhpl.impi [user]$ mpdallexit

======HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ======

An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N : 53248 NB : 168 PMAP : Row-major process mapping P : 2 Q : 4 PFACT : Right steps to start xpl:

[user]$ . /opt/intel/mpi/3.1.038/bin64/mpivars.sh [user]$ . /opt/intel/cce/11.0.81/bin/iccvars.sh intel64 [user]$ echo "starting mpdboot" [user]$ mpdboot -n 1 [user]$ mpdtrace [user]$ cd /tmp [user]$ /opt/admin/icsmoke/scripts/mkhpl 0.9 `hostname` > HPL.dat [user]$ echo "starting xhpl" [user]$ mpiexec -np 8 /opt/admin/icsmoke/bin/xhpl.impi [user]$ mpdallexit

======HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ======

An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system.

The following parameter values will be used:

N : 53248 NB : 168 PMAP : Row-major process mapping P : 2 Q : 4 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 0 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words

------

- The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0

Column=000336 Fraction=0.005 Mflops=85502.71 Column=000672 Fraction=0.010 Mflops=85598.72 Column=000840 Fraction=0.015 Mflops=85653.06 ….. ======T/V N NB P Q Time Gflops ------WR01C2R4 53248 168 2 4 1189.86 8.459e+01 ------||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0032127 ...... PASSED ======

Finished 1 tests with the following results: 1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. ------

End of Tests. ======Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

60 SC2010

60 General • tests should be short • keep your tests identical over life of cluster • keep results archived • absolute performance depends on hardware • threshold and variation over time show degradation

61 SC2010 Single node tests • Disk: dd, iozone*, bonnie++* •Memory: streams •CPU tests: – Single Thread vs Multi Thread – Integer vs Floating Point –Dhrystone* –Linpack*

62 SC2010 dd – standard Unix www.iozone.org http://www.coker.com.au/bonnie++/ http://www.cs.virginia.edu/stream/ http://www.top500.org/project/linpack for instance: http://www.anime.net/~goemon/benchmarks.html Example: iozone [user]$ cat run_iozone.sh cd /tmp /opt/admin/bin/iozone -I -r 256 -s 700000 -i 0 \ awk '/700000 256/{printf "%10d %10d\n",$4,$3}'

[user]$ pdsh -w et[61-64,69-72] sh run_iozone.sh et61: 76414 77044 et62: 76379 76888 et69: 75700 77648 et64: 74326 75087 et70: 72444 74298 et63: 72207 72110 et72: 71340 71385 et71: 71535 71280

[user]$ cat run_iozone_refined.sh cd /tmp /opt/admin/icsmoke/bin/iozone -I -r 256 -s 700000 -i 0 | \ awk '/700000 256/‘ \ '{if($3 < 72000 || $4 < 72000){printf "FAIL %10d %10d\n",$4,$3}else{print "OK"}}'

[user]$ pdsh -w et[61-64,69-72] sh run_iozone_refined.sh | dshbak -c ------et72 ------FAIL 71709 71794 ------et[61-62,64,69-70] ------OK … 63 SC2010

complete output of the iozone command: [user]$ /opt/admin/bin/iozone -I -r 256 -s 700000 -i 0 Iozone: Performance Test of File I/O Version $Revision: 3.263 $ Compiled for 64 bit mode. Build: linux-AMD64

Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Jean-Marc Zucconi, Jeff Blomberg, Erik Habbinga, Kris Strecker, Walter Wong.

Run began: Fri Aug 20 17:07:30 2010

O_DIRECT feature enabled Record Size 256 KB File size set to 700000 KB Command line used: /opt/admin/bin/iozone -I -r 256 -s 700000 -i 0 Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 700000 256 44595 44297 iozone test complete. Multi Node Tests •MPI Stack(s) • Node to Node performance • Full sectional bandwidth •examples: – writing it yourself  –Intel®IMB – Microway ® MPI Link-Checker* – hpcc* – HPL Linpack* 2.0 (within limits)

64 SC2010 Example of an MPI Test

[user]$ cat run_mpi.sh #!/bin/sh HOSTS=$1 HOSTFILE=`mktemp` MASTER=`hostname` ALLNODES=`pdsh -w $HOSTS -N -u 5 hostname | sort` . /opt/intel/mpi/4.0.0.028/bin64/mpivars.sh MPIBIN=`which IMB-MPI1` VAL=""

for I in $ALLNODES do if [ $I = $MASTER ] then VAL="$VAL ------" else echo $MASTER > $HOSTFILE; echo $I >> $HOSTFILE VAL="$VAL `mpirun --file=$HOSTFILE -perhost 1 -n 2 $MPIBIN Sendrecv | awk '/ 4194304/{printf("%8.0f\n",$6)}'`" fi done

echo "$VAL" /bin/rm $HOSTFILE 65 SC2010

Output of the IMB test: [user]$ mpirun --file=$HOSTFILE -perhost 1 -n 2 $MPIBIN Sendrecv #------# Intel (R) MPI Benchmark Suite V3.2.1, MPI-1 part #------# Date : Fri Aug 20 18:50:10 2010 # Machine : x86_64 # System : Linux # Release : 2.6.18-164.11.1.el5.crt1 # Version : #2 SMP Tue Jan 26 10:58:02 PST 2010 # MPI Version : 2.1 # MPI Thread Environment: MPI_THREAD_SINGLE

# Calling sequence was:

# /opt/intel/mpi/4.0.0.028/intel64/bin/IMB-MPI1 Sendrecv

# Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # #

# List of Benchmarks to run:

# Sendrecv

#------# Benchmarking Sendrecv # #processes = 2 #------#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec Output of the IMB test: [user]$ mpirun --file=$HOSTFILE -perhost 1 -n 2 $MPIBIN Sendrecv #------# Intel (R) MPI Benchmark Suite V3.2.1, MPI-1 part #------# Date : Fri Aug 20 18:50:10 2010 # Machine : x86_64 # System : Linux # Release : 2.6.18-164.11.1.el5.crt1 # Version : #2 SMP Tue Jan 26 10:58:02 PST 2010 # MPI Version : 2.1 # MPI Thread Environment: MPI_THREAD_SINGLE

# Calling sequence was:

# /opt/intel/mpi/4.0.0.028/intel64/bin/IMB-MPI1 Sendrecv

# Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # #

# List of Benchmarks to run:

# Sendrecv

#------# Benchmarking Sendrecv # #processes = 2 #------#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 1.53 1.53 1.53 0.00 1 1000 1.56 1.56 1.56 1.22 2 1000 1.56 1.56 1.56 2.44 4 1000 1.53 1.53 1.53 4.99 8 1000 1.59 1.59 1.59 9.59 16 1000 1.60 1.60 1.60 19.05 32 1000 1.66 1.66 1.66 36.68 64 1000 1.72 1.72 1.72 70.85 128 1000 1.89 1.89 1.89 128.98 256 1000 3.00 3.01 3.00 162.49 512 1000 3.29 3.29 3.29 297.09 1024 1000 3.94 3.94 3.94 495.85 2048 1000 5.03 5.03 5.03 776.13 4096 1000 6.02 6.02 6.02 1297.74 8192 1000 8.17 8.17 8.17 1911.79 16384 1000 12.49 12.49 12.49 2501.37 32768 1000 21.18 21.21 21.19 2947.26 65536 640 48.48 48.53 48.50 2575.84 131072 320 48.92 48.93 48.93 5109.55 262144 160 90.18 90.19 90.19 5543.62 524288 80 172.06 172.09 172.08 5811.00 1048576 40 332.03 332.08 332.05 6022.73 2097152 20 651.75 651.80 651.78 6136.85 4194304 10 1303.01 1303.29 1303.15 6138.31

# All processes entering MPI_Finalize

Output of an MPI Test

[user]$ pdsh -w et[61-64,69-72] -f 1 sh run_mpi.sh et[61-64,69-72] et61: ------6128 6134 6142 6148 6136 6140 6138 et62: 6138 ------6137 6137 6143 6135 6137 6132 et63: 6137 6133 ------6138 6139 6137 6141 6143 et64: 6134 6142 6139 ------6138 6145 6134 6139 et69: 6130 6127 6133 6142 ------6145 6142 6133 et70: 6140 6143 6128 6133 6131 ------6138 6141 et71: 6147 6149 6136 6146 6134 6137 ------6138 et72: 6135 6131 6124 6137 6127 6140 6142 ------

66 SC2010 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

67 SC2010

67 Connecting to the Windows Server

• open “Putty” – enter IP Address – open connection/ ssh/tunnels – Enter tunnel data: Source Port: 13389 Destination: localhost:3389 – open connection • open “Remote Desk” – connect to: localhost:3389 68 SC2010 The * toolbox • Powershell* + WMI* Interface • can query remote nodes • Rich repository of information • Problem: getting it coherent • Solution: write a simple script

PS > get-wmiObject -computer emsw001 win32_service | format-table

ExitCode Name ProcessId StartMode State Status ------0 AeLookupSvc 0 Manual Stopped OK 1077 ALG 0 Manual Stopped OK 1077 AppIDSvc 0 Manual Stopped OK 1077 Appinfo 0 Manual Stopped OK 1077 AppMgmt 0 Manual Stopped OK 1077 AudioEndpointBui... 0 Manual Stopped OK 1077 AudioSrv 0 Manual Stopped OK 0 BFE 304 Auto Running OK 0 BITS 828 Manual Running OK 1077 Browser 0 Disabled Stopped OK 0 CertPropSvc 828 Manual Running OKcode 69 SC2010

import os,sys,getopt,re import random, threading,time import subprocess

STDOUT=threading.RLock() WAKEUP=threading.Event() WLock =threading.RLock()

# Default Values Fanout = 12 Timeout = 3600 ExcludeNodes = "" Nodelist = "" Verbose = 0 def usage(): print sys.argv[0], "[-c] [-h] [--help] " print " -h, --help this message" print " -w NODES list of nodes to use" print " -X NODES list of nodes to exclude from -w RANGE" print " -u TIMEOUT timout in s, defaults to 3600" print " -f FANOUT maximum number of parallel request, defaults to 12" print " -v verbose mode" def ExpandNodes(Nodelist):

# replace all "," inside a range indicate by "[...]" by ";" Temp = "" Colon = "," for Char in Nodelist: if Char == "[": Colon = ";" elif Char == "]": Colon = "," import os,sys,getopt,re import random, threading,time import subprocess

STDOUT=threading.RLock() WAKEUP=threading.Event() WLock =threading.RLock()

# Default Values Fanout = 12 Timeout = 3600 ExcludeNodes = "" Nodelist = "" Verbose = 0 def usage(): print sys.argv[0], "[-c] [-h] [--help] " print " -h, --help this message" print " -w NODES list of nodes to use" print " -X NODES list of nodes to exclude from -w RANGE" print " -u TIMEOUT timout in s, defaults to 3600" print " -f FANOUT maximum number of parallel request, defaults to 12" print " -v verbose mode" def ExpandNodes(Nodelist):

# replace all "," inside a range indicate by "[...]" by ";" Temp = "" Colon = "," for Char in Nodelist: if Char == "[": Colon = ";" elif Char == "]": Colon = ","

if Char == ",": Temp = Temp + Colon else: Temp = Temp + Char

Array = Temp.split(",") Return = []

for Part in Array: (Prefix,Range,Postfix) = re.search("(.*?)\[?([0-9;- ]*)]?(\D*)$",Part).groups() RangeParts = Range.split(";") for Item in RangeParts: if Item.count("-"): (Start,End) = re.search("(\d*)-(\d*)$",Item).groups() Len = max(len(Start),len(End)) Format = "%0"+str(Len)+"d"

Start = int(Start) End = int(End)

Count = Start while Count <= End: Item = Format % (Count) Return.append(Prefix + Item + Postfix) Count = Count + 1 else: Return.append(Prefix + Item + Postfix)

return Return class Worker(threading.Thread):

def __init__(self,ID,Host,Com): self.ID = ID self.Host = Host self.Com = Com.replace("%N",Host) threading.Thread.__init__(self)

def run(self): global STOUT,WLock,WAKEUP

if Verbose: STDOUT.acquire() print self.ID,self.Host,":",self.Com STDOUT.release()

Result = subprocess.Popen(["powershell", self.Com], stdout=subprocess.PIPE).communicate()[0] # if Verbose: # STDOUT.acquire() # print self.ID,self.Host,":" # print [Result] # STDOUT.release() Result = Result.split("\n")

Return = [] for Line in Result: Return.append(self.Host+": "+Line.rstrip()) Return = "\n".join(Return)

if Verbose: STDOUT.acquire() print self.ID,self.Host # print Result # print [Result] # print [Return] STDOUT.release()

STDOUT.acquire() print Return STDOUT.release()

WLock.acquire() WAKEUP.set() WLock.release()

############################################################### # # Program starts # ###############################################################

if len(sys.argv) == 1: usage() sys.exit(0) else: Options,Arguments = getopt.getopt(sys.argv[1:],"chw:X:u:f:v", \ ["help"]) for O,V in Options: # print O,V if O in ("-h","--help"): usage() sys.exit(0)

if O in ("-w"): Nodelist = V elif O in ("-X"): ExcludeNodes = V elif O in ("-u"): Timeout = V elif O in ("-f"): Fanout = V elif O in ("-v"): Verbose = 1 else: pass

Command = " ".join(Arguments) if Verbose: print "Nodelist: ",Nodelist print "Fanout: ",Fanout print "Timeout: ",Timeout print "ExcludeNodes: ",ExcludeNodes print "Nodelist: ",Nodelist print "Verbose: ",Verbose print "Command: ",Command if Nodelist == "": print " no nodes added, exiting" sys.exit(1) if Command == "": print " no command, exiting" sys.exit(2)

Nodes = ExpandNodes(Nodelist) for Node in ExpandNodes(ExcludeNodes): if Nodes.count(Node): Nodes.remove(Node)

Nodes.sort() if Verbose: print Nodes

MaxUsed = Fanout Used = 0 Threads = range(Fanout) IDs = range(Fanout) TermEvents = range(Fanout) Pool = range(Fanout) Pool.reverse() Hosts = Nodes[:]

WLock.acquire() while len(Hosts): while len(Pool) and len(Hosts): ID = Pool.pop() Host = Hosts.pop() Threads[ID] = Worker(ID,Host,Command) Threads[ID].start()

if Verbose: STDOUT.acquire() print "started Thread",ID,Host STDOUT.release()

WAKEUP.clear() WLock.release() WAKEUP.wait(1) WLock.acquire() for ID in range(Fanout): if Pool.count(ID): continue if Threads[ID].isAlive(): continue Pool.append(ID) if Verbose: STDOUT.acquire() print "end thread",ID,"workload",len(Hosts),"free",len(Pool) STDOUT.release() if Verbose: STDOUT.acquire() print Pool STDOUT.release()

WLock.release()

Example Querying Multiple Nodes

Z:\admin>C:\crtdc\Python27\python.exe pdsh.py -w emsw000,emsw001 "get- wmiObject -computer %N win32_product | sort -property IdentifyingNumber" | C:\crtdc\Python27\python.exe dshbak.py -c ------emsw[000-001] ------IdentifyingNumber : {CD5190DD-A85D-4844-9BF4-AC6B04EB1A12} Name : Microsoft HPC Pack 2008 R2 Server Components Vendor : Microsoft Corporation Version : 3.0.2369.0 Caption : Microsoft HPC Pack 2008 R2 Server Components

IdentifyingNumber : {CED243AB-C7BA-3D42-9609-14EF5A6FC601} Name : Microsoft Report Viewer Redistributable 2008 (KB971119) Vendor : Microsoft Corporation Version : 9.0.30731 Caption : Microsoft Report Viewer Redistributable 2008 (KB971119)

IdentifyingNumber : {D3299935-57F7-403A-9D7B-0B8F9F56F44B} Name : Microsoft HPC Pack 2008 R2 MS-MPI Redistributable Pack Vendor : Microsoft Corporation Version : 3.0.2369.0 Caption : Microsoft HPC Pack 2008 R2 MS-MPI Redistributable Pack

IdentifyingNumber : {D86BF5A7-BB6E-423F-AA1D-02B5F59C38B0} Name : Microsoft HPC Pack 2008 R2 Client Components Vendor : Microsoft Corporation Version : 3.0.2369.0 Caption : Microsoft HPC Pack 2008 R2 Client Components

70 SC2010 Win32* Subsystems

Win32_DiskDrive Win32_ActiveRoute Win32_BIOS Win32_IP4PersistedRouteTable Win32_CacheMemory Win32_IP4RouteTable Win32_MemoryArray Win32_IP4RouteTableEvent Win32_MemoryArrayLocation Win32_NetworkClient Win32_MemoryDevice Win32_NetworkConnection Win32_MemoryDeviceArray Win32_NetworkProtocol Win32_MemoryDeviceLocation Win32_NTDomain Win32_PhysicalMemory Win32_PingStatus Win32_PhysicalMemoryArray Win32_ProtocolBinding Win32_PhysicalMemoryLocation Win32_BootConfiguration Win32_Processor Win32_ComputerSystem Win32_SystemMemoryResource Win32_ComputerSystemProcessor Win32_NetworkAdapter Win32_OperatingSystem Win32_NetworkAdapterConfiguration Win32_SystemNetworkConnections Win32_NetworkAdapterSetting Win32_SystemOperatingSystem Win32_SystemDriver Win32_SystemProcesses Win32_DiskPartition Win32_SystemProgramGroups Win32_LogicalDisk Win32_SystemResources Win32_LogicalMemoryConfiguration Win32_SystemServices Win32_SystemLogicalMemoryConfiguration Win32_SystemSetting Win32_LogicalMemoryConfiguration Win32_SystemSystemDriver Win32_SystemLogicalMemoryConfiguration

71 SC2010 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions

72 SC2010

72 Some Examples • Microsoft HPC Cluster 2008R2* has a validation subsystem • Science+Computing SC.Venus* and SCluster* (Windows + Linux) • Intel ® Cluster Checker (Linux)

73 SC2010 Summary • Consistent validation is a tool to ensure cluster health • should be done after changes and on a regular basis • can grow from simple scripts to complete commercial solutions

74 SC2010 Legal Disclaimer • INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPETY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. • Intel may make changes to specifications and product descriptions at any time, without notice. • All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. • Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. • Nehalem, Westmere and other code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user • Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. •Intel, Xeon and the Intel logo are trademarks of Intel Corporation in the United States and other countries. • *Other names and brands may be claimed as the property of others. • Copyright ©2010 Intel Corporation. 75 SC2010 75

75 Risk Factors

The above statements and any others in this document that refer to plans and expectations for the second quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the corporation’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Additionally, Intel is in the process of transitioning to its next generation of products on 32nm process technology, and there could be execution issues associated with these changes, including product defects and errata along with lower than anticipated manufacturing yields. Revenue and the gross margin percentage are affected by the timing of new Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; defects or disruptions in the supply of materials or resources; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on changes in revenue levels; product mix and pricing; start-up costs, including costs associated with the new 32nm process technology; variations in inventory valuation, including variations related to the timing of qualifying products for sale; excess or obsolete inventory; manufacturing yields; changes in unit costs; impairments of long-lived assets, including manufacturing, assembly/test and intangible assets; the timing and execution of the manufacturing ramp and associated costs; and capacity utilization. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenueand profits. The majority of our non-marketable equity investment portfolio balance is concentrated in the flash memory market segment, and declines in this market segment or changes in management’s plans with respect to our investment in this market segment could result in significant impairment charges, impacting restructuring charges as well as gains/losses on equity investments and interest and other. Intel's results could be impacted by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting us from manufacturing or selling one or more products, precluding particular business practices, impacting our ability to design our products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the report on Form 10-Q for the quarter ended March 27, 2010.

Rev. 5/7/10

76 SC2010 76

76