Ipmitool –Syscfg

Ipmitool –Syscfg

Validation of an HPC Cluster: A Sometimes Neglected Aspect of System Administration walk through of methods and procedures Michael Hebenstreit INTEL® corp. CRT Datacenter, Senior Cluster Architect tut118 1 SC2010 1 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions 2 SC2010 2 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Windows* • Commercial solutions 3 SC2010 3 CRT Datacenter Challenges • Support for variety – Multitude of different hardware architectures – Early access often leads to alpha and beta systems used in cluster configuration • Support for different customers – OEMs, End users, ISVs – Some want their own configuration – Manage access while preserving security of data for each user – Protect the internal network and Intel IP from external disclosure • Support for scaling – Often requires exclusive period due to custom configurations – Remove compute nodes out of circulation for the duration of the project 4 SC2010 CRT-DC cluster configuration Panasas* Force10* network /home QDR IB long-term 360 64 storage admin1 Urbanna Supermicro* admin2 compute compute DDN* pbs-serv1 nodes Nodes Lustre pbs-serv2 24 GB RAM 24 GB RAM 400 GB SAS HD 500 GB SATA HD LFS4 login 2 (HDD) login compile LFS5 (SSD) 1GbE network QDR InfiniBand network 5 SC2010 Exemplary Configurations •Nodes – 360 Intel SR1600UR: Xeon® X5670 (WSM),2.93 GHz,12cores/node,24 GB – 64 Supermicro 6026T-NTR+: 34 Xeon® X5560 (NHM,2.8GHz,8 cores/node), 40 Xeon® X5677 (WSM,3.47GHz,8 cores/node), all 24 GB • Cluster File System – Panasas *(70 TB storage) – DDN* Lustre (28 TB storage) – HDD Lustre (23 TB storage) – SSD Lustre (3 TB storage) • Distributed GigE: – Force10* Networks C-300 backbone, Force10 Networks S50N top-of-rack • Distributed InfiniBand*: – Mellanox* MTS3600Q, 18 spine, 28 leaf switches, 504 ports •Software stack: – RedHat* EL5, OFED 1.3+,Lustre 1.6.4.3+ • has been on Top 500 since June 2006 (best ranking #68, worst #153) 6 SC2010 Agenda • CRT-DC – the Customer Response Data Center •The problem • Tier 1: The hardware • Tier 2: the installed image • Tier 3: performance tests • A look at MS Window* • Commercial solutions 7 SC2010 7 Classification • Hardware and software defects: systems dead or does not operate correctly • Inconsistencies: configuration (config files, installed rpms…) are not identical across the cluster • Degradation: system performs correctly but lost performance ->keep log files 8 SC2010 The Linux Toolbox • Executing commands in parallel – pdsh* • Consolidating pdsh output – dshbak* • cat, grep, sum, sed, awk… • shell scripting • advanced programming languages like Python* or Perl* pdsh homepage: http://sourceforge.net/projects/pdsh 9 SC2010 redirect – To send the output of a file or command into another file [user]$ echo "\"To err is human -" > text1 [user]$ echo "and to blame it on a computer is even more so."\" > text2 [user]echo "Robert Orben" > text3 --------------------------------------------------------------------------------------------- ----------------------------------------------- cat (concatenate) Displays the contents of one or more files to standard output. It is most commonly used to display a single file to a monitor. [user]$ cat text1 "To err is human – [user]$ cat text2 and to blame it on a computer is even more so." [user]$ cat text3 Robert Orben [user]$ cat text1 text2 text3 "To err is human - and to blame it on a computer is even more so." Robert Orben [user]$ cat text1 text2 text3 > text4 [user]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben --------------------------------------------------------------------------------------------- ------------------------ grep – Used to find a text pattern within a file and return the line(s) containing the pattern. Most commonly used to find a word, but can find a character, phrase, sentence or any regular expression. [user]$ grep computer text4 and to blame it on a computer is even more so." grep –i Because grep is case sensitive, -i is used to ignore case [user]$ grep to text4 and to blame it on a computer is even more so." [user]$ grep –i to text4 "To err is human - and to blame it on a computer is even more so." grep –c To count the number of lines which contain the expression being grep’d. [user]grep -c is text4 2 redirect – To send the output of a file or command into another file [smartuser@server1~]$ echo "\"To err is human -" > text1 [smartuser@server1~]$ echo "and to blame it on a computer is even more so."\" > text2 [smartuser@server1~]echo "Robert Orben" > text3 -------------------------------------------------------------------------------------------------------------------------------------------- cat (concatenate) Displays the contents of one or more files to standard output. It is most commonly used to display a single file to a monitor. [smartuser@server1~]$ cat text1 "To err is human – [smartuser@server1~]$ cat text2 and to blame it on a computer is even more so." [smartuser@server1~]$ cat text3 Robert Orben [smartuser@server1~]$ cat text1 text2 text3 "To err is human - and to blame it on a computer is even more so." Robert Orben [smartuser@server1~]$ cat text1 text2 text3 > text4 [smartuser@server1~]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben --------------------------------------------------------------------------------------------------------------------- grep – Used to find a text pattern within a file and return the line(s) containing the pattern. Most commonly used to find a word, but can find a character, phrase, sentence or any regular expression. [smartuser@server1~]$ grep computer text4 and to blame it on a computer is even more so." grep –i Because grep is case sensitive, -i is used to ignore case [smartuser@server1~]$ grep to text4 and to blame it on a computer is even more so." [smartuser@server1~]$ grep –i to text4 "To err is human - and to blame it on a computer is even more so." grep –c To count the number of lines which contain the expression being grep’d. [smartuser@server1~]grep -c is text4 2 grep –v To search for lines which do not contain the expression [smartuser@server1~]grep -v is text4 Robert Orben grep –q Searches and quietly exits if the expression is found. When the grep is finished, the exit code is set to the variable $?. If we echo $?, we will see if the expression is present or not. Succcess = 0, Failure = 1. This is useful in “if” statements to avoid confusing output to a user. [smartuser@server1~]grep -q man text4; echo $? 0 [smartuser@server1~]grep -q woman text4; echo $? 1 --------------------------------------------------------------------------------------------------------------------- sum – Computes a 16-bit checksum for each given file and counts the blocks each file occupies. This is calculated after a file transfer and compared to the checksum of the original file to ensure file integrity. [smartuser@server1~]$ sum text4 05333 1 [smartuser@server1~]$ sum text1 text2 text3 24872 1 text1 63331 1 text2 20594 1 text3 --------------------------------------------------------------------------------------------------------------------- awk (printing a specific column) – awk generally is used to search output or a file for a pattern and then manipulate it. When awk finds a specified pattern in a line, it assigns each part of that line to unique variables, e.g. $1 $2 $3 $4 $NF. The smart user can then manipulate the values by using the variables. [smartuser@server1~]$ cat text4 "To err is human - and to blame it on a computer is even more so." Robert Orben [smartuser@server1~]$ err to Orben To limit the output we can use an option telling awk to only consider the line that begins with “and” [smartuser@server1~]$ awk /^and/'{print $3" "$6" "$7}' text4 blame a computer piping with "|" – The pipe lets us direct output from one command directly into another. So here is another way to get to the same output. [smartuser@server1~]$ grep blame text4 | awk '{print$3" "$6" "$7}' blame a computer -------------------------------------------------------------------------------------------------------------------------------------------- sed (changing text) – sed is most useful for making text transformations on an input stream, whether from a file or a pipeline. The single quotes contain the logic sed is to follow, s = substitute, computer is the expression to find and dog is the expression to put in it’s place, g means global and tells sed not to stop at the first occurrence, but to make the change anywhere in the file where the expression “computer” occurs. [smartuser@server1~]$ grep blame text4 | awk '{print$3" "$6" "$7}' | sed 's/computer/dog/g' blame a dog Or awk /^and/'{print $3" "$6" "$7}' text4 | sed 's/computer/cat/g' [smartuser@server1~]$ awk /^and/'{print $3" "$6" "$7}' text4 | sed 's/computer/cat/g' blame a cat For fun [smartuser@server1~]OTHERS="horse pig mouse goat" [smartuser@server1~]echo $OTHERS horse pig mouse goat [smartuser@server1~]for i in $OTHERS; do awk /^and/'{print"You should "$3" "$6" "$7}' text4 | sed 's/computer/'$i'/g';done You should blame a horse You should blame a pig You should blame a mouse You should blame a goat sort is used to sort either alphabetically or numerically. If you | standard output to sort, you will see sorted results on your monitor.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    130 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us