<<

Introduction to Today’s Class

• What is a cluster?

• History of UNIX • Why is UNIX the of scientific ?

• UNIX basics – Working from the -line

• Practice Problems Connecting to the UH cluster

• Connect via ssh (“secure ”) • Installed by default on any UNIX-based machine (Mac/) • Mac (Applications->Utilities) • Linux Machine (Terminal) • Windows – PuTTY, a free ssh software

• Need: Check out a node to use in interactive • A account mode: • IP address !$ qsub –I –l walltime=04:00:00 ! • Format: !$ ! • $ ssh username@IP_address! Do not run jobs on the • For example: login nodes!! $ ssh [email protected]! $ ssh [email protected]!

Using a cluster

• Log onto the “login” node Loosely connected or tightly connected computers that • Home – store your files work together so that in many • Do not run processes on the login nodes!!! respects they can be viewed as a single system. • http://en.wikipedia.org/wiki/Computer_cluster To use the “compute” nodes • Submit a job (“qsub –I” ; “exit” when done) • Check out many nodes once using a job script - Connected through fast LANs – high speed data transfer - Each “node” (or server) runs its own instance of the operating system

- User logs into a single “login” node, where they have a to store files - Request to use multiple nodes via a job scheduler software - Job scheduler assigns nodes to the particular user, runs the job, and returns the results. - Check out single nodes for interactive use Beowulf Cluster Off-the-shelf computers connected to one another via a network

One host and multiple clients (“master” node and “slave” nodes) Easy to grow and update – just add computers

Two ways of using a cluster:

(1) One big job, distributed across the nodes “parallel computing” (2) Many individual jobs, each run on its own computer “batch” computing Any given run on a node is probably no better/faster than your computer. History of UNIX

• Created in 1969 as part of AT&T’s • Multi-user, multi-tasking operating system – lets many programmers simultaneously work on the same computer • OS coordinates the use of the computer’s resources (real- sharing) • Each user is unaware of the activities of other users • First attempt called

• Key Features: • Multi-user • Multi-tasking • System Portability • UNIX tools (modular design - many small programs rather than one monolithic program)

Source: http://www.bell-labs.com/history/unix/tutorial.html and porting UNIX to the PDP-11 History of UNIX

• 1976-1977 Ken Thompson took a sabbatical at UC Berkeley – taught a course on UNIX

• Students and professors at Berkeley continued to develop the system – led to the development of Berkeley Software Distribution (BSD) UNIX

• UNIX eventually became licensed/trademarked, and ‘true’ UNIX systems follow the UNIX specification

• Many UNIX operating systems are not official UNIX, but “Unix-like” – these include Linux and MacOSX

Evolution of UNIX and UNIX-like Systems

http://en.wikipedia.org/wiki/Unix GNU + Linux = Open source UNIX

Richard Stallman, Free Linus Torvalds Software Foundation GNU is not UNIX!

GNU: , utilities

Linux: kernel GNU Public License (GPL)

• Most widely used free software license

Users are free to use, modify, and redistribute the software

However, the same terms and conditions apply to the derivative work.

GPL is a “copyleft” license. Copyrights limit use, modification and redistribution. “Copyleft” does the opposite – ensures that the code will never become proprietary. Three Levels of the UNIX Operating System

• Kernel – core of the operating system; controls the hardware

• Shell – acts as the interpreter between the user and the computer

• Applications/Utilities UNIX directory structure /bin = GNU/Linux utilities (e.g., , ) /home = user accounts /lib = libraries

/usr = where additional/optional binaries (/usr/bin) or libraries (/usr/lib) are installed

“which” or “whereis” can you figure out where a given program or library is installed

User has RWX (read--execute) permissions only for their only home directory (and sub-directories)

“root” (administrator) access is required to install at the system level “su” () or “” (in front of a command) will execute that command as root

UNIX directory structure

• Files are organized into directories. • Everything is a (including devices, executables, scripts, links) • File suffixes are for convention only (but encouraged!)

• Directories are organized in a hierarchical manner

• Pathnames show the location of the file ~ Home directory • UNIX uses forward slashes (“/”) to . This directory separate directories .. Up one level in the • Pathnames can be absolute or relative directory

• The terminal “/” when navigating to a directory is optional “cd /home/ostrow“ is the same as “cd /home/ostrow/”, which is the same as “cd ~“ “cd /home/ostrow” == “cd /home/ostrow/” == “cd ~”

Navigating on UNIX machine

• Home directory • List contents (ls) • Change directory (cd) • Print ()

- cd .! ls! - cd /! ls –l! - cd ~! ls –la! - cd ./! ls -lrt! - cd ..! - cd /project! - cd /home/ostrow!

- pwd! - locate! “Root” (= “Administrator”)

• Dangerous for any user to be able to modify files on a multi-user UNIX cluster

• A typical user has “write” privileges that are limited to his or her own directory (and its subdirectories) • Will have broad read/execute privileges, so can run applications

• Need to be a superuser (“root”) to be able to write at the system level

• Ask sys admin to install software at the system level OR do a “local” install (i.e., install in your own directory). Three ways to have root privileges

1. Log in as root ssh [email protected]!

2. Go into “superuser” mode: $ su!

(You must be on the superuser list.)

3. Issue a command as “superuser” $ sudo [your command here]!

http://xkcd.com/149/

A Note to Mac Users

• Although Macs are UNIX-based, many of the standard utilities are not installed by default

• Download the Apple Command-Line Tools Two ways: Install Xcode (free from Apple website) OR register as an Apple Developer and use downloads site

• Consider installing a package manager • (fink, MacPorts, Homebrew)

• With command-line tools, you will have most UNIX utilities installed and ready to use

• Windows machine – Try a Linux virtual machine or Cygwin UNIX command line

• General format of UNIX commands: command [-options] [target]

For example: ls! ls ~! ls /users/eaostrow! ls -altr /users/eaostrow! ! less countATGC.rb ! countATGC.rb countAGCT.rb! -f1 myfile.dat! -l snpfile! ! Useful Commands Consult the manpages if you can’t remember how Navigation: the syntax! ls (list files)! e.g., “man ls”, “info ls” pwd (print working directory)! cd path_to_dir (change directory)! ! Creating Files: ! ! Renaming or Moving Files: filename destination_dir! scp filename user@hostname:path_to_dest! mv filename new_filename! filename #DANGEROUS! ! my_directory! my_directory! rm –rf ./mydir #DANGEROUS! ! ! Useful Commands (cont’d) Reading files: less [-S] filename ! Keystrokes!! in less:! more filename! q to quit; ! filename! SPACE to scroll down; ! / to search; Finding text in files: pattern filename! filename! cut [-f] ! wc [-l] filename (=word count) Grep and sort have many powerful options! Printing lines: filename ! grep:! filename! -, -n, -A, -B, -v! ! Redirect: sort:! > (redirect to file, replace)! -k –n -r! >> (redirect to file, append) ! ! ! ! UNIX command line (cont’d)

• Power of the UNIX command line is ability to build complex tasks by chaining commands togethers:

Using “pipes”, the output of one command becomes the input to the next

Example: $ less QS73_m15q338.pileup | grep -c DDB0169550 (counts number of mitochondrial sites)

• Re-direct output to file (>, >>) $ less QS73_m15q338.pileup | grep -c DDB0232428 | cut –f2,4 > chr1.cover.dat! ! • By chaining, grepping, and redirecting, you can efficiently filter and customize data sets. Ctrl-A Beginning of the line Keystrokes Ctrl-B Moves cursor backwards Ctrl-C Cancel currently running • Tab – auto-completes command • (Tab again to see completion Ctrl-E End of the line options) Ctrl-F Moves cursor forward • Up-arrow / Down-arrow Ctrl-H Delete / Backspace Cycle through previous Ctrl-K Delete forward from cursor commands Ctrl-L Redraws the screen (same as clear) Ctrl-P Pastes previous line !!! Executes the (same as up-arrow) previous command Ctrl-U Delete backwards from reset! Reseats the terminal cursor if not displaying Ctrl-W Deletes last word typed properly Ctrl-Z Suspends a running clear! Clears the screen Killing a process

• Cancel a job you’ve started on the cluster: • $ qstat –u username • $ canceljob

• If your computer becomes unresponsive:

Ctrl-C • Open a second terminal - ” , where is your process ID - your psid: type “ aux” or “ps aux | grep ” For Monday’s Quiz

• Know how to navigate up and down the UNIX directory structure • Get back to your home directory (efficiently) • Print your working directory • List directory contents

• Create, , remove, move, and rename directories and files (touch, cp, rm, mv, mkdir, rmdir)

• Difference between a relative and an absolute ; how to find and specify both. For example: “cd ../” “cd ./” “cd ~” “pwd”, “less ../myfile.dat”, “less ~/myfile.dat” !etc.)

• Log on remotely (ssh); copy files from one machine to another (scp)

• How to obtain a node for an interactive job (qsub)

• How to exit from a shell (exit); how to kill a process (Ctrl-C)

• Keystrokes to type efficiently on the command line; tab completion

• How to access, interpret, and use manpages

• How to read the contents of a file (less, cat, more, , >, >>, head, tail)