Introduction to High Performance Computing at Case Western Reserve University ! HPC Access & Linux [email protected]
Total Page:16
File Type:pdf, Size:1020Kb
Introduction to High Performance Computing at Case Western Reserve University ! HPC Access & Linux [email protected] KSL Data Center Working on the Cluster • The command line: How does it work? • Connecting to the cluster • Files & directories • Navigating • Creating, Editing, Moving/Renaming • The file structure: What & How? • The session environment • Variables • Modules & Hierarchies Why do we learn to use the shell? • Repeating a workflow can be labour intensive • Allows to automate repetitive tasks • Capture small data manipulation steps to make research reproducible • Manual manipulation of data files: • is often not captured in documentation • is hard to reproduce; troubleshoot; review; or improve • The Shell allows… • Workflows can be automated through the use of shell scripts • Focus on easy string manipulation (e.g. sort, grep, etc.) • Every step can be captured, enhancing reproducibility and troubleshooting What does the “Shell” Do? • The shell is a process providing the CLI • The heart of a cli is a read-evaluate-print loop, or REPL: you type a command, press Enter, the shell: • Reads it • Executes (or “evaluates” it) • Prints the output • Then restores the prompt, waits for you to enter another command Linux Shell Concepts • path/absolute path: specify the location of a file/dir • current working directory: applies in a shell session • filename extension: a label, indicates data type • home directory: ‘privileged’ location, easy to reference • Shell script: interpreted program to automate tasks • Standard Input: The input channel to a process • Standard Output & Error: output channels • Redirection - - change the standard ‘flow’ (source) Access the Cluster You can login from anywhere You will need: • An approved cluster account • Enter your CaseID and the Single Sign-On password • ssh (secure shell) utility [detailed instructions for all platforms] • web browser (https) via OnDemand web portal If Off-campus Location, then Connect through VPN, using two-factor authentication Case Guest wireless == “off-campus” Accessing the Cluster • using ssh command: Form and options • ssh [options] <user>@<hostname> • e.g. ssh -X [email protected] - - results in, • a network connection through ‘rider.case.edu’ to a headnode • from the ‘-X’ option, a graphical channel is created • Usage: • need to run for each window • creates an independent process • Access the Cluster Access the Cluster Owner Group Permissions Ownership & Permissions Ownership categories: user, group, other, all •When creating files, they will be ‘owned’ by your account •Only one group membership is active at any time ! Permissions: read, write, execute •execute: directories, allows ‘traversal’; files indicates executable Files & Directories Navigating • How to see or find files and directories ➡ listing, finding and searching (within) files • How to move within the file system ➡ changing directories or computers • How to specify locations within the cluster ➡ paths: current, absolute, relative Files & Directories How to see or find files and directories • The list command — ls • ls [OPTION]... [FILE]… • let’s ask for help in an example, using ‘- - help’ [demo] • options • - l : long-list, provides details (shows directory contents) • - d : only list a directory, don’t show contents • - Sh : size + ‘human’-readable output (typically for storage utilization) • - t : time-order • - r : reverse the ordering Files & Directories How to specify locations • what’s in a correct path name? • adequate information for the shell to recognize the destination • the references must respect the filesystem (we’ll review later) • current working directory is the reference point • moving downward just requires a subdirectory name • moving outside requires full pathname or a proper relative path • Check current path • ‘pwd’ — path to current working directory Files & Directories How to move within the file system • The ‘change directory’ command — cd • cd <destination path> • <destination path> — may use full or relative path names • usage • ‘ch /home/<caseid>’ — change to /home/<caseid> • ’cd $HOME’ — $HOME is an environment variable • ‘cd’ — no argument == go to home dir • ‘cd /scratch/pbsjobs/job.8364141.hpc’ — full pathname • ‘ch ..’ — move up one directory; special instance relative path Files & Directories Creating, Editing, Moving/Renaming • creating • mkdir — Directories • touch — File (empty) • vi, nano, emacs — File, and provide an editor • Moving/Renaming • cp [-r] <source-path> <destination-path> • mv <source-path> <destination-path> • rm -r <target-path> use with caution — the shell does not forgive Accessing the Cluster • Web Portal service: https://ondemand.case.edu • No installation, access via browser using SSO • https rather than ssh -- no X11 graphics software required • shell; file manager; job status & submission • desktop environment; jupyter, rstudio interactive web applications • More interactive applications to follow • From within CWRU Campus Network Accessing the Cluster • Graphical Environment Options via X11 • ssh -X <user>@<hostname> • e.g. ssh -X [email protected] - - results in, • a network connection through ‘rider.case.edu’ to a headnode • from the ‘-X’ option, a graphical channel is created • x2go-client: Improved efficiency & capabilities • Full desktop environment • Install & config instructions: CWRU HPCC documentation Rider Cluster Components ! University rider.case.edu ondemand.case.edu Firewall ! Admin SLURM Science Head Nodes Nodes Master DMZ Resource Manager Data Transfer Disk Storage Nodes Batch nodes GPU nodes SMP nodes Markov Cluster Components ! University markov.case.edu ondemand.case.edu Firewall ! Admin SLURM Science Head Nodes Nodes Master DMZ Resource Manager Data Transfer Disk Storage Nodes class (GPU) nodes Working within Group Allocations - I • Review: What are linux groups? • Manage affiliations in the multiuser environment • Set “in-between” permissions: u <— g —> o • Groups are administered — contact [email protected] • Switching the active group: “newgrp - <groupname>” [mrd20@hpc3 ~] groups tas35 oscsys gaussian hpcadmin schrodinger ccm4 singularity [mrd20@hpc3 ~] newgrp - hpcadmin [mrd20@hpc3 ~] groups hpcadmin oscsys gaussian tas35 schrodinger ccm4 singularity Working within Group Allocations - II • Creating a file/directory with the intended ownership • newgrp hpcadmin • mkdir <new directory> • Changing ownership after creating a file/directory • mkdir new-directory • ls -l new-directory HPC Environment Group Cluster Resources Your HPC account, sponsored by your PI, provides: ! •Group affiliation — resources shared amongst group members •Storage •/home — permanent storage, replicated & “snapshot” protected •/scratch/pbsjobs — up to 1 TB temporary storage •/scratch/users — small-scale temporary storage ➡ exceeding quota(s) will prevent using account!! ! •Cores: member groups allocation of, typically, 64 per unit • Wall-time: 320-hour limit for member shares (36 hours for guest units) HPC Environment Your /home •Allocated storage space in the HPC filesystem for your work •Create subdirectories underneath your /home/CaseID, ideally each job has its own subdirectory ! cd — linux command to change the current directory examples to change to “home” ‣cd /home/<CaseID> ‣cd ~<CaseID> ‣cd $HOME ! $HOME is an environment variable that points to /home/<CaseID> HPC Environment File Structure / [root] /home /scratch /mnt /usr /usr/bin /home/<caseid> /scratch/pbsjobs /mnt/pan /usr/lib xxx accounts for scheduled jobs additional high-! system ! performance storage executables ! and libraries /scratch/users /mnt/projects general purpose ! research storage /usr/local temp storage installed software HPC Environment: Environment Variables on Rider Keeping organized ! ‣echo $PATH /usr/local/intel-17/openmpi/2.0.1/bin:/usr/local/intel/17/compilers_and_libraries_2017/linux/bin/intel64:/ usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/dell/srvadmin/bin! ! ‣echo $LD_LIBRARY_PATH /usr/local/intel-17/openmpi/2.0.1/lib:/usr/local/intel/17/tbb/lib/intel64/gcc4.7:/usr/local/intel/17/ compilers_and_libraries_2017/linux/mkl/lib/intel64:/usr/local/intel/17/compilers_and_libraries_2017/ linux/lib/intel64::/usr/local/lib Changes: slurm commands in /usr/bin —> no longer referenced in the PATH, etc Modules and Environment Module command: avail, spider, list, load, unload Manage the environment necessary to run your applications (binary, libraries, shortcuts) Modify environment variables using module commands: >>module avail & spider — learn what is available and how to load it >>module list (shows modules loaded in your environment) >>module load python (loads default version) >>module load python/3.5.1 (loads specific version) >>module unload python/3.5.1 (unloads specific version) >>module purge — {when all else fails, and you don’t want to start a new shell session….} ! ------------------------------------------------------------------- Modules and Environment On Rider, you might need to load a particular version of a compiler and OpenMPI in order to find your module. Command Outcome Shows the list of the current loadable modules of a module avail hierarchy. It also shows, visually, which modules are loaded. module spider Shows the list of all modules and versions available. module spider/<ver> Shows how to load the specific module version Modules and Environment Lua module hierarchies — independence & accountability • Core — persistent, independent: no run-time dependence on other packages • Compilers