Working with Shell Scripting

Working with Shell Scripting Daniel Balagué Editing Text Files We offer many text editors in the HPC cluster. ● Command-Line Interface (CLI) editors: ○ vi / vim ○ nano (very intuitive and easy to use if you are a new user - it has no undo) ○ emacs ○ etc. ● Graphical User Interface (GUI) editors: ○ gedit ○ kwrite ○ etc. What is a Shell Script? & Why Shell Scripts? Shell Scripts, why? ● In a shell script, the code is not compiled; it is interpreted. The interpreter reads the lines of the code and it executes it 1. Simplicity: the shell is a high-level language; sequentially. you can express complex operations clearly and simply using it. They are often less efficient than a compiled code. 2. Portability: use just the POSIX-specific features and your script would likely run ● Then, why would you use a shell script? “unchanged” in other systems. Because they are faster to code than C/C++ 3. Ease of development: you can often write a code (and do not require compilation) with powerful script in little time. an acceptable performance. It makes the tradeoff worthwhile. Examples of scripting languages: awk, BASH, Perl, Python, Ruby, Lua, etc. Creating a BASH Script: The first line BASH scripts start with the following line: #!/path/to/bash Usually it is: #!/bin/bash (it may change in different Linux or Unix flavors). Executing the Script Use the following command: Here, the . (dot) means the ./script.sh (for a relative call) current working directory. or /path/to/script.sh (for an absolute call) “Hello World” (C v.s. BASH) C BASH #include <stdio.h> #!/bin/bash int main(int argc, char* argv[]) echo "Hello World" { printf("Hello World\n"); return 0; } You are Ready! It is time to log into the HPC cluster!!! ssh <caseID>@rider.case.edu The SLURM Script The SLURM script, is the script that will be read by the job scheduler, and it will be used to allocate your resources. You can think of the SLURM script as a “personalized” BASH script. Structure of the SLURM Script These are some (but not all) of the options for the configuration: #!/bin/bash #SBATCH --account=<group_account> # Group account (-A) #SBATCH --nodes=1 # number of nodes required (-N) #SBATCH --tasks-per-node=1 # number of tasks per node (-n) #SBATCH --cpus-per-task=2 # number of cpus per task (-c) #SBATCH --partition=<partion_name> # Partition type (-p) #SBATCH -J jobname # A Job Name for the array #SBATCH --mem=1Gb # memory reserved #SBATCH --time=00:02:00 # minutes #SBATCH [email protected] # user email #SBATCH --mail-type=ALL # Email notifications #SBATCH -o readgroup_job.o%j # Screen output Allocate resources correctly Common errors HPCC users encounter: • Long waiting time in the queue. • Job cancelled because there was not enough memory allocated. • Job cancelled because it exceeded the time limit. • OpenMP jobs running in serial. We recommend you reading our Batch & Interactive Job documentation, as well as the SLURM sbatch documentation Example of a Simple SLURM Script #!/bin/bash #SBATCH -o test.o%j #SBATCH --time=00:30:00 #SBATCH -N 1 #SBATCH -n 1 #SBATCH -c 1 module load <software-module> ./<executable> [options] [inputs] Using Scratch Space (explained) #!/bin/bash basic_script.slurm First line #SBATCH -o test.o%j #SBATCH --time=00:30:00 Scheduler Options #SBATCH -N 1 #SBATCH -n 1 module load <software-module> Load necessary modules cp -r my_files $PFSDIR Copy files to scratch cd $PFSDIR Move to the scratch folder ./<executable> [options] [inputs] Execute your code / tool in scratch cp -ru * $SLURM_SUBMIT_DIR Copy just new generated files back Useful SLURM Variables Variable Description $PFSDIR The scratch folder created for the job Directory from which the Slurm script $SLURM_SUBMIT_DIR was launched $SLURM_ARRAY_TASK_ID Index in the job array variable $CUDA_VISIBLE_DEVICES Shows the GPU identifier Repetitive Jobs - Use a Job Array You can launch the same job over and over with different parameters using job arrays. What are job arrays? ● They are a way to launch your jobs as soon as resources are available. ● Jobs (sort of) run “in parallel”. ● You only need one script! More on Job Arrays ● Run your job array with sbatch --array=1-N you_script.slurm ● Control the parameters inside your script with: $SLURM_ARRAY_TASK_ID Thank you for your attention! Questions? It is time for some exercises! ADVANCED MATERIAL Using Variables in BASH hello2.sh #!/bin/bash FIRST_NAME="Daniel" LAST_NAME="Balague" echo "Hello $FIRST_NAME $LAST_NAME" WARNING: Do not put spaces between variable names, the equal sign, and the value for the variable. BASH Script that Takes Parameters hello3.sh #!/bin/bash echo "Hello $1 $2" Here $1 and $2 are the first and second arguments passed when we call the script Fancier Output: printf We can modify the way numbers or strings are displayed on the screen: print_numbers.sh #!/bin/bash NUMBER=12 printf "The number is %04d\n" "$NUMBER" Assign the Output of a Command into a Variable print_numbers2.sh #!/bin/bash NUMBER=12 STRING=$(printf "The number is %04d\n" "$NUMBER") echo $STRING Produces the same output as before:.

Load more