Working with Shell Scripting

Daniel Balagué Editing Text Files

We offer many text editors in the HPC cluster. ● -Line Interface (CLI) editors: ○ / vim ○ nano (very intuitive and easy to use if you are a new user - it has no undo) ○ emacs ○ etc.

(GUI) editors: ○ gedit ○ kwrite ○ etc. What is a Shell Script? & Why Shell Scripts?

Shell Scripts, why? ● In a shell script, the code is not compiled; it

is interpreted. The interpreter reads the lines of the code and it executes it 1. Simplicity: the shell is a high-level language; sequentially. you can express complex operations clearly and simply using it. They are often efficient than a compiled code. 2. Portability: use just the POSIX-specific features and your script would likely run ● Then, why would you use a shell script? “unchanged” in other systems.

Because they are faster to code than /C++ 3. Ease of development: you can often a code (and do not require compilation) with powerful script in little . an acceptable performance. It makes the tradeoff worthwhile.

Examples of scripting languages: , , , Python, Ruby, Lua, etc. Creating a BASH Script: The first line

BASH scripts start with the following line: #!/path/to/bash Usually it is: #!/bin/bash (it may change in different or flavors). Executing the Script

Use the following command:

Here, the . () means the ./script.sh (for a relative call) current working directory.

or

/path/to/script.sh (for an absolute call) “Hello World” (C v.s. BASH)

C BASH

#include #!/bin/bash

int main(int argc, char* argv[]) "Hello World" { ("Hello World\n"); return 0; } You are Ready!

It is time to log into the HPC cluster!!!

ssh @rider.case.edu The SLURM Script

The SLURM script, is the script that will be read by the job scheduler, and it will be used to allocate your resources.

You can think of the SLURM script as a “personalized” BASH script. Structure of the SLURM Script

These are some (but not all) of the options for the configuration:

#!/bin/bash #SBATCH --account= # Group account (-A) #SBATCH --nodes=1 # number of nodes required (-N) #SBATCH --tasks-per-node=1 # number of tasks per node (-n) #SBATCH --cpus-per-task=2 # number of cpus per task (-c) #SBATCH --partition= # Partition (-p) #SBATCH -J jobname # A Job Name for the array #SBATCH --mem=1Gb # memory reserved #SBATCH --time=00:02:00 # minutes #SBATCH --[email protected] # user email #SBATCH --mail-type=ALL # Email notifications #SBATCH -o readgroup_job.o%j # Screen output Allocate resources correctly

Common errors HPCC users encounter:

• Long waiting time in the queue.

• Job cancelled because there was not enough memory allocated.

• Job cancelled because it exceeded the time limit.

• OpenMP jobs running in serial.

We recommend you reading our Batch & Interactive Job documentation, as well as the SLURM sbatch documentation Example of a Simple SLURM Script

#!/bin/bash #SBATCH -o .o%j #SBATCH --time=00:30:00 #SBATCH -N 1 #SBATCH -n 1 #SBATCH -c 1 module load ./ [options] [inputs] Using Scratch Space (explained)

#!/bin/bash basic_script.slurm First line #SBATCH -o test.o%j #SBATCH --time=00:30:00 Scheduler Options #SBATCH -N 1 #SBATCH -n 1

module load Load necessary modules - my_files $PFSDIR Copy files to scratch $PFSDIR Move to the scratch folder ./ [options] [inputs] Execute your code / tool in scratch cp -ru * $SLURM_SUBMIT_DIR Copy just new generated files back Useful SLURM Variables

Variable Description

$PFSDIR The scratch folder created for the job Directory from the Slurm script $SLURM_SUBMIT_DIR was launched

$SLURM_ARRAY_TASK_ID Index in the job array variable

$CUDA_VISIBLE_DEVICES Shows the GPU identifier Repetitive Jobs - Use a Job Array

You can launch the same job over and over with different parameters using job arrays. What are job arrays? ● They are a way to launch your jobs as soon as resources are available.

● Jobs ( of) run “in parallel”.

● You only need one script! on Job Arrays

● Run your job array with

sbatch --array=1-N you_script.slurm

● Control the parameters inside your script with:

$SLURM_ARRAY_TASK_ID Thank you for your attention!

Questions?

It is time for some exercises! ADVANCED MATERIAL Using Variables in BASH

hello2.sh

#!/bin/bash FIRST_NAME="Daniel" LAST_NAME="Balague"

echo "Hello $FIRST_NAME $LAST_NAME"

WARNING: Do not put spaces between variable names, the equal sign, and the value for the variable. BASH Script that Takes Parameters

hello3.sh #!/bin/bash echo "Hello $1 $2"

Here $1 and $2 are the first and second arguments passed when we call the script Fancier Output: printf

We can modify the way numbers or are displayed on the screen:

print_numbers.sh #!/bin/bash NUMBER=12 printf "The number is %04d\n" "$NUMBER" Assign the Output of a Command into a Variable

print_numbers2.sh #!/bin/bash NUMBER=12 STRING=$(printf "The number is %04d\n" "$NUMBER") echo $STRING

Produces the same output as before: