Latest Versions Are Shown by Default
Total Page:16
File Type:pdf, Size:1020Kb
HPC wiki Documentation Release 2.0 Hurng-Chun Lee, Daniel Sharoh, Edward Gerrits, Marek Tyc, Mike van Engelenburg, Mariam Zabihi Sep 15, 2021 Contents 1 About the wiki 1 2 Table of Contents 3 2.1 High Performance Computing for Neuroimaging Research.......................3 2.2 Linux tutorial...............................................4 2.3 Introduction to the Linux BASH shell.................................. 21 2.4 The HPC cluster............................................. 37 2.5 The project storage............................................ 102 2.6 Linux & HPC workshops........................................ 105 i ii CHAPTER 1 About the wiki This wiki contains materials used by the Linux and HPC workshop held regularly at Donders Centre for Cognitive Neuroimaging (DCCN). The aim of this workshop is to provide researchers the basic knowledage to use the High- Performance Computing (HPC) cluster for data analysis. During the workshop, the wiki is used in combination with lectures and hands-on exercises; nevertheless, contents of the wiki are written in such that they can also be used for self-learning and references. There are two major sessions in this wiki. The Linux basic consists of the usage of the Linux operating system and an introduction to the Bash scripting language. After following the session, you should be able to create text-based data files in a Linux system, and write a bash script to perform simple data analysis on the file. The cluster usage focuses on the general approach of running computations on the Torque/Moab cluster. After learning this session, you should be knowing how to distribute data analysis computations to the Torque/Moab cluster at DCCN. 1 HPC wiki Documentation, Release 2.0 2 Chapter 1. About the wiki CHAPTER 2 Table of Contents 2.1 High Performance Computing for Neuroimaging Research Fig. 1: Figure: the HPC environment at DCCN. 2.1.1 HPC Cluster The HPC cluster at DCCN consists of two groups of computers, they are: • access nodes: mentat001 ~ mentat005 as login nodes. • compute nodes: a pool of powerful computers with more than 1000 CPU cores. Computer nodes are managed by the Torque job manager and the Moab job scheduler. While the access nodes can be accessed via either a SSH terminal or a VNC session, compute nodes are only accessible by submiting computational jobs. 2.1.2 Central Storage The central storage provides a shared file system amongst the Windows desktops within DCCN and the computers in the HPC cluster. On the central storage, every user has a personal folder with a so-called office quota (20 gigabytes by default). This personal folder is referred to as the M:\ drive on the Windows desktops. Storage spaces granted to research projects (following the project proposal meeting(PPM)) are also provided by the central storage. The project folders are organised under the directory /project which is referred to as the P:\ drive on the Windows desktops. The central storage also hosts a set of commonly used software/tools for neuroimaging data processing and analysis. This area in the storage is only accessible for computers in the HPC cluster as software/tools stored there require the Linux operating system. 3 HPC wiki Documentation, Release 2.0 2.1.3 Identity Manager The identity manager maintains information for authenticating users accessing to the HPC cluster. It is also used to check users’ identity when logging into the Windows desktops at DCCN. In fact, the user account received from the DCCN check-in proceduer is managed secretely by this identity manager. Note: The user account concerned here (and throughout the entire wiki) is the one received via the DCCN check-in procedure. It is, in most of cases, a combination of the first three letters of your first name and the first three letters of your last name. It is NOT the account (i.e. U/Z/S-number) from the Radboud University. 2.1.4 Supported Software A list of supported software can be found here. 2.2 Linux tutorial 2.2.1 Very short introduction of Linux Linux is an operating system originally developed by Linus Torvalds in 90’s for cloning the Unix operating system to personal computers (PCs). It is now one of the world-renowned software projects developed and managed by the open-source community. With its open nature in software development, free in (re-)distribution, and many features inherited directly from Unix, the Linux system provides an ideal and affordable environment for software development and scientific computation. It is why Linux is widely used in most of scientific computing systems nowadays. 4 Chapter 2. Table of Contents HPC wiki Documentation, Release 2.0 Architecture The figure above illustrates a simplified view of the Linux architecture. From inside out, the core of the system is called the kernel. It interacts with hardware devices, and provides upper layer components with low-level functions that hide complexity of, for example, arranging concurrent accesses to hardware. The shell is an interface to the kernel. It takes commands from user (or application) and executes kernel’s functions accordingly. Applications are generally refer to system utilities providing advanced functionalities of the operating system, such as the tool cp for copying files. File and process Everything in Linux is either a file or a process. A process in Linux refers to an executing program identified by an unique process identifier (PID). Processes are internally managed by the Linux kernel for the access to hardware resources (e.g. CPU, memory etc.). In most of cases, a file in Linux is a collection of data. They are created by users using text editors, running compilers etc. Hardware devices are also represented as files in the Linux. Linux distributions Nowadays Linux is made available as a collection of selected software packages based around the Linux kernel. It is the so-called Linux distribution. As of today, different Linux distributions are available on the market, each addresses the need of certain user community. In the HPC cluster at DCCN, we use the CentOS Linux distribution. It is a well-maintained distribution developed closely with RedHat, a company providing commercial Linux distribution and support. It is also widely used in many scientific computing systems in the world. 2.2. Linux tutorial 5 HPC wiki Documentation, Release 2.0 2.2.2 Getting started with Linux By following this wiki, you will login to one of the access nodes of the HPC cluster, learn about the Linux shell and issue a very simple Linux command on the virtaul terminal. Obtain a user account Please refer to this guide. SSH login with Putty Please refer to this guide. The prompt of the shell After you login to the access node, the first thing you see is a welcome message together with couple of news messages. Following the messages are few lines of text look similar to the example below: honlee@mentat001:~ 999$ Every logged-in users is given a shell to interact with the system. The example above is technically called the prompt of the Linux shell. It waits for your commands to the system. Following the prompt, you will type in commands to run programs. Note: For the simplicity, we will use the symbol $ to denote the prompt of the shell. Environment variables Every Linux shell comes with a set of variables that can affect the way running processes will behave. Those variables are called environment variables. The command to list all environment variables in the current shell is $ env Tip: The practical action of running the above command is to type env after the shell prompt, and press the Enter key. Generally speaking, user needs to set or modify some default environment variables to get a particular program running properly. A very common case is to adjust the PATH variable to allow the system to find the location of the program’s executable when the program is launched by the user. Another example is to extend the LD_LIBRARY_PATH to include the directory where the dynamic libraries needed for running a program can be found. In the HPC cluster, a set of environment variables has been prepared for the data analysis software supported in the cluster. Loading (or unloading) these variables in a shell is also made easy using the Environment Modules. For average users, it’s not even necessary to load the variables explicitly as a default set of variables corresponding to commonly used neuroimaging software are loaded automatically upon the user login. More details about using software in the HPC cluster is found here). 6 Chapter 2. Table of Contents HPC wiki Documentation, Release 2.0 Knowing who you are in the system The Linux system is designed to support multiple concurrent users. Every user has an account (i.e. user id) that is the one you used to login to the access node. Every user account is associated with at-least one group in the system. In the HPC cluster at DCCN, the system groups are created in response of the research (i.e. PI) groups. User accounts are associated with groups according to the registration during the check-in procedure. To know about your user id and the system group you are associated with, simply type id followed by pressing the Enter key to issue the command on the prompt. For example: $ id uid=10343(honlee) gid=601(tg) groups=601(tg) Using online manuals A linux command comes with options for additional functionalities, the online manual provides a handy way to find the supported options of a command. To access to the online manual of a command, one use the command man followed by the command in question. For example, to get all possible options of the id command, one does $ man id 2.2.3 Understanding the Linux file system Data and software programs in the Linux system are stored in files organised in directories (i.e.