Design and Implementation of High Performance Computing Cluster for Educational Purpose
Total Page:16
File Type:pdf, Size:1020Kb
Design and Implementation of High Performance Computing Cluster for Educational Purpose Dissertation submitted in partial fulfillment of the requirements for the degree of Master of Technology, Computer Engineering by SURAJ CHAVAN Roll No: 121022015 Under the guidance of PROF. S. U. GHUMBRE Department of Computer Engineering and Information Technology College of Engineering, Pune Pune - 411005. June 2012 Dedicated to My Mother Smt. Kanta Chavan DEPARTMENT OF COMPUTER ENGINEERING AND INFORMATION TECHNOLOGY, COLLEGE OF ENGINEERING, PUNE CERTIFICATE This is to certify that the dissertation titled Design and Implementation of High Performance Computing Cluster for Educational Purpose has been successfully completed By SURAJ CHAVAN (121022015) and is approved for the degree of Master of Technology, Computer Engineering. PROF. S. U. GHUMBRE, DR. JIBI ABRAHAM, Guide, Head, Department of Computer Engineering Department of Computer Engineering and Information Technology, and Information Technology, College of Engineering, Pune, College of Engineering, Pune, Shivaji Nagar, Pune-411005. Shivaji Nagar, Pune-411005. Date : Abstract This project work confronts the issue of bringing high performance computing (HPC) education to those who do not have access to a dedicated clustering en- vironment in an easy, fully-functional, inexpensive manner through the use of normal old PCs, fast Ethernet and free and open source softwares like Linux, MPICH, Torque, Maui etc. Many undergraduate institutions in India do not have the facilities, time, or money to purchase hardware, maintain user accounts, configure software components, and keep ahead of the latest security advisories for a dedicated clustering environment. The projects primary goal is to provide an instantaneous, distributed computing environment. A consequence of provid- ing such an environment is the ability to promote the education of high perfor- mance computing issues at the undergraduate level through the ability to turn an ordinary off the shelf networked computers into a non-invasive, fully-functional cluster. The cluster is used to solve problems which require high degree of com- putation like satisfiability problem for Boolean circuits, Radix-2 FFT algorithm, 1 dimensional time dependent heat equation and other. Also the cluster is bench- marked by using High Performance Linpack and HPCC benchmark suite. This cluster can be used for research on data mining applications with large data sets, object-oriented parallel languages, recursive matrix algorithms, network protocol optimization, graphical rendering, Fast Fourier transforms, built college's private cloud etc. Using this cluster students and faculty will receive extensive experience in configuration, troubleshooting, utilization, debugging and administration issues uniquely associated with parallel computing using such cluster. Several students and faculty can use it for their project and research work in near future. iii Acknowledgments It is great pleasure for me to acknowledge the assistance and contribution of num- ber of individuals who helped me in my project titled Design and Implementation of HPCC for Educational Purpose. First and foremost I would like to express deepest gratitude to my Guide Prof. S.U. Ghumbre who has encouraged, supported and guided me during every step of the Project. Without his invaluable advice completion of this project would not be possible. I take this opportunity to thank our Head of Department, Prof. Dr. Jibi Abraham for her able guidance and for providing all the necessary facilities, which were indispensable in the completion of this project. I am also thankful to the staff of Computer Engineering Department for their invaluable suggestions and advice. I thank the college for providing the required magazines, books and access to the Internet for collecting information related to the Project. I am thankful to Dr. P. K. Sinha, Senior Director HPC, C-DAC, Pune for granting me permission to study C-DAC's PARAM Yuva facility. I am also thank- ful to Dr. Sandeep Joshi and Mr. Rishi Pathak, Mr. Vaibhav Pol of PARAM Yuva Supercomputing facility, C-DAC, Pune for their continuous encouragement and support throughout the course of this project. Last, but not the least, I am also grateful to my friends for their valuable comments and suggestions. iv Contents Abstract iii Acknowledgments iv List of Figures vi 1 Introduction1 1.1 High Performance Computing.....................1 1.1.1 Types of HPC architectures..................2 1.1.2 Clustering............................3 1.2 Characteristics and features of clusters................4 1.3 Motivation................................5 1.3.1 Problem Definition.......................5 1.3.2 Scope..............................5 1.3.3 Objectives............................5 2 Literature Survey6 2.1 HPC oppurtunities in Indian Market.................6 2.2 HPC at Indian Educational Institutes.................6 2.3 C-DAC..................................7 2.3.1 C-DAC and HPC........................7 2.4 PARAM Yuva..............................8 2.5 Grid Computing............................. 10 2.5.1 GARUDA: The National Grid Computing Initiative of India 10 2.5.2 Garuda: Objectives....................... 11 2.6 Flynn's Taxonomy........................... 11 2.7 Single Program, Multiple Data (SPMD)............... 13 2.8 Message Passing and Parallel Programming Protocols........ 14 2.8.1 Message Passing Models.................... 14 2.9 Speedup and Efficiency......................... 18 2.9.1 Speedup............................. 18 2.9.2 Efficiency............................ 18 2.9.3 Factors affecting performance................. 19 2.9.4 Amdahl's Law.......................... 21 2.10 Maths Libraries............................. 22 2.11 HPL Benchmark............................ 24 2.11.1 Description of the HPL.dat File................ 25 2.11.2 Guidelines for HPL.dat configuration............. 30 2.12 HPCC Challenge Benchmark..................... 32 3 Design and Implementation 35 3.1 Beowulf Clusters: A Low cost alternative............... 35 3.2 Logical View of proposed Cluster................... 36 3.3 Hardware Configuration........................ 36 3.3.1 Master Node.......................... 36 3.3.2 Compute Nodes......................... 37 3.3.3 Network............................. 37 3.4 Softwares................................ 38 3.4.1 MPICH2............................. 39 3.4.2 HYDRA: Process Manager................... 44 3.4.3 TORQUE: Resource Manager................. 44 3.4.4 MAUI: Cluster Scheduler.................... 45 3.5 System Considerations......................... 46 4 Experiments 48 4.1 Finding Prime Numbers........................ 48 4.2 PI Calculation.............................. 49 4.3 Circuit Satisfiability Problem..................... 50 4.4 1D Time Dependent Heat Equation.................. 51 4.4.1 The finite difference discretization............... 51 4.4.2 Using MPI to compute the solution.............. 53 4.5 Fast Fourier Transform......................... 53 4.5.1 Radix-2 FFT algorithm.................... 54 4.6 Theoretical Peak Performance..................... 55 4.7 Benchmarking.............................. 56 4.8 HPL................................... 56 4.8.1 HPL Tuning........................... 56 4.8.2 Run HPL on cluster...................... 58 vi 4.8.3 HPL results........................... 59 4.9 Run HPCC on cluster......................... 60 4.9.1 HPCC Results.......................... 61 5 Results and Applications 63 5.1 Discussion on Results.......................... 63 5.1.1 Observations about Small Tasks................ 63 5.1.2 Observations about Larger Tasks............... 63 5.2 Factors affecting Cluster performance................. 64 5.3 Benefits................................. 64 5.4 Challenges of parallel computing.................... 65 5.5 Common applications of high-performance computing clusters... 67 6 Conclusion and Future Work 69 6.1 Conclusion................................ 69 6.2 Future Work............................... 69 Bibliography 71 Appendix A PuTTy 74 A.1 How to use PuTTY to connect to a remote computer........ 74 A.2 PSCP.................................. 75 A.2.1 Starting PSCP......................... 76 A.2.2 PSCP Usage........................... 76 vii List of Figures 1.1 Basic Cluster..............................3 2.1 Evolution of PARAM Supercomputers & HPC Roadmap......8 2.2 Block Diagram of PARAM Yuva....................9 2.3 Single Instruction, Multiple Data streams (SISD).......... 12 2.4 Single Instruction, Multiple Data streams (SIMD).......... 12 2.5 Multiple Instruction, Single Data stream (MISD).......... 13 2.6 Multiple Instruction, Multiple Data streams (MIMD)........ 13 2.7 General MPI Program Structure.................... 17 2.8 Speedup of a program using multiple processors........... 21 3.1 The Schematic structure of proposed cluster............. 35 3.2 Logical view of proposed cluster.................... 36 3.3 The Network interconnection...................... 38 4.1 Graph showing performance for Finding Primes........... 49 4.2 Graph showing performance for Calculating π ............ 50 4.3 Graph showing performance for solving C-SAT Problem....... 51 4.4 Graph showing performance for solving 1D Time Dependent Heat Equation................................. 52 4.5 Symbolic relation between four nodes................. 52 4.6 Graph showing performance Radix-2 FFT algorithm......... 54 4.7 8-point Radix-2 FFT: Decimation in frequency form......... 55 4.8 Graph showing High Performance