Document Downloaded From: This Paper Must Be Cited As: the Final

Document downloaded from: http://hdl.handle.net/10251/103725 This paper must be cited as: López Rodríguez, PJ.; Baydal Cardona, ME. (2017). On a course on computer cluster configuration and administration. Journal of Parallel and Distributed Computing. 105:127- 137. doi:10.1016/j.jpdc.2017.01.009 The final publication is available at https://doi.org/10.1016/j.jpdc.2017.01.009 Copyright Elsevier Additional Information On a Course on Computer Cluster Configuration and Administration Pedro Lópeza,∗, Elvira Baydala aDISCA Department Universitat Politècnica de València Camino de Vera, 14 46022 Valencia (SPAIN) Abstract Computer clusters are today a cost-effective way of providing either high- performance and/or high-availability. The flexibility of their configuration aims to fit the needs of multiple environments, from small servers to SME and large Internet servers. For these reasons, their usage has expanded not only in academia but also in many companies. However, each environment needs a different \cluster flavour". High-performance and high-throughput computing are required in universities and research centres while high-performance service and high-availability are usually reserved to use in companies. De- spite this fact, most university cluster computing courses continue to cover only high-performance computing, usually ignoring other possibilities. In this paper, a master-level course which attempts to fill this gap is discussed. It explores the different types of cluster computing as well as their func- tional basis, from a very practical point of view. As part of the teaching ∗Corresponding author Email addresses: [email protected] (Pedro López), [email protected] (Elvira Baydal) Preprint submitted to JPDC SI on Teaching Parallel, Distr. and High-Performance ComputingJanuary 4, 2017 methodology, each student builds from scratch a computer cluster based on a virtualization tool. The entire process is designed to be scalable. The goal is to be able to apply it to an actual computer cluster with a larger number of nodes, such as those the students may subsequently encounter in their professional life. Keywords: computer engineering education, computer cluster configuration and administration, lab project 1. Introduction Computer clusters are nowadays the most cost-effective alternative to build high-performance computer systems. A cluster is a type of parallel or distributed processing system which consists of a collection of interconnected stand-alone computers working together as a single, integrated computing resource [4]. Although the idea of using clusters to improve system reliability dates back to the 60s [23], its usage to improve system performance is more recent. The Beowulf project [3] built a $40,000 cluster with 16 personal computers that was able to reach 1 GFLOPs. Since then, computer clusters have become a low cost alternative to build high-performance systems. The fact of being composed by several computers can be exploited in two ways. The traditional approach is to obtain an overall high-performance by combining the performance of each individual computer. In addition, the fact of having component redundancy aims to improve system availability. The excellent price-performance ratio of clusters has led to a widespread usage. Their configuration flexibility aims to fit the needs of multiple environments, from small servers to Small and Medium Enterprises (SME) and large Inter- 2 net servers. Popular Internet services are provided by large scale clusters, usually located in different places to provide resilience against natural disas- ters. On the other hand, the largest supercomputers are based on computer clusters. Since 2007, more than 80% of the TOP500 list of supercomputers [27] are clusters. Taking into account their importance, computer engineering curricula may include some subjects about clusters [12]. The topics to cover will be related to system architecture, networking, operating systems, parallel programming and applications. Although the core of these topics will have already been taught previously, a cluster subject should apply them to a particular system: the computer cluster. Instructors have a golden opportunity to integrate and apply the knowledge acquired in previous core subjects. In particular, in this paper, we describe a course on computer cluster configuration and administration. As cluster computing is an advanced topic [13], the proposed course is targeted to graduate students, although its contents could be applied as well to an advanced undergraduate course in Computer Engineering. Selected topics of the course include cluster configuration, system installation, running sequential and parallel applications, storage, load-balancing and high availability. In addition, a computer cluster course requires hands-on sessions in a realistic environment to readily apply the concepts taught in the classroom. In particular, we propose building a computer cluster from scratch, installing the operating system, configuring the services and installing the required software packages. Taking into account the typical usage “flavours" of clusters [11], two approaches are simul- taneously considered. The first one is a high-performance/high-throughput 3 computing cluster where several users will run their sequential or parallel applications. The second one is a high-availability/load-balanced Internet server. This second usage of clusters is often ignored in cluster related subjects, albeit most cluster installations fit in this second category. Another possible flavour to consider in the next future is a cluster specifically designed for storing and analyzing huge amounts of unstructured data (i.e. a big data cluster). The straightforward approach to implement the course consists in using a real hardware infrastructure to provide this environment. With this approach, the students have the opportunity to see, touch and interact with actual hardware components of a computer cluster: processing nodes, storage nodes, network switches, PDUs, KVM switches, cluster cabinet, etc. More- over, they can directly experience the issues related to power consumption and space management. However, the problem of using real hardware in lab sessions is that each student should use his/her own cluster, composed by several nodes. Sharing a physical cluster is not an option because along the project development the students must have root privileged access to the system to allow them to modify system configuration. Therefore, their work may interfere with each other. Indeed, their activity will result in system reboots or even system crashes with catastrophic consequences to the work of other students. As a part of their learning process, these crashes will appear occasionally along the project. However, the alternative of purchasing a cluster for every student has a prohibitive cost for the University and will result in a very inefficient resource utilization. Indeed, this hardware may remain largely idle during 4 holidays and weekends and between major assignment due-dates [14]. Fortunately, the availability of free virtualization software and low-cost but high-performance personal and laptop computers makes possible for the students to run a cluster of virtual machines on the available lab desktop computers or even using their own laptops. While the students need admin- istrative privileges to install and manage their virtual machines, they need no privileges on the host machine. In addition, they can update or install cluster software and infrastructure without compromising the work done by other students. However, notice that the use of virtual machines to build a cluster is only for academic purposes and by no means we want to suggest that an actual cluster will be built using virtual machines. To reduce system requirements, the size of the cluster should be small. As a minimum configuration, the cluster that each student builds is composed of two director nodes (to provide high availability), three server nodes plus a storage node. Despite the reduced number of server nodes, the course is designed as if a large computer cluster were deployed. For instance, installing the system individually on each server node is easy for three servers, but is unfeasible for a large cluster. Thus, a scalable way of installing the system and configuring cluster nodes is proposed. More details can be found in Section 3. The main contribution of this paper is the proposal of a computer cluster subject that i) considers not only the traditional high-performance/high- throughput computing usage of clusters but also other usage flavours, like their usage as high-availability/load-balanced Internet server; ii) deploys a computer cluster based on virtual machines from scratch, installing and con- 5 figuring all the required packages. As a consequence, the course contributes to bridging the gap between industry and academia by teaching those contents that are used in today servers from a practical point of view. This approach has been recommended by ACM and IEEE Computer Society in their coming curricula in Computer Engineering [12]. The rest of the paper is organized as follows. Section 2 describes the syllabus. Section 3 presents the hands-on part of the course detailing the steps followed by each student to build his/her cluster. Section 4 discusses the teaching methodology and evaluation criteria. Section 5 discusses some related work and, finally, some conclusions are drawn. 2. Description of the Subject 2.1. Context One of the key

Load more