Concept and Implementation of CLUSTERIX: National Cluster of Linux Systems

Concept and Implementation of CLUSTERIX: National Cluster of Linux Systems Roman Wyrzykowski1, Norbert Meyer2, and Maciej Stroinski2 1 Czestochowa University of Technology Institute of Computer & Information Sciences Dabrowskiego 73, 42-200 Czestochowa, Poland [email protected] http://icis.pcz.pl 2 Poznan Supercomputing and Networking Center Noskowskiego 10, 61-704 Poznan, Poland {meyer, stroins}@man.poznan.pl http://www.man.poznan.pl Abstract. This paper presents the concept and implementation of the National Cluster of Linux Systems (CLUSTERIX) - a distributed PC- cluster (or metacluster) of a new generation, based on the Polish Optical Network PIONIER. Its implementation makes it possible to deploy a production Grid environment, which consists of local PC- clusters with 64- and 32-bit Linux machines, located in independent centers across Poland. The management software developed as Open Source allows for dynamic changes in the metacluster configuration. The resulting system will be tested on a set of pilot distributed applications developed as a part of the project. The project is implemented by 12 Polish supercomputing centers and metropolitan area networks. 1 Introduction PC-clusters using Open Source software such as Linux are the most common and available parallel systems now. At the same time, the capability of Gigabit/s wide area networks are increasing rapidly, to the point when it becomes feasible and indeed interesting to think of the high-end integrated metacluster environment rather than a set of disjoint local clusters. Such metaclusters [3,17,18] can be viewed as key elements of the modern Grid infrastructure, and used by scientists and engineers to solve computationally and data demanding problems. In Poland, we have access to all crucial elements which are necessary to build the national Linux metacluster. The most important among them is Polish Optical Network PIONIER [15, 16]. It represents an intelligent, multi-channel optical network using the DWDM technology, with the bandwidth of n x (10, 40, ...) Gb/s, based on IP protocol. On the transport layer this network provides allocation of dedicated resources for specified applications, Grids, and thematic networks. 2 Roman Wyrzykowski et al. 2 Project Goals and Status The main objective of the CLUSTERIX project [1] is to develop mechanisms and tools that allow for the deployment of a production Grid environment with the bacbone consisting of dedicated, local Linux clusters with 64-bit machines. Local clusters are placed in geographically distant independent centers connected by the Polish Optical Network PIONIER. It is assumed that each (in theory) Linux cluster may be attached to the backbone dynamically, as so called dynamic cluster. As a result, a geographically distributed Linux cluster is obtained, with a dynamically changing configuration, fully operational, and integrated with services offered by other projects. The project started on December 2003, and lasts 32 months. It is divided into two stages: (i) research and development with estimated duration of 20 months, (ii) deployment stage. The project is implemented by 12 Polish supercomputing centers and metropolitan area networks affiliated to Polish universities, with Czestochowa University of Technology as the project coordinator. It is important to note the phrase ”production Grid”; meaning the development of software/hardware infrastructure accessible for real computing, fully operational and integrated with services offered by other projects related to the PIONIER program [16], e.g., the National Computational Cluster based on the LSF batch system, National Data Warehouse, and virtual lab project. Deliver- ing advanced and specialized services integrated into a single coherent system requires additional mechanisms not available in the existing pilot installations (see, e.g., CrossGrid testbed [2]). They are commonly constrained by the as- sumption of static infrastructure in terms of the number of nodes and services provided, as well as the number of users organized into virtual organizations. On the contrary, in CLUSTERIX we provide mechanisms and tools for an auto- mated attachement of dynamic clusters; for example, non-dedicated clusters or labs may be attached to the backbone during the night or weekend. In the CLUSTERIX project, a lot of emphasis is laid on the usage of the IPv6 protocol [8] and its added functionality - enhanced reliability and QoS. This functionality delivered to the application level and at least used in middleware would allow for a better quality of services. Nothing like a production, IPv6- based Grid infrastructure does exist at present, but taking into account duration of the project it may be assumed that the IPv6 standard will be widely used. Therefore, the developed tools will support both IPv6 and IPv4. After the system is built, it will be tested on a set of pilot applications created as a part of the project. The important goal of the project is also to support potential CLUSTERIX users in preparation of their Grid applications, thus creating a group of people being able to use the cluster in an optimal way after the research and deployment works are finished. Concept and Implementation of CLUSTERIX 3 3 Pilot Installation The CLUSTERIX project includes a pilot installation (Fig.1) consisting of 12 local clusters located in independent centers across Poland. They are intercon- nected via dedicated 1 Gb/s channels provided by the PIONIER optical network. S£UPSK GDAÑSK 32xIA-64, KOSZALIN 128GBRAM, 1168GBHDD, switchInfiniBand OLSZTYN SZCZECIN 6xIA-64, 12GBRAM, 438GBHDD, BYDGOSZCZ switch48x1Gb/s 30xIA-64, TORUÑ BIA£YSTOK 60GBRAM, 6xIA-64, 1095 GBHDD, 12GBRAM, switch48x1Gb/s 6xIA-64, 2219 GBHDD, 16GBRAM, switch24x1Gb/s POZNAÑ 219 GBHDD, switch24x1Gb/s WARSZAWA 12xIA-64, 24GBRAM, ZIELONA GÓRA 438 GBHDD, 8xIA-64, switch48x1Gb/s 24GBRAM, 292 GBHDD, £ÓD switch24x1Gb/s 18xIA-64, PU£AWY 172GBRAM, 6278 GBHDD, RADOM switchInfiniBand 16xIA-64, 32GBRAM, LUBLIN WROC£AW 2219 GBHDD, 24xIA-64, 8xIA-64, switch24x1Gb/s 48GBRAM, 16GBRAM, 876 GBHDD, 292 GBHDD, CZÊSTOCHOWA switch48x1Gb/s switch48x1Gb/s KIELCE OPOLE GLIWICE 16xIA-64, 32GBRAM, 584 GBHDD, KATOWICE switch24x1Gb/s KRAKÓW RZESZÓW BIELSKO-BIA£A Fig. 1. Pilot installation in the CLUSTERIX project The core of the testbed is equipped with 127 Intel Itanium2 nodes managed by Linux OS (Debian distribution, kernel 2.6.x). A computational node includes two Itanium2 processors (1,3 GHz, 3 MB cache), 4 GB or 8 GB RAM, 73 or 146 GB SCSI HDD, as well as two network interfaces (Gigabit Ethernet, and InfiniBand or Myrinet). Such a dual network interface allows for creating two independent communication channels dedicated to exchange of messages during computations and NFS support. The efficient access to the PIONIER backbone is provided through a Gigabit Ethernet L2/L3 coupling switch (see Fig.2). 4 Roman Wyrzykowski et al. Fig. 2. Architecture of the CLUSTERIX infrastructure Concept and Implementation of CLUSTERIX 5 Selected 32-bit machines are dedicated to management of local clusters and the entire infrastructure. While users tasks are allowed to be executed only on computational nodes, each local cluster is equipped with an access node where the Globus Toolkit [5] and local batch system are running. All machines inside a local cluster are protected by a firewall, which is also used as a router for attachment of dynamic clusters. Access to resources of the National Linux Cluster is allowed only from machines called entry points; physical users can possess their accounts only on these dedicated nodes. It is assumed that end- users applications are submitted to the CLUSTERIX system through WWW portals. An important element of the pilot installation is Data Storage System. Before execution of an application, input data are fetched from storage elements and transferred to access nodes; after the execution output data are returned from access nodes to storage elements. The Data Storage System includes a distributed implementation of data broker. Currently each storage element is equipped with 2 TB HDD. 4 Pilot Applications The National Linux Cluster will be used for running HTC applications, as well as large-scale distributed applications that require parallel use of resources of one or more local clusters (meta-applications). In the project, selected end-user‘s applications are being developed for the experimental verification of the project assumptions and deliverables, as well as to achieve real application results. It is clear that applications and their ability to use distributed resources efficiently will decide finally on success of computational Grids. Because of the hierarchical architecture of the CLUSTERIX infrastructure, it is not a trivial issue to adopt an application for its efficient execution on the metacluster. This requires parallelization on several levels corresponding to the metacluster architecture, and taking into account heterogeneity in both the computing power of different nodes, and network performance between various subsystems. Another problem is a variable availability of Grid components. In the CLUSTERIX project, the MPICH-G2 tool [10] based on the Globus Toolkit is used as a Grid-enabled implementation of MPI standard. The list of pilot applications includes among others: – FEM modeling of castings solidification; – modeling transonic flows and design of advanced tip devices; – prediction of protein structures from a sequence of aminoacids and simula- tion of protein folding; – investigation of properties of bio-molecular

Load more