Designing and Implementing a Parallel Computing Curriculum Based on Beowulf Clustering

2006-1733: DESIGNING AND IMPLEMENTING A PARALLEL COMPUTING CURRICULUM BASED ON BEOWULF CLUSTERING Fitra Khan, University of Texas-Brownsville Mahmoud Quweider, University of Texas-Brownsville Juan Iglesias, University of Texas-Brownsville Amjad Zaim, University of Texas-Brownsville Page 11.418.1 Page © American Society for Engineering Education, 2006 Designing and Implementing a Parallel Computing Curriculum Based on Beowulf Clustering1 Introduction The Computer Science/Computer Information Systems (CS/CIS) Department at The University of Texas at Brownsville (UTB) has improved its curriculum by including parallel computing topics based on a computing and networking laboratory (CNL)1. Built around a 24-node distributed Beowulf2,3 supercomputer, the main goal of CNL is to enhance the understanding of parallel computing principles in key courses of the Bachelor of Science in Computer Science (BS-CS) degree, the two-year Associate in Applied Science in Computer Information Systems (AAS-CIS), and the four-year Bachelor of Applied Technology in Computer Information Systems Technology (BAT-CIST). The strategy has been to use this supercomputer as the main instrument to infuse concepts and principles into targeted courses by creating a set of laboratory modules and capstone projects. Such project framework in CS education is strongly emphasized in the ACM/IEEE-CS curricula model4. CNL has aided in motivating the students by engaging them in integrating distributed computing and networking concepts into their course work through laboratory modules and capstone projects. There are benefits in joining the practice and theory of different computer science areas via an integrated laboratory environment such as the one provided by CNL. First, it is easier to develop laboratory modules that help students to put different theoretical concepts together5,6. Second, an integrated laboratory is a low-cost solution compared to developing separate physical laboratories to serve different areas of computer science. The laboratory has proved to be a dynamic educational tool for providing in depth understanding of essential concepts by incorporating state-of-the-art technologies into the curricula. This has allowed educators to keep on developing new laboratory modules for enriching their courses. In addition to currently implemented modules in areas like networking, databases and operating systems, new modules in areas such as encryption, autonomous intelligent systems, and web design and programming are planned to be developed, for example. After being supported originally by NSF, the CNL project has reached maturity and it is now institutionalized. This paper details the rationale, scope and achievements of the project. The 11.418.2 Page 1 This material is based upon work supported by the National Science Foundation under Grant No. 0101648. methodology used is also discussed with emphasis on considerations and feasibility for implementing similar computing and networking environment at peer institutions. Laboratory Design CNL project is built around the concept of a laboratory which offers laboratory projects in key courses of computer science. The equipment consists mainly of 24 computers, three Alpha workstations, and network hardware used to build a 24-node rack-mounted Beowulf cluster. The 24-node Beowulf cluster currently runs using open source Linux operating system. The clustering software used is based on a Message Passing Interface (MPI) package called MPICH7 which is available free of cost. MPI based software packages/toolkits are used to familiarize students with real-world tools to develop and implement algorithms with a short development cycle. The Beowulf cluster is complemented with devices to develop real-world laboratory projects in order to enhance student understanding of important concepts of computer science. For example, network devices to simulate leased lines of Public Switched Network (PSN)8 are installed to provide a true network environment of the real-world. As another example, image capturing devices were acquired to capture an image of an object for recognition by a neural network based pattern recognition algorithm. The hardware is interconnected using network auxiliary devices to provide a locally simulated PSN that models the real-world connectivity environment. The network hardware includes a 400Mbps network switching matrix with a 100Mbps Fast Ethernet uplink to the building's Gigabit backbone. The backplane is attributed by 10Mbps switches (VN900EE and VN900EA) providing a total of one ATM port and 36 10Mbps switched ports. The building's LAN is connected to the Internet via a GigaMAN circuit leased from the local phone provider. One of the three Alpha workstations is used for distributing tasks to the Beowulf and also for accepting tasks from connected users. This Alpha workstation is the management station for the Beowulf. The second Alpha workstation is used to compile and analyze results produced by the Beowulf. It is a dedicated user workstation due to graphics required to analyze data. The third Alpha workstation is placed on the far side of the simulated Public Switched Network (PSN). Among the many tasks, it is used to simulate congestion on the PSN by transferring large amounts of data back and forth across the PSN. Figure 1 shows the general schematic of CNL. The laboratory houses the 24 computers that constitute the 24-node rack-mounted Beowulf as a central component of B-CEIL. Network devices are required to simulate a real-world PSN. This consists of a pair of T1-to-V.35 devices to simulate a leased line8, a pair of DACs to aggregate or cross-connect different channels of T1's, a pair of routers to provide WAN-to-LAN connectivity at each end of the leased line, and VoIP units on each end to simulate real-world voice grade channels. The Beowulf nodes and other LAN equipment are connected by a hub backplane. A LAN switch provides connectivity to the LAN devices on the other side of the simulated PSN. Page 11.418.3 Page Illustration 1: Overall CNL configuration. Hardware is also available to introduce image processing algorithms9. Transducers are used to convert video signals to binary frame format fit for image processing10. A high resolution Charged Coupled Device (CCD) camera and a comparable image capturing card is used to capture high resolution images in order to process real-world images. A pair of video codec’s is used for benchmarking student algorithms. Implementation The authors participated during the implementation of the project; each one was scheduled to teach two different courses per semester for which the corresponding laboratory modules (LM) were developed. A total of eight courses were selected for utilizing B-CEIL in the first year for this project: COSC 3330 Networking and Database Management Systems, COSC 3310 Systems Programming and Concurrent Processes, COSC 3325 Digital Logic and Computer Organization, COSC 4310 Operating Systems, COSC 3355 Principles of Programming Languages, COSC 4342 Database Management Systems, COSC 4360 Numerical Methods, and COSC 4380 Image Processing. Two levels of student laboratory projects were developed for curriculum enrichment. Appendix A presents a finer LM´s breakdown including the subject areas in which they were utilized to enhance understanding of essential concepts. The first level of student laboratory projects was related directly to the Beowulf cluster itself, 11.418.4 Page specifically, its hardware architecture, connectivity, and existence as a logical cluster. This entails the development of laboratory projects on topics such as computer interfacing, Local Area Networking (LAN), clustering, task scheduling and optimization, and benchmarking. The second level of student laboratory projects were focused on setting up a PSN and associated data/voice channels to model the real-world connectivity, and building applications for Beowulf in the simulated PSN environment. This includes the development of laboratory projects in the area of Wide Area Networking (WAN), in order to enhance understanding of real-world PSN based connectivity, and computationally intensive fields such as artificial neural networks, image compression, image analysis, numerical analysis, and distributed databases where parallel processing concepts may be utilized to speed up computations. LM’s were developed so that students could complete them in one to two weeks during a regular semester instruction. In addition to LM’s, course projects (CP) were also proposed to students. These CP’s had a long term nature in the sense that they were intended to be developed during the entire semester and carried on during different offerings of the same course from one semester to another. Topics of CP’s were not restricted to the ambit of a single particular course. Instead, CP’s were developed having in mind a crossing-discipline emphasis that could integrate different areas of computer science. Appendix B shows a more detailed description of the CP’s. As the reader can appreciate from Appendix B, the topics of CP’s are wide in range going from an “Integrated Monitoring System” for public networks to the “Parallel Simulation of Electromagnetic Wave Propagation” and “Optimization Based on Genetic Algorithms”. This variety is in fact a reflection of the versatility and generality of the CNL. Results During the three years of its implementation, the project has proven to be successful. The laboratory was opened in fall 2002 and it has remained operational ever since. During this

Designing and Implementing a Parallel Computing Curriculum Based on Beowulf Clustering

Multicomputer Cluster

Cluster Computing: Architectures, Operating Systems, Parallel Processing & Programming Languages

Building a Beowulf Cluster

Beowulf Clusters Make Supercomputing Accessible

Spark on Hadoop Vs MPI/Openmp on Beowulf

Choosing the Right Hardware for the Beowulf Clusters Considering Price/Performance Ratio

Performance Comparison of MPICH and Mpi4py on Raspberry Pi-3B

USE of LOW-COST BEOWULF CLUSTERS in COMPUTER SCIENCE EDUCATION Timothy J

Beowulf Cluster’ for High-Performance Computing Tasks at the University: a Very Proﬁtable Investment

Lecture 20: “Overview of Parallel Architectures”

Cluster Computing

¡ ¢ ¡ £ ¡ £ ¤ £ ¤ £ ¥ Жзизжй Ий § § § ! " # § $