Clutter to Cluster
Total Page:16
File Type:pdf, Size:1020Kb
CoveR sToRy PelicanHPC Andrea Danti, Fotolia Danti, Andrea CLUTTERTurn your desktop computer into a high-performance TO CLUSTER cluster with PelicanHPC Crunch big numbers with your very own high-performance computing cores to work. Both 32-bit and 64-bit versions are available, so grab the one BY MAYANK SHARMA cluster. that matches your hardware. The developer claims that with Peli- f your users are clamoring for the puting in Fortran, C, Python, and Octave canHPC you can get a cluster up and Ipower of a data center but your pe- to provide some basic working examples running in five minutes. However, this is nurious employer tells you to make for beginners. a complete exaggeration – you can do it do with the hardware you already own, However, the process of maintaining in under three. don’t give up hope. With some some the distribution was pretty time con- First, make sure you get all the ingre- time, a little effort, and a few open suming, especially when it came to up- dients right: You need a computer to act source tools, you can transform your dating packages such as X and KDE. as a front-end node, and others that’ll mild-mannered desktop systems into a That’s when Creel discovered Debian act as slave computing nodes. The front- number-crunching super computer. For Live, spent time wrapping his head end and the slave nodes connect via the the impatient, the PelicanHPC Live CD around the live-helper package, and cre- network, so they need to be part of a will cobble off-the-shelf hardware into a ated a more systematic way to make a local LAN. Although you can connect high-performance cluster in no time. Live distro for clustering. So in essence, them via wireless, depending on the The PelicanHPC project is the natural PelicanHPC is a single script that fetches amount of data being exchanged, you evolution of ParallelKnoppix, which was required packages off a Debian reposi- could run into network bottlenecks. a remastered Knoppix with packages for tory, adds some configuration scripts Also, make sure the router between the clustering. Michael Creel developed Peli- and example software, and outputs a front end and the slaves isn’t running a canHPC for his own research work. Creel bootable ISO. DHCP server because the front end doles was interested in learning about cluster- out IP addresses to the slaves. ing, and because adding packages was Boot PelicanHPC Although you don’t really need a mon- so easy, he added PVM, cluster tools like Later in the article, I’ll use the script to itor or keyboard or mouse on the slave ganglia monitor, applications like GRO- create a custom version. For now, I’ll use nodes, you need these on the front end. MACS, and so forth. He also included the stock PelicanHPC release (v1.8) from If you have a dual core with enough some simple examples of parallel com- the website [1] to put those multiple memory, it wouldn’t be a bad idea to run 30 ISSUE 103 JUNE 2009 PelicanHPC CoveR sToRy the Network boot option is prioritized over other forms of booting in the BIOS. When it sees the front-end node, the slave displays the PelicanHPC splash screen and lets you enter any boot pa- rameters (language, etc.), just as it did on the front-end node earlier. Instead of booting into Xfce, when it’s done booting, the slave node displays a Figure 1: If your slave node isn’t headless, this is what it’ll say. notice that it’s part of a cluster and shouldn’t be turned off (Figure 1). Of the front end on a virtual machine and for the user user on the front-end nodes, course, if your slave nodes don’t have a the slave on physical machines. Primar- as well as on the slave nodes. Peli- monitor, just make sure the boot param- ily, PelicanHPC runs on memory, so canHPC is designed for a single user, eters in the BIOS are in the correct order make sure you have plenty. If you’re and the password is in cleartext. and turn it on. doing serious work on the cluster, you When it has this info, PelicanHPC will When the node is up and running, can make it save your work on the hard boot the front-end node and drop off head back to the front end and press the disk, in which case, make sure you have into the Xfce desktop environment. No button, which rescans the cluster and a hard disk attached. In fact, to test Peli- updates the number of connected nodes canHPC, you can run it completely on set Up the Cluster (Figure 2). When the number of con- virtual hardware with virtual network Now that the front-end node is up and nected nodes matches the number of connections, provided you have the juice running, it’s time to set it up for cluster- slaves you turned on, press Yes. Peli- on the physical host to power so much ing. PelicanHPC has a set of scripts for canHPC displays a confirmation message virtual hardware. this purpose. Either call the scripts man- and points you to the script that’s used With the hardware in place, pop in the ually or use the master pelican_setup to reconfigure the cluster when you de- Live CD in the front-end node and let it script, which calls all the other scripts cide to add or remove a node (Figure 3). boot. If you want to choose a custom that start the various servers and con- To resize the cluster, run the following language or turn off ACPI or tweak some nects with the slave nodes. script: other boot parameters, you can explore To start setting up the cluster, open a the boot options from the F1 key; press terminal window and type: sh pelican_restarthpc Enter to boot with the default options. During bootup, PelicanHPC prompts sh pelican_hpc That’s it. Your cluster is up and running, you thrice. First it wants you to select a waiting for your instructions. permanent storage device that’ll house If you have multiple network interfaces the /home directory. The default option on the machine, you’ll be asked to select Crunchy Bar ram1 stores the data on the physical the one that is connected to the cluster. The developer, Creel, is a professor of RAM. If you want something more per- Next, you’re prompted to allow the economics at the Autonomous Univer- manent, you just need to enter the de- scripts to start the DHCP server, fol- sity of Barcelona in Catalonia, Spain. He vice, such as hda1 or sda5. The device lowed by confirmation to start the ser- works in econometrics, which involves a can be a hard disk partition or a USB vices that’ll allow the slave nodes to lot of number crunching. Therefore, disk – just make sure it’s formatted as join the cluster. At first, the constant you’ll find some text and example GNU ext2 or ext3. If you replace the default confirmations seem irritating, but they Octave code related to Creel’s research option ram1 with a device, PelicanHPC are necessary to pre- will create a user directory at the root of vent you from that device. throwing the net- Next, PelicanHPC asks whether it work into a tizzy should copy all the configuration scripts with conflicting and the examples to the home directory DHCP services or on the specified device. If this is the first from accidentally in- time you are running PelicanHPC, you’ll terrupting on-going want to choose Yes. If you’ve selected a computations. permanent storage location, such as a Once it has your partition of the disk, on subsequent permission to start boots, you should choose No here. Of the cluster, the script course if you are running PelicanHPC asks you turn on the from RAM, you’ll always have to choose slave nodes. Yes. Slave nodes are Finally, you’re prompted to change the booted over the net- default password. This password will be work, so make sure Figure 2: Two nodes up and running; continue scanning for more. JUNE 2009 ISSUE 103 31 CoveR sToRy PelicanHPC example2, shown in thon-based apps for scientific computing. Figure 4, shows the PelicanHPC also has the MPI toolbox result of the Monte (MPITB) for Octave, which lets you call Carlo test. MPI library routines from within Octave. Creel also suggests that PelicanHPC can Passing the Buck be used for molecu- If you’re new to parallel programming, lar dynamics with you might not be aware of MPI (Mes- the open source soft- sage-Passing Interface), which is key to ware, GROMACS parallel computing. It is a software sys- (GROningen MA- tem that allows you to write message- chine for Chemical passing parallel programs that run on a Simulations). The cluster. MPI isn’t a programming lan- distributed project guage, but a library that can pass mes- for studying protein sages between multiple processes. The Figure 3: Two nodes are up and running besides the front end. folding, Folding@ process can be either on a local machine home, also uses or running across the various nodes on and teaching. If you’re interested in GROMACS, and Creel believes that one the cluster. econometrics, the econometrics.pdf file could also replicate this setup on a clus- Popular languages for writing MPI under the /home/user/Econometrics di- ter created by PelicanHPC. programs are C, C++ and Fortran.