<<

CoveR sToRy PelicanHPC Andrea Danti, Fotolia

Turn your computer into a high-performance cluster with PelicanHPC TO CLUSTER

Crunch big numbers with your very own high-performance computing cores to work. Both 32-bit and 64-bit versions are available, so grab the one BY MAYANK SHARMA cluster. that matches your hardware. The developer claims that with Peli- f your users are clamoring for the puting in Fortran, C, Python, and Octave canHPC you can get a cluster up and power of a data center but your pe- to provide some basic working examples running in five minutes. However, this is Inurious employer tells you to make for beginners. a complete exaggeration – you can do it do with the hardware you already own, However, the process of maintaining in under three. don’t give up hope. With some some the distribution was pretty time con- First, make sure you get all the ingre- time, a little effort, and a few open suming, especially when it came to up- dients right: You need a computer to act source tools, you can transform your dating packages such as X and KDE. as a front-end node, and others that’ll mild-mannered desktop systems into a That’s when Creel discovered act as slave computing nodes. The front- number-crunching super computer. For Live, spent time wrapping his head end and the slave nodes connect via the the impatient, the PelicanHPC Live CD around the live-helper package, and cre- network, so they need to be part of a will cobble off-the-shelf hardware into a ated a more systematic way to make a local LAN. Although you can connect high-performance cluster in no time. Live distro for clustering. So in essence, them via wireless, depending on the The PelicanHPC project is the natural PelicanHPC is a single script that fetches amount of data being exchanged, you evolution of ParallelKnoppix, which was required packages off a Debian reposi- could run into network bottlenecks. a remastered with packages for tory, adds some configuration scripts Also, make sure the router between the clustering. Michael Creel developed Peli- and example software, and outputs a front end and the slaves isn’t running a canHPC for his own research work. Creel bootable ISO. DHCP server because the front end doles was interested in learning about cluster- out IP addresses to the slaves. ing, and because adding packages was Boot PelicanHPC Although you don’t really need a mon- so easy, he added PVM, cluster tools like Later in the article, I’ll use the script to itor or keyboard or mouse on the slave ganglia monitor, applications like GRO- create a custom version. For now, I’ll use nodes, you need these on the front end. MACS, and so forth. He also included the stock PelicanHPC release (v1.8) from If you have a dual core with enough some simple examples of parallel com- the website [1] to put those multiple memory, it wouldn’t be a bad idea to run

30 ISSUE 103 JUNE 2009 PelicanHPC CoveR sToRy

the Network boot option is prioritized over other forms of booting in the BIOS. When it sees the front-end node, the slave displays the PelicanHPC splash screen and lets you enter any boot pa- rameters (language, etc.), just as it did on the front-end node earlier. Instead of booting into , when it’s done booting, the slave node displays a Figure 1: If your slave node isn’t headless, this is what it’ll say. notice that it’s part of a cluster and shouldn’t be turned off (Figure 1). Of the front end on a virtual machine and for the user user on the front-end nodes, course, if your slave nodes don’t have a the slave on physical machines. Primar- as well as on the slave nodes. Peli- monitor, just make sure the boot param- ily, PelicanHPC runs on memory, so canHPC is designed for a single user, eters in the BIOS are in the correct order make sure you have plenty. If you’re and the password is in cleartext. and turn it on. doing serious work on the cluster, you When it has this info, PelicanHPC will When the node is up and running, can make it save your work on the hard boot the front-end node and drop off head back to the front end and press the disk, in which case, make sure you have into the Xfce desktop environment. No button, which rescans the cluster and a hard disk attached. In fact, to test Peli- updates the number of connected nodes canHPC, you can run it completely on set Up the Cluster (Figure 2). When the number of con- virtual hardware with virtual network Now that the front-end node is up and nected nodes matches the number of connections, provided you have the juice running, it’s time to set it up for cluster- slaves you turned on, press Yes. Peli- on the physical host to power so much ing. PelicanHPC has a set of scripts for canHPC displays a confirmation message virtual hardware. this purpose. Either call the scripts man- and points you to the script that’s used With the hardware in place, pop in the ually or use the master pelican_setup to reconfigure the cluster when you de- Live CD in the front-end node and let it script, which calls all the other scripts cide to add or remove a node (Figure 3). boot. If you want to choose a custom that start the various servers and con- To resize the cluster, run the following language or turn off ACPI or tweak some nects with the slave nodes. script: other boot parameters, you can explore To start setting up the cluster, open a the boot options from the F1 key; press terminal window and type: sh pelican_restarthpc Enter to boot with the default options. During bootup, PelicanHPC prompts sh pelican_hpc That’s it. Your cluster is up and running, you thrice. First it wants you to select a waiting for your instructions. permanent storage device that’ll house If you have multiple network interfaces the /home directory. The default option on the machine, you’ll be asked to select Crunchy Bar ram1 stores the data on the physical the one that is connected to the cluster. The developer, Creel, is a professor of RAM. If you want something more per- Next, you’re prompted to allow the economics at the Autonomous Univer- manent, you just need to enter the de- scripts to start the DHCP server, fol- sity of Barcelona in Catalonia, Spain. He vice, such as hda1 or sda5. The device lowed by confirmation to start the ser- works in econometrics, which involves a can be a hard disk partition or a USB vices that’ll allow the slave nodes to lot of number crunching. Therefore, disk – just make sure it’s formatted as join the cluster. At first, the constant you’ll find some text and example GNU ext2 or ext3. If you replace the default confirmations seem irritating, but they Octave code related to Creel’s research option ram1 with a device, PelicanHPC are necessary to pre- will create a user directory at the root of vent you from that device. throwing the net- Next, PelicanHPC asks whether it work into a tizzy should copy all the configuration scripts with conflicting and the examples to the home directory DHCP services or on the specified device. If this is the first from accidentally in- time you are running PelicanHPC, you’ll terrupting on-going want to choose Yes. If you’ve selected a computations. permanent storage location, such as a Once it has your partition of the disk, on subsequent permission to start boots, you should choose No here. Of the cluster, the script course if you are running PelicanHPC asks you turn on the from RAM, you’ll always have to choose slave nodes. Yes. Slave nodes are Finally, you’re prompted to change the booted over the net- default password. This password will be work, so make sure Figure 2: Two nodes up and running; continue scanning for more.

JUNE 2009 ISSUE 103 31 CoveR sToRy PelicanHPC

example2, shown in thon-based apps for scientific computing. Figure 4, shows the PelicanHPC also has the MPI toolbox result of the Monte (MPITB) for Octave, which lets you call Carlo test. MPI library routines from within Octave. Creel also suggests that PelicanHPC can Passing the Buck be used for molecu- If you’re new to parallel programming, lar dynamics with you might not be aware of MPI (Mes- the open source soft- sage-Passing Interface), which is key to ware, GROMACS parallel computing. It is a software sys- (GROningen MA- tem that allows you to write message- chine for Chemical passing parallel programs that run on a Simulations). The cluster. MPI isn’t a programming lan- distributed project guage, but a library that can pass mes- for studying protein sages between multiple processes. The Figure 3: Two nodes are up and running besides the front end. folding, Folding@ process can be either on a local machine home, also uses or running across the various nodes on and teaching. If you’re interested in GROMACS, and Creel believes that one the cluster. econometrics, the econometrics. file could also replicate this setup on a clus- Popular languages for writing MPI under the /home/user/Econometrics di- ter created by PelicanHPC. programs are C, C++ and Fortran. rectory is a good starting point. Also Creel also suggests that users solely in- MPICH was the first implementation of check out the ParallelEconometrics.pdf terested in learning about high-perfor- the MPI 1.x specification. LAM/ MPI is file under /home/user/Econometrics/Par- mance computing should look to Paral- another implementation that also covers allelEconometrics. This presentation is a lelKnoppix, the last version of which is significant bits of the MPI 2.x spec. nice introduction to parallel computing still available for download [4]. LAM/MPI can pass messages via TCP/IP, and econometrics. shared memory, or Infiniband. The most For the uninitiated, GNU Octave [2] is Parallel Programming with popular implementation of MPI is Open- “a high-level computational language for PelicanHPC MPI, which is developed and main- numerical computations.” It is the free One of the best uses for PelicanHPC is tained by a consortium and combines software alternative to the proprietary for compiling and running parallel pro- the best of various projects, such as MATLAB program, both of which are grams. If this is all you want to use Peli- LAM/MPI. Many of the Top 500 super- used for hardcore arithmetic. canHPC for, you don’t really need the computers use it, including IBM Road- Some sample code is in the /home/ slave nodes because the tools can com- runner, which is currently the fastest. user/Econometrics/Examples/ directory pile your programs on the front-end for performing tests such as kernel den- node itself. MPI sity [3] and maximum likelihood estima- PelicanHPC includes several tools for PelicanHPC includes two MPI implemen- tions, as well as for running the Monte writing and processing parallel code. tations: LAM/MPI and OpenMPI. When Carlo simulations of how a new econo- OpenMPI compiles programs in C, C++, writing parallel programs in C or C++, metric estimator performs. and Fortran. SciPy and NumPy [5] are Py- make sure you include the mpi.h header file (#include ). To compile the Run Tests Listing 1: “Hello, World” in C with MPI programs, you need mpicc for C pro- To run the tests, open a 01 #include grams, mpic++ or mpiCC for C++ terminal and start GNU 02 #include "mpi.h" programs, and mpif77 for Fortran. Octave by typing octave 03 Listing 1 has a sample “Hello World” on the command line, 04 int main(int argc, char *argv[ ]) program in C that uses the MPI library to which brings you to the 05 { print a message from all the nodes in the Octave interface. 06 int rank, size; cluster. Compile it with mpicc: Here you can run vari- 07 ous examples of sample 08 MPI_Init(&argc, &argv); mpicc borg-greeting.c U code by typing in a 09 -o borg-greeting name. For example, the 10 MPI_Comm_rank(MPI_COMM_WORLD, &rank); kernel estimations are 11 MPI_Comm_size(MPI_COMM_WORLD, &size); To run the programs you need to use performed by typing ker- 12 mpirun: nel_example. 13 printf("We are borg! I am %d of %d\n", rank, size); Similarly, pea_exam- mpirun -np 4 U 14 ple shows the parallel borg-greeting implementation of the 15 MPI_Finalize(); parameterized expecta- 16 return 0; This command tells the MPI library to tion algorithm, and mc_ 17 } explicitly run four copies of the hello

32 ISSUE 103 JUNE 2009 Anzeige wird separat angeliefert CoveR sToRy PelicanHPC

online from Peter So to roll out your own ISO or USB Pacheco’s book, Par- image, first install a recent or allel Programming Debian release. I’ve used Lenny to create with MPI [7]. a customized PelicanHPC release. Next, See the OpenMPI grab the live_helper package from the website for addi- distro’s repository. Finally, grab the lat- tional documen- est version of the make_pelican script tation, including (currently v1.8) from Pelican’s download a very detailed page [4]. FAQ [8]. Open the script in your favorite text editor. The script is divided into various Build your sections. After the initial comments, own which include a brief changelog, the first PelicanHPC section lists the packages that will be If you’re just inter- available on the ISO. Here is where you ested in learning make the changes. parallel program- Listing 2 shows a modified version ming, PelicanHPC of this section, in which I’ve commented Figure 4: Gnuplot plots the results of a Monte Carlo test example. provides more than out the binary blobs for networking, enough. But the because I don’t need this for my net- app, scheduling them on the CPUs in the main goal of the Live CD is to help you works. I’ve also added AbiWord and cluster in a round-robin fashion. De- get a cluster up and running without the GROMACS package. Because these pending on the number of nodes in your much effort. The focus is on maintain- packages are fetched off your distri- cluster, you’ll see something like: ability and ease of customization, which bution’s repositories, make sure you is why the releases do not include a lot spell them as they appear there. We are borg! I am 1 of 4 of packages. GROMACS has several dependencies but We are borg! I am 3 of 4 Once you test the Live CD and think you don’t have to worry about adding We are borg! I am 0 of 4 it’ll work for you, you are encouraged to them because they’ll be fetched auto- We are borg! I am 2 of 4 make your own versions via the Debian matically. live-helper package and Pelican’s make_ The next bit in the make_pelican script Several MPI tutorials reside on the pelican script. Also, you’ll need a Debian you have to tinker with is the architec- web [6]. Professor José Luis at the Uni- or Ubuntu installation to produce the ture you want to build the ISO for and versity of Seville in Spain uses Peli- Live CD, which can be a minimal instal- whether you want the ISO or USB image. canHPC to teach his parallel program- lation or even a virtual machine on a This section also specifies the series of ming course. He recommends that new host with lots of RAM and a fast dual- network addresses doled out by Peli- programmers try the examples available core processor, which is what I use. canHPC:

Listing 2: Packages For Your PelicanHPC Live CD PELICAN_NETWORK=U 01 ### packages to add - place names of 14 # MPI "10.11.12"

packages you want here ### 15 lam-runtime lam4-dev openmpi-bin MAXNODES="100" 02 cat < addlist openmpi-dev #ARCHITECTURE=U 03 # basic stuff needed for cluster 16 # Octave "amd64" setup 17 octave3.0 octave3.0-headers gnuplot #KERNEL="amd64" 04 ssh dhcp3-server nfs-kernel-server 18 # Python ARCHITECTURE=U nfs-common atftpd ifenslave 19 python-scipy python-matplotlib "i386" 05 # binary blobs for networking python-numpy ipython lampython KERNEL="686" 06 # firmware-bnx2 firmware-iwlwifi 20 # other scientific IMAGETYPE="iso" firmware-ralink 21 gfortran libatlas-headers -wlan-ng-firmware #IMAGETYPE="usb-hdd" libatlas3gf-base 07 # resource management DISTRIBUTION=U 22 # GROMACS 08 slurm-llnl slurm-llnl-sview 23 gromacs slurm-llnl-basic-plugins Mayank Sharma has written for vari- 24 # X stuff 09 # configuration and tools ous Linux publications, including 25 xorg xfce4 konqueror ksysguard 10 wget bzip2 dialog less net-tools Linux.com, IBMdeveloperWorks, and ksysguardd kate kpdf rsync fping screen Linux Format, and has published two 26 konsole kcontrol kdenetwork kdeadmin 11 make htop fail2ban locales books through Packt on administer- console-common ing Elgg and Openfire. Occasionally 27 PACKAGELIST 12 # mail support he teaches FLOSS technologies. You 28 ### END OF PACKAGELIST ### 13 bsd-mailx liblockfile1 mailx postfix AUTHOR THE can reach him via: ssl-cert http://www. geekybodhi. net.

34 ISSUE 103 JUNE 2009 PelicanHPC CoveR sToRy

sh make_pelican oper has put a lot of effort behind Peli- canHPC’s no-fuss approach to get your Now sit back and enjoy, cluster off the ground in a jiffy. The cus- or if you have a slow tomization abilities are the icing on the connection and are run- cake and make PelicanHPC an ideal plat- ning this on a slow form for building your own custom clus- computer, you better do ter environment. n your taxes because it’ll take a while to fetch all INFO the packages and com- [1] PelicanHPC: http:// pareto. uab. es/ pile them into a distro mcreel/ PelicanHPC/ Figure 5: Tweak the make_pelican script to create your custom image. [2] GNU Octave: prompts. When it’s done, you’ll http://www. gnu. org/ software/ octave/ have a shiny new ISO [3] Kernel density estimation: "lenny" named binary.iso under either the i386/ http:// en. wikipedia. org/ wiki/ Kernel_ density_estimation MIRROR="en" or the amd64/ directory, depending on the architecture you build for. Now [4] ParallelKnoppix download: http://pareto. uab. es/ mcreel/ The rest of the script deals with Peli- transfer the USB image onto a USB stick, PelicanHPC/download/ canHPC internals and shouldn’t be or test the ISO image with VirtualBox or [5] SciPy and NumPy: tweaked unless you know what you’re with Qemu before burning it onto a disc. http:// www. scipy. org/ doing. However, it’s advisable to browse Figure 5 shows the password screen of a [6] MPI tutorial: http:// www. dartmouth. through the other sections to get a better modified PelicanHPC Live CD. edu/ ~rc/ classes/ intro_mpi/ idea about how PelicanHPC magically PelicanHPC is designed with ease of [7] Parallel Programming with MPI: transforms ordinary machines into ex- use in mind for anyone who wants to http:// www. cs. usfca. edu/ mpi/ traordinary computing clusters. use their spare computers to do some se- [8] OpenMPI FAQ: When you’ve tweaked the script, exe- rious number crunching. Building on the http://www. open-mpi. org/ faq/ ? cute it from the console: experience of ParallelKnoppix, the devel- category=mpi-apps the mathematics of humour

TWELVE Quirky Humans, Over Two Million Geeks around the world can’t be wrong! TWO Lovecraftian Horrors, COME JOIN THE INSANITY! ONE Acerbic A.I., ONE Fluffy Ball of Innocence and TEN Years of Archives EQUALS ONE Daily Cartoon that Covers the Geek Gestalt from zero to infinity!

JUNE 2009 ISSUE 103 35