Installation Procedures for Clusters

Installation Procedures for Clusters

The 2nd Workshop on High Performance Computing, IPM & Shahid Beheshti University, Tehran, Iran InstallationInstallation ProceduresProcedures forfor ClustersClusters Moreno Baricevic CNR-INFM DEMOCRITOS, Trieste Antun Balaz Scientifc Computing Laboratory, Institute of Physics Belgrade,Serbia The 2nd Workshop on HPC, Tehran, Iran, 21 Jan – 01 Feb 2009 AgendaAgenda Cluster Services Overview on Installation Procedures Confguration and Setup of a NETBOOT Environment Troubleshooting Cluster Management Tools Notes on Security Hands-on Laboratory Session 2 What'sWhat's aa cluster?cluster? INTERNET Commodity Cluster HPC CLUSTER LANLAN NETWORK servers, workstations, laptops, ... master-node computing nodes 3 CLUSTERCLUSTER SERVICESSERVICES CLUSTER INTERNAL NETWORK NTP NTP CLUSTER-WIDE TIME SYNC DNS DNS DYNAMIC HOSTNAMES RESOLUTION DHCP INSTALLATION / CONFIGURATION LAN (+ network devices confguration and backup) TFTP NFS SHARED FILESYSTEM REMOTE ACCESS SSH SSH FILE TRANSFER PARALLEL COMPUTATION (MPI) SERVER / MASTERNODE SERVER LDAP/NIS/... LDAP/NIS/... AUTHENTICATION ... 4 HPCHPC SOFTWARESOFTWARE INFRASTRUCTUREINFRASTRUCTURE OverviewOverview Users' Parallel Applications Users' Serial Applications Parallel Environment: MPI/PVM Software Tools for Applications (compilers, scientifc libraries) Resources Management Software System Management Software (installation, administration, monitoring) O.S. Network Storage GRID-enabling software + (fast interconnection (shared and parallel services among nodes) fle systems) 5 HPCHPC SOFTWARESOFTWARE INFRASTRUCTUREINFRASTRUCTURE OverviewOverview (our(our experience)experience) Fortran, C/C++ codes Fortran, C/C++ codes MVAPICH / MPICH / openMPI / LAM INTEL, PGI, GNU compilers BLAS, LAPACK, ScaLAPACK, ATLAS, ACML, FFTW libraries PBS/Torque batch system + MAUI scheduler SSH, C3Tools, ad-hoc utilities and scripts, IPMI, SNMP Ganglia, Nagios gLite 3.x NFS Gigabit Ethernet LUSTRE, LINUX Infniband GPFS, GFS Myrinet SAN 6 CLUSTERCLUSTER MANAGEMENTMANAGEMENT InstallationInstallation Installation can be performed: - interactively - non-interactively Interactive installations: - fner control Non-interactive installations: - minimize human intervention and let you save a lot of time - are less error prone - are performed using programs (such as RedHat Kickstart) which: - “simulate” the interactive answering - can perform some post-installation procedures for customization 7 CLUSTERCLUSTER MANAGEMENTMANAGEMENT InstallationInstallation MASTERNODE Ad-hoc installation once forever (hopefully), usually interactive: - local devices (CD-ROM, DVD-ROM, Floppy, ...) - network based (PXE+DHCP+TFTP+NFS/HTTP/FTP) CLUSTER NODES One installation reiterated for each node, usually non-interactive. Nodes can be: 1) disk-based 2) disk-less (not to be really installed) 8 CLUSTERCLUSTER MANAGEMENTMANAGEMENT ClusterCluster NodesNodes InstallationInstallation 1) Disk-based nodes - CD-ROM, DVD-ROM, Floppy, ... Time expensive and tedious operation - HD cloning: mirrored raid, dd and the like A “template” hard-disk needs to be swapped or a disk image needs to be available for cloning, confguration needs to be changed either way - Distributed installation: PXE+DHCP+TFTP+NFS/HTTP/FTP More eforts to make the frst installation work properly (especially for heterogeneous clusters), (mostly) straightforward for the next ones 2) Disk-less nodes - Live CD/DVD/Floppy - ROOTFS over NFS - ROOTFS over NFS + UnionFS - initrd (RAM disk) 9 CLUSTERCLUSTER MANAGEMENTMANAGEMENT ExistentExistent toolkitstoolkits Are generally made of an ensemble of already available software packages thought for specifc tasks, but confgured to operate together, plus some add-ons. Sometimes limited by rigid and not customizable confgurations, often bounded to some specifc LINUX distribution and version. May depend on vendors' hardware. Free and Open - OSCAR (Open Source Cluster Application Resources) - NPACI Rocks - xCAT (eXtreme Cluster Administration Toolkit) - Warewulf/PERCEUS - SystemImager - Kickstart (RH/Fedora), FAI (Debian), AutoYaST (SUSE) Commercial - Scyld Beowulf - IBM CSM (Cluster Systems Management) - HP, SUN and other vendors' Management Software... 10 Network-basedNetwork-based DistributedDistributed InstallationInstallation OverviewOverview PXE DHCP TFTP INITRD INSTALLATION ROOTFS over NFS Kickstart/Anaconda NFS NFS + UnionFS Customization through Dedicated mount Customization Post-installation point for each node through of the cluster UnionFS layers 11 NetworkNetwork bootingbooting (NETBOOT)(NETBOOT) PXEPXE ++ DHCPDHCP ++ TFTPTFTP ++ KERNELKERNEL ++ INITRDINITRD DHCPDISCOVER PXE DHCP DHCPOFFER IP Address / Subnet Mask / Gateway / ... Network Bootstrap Program (pxelinux.0) DHCPREQUEST PXE DHCP DHCPACK PXE DHCP tftp get pxelinux.0 PXE TFTP TFTP INITRD tftp get pxelinux.cfg/HEXIP PXE+NBP TFTP SERVER / MASTERNODE SERVER CLIENT / COMPUTING NODE tftp get kernel foobar PXE+NBP TFTP tftp get initrd foobar.img kernel foobar TFTP 12 Network-basedNetwork-based DistributedDistributed InstallationInstallation NETBOOTNETBOOT ++ KICKSTARTKICKSTART INSTALLATIONINSTALLATION get NFS:kickstart.cfg kernel + initrd NFS get RPMs anaconda+kickstart NFS tftp get tasklist kickstart: %post TFTP tftp get task#1 kickstart: %post TFTP Installation tftp get task#N kickstart: %post TFTP SERVER / MASTERNODE SERVER CLIENT / COMPUTING NODE tftp get pxelinux.cfg/default kickstart: %post TFTP tftp put pxelinux.cfg/HEXIP kickstart: %post TFTP 13 DisklessDiskless NodesNodes NFSNFS BasedBased NETBOOTNETBOOT ++ NFSNFS mount /nodes/rootfs/ kernel + initrd NFS mount /nodes/IPADDR/ kernel + initrd NFS bind /nodes/IPADDR/FS kernel + initrd NFS mount /tmp SERVER / MASTERNODE SERVER kernel + initrd TMPFS CLIENT / COMPUTING NODE ROOTFS over NFS ROOTFS /tmp/ as tmpfs (RAM) RW (volatile) /nodes/10.10.1.1/var/ RW (persistent) /nodes/10.10.1.1/etc/ RW (persistent) /nodes/rootfs/ RO Resultant fle system RW RO RW RO RW RO 14 DrawbacksDrawbacks Removable media (CD/DVD/foppy): – not fexible enough – needs both disk and drive for each node (drive not always available) ROOTFS over NFS: – NFS server becomes a single point of failure – doesn't scale well, slow down in case of frequently concurrent accesses – requires enough disk space on the NFS server ROOTFS over NFS+UnionFS: – same as ROOTFS over NFS – some problems with frequently random accesses RAM disk: – need enough memory – less memory available for processes Local installation: – upgrade/administration not centralized – need to have an hard disk (not available on disk-less nodes) 15 ConfgurationConfguration andand setupsetup ofof NETBOOTNETBOOT servicesservices ● clientclient setupsetup ● PXE ● BIOS ● serverserver setupsetup ● DHCP ● TFTP + PXE ● NFS ● Kickstart SettingSetting upup thethe clientclient NIC that supports network booting (or etherboot) BIOS boot-sequence 1. Floppy 2. CD/DVD 3. USB/External devices 4. NETWORK 5. Local Hard Disk Information gathering (client MAC address) documentation (don't rely on this) motherboard BIOS (if on-board) NIC BIOS, initialization, PXE booting (need to monitor the boot process) network snifer (suitable for automation) 17 CollectingCollecting MACMAC addressesaddresses # tcpdump -c1 -i any -qtep port bootpc and port bootps and ip broadcast tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on any, link-type LINUX_SLL (Linux cooked), capture size 96 bytes B 00:30:48:2c:61:8e 592: IP 0.0.0.0.bootpc > 255.255.255.255.bootps: UDP, length 548 1 packets captured 1 packets received by flter 0 packets dropped by kernel (see /etc/services for details on ports assignment) 18 SettingSetting upup DHCPDHCP ddns-update-style none; ddns-updates off; It's a protocol that allows the authoritative; dynamic confguration of the deny unknown-clients; network settings for a client # cluster network subnet 10.10.0.0 netmask 255.255.0.0 { option domain-name "cluster.network”; We need DHCP software for option domain-name-servers 10.10.0.1; option ntp-servers 10.10.0.1; both the server and the clients option subnet-mask 255.255.0.0; option broadcast-address 10.10.255.255; (PXE implements a DHCP # TFTP server client internally) next-server 10.10.0.1; # NBP filename "/pxe/pxelinux.0"; Steps needed default-lease-time -1; min-lease-time 864000; } – DHCP server package # client section – host node01.cluster.network { DHCP confguration hardware ethernet 00:30:48:2c:61:8e; fixed-address 10.10.1.1; – client confguration option host-name "node01"; } – a TFTP server to supply the PXE bootloader 19 SettingSetting upup DHCPDHCP # client section host node01.cluster.network { hardware ethernet 00:30:48:2c:61:8e; ddns-update-style none; fixed-address 10.10.1.1; ddns-updates off; option host-name "node01"; authoritative; } deny unknown-clients; # cluster network subnet 10.10.0.0 netmask 255.255.0.0 { option domain-name "cluster.network”; Parameters starting with the option domain-name-servers 10.10.0.1; option keyword correspond option ntp-servers 10.10.0.1; to actual DHCP options, option subnet-mask 255.255.0.0; while parameters that do option broadcast-address 10.10.255.255; not start with the option # TFTP server keyword either control the behavior of the DHCP server next-server 10.10.0.1; or specify client parameters # NBP that are not optional in the filename "/pxe/pxelinux.0"; DHCP protocol. default-lease-time -1; (man dhcpd.conf) min-lease-time 864000; } 20 TFTPTFTP andand PXEPXE What is TFTP – Trivial File Transfer Protocol: is a simpler, faster, session-less and “unreliable” (based on UDP) implementation of

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    38 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us