BuildingBuilding andand managingmanaging virtualvirtual machinesmachines atat thethe TierTier 11

Jason A. Smith, John DeStefano, James Pryor

System Workflow: From new unconfigured to fully configured & monitored Install New System PXE boot Cobbler RHEL via system w/ Powered off network install Server kickstart. RHEL Install puppet Reboot Fresh booted

FusionInventory Manually tell Contact Puppet system w/ Agent reports on Asset Mgmt GLPI to assign system w/ server Puppet RHEL system's assets Service puppet classes Puppet classes Server

send catalog to puppet client

Puppet configures & Reports back to system w/ makes changes System server via puppet's Puppet Puppet catalog system via catalog Configured exported resources Server

The puppet server gives the Nagios server a new Now Nagios knows about the new machine System config catalog. Nagios & service and will automatically begin to Server monitor it. Configured & monitored CobblerCobbler provisioningprovisioning systemsystem

System Provisioning Cobbler & Koan

Cobbler is provisioning tool-set that allows for rapid setup & installation of systems through the network. It has both a web GUI and a command line interface.

Provisioning with Cobbler:

● Just about all non-LinuxFarm machines are PXE booted & provisioned with Cobbler and it's companion tool koan.

● We use Cobbler as the PXE boot kickstart source for RPM packages.

● During post-install of the kickstart, we register the machine against the local . It becomes the sole repo for packages & updates.

● Satellite allows us to manage packages & errata across all RHEL machines.

● Puppet is bootstrapped in the post-install

● Koan is mostly used for RHEL 5.X VMs

Provisioning VMs with Cobbler & Koan: koan -s name.of.cobbler.server --virt -y FQDN.of.VM.stored.in.cobbler -V shorthostname --force-path

virt-install -b br0 -n VMhostname -r 8096 --vcpus=4 --pxe --os-type= --os- variant=rhel5.4 --disk=/dev/vmvg/VMhostname -m 00:16:3E:1A:4D:84 --virt-type kvm --autostart --vnc

ChangeChange && ConfigurationConfiguration ManagementManagement withwith PuppetPuppet && GitGit

Change Management & Monitoring Puppet with Git, Nagios

User Workflow: initiating a configuration change

When it is reviewed & approved, Get latest changes from Git with Make changes to puppet it gets merged to Production a clone command manifests (ex: dcache class) & send to Git for approval branch

User Git User Approved? Git

updates to Puppet Server

manually tell GLPI to assign send catalog Asset Mgmt puppet classes: system w/ contact Puppet Puppet to puppet system w/ Service dcache web Puppet classes server Server client Puppet catalog grid

Puppet configures system via catalog and assigned classes. Ex: dcache class so it installs dcache

system configured w/ dcache Puppet & Git:

● Puppet classes and common foundation

● Simple 'base' class foundation

● Allow other group to create their own puppet classes on top of base class

● Puppet-Git Branches

● Puppet Approval

● Puppet-CA/Run web

AssetAsset ManagementManagement

Asset Management GLPI + OCSInventory, FusionInventory Agent, Linux Farm DB, Puppet Dashboard

Asset Management and Puppet:

● Linux Farm DB

Asset Management and Puppet:

● GLPI

● Puppet Dashboard

Virtualization:

● We began using virtualization at RACF GCE in Spring of 2008 just as a preliminary test. Platform was RHEL 5.2 with the Xen hypervisor.

● Jan 1 2009 we had about 20 VMs on 5 hosts. By Jan 1 2010 it had grown to over 100 VMs on 10 hosts. Currently we have over 150 VMs on 17 hosts with a majority of para-virtualization and very little full-virtualization.

● Our first goal was to virtualize lightweight & redundant USATLAS Grid services like VOMS & GUMS. It worked very well and demonstrated that virtulization was both easy & effective.

● We discovered a method to completely copy a filesystem from a physical machine, fully intact, and make it run as a fully- virtualized VM. This allowed us to retire old HW without having to rebuild the OS & software stack in a new VM. Virtualization:

● Up until very recently, the Xen hypervisor was not part of the mainline kernel and the kernel developers and Red hat decided to back KVM as the future of Linux kernel based virtualization. Red Hat added KVM in RHEL 5.4 and we began using KVM in RHEL 5.5. We have about 15 VMs running over 5 hosstt wih KVM on both RHEL 5.6 and RHEL 6.0

● This transition from Xen to KVM indicated that we need to be flexible and not rely too heavily on Xen specific tools or methods. Instead we have used the toolset and accompanying tools like virt-install, virt-manager and libguestfs and guestfish.

Virtualization:

● Due to bugs in the Xen & Linux kernels with regards to clock skew, we were reluctant to virtualize time sensitive services like DNS/NTP. Now that the bugs have been resolved, we now virtualize core infrastructure services like Kerberos, OpenAFS, DNS and NTP.

● Now that we rely on virtualization for core services and the ever growing number of VMs & VM host machines, we recognized the need a better management tool-set. Libvirt tools are fine but do not give the “datacenter”-view of virtualization.

● We need a virtualization management toolset to oversee all aspects of consumable virtualization resources like storage, CPU & RAM. We also need to be able to have an easy to use GUI that will allow us to see virtualization as part of the overall datacenter and then be able to drill-down and see into VMs.

Virtualization:

● We are now evaluating Red Hat's RHEV 2.x product. While it is not fully libvirt based, not open source, and requires a Windows Server, it seems to be the tool that will satisfy our desire for a better management solution.

● Version 2.x is based on KVM and RHEL 5.4+ Version 3.0 of the will be based on RHEL 6.1's KVM, will no longer need a Windows Server & should be released by the end of 2011.

● So far the RHEV eval is meeting expectations.