Enabling Grids for E-sciencE

Xen – Performance Measurements and LCG-2.4 installation

Marcus Hardt & Dr.Rüdiger Berlich Forschungszentrum Karlsruhe GmbH

www.eu-egee.org

INFSO-RI-508833 Overview

• Why Virtualisation? – Use Cases, Techniques and Products • – What is Xen; Paravirtualisation – Configuration / Starting / Migrating • Xen Performance – What to measure? – (How measured) – Results • LCG-2.4 on Xen – How installed – Caveats • Life show – Booting a Xen grid site – Submitting a job on fully virtual hardware – Discussion: Is the job real? – Migration

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 2 Virtualisation: Use Cases

• Consolidation of different services on one machine • Compute Center, ISP, Webhoster, ... • Sandbox – secure environments • More efficient usage of resources • Cluster, farms (e.g. Dubna) • Independence of user's location • “Grid in a box” • Independence from external testbeds • Training of configuration of particular Grid components • Testing new techniques • Leveraging training resources more efficiently - training is not CPU-intensive

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 3 Techniques and Products (OS)

Virtualisierung Guest-OS is a process: with hardware higher overhead, but or specialised easier to implement master-OS (e.g. VMWare Workst. OS1 OS2 OS3 microkernel) GSX Server Prozess Prozess Prozess Usermode IBM zSeries OS1 OS2 OS3 Host-OS Win4Lin XEN ESX Server CPU1, CPU2, ... CPU1, CPU2, ... Virtual PC

Migration in Cluster or Grid ? CPU1, CPU2, ...

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 4 Techniques and Products (misc.)

• Mosix: Process migration in cluster • Global filesystems: storage location • globale security mechanisms: user location • Grid: Location of calculation

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 5 Xen: History

• Approx. 2 years old • Started by the Systems Research Group of the University of Cambridge, UK • Originally part of the Xenoserver project, which aims to build a public infrastructure for wide-area distributed computing. • Idea: Provide a distributed network of OS environments tailored to the user's needs • Xen is thus closely related to the ideas of Grid Computing ! • Now available in Version 2.06 • Outlook: Native execution of arbitrary Intel-based OS feasible using hardware virtualisation features (Intel Vanderpool) • Ports to 64 bit platforms underway (with the help of AMD, Intel, ...)

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 6 Xen: Paravirtualisation

• Priviledged calls are done through dedicated interface in domain 0 • Advantage: Very high per- formance (low overhead, very little emulation necessary) • Disadvantage: Guest-OS must be ported to Xen (but not the applications !) • But: very minor adaptations, in the range of O(3000 LOC)

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 7 Xen: Configuration, Starting

• Configuration with Python Script • Starting with the command “xm create -c myconfig” • Possibility to attach X output, e.g. with VNC • External IP assigned e.g. via DHCP • From the outside, domains cannot be distinguished from physical hosts

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 8 Xen: Cross Platform !

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 9 Xen: Networking

• Domain 0 provides bridged networking to DomU's (i.e. guest-OSs) • DomU's get access to external networking environment • Can be assigned IP by external DHCP server • DomU's can be reached from external hosts, appear like standard physical hosts • A physical network card can be assigned to a DomU (if available)

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 10 Xen: Migration

Xen can migrate domains between different physical hosts while keeping the network connection alive ! • Create a copy of the memory allocated to a given domain, while the it is still running • During migration, only an incremental backup of the domain's memory needs to be copied • Network connections are kept alive, including IP • No check-pointing needed !!! • Downtime in the range of milliseconds • Disadvantage: disk image must be on shared storage !

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 11 Xen Performance

• What to measure? – CPU performance: “four in a row” – Memory performance: “pCompress” – Network throughput: “t-bench” – Disk I/O: “dd” – Overall performance: “kernel compilation” • How measured – Reference System: . PIII-700MHz / 1GB RAM / 40GB Disk / 100Mbit/s – Benchmark installation booted and run on 1-4 xen instances – Reference Measurement 1-4 parallel runs on plain smp

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 12 Results – CPU

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 13 Results – Memory

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 14 Results – Network

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 15 Results – Disk

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 16 Results – Kernel compilation

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 17 LCG-2.4 on Xen

• Installation – Scientific Linux 3.0.4 – Prepare one basic Image for use with Xen => sl-3.0.4-ready-to-yaim.img – Grow image to 4GB – Multiply image . CE, SE, WN, UI, MON – Yaim/latest – 98% work

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 18 LCG-2.4 on Xen

• Caveats – openldap/edg . -> uses libdb4 -> uses glibc/tls => slap* tools crash => use openldap/SL-3.0.4 which works fine . Patches to yaim available at CERN savannah or at marcus – TLS is not a Xen problem – Memory . 1 GB not enough for 5 Nodes . Use most RAM for CE . Use swap (1/2GB per node)

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 19 Conclusion

• Stable, high-performance environment • Very active user community • Commercial support available • Supported by large processor manufacturers • Unique live-migration capability (“stay tuned ...”) • Proven ability to serve as the basis of a “Grid in a box” • Try it out !

Marcus Hardt, Dr. Rüdiger Berlich The Grid in a box, Xen and LCG, June 06 20