A Hybrid OS Cluster Solution
Total Page:16
File Type:pdf, Size:1020Kb
Architect of an Open WorldTM A Hybrid OS Cluster Solution Dual-Boot and Virtualization with Windows HPC Server 2008 and Linux Bull Advanced Server for Xeon Published: June 2009 Dr. Patrice Calegari, HPC Application Specialist, BULL S.A.S. Thomas Varlet, HPC Technology Solution Professional, Microsoft The proof of concept presented in this document is neither a product nor a service offered by Microsoft or BULL S.A.S. The information contained in this document represents the current view of Microsoft Corporation and BULL S.A.S. on the issues discussed as of the date of publication. Because Microsoft and BULL S.A.S. must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft or BULL S.A.S., and Microsoft and BULL S.A.S. cannot guarantee the accuracy of any information presented after the date of publication. This White Paper is for informational purposes only. MICROSOFT and BULL S.A.S. MAKE NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation and BULL S.A.S. Microsoft and BULL S.A.S. may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft or BULL S.A.S., as applicable, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. © 2008, 2009 Microsoft Corporation and BULL S.A.S. All rights reserved. NovaScale is a registered trademark of Bull S.A.S. Microsoft, Hyper-V, Windows, Windows Server, and the Windows logo are trademarks of the Microsoft group of companies. PBS GridWorks®, GridWorks™, PBS Professional®, PBS™ and Portable Batch System® are trademarks of Altair Engineering, Inc. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Initial publication: release 1.2, 52 pages, published in June 2008 Minor updates: release 1.5, 56 pages, published in Nov. 2008 This paper with meta-scheduler implementation: release 2.0, 76 pages, published in June 2009 A Hybrid OS Cluster Solution: Dual-Boot and Virtualization with Windows HPC Server 2008 and Linux Bull Advanced Server for Xeon 2 Abstract The choice of an operating system (OS) for a high performance computing (HPC) cluster is a critical decision for IT departments. The goal of this paper is to show that simple techniques are available today to optimize the return on investment by making that choice unnecessary, and keeping the HPC infrastructure versatile and flexible. This paper introduces Hybrid Operating System Clusters (HOSC). An HOSC is an HPC cluster that can run several OS’s simultaneously. This paper addresses the situation where two OS’s are running simultaneously: Linux Bull Advanced Server for Xeon and Microsoft® Windows® HPC Server 2008. However, most of the information presented in this paper can apply to 3 or more simultaneous OS’s, possibly from other OS distributions, with slight adaptations. This document gives general concepts as well as detailed setup information. Firstly, technologies necessary to design an HOSC are defined (dual-boot, virtualization, PXE, resource manager and job scheduler). Secondly, different approaches of HOSC architectures are analyzed and technical recommendations are given with a focus on computing performance and management flexibility. The recommendations are then implemented to determine the best technical choices for designing an HOSC prototype. The installation setup of the prototype and the configuration steps are explained. A meta-scheduler based on Altair PBS Professional is implemented. Finally, basic HOSC administrator operations are listed and ideas for future works are proposed. This paper can be downloaded from the following web sites: http://www.bull.com/techtrends http://www.microsoft.com/downloads http://technet.microsoft.com/en-us/library/cc700329(WS.10).aspx A Hybrid OS Cluster Solution: Dual-Boot and Virtualization with Windows HPC Server 2008 and Linux Bull Advanced Server for Xeon 3 ABSTRACT.......................................................................................................................................................... 3 1 INTRODUCTION .......................................................................................................................................... 7 2 CONCEPTS AND PRODUCTS......................................................................................................................... 9 2.1 MASTER BOOT RECORD (MBR) ............................................................................................................................9 2.2 DUAL‐BOOT.......................................................................................................................................................9 2.3 VIRTUALIZATION...............................................................................................................................................10 2.4 PXE...............................................................................................................................................................12 2.5 JOB SCHEDULERS AND RESOURCE MANAGERS IN A HPC CLUSTER ................................................................................13 2.6 META‐SCHEDULER............................................................................................................................................13 2.7 BULL ADVANCED SERVER FOR XEON .....................................................................................................................14 2.7.1 Description...........................................................................................................................................14 2.7.2 Cluster installation mechanisms ..........................................................................................................14 2.8 WINDOWS HPC SERVER 2008 ...........................................................................................................................16 2.8.1 Description...........................................................................................................................................16 2.8.2 Cluster installation mechanisms ..........................................................................................................16 2.9 PBS PROFESSIONAL ..........................................................................................................................................18 3 APPROACHES AND RECOMMENDATIONS...................................................................................................19 3.1 A SINGLE OPERATING SYSTEM AT A TIME................................................................................................................19 3.2 TWO SIMULTANEOUS OPERATING SYSTEMS ............................................................................................................21 3.3 SPECIALIZED NODES...........................................................................................................................................23 3.3.1 Management node ..............................................................................................................................23 3.3.2 Compute nodes ....................................................................................................................................23 3.3.3 I/O nodes..............................................................................................................................................24 3.3.4 Login nodes..........................................................................................................................................24 3.4 MANAGEMENT SERVICES....................................................................................................................................25 3.5 PERFORMANCE IMPACT OF VIRTUALIZATION ...........................................................................................................25 3.6 META‐SCHEDULER FOR HOSC ............................................................................................................................26 3.6.1 Goals ....................................................................................................................................................26 3.6.2 OS switch techniques ...........................................................................................................................26 3.6.3 Provisioning and distribution policies ..................................................................................................26 4 TECHNICAL CHOICES FOR DESIGNING AN HOSC PROTOTYPE.......................................................................27 4.1 CLUSTER APPROACH ..........................................................................................................................................27 4.2 MANAGEMENT NODE ........................................................................................................................................27