System Architectures Based on Functionality Offloading

Total Page:16

File Type:pdf, Size:1020Kb

System Architectures Based on Functionality Offloading SYSTEM ARCHITECTURES BASED ON FUNCTIONALITY OFFLOADING BY ANIRUDDHA BOHRA A dissertation submitted to the Graduate School—New Brunswick Rutgers, The State University of New Jersey in partial fulfillment of the requirements for the degree of Doctor of Philosophy Graduate Program in Computer Science Written under the direction of Liviu Iftode and approved by New Brunswick, New Jersey January, 2008 c 2008 Aniruddha Bohra ALL RIGHTS RESERVED ABSTRACT OF THE DISSERTATION System Architectures Based on Functionality Offloading by Aniruddha Bohra Dissertation Director: Liviu Iftode Offloading to hardware components that support the primary task of a system enables sep- aration of concerns and allows both the primary and offloaded components of a system to be easy to understand, manage, and evolve independent of other components. In this dissertation, we explore the software mechanisms required to effectively offload functionality to idle processing elements. We present the design, implementation, and evalua- tion of three system architectures – TCPServers, Orion, and FileWall, which offload functional- ity for improving performance (TCPServers), improving availability (Orion), and for extending functionality (FileWall). We explore software mechanisms to offload functionality to a subset of processors in an Symmetric Multiprocessor (SMP) system, a programmable network inter- face, and an interposing network middlebox to realize the three system architectures. TCPServers is a system architecture that offloads network processing to a subset of proces- sors in an SMP system. Network processing imposes direct and indirect overheads on server systems. It directly affects system performance since it executes at a higher priority than ap- plication tasks and prevents other components of the system from executing simultaneously on the processors. It indirectly affects performance by causing cache pollution and Trans- lation Lookaside Buffer (TLB) flushes, which lead to degraded memory system performance. Through offloading, TCPServers isolates the network processing and eliminates the direct over- heads. We also present mechanisms for increasing network stack concurrency and connection ii scheduling, which significantly improve the performance in a multi-threaded network stack. Orion is a system architecture that enables Remote Healing, where the monitoring and healing actions are performed externally, by offloading such functionality to external hardware. Fine grained monitoring of computer systems for performance anomalies, intrusion detection, and failure detection imposes significant overhead on the target system. Performance critical systems, where such fine grained monitoring is essential, cannot be subjected to these over- heads. Orion offloads the healing functionality to a programmable network interface, which provides an alternate path to the memory of the target system. Using Orion, monitoring and healing actions can be performed nonintrusively, without involving the processors of the target system. We present the design and implementation of mechanisms for remote monitoring and remote repair of damaged OS state using Orion. FileWall is a system architecture that enables extension of network file system functionality. Network file system evolution is constrained by the tight binding between clients and servers. This binding limits the evolution of file system protocols and hinders deployment of file system or protocol extensions. For example, current network file systems have limited support for implementing file access policies. We propose message transformation as a mechanism to separate the client and server systems for protocol enhancement. FileWall offloads message transformation and policy enforcement to an interposing network element on the client-server path. We have used FileWall to implement file access policies and present experimental results showing policy enforcement at FileWall imposes minimal overheads on the base file system protocol. The main conclusion of our research is that system architectures based on functionality of- floading can be realized simply and effectively through efficient software mechanisms, using only commodity off the shelf hardware. With the availability of resources at idle processing cores in a multiprocessor system, intelligent peripherals, and unused nodes in a cluster, of- floading is a practical solution for improving performance and introducing new functionality in computer systems and networks. iii Acknowledgements This dissertation would not be possible without the help, support, and guidance of my family, teachers, and friends - I want to thank them for all their efforts. I want to thank my advisor, Liviu Iftode, and my committee members - Ricardo Bianchini, Rich Martin, S. Muthukrishnan, and Jason Nieh, for valuable suggestions that made this dissertation better. The last-minute help from Rich actually made the dissertation possible. Several professors at Rutgers, whose courses I have taken and whose talks I have attended, helped me understand how little I know and how much I need to learn. I thank them all for their efforts and help through the years. My advisor, Liviu Iftode, has been a firm believer in me and my work, more than I ever was. He has been a source of energy and ideas, many of which were wasted on me, but he has never given up. Through the years, Liviu has been a part of hundreds of arguments with me, some where I had something useful and productive to contribute, others where I was argumentative and foolish. I thank him for never discouraging me and helping me grow as a researcher and as a person. Ricardo has been as much a professor and teacher to me as a friend and mentor. He has given valuable advice whenever I went up to him (and it was often). His enthusiasm and focus on the soccer field as much as in research has been an inspiration. I cannot imagine staying on in graduate school without his support and help. Over the years, I have worked on projects with many people. At the beginning of my graduate study, Florin Sultan took me under his wing and made me realize the importance of patience, perseverance, and attention to detail. I still hear his words, “Don’t jump!”, whenever I am chasing bugs and getting nowhere. To him I owe more than I can thank him for. Stephen Smaldone has been a friend and my partner in the research through the last six years. He never lets me forget what the goal is in work as well as in life. It has been a pleasure working with Steve and I am honored to be his friend. Members of Discolab, especially Murali Rangarajan, iv Cristian Borcea, Pascal Gallard, Iulian Neamtiu, Yufei Pan, Porlin Kang, and Arati Baliga, who over the years have contributed ideas, participated in discussions, and even helped with building systems and running experiments, have been a crucial part of my years at Rutgers. Lu Han and Tzvika Chumash gave valuable feedback and help during the final days before my defense. The long nights in the lab were always an interesting time due to all of them. I have been lucky to have many friends who made the long time away from home enjoyable and interesting. Unfortunately, I cannot possibly thank each one of you here. I have known Budha and Hari for more than a decade and have always had a great time with them discussing science, cricket, history, politics, and everything else under the sun. I also want to acknowledge the colleagues at NEC Labs, who were understanding and supportive of me continuing to work on my dissertation while working there. This dissertation would not have been possible without Kavitha, who through her support and continuous encouragement has kept me going. She is also responsible for introducing me to the pleasure of having pets, and for enriching my life with Mishki, Puff, Ollie, and Nicky. Together they have suffered my absences and handled the ups and downs of this journey. Finally, my parents and my brother are responsible for all that is good in me, the faults are all mine. Thank you all! v Dedication To my parents vi Table of Contents Abstract ::::::::::::::::::::::::::::::::::::::::: ii Acknowledgements ::::::::::::::::::::::::::::::::::: iv Dedication :::::::::::::::::::::::::::::::::::::::: vi List of Tables :::::::::::::::::::::::::::::::::::::: ix List of Figures :::::::::::::::::::::::::::::::::::::: x List of Abbreviations :::::::::::::::::::::::::::::::::: xii 1. Introduction ::::::::::::::::::::::::::::::::::::: 1 1.1. Thesis . 1 1.2. Offloading Functionality in Computer Systems . 1 1.3. Offloading Architectures . 2 1.4. The Offloading Debate: Opportunity and Challenges . 8 1.5. Summary of Dissertation Contributions . 16 1.6. Contributors to the Dissertation . 18 1.7. Organization of the Dissertation . 18 2. TCPServers: Offloading for Improved Network Performance :::::::::: 19 2.1. Problem Statement . 19 2.2. Network Processing Overview . 20 2.3. TCPServers Architecture . 25 2.4. TCPServers Design . 26 2.5. Prototype Implementation . 34 2.6. Evaluation . 36 2.7. Related Work . 42 vii 2.8. Summary . 50 3. Orion: Offloading Monitoring for Improved Availability ::::::::::::: 51 3.1. Problem Statement . 51 3.2. Operating System and Application Monitoring . 53 3.3. Orion Architecture . 56 3.4. Orion Design . 58 3.5. Case Studies . 69 3.6. Prototype Implementation . 76 3.7. Evaluation . 78 3.8. Related Work . 83 3.9. Summary . 88 4. FileWall: Offloading Policy Enforcement in Network File Systems :::::::: 89 4.1. Problem Statement . 89 4.2. FileWall Model . 91 4.3. FileWall Design . 98 4.4. FileWall Policy Templates . 102 4.5.
Recommended publications
  • Theendokernel: Fast, Secure
    The Endokernel: Fast, Secure, and Programmable Subprocess Virtualization Bumjin Im Fangfei Yang Chia-Che Tsai Michael LeMay Rice University Rice University Texas A&M University Intel Labs Anjo Vahldiek-Oberwagner Nathan Dautenhahn Intel Labs Rice University Abstract Intra-Process Sandbox Multi-Process Commodity applications contain more and more combina- ld/st tions of interacting components (user, application, library, and Process Process Trusted Unsafe system) and exhibit increasingly diverse tradeoffs between iso- Unsafe ld/st lation, performance, and programmability. We argue that the Unsafe challenge of future runtime isolation is best met by embracing syscall()ld/st the multi-principle nature of applications, rethinking process Trusted Trusted read/ architecture for fast and extensible intra-process isolation. We IPC IPC write present, the Endokernel, a new process model and security Operating System architecture that nests an extensible monitor into the standard process for building efficient least-authority abstractions. The Endokernel introduces a new virtual machine abstraction for Figure 1: Problem: intra-process is bypassable because do- representing subprocess authority, which is enforced by an main is opaque to OS, sandbox limits functionality, and inter- efficient self-isolating monitor that maps the abstraction to process is slow and costly to apply. Red indicates limitations. system level objects (processes, threads, files, and signals). We show how the Endokernel Architecture can be used to develop enforces subprocess access control to memory and CPU specialized separation abstractions using an exokernel-like state [10, 17, 24, 25, 75].Unfortunately, these approaches only organization to provide virtual privilege rings, which we use virtualize minimal parts of the CPU and neglect tying their to reorganize and secure NGINX.
    [Show full text]
  • Impact Analysis of System and Network Attacks
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by DigitalCommons@USU Utah State University DigitalCommons@USU All Graduate Theses and Dissertations Graduate Studies 12-2008 Impact Analysis of System and Network Attacks Anupama Biswas Utah State University Follow this and additional works at: https://digitalcommons.usu.edu/etd Part of the Computer Sciences Commons Recommended Citation Biswas, Anupama, "Impact Analysis of System and Network Attacks" (2008). All Graduate Theses and Dissertations. 199. https://digitalcommons.usu.edu/etd/199 This Thesis is brought to you for free and open access by the Graduate Studies at DigitalCommons@USU. It has been accepted for inclusion in All Graduate Theses and Dissertations by an authorized administrator of DigitalCommons@USU. For more information, please contact [email protected]. i IMPACT ANALYSIS OF SYSTEM AND NETWORK ATTACKS by Anupama Biswas A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in Computer Science Approved: _______________________ _______________________ Dr. Robert F. Erbacher Dr. Chad Mano Major Professor Committee Member _______________________ _______________________ Dr. Stephen W. Clyde Dr. Byron R. Burnham Committee Member Dean of Graduate Studies UTAH STATE UNIVERSITY Logan, Utah 2008 ii Copyright © Anupama Biswas 2008 All Rights Reserved iii ABSTRACT Impact Analysis of System and Network Attacks by Anupama Biswas, Master of Science Utah State University, 2008 Major Professor: Dr. Robert F. Erbacher Department: Computer Science Systems and networks have been under attack from the time the Internet first came into existence. There is always some uncertainty associated with the impact of the new attacks.
    [Show full text]
  • The Challenges of Dynamic Network Interfaces
    The Challenges of Dynamic Network Interfaces by Brooks Davis brooks@{aero,FreeBSD}.org The Aerospace Corporation The FreeBSD Project EuroBSDCon 2004 October 29-31, 2004 Karlsruhe, Germany Introduction ● History of Dynamic Interfaces ● Problems ● Possible Solutions ● Advice to Implementors ● Future Work Early UNIX ● Totally static. ● All devices must be compiled in to kernel ● Fast and easy to program ● Difficult to maintain as the number of devices grows Autoconfiguration ● Introduced in 4.1BSD (June 1981) ● One kernel can serve multiple hardware configurations ● Probe – Test for existence of devices, either using stored addresses or matching devices on self-identifying buses ● Attach – Allocate a driver instance (as of 6.0, this must be fully dynamic) Kernel Modules ● Allows drivers to be added at run time ● LKM (Loadable Kernel Modules) – Introduced in 2.0 by Terry Lambert – Modeled after the facility in SunOS ● KLD (dynamic kernel linker) – Introduced along with newbus in 3.0 by Doug Rabson – Added a generic if_detach() function PC Card & CardBus ● Initial PC Card (PCMCIA) support via PAO in 2.0 ● Fairly good support in 3.0 ● Most PAO changes merged in 4.0 – PAO development ceased ● CardBus support in 5.0 Other Removable Devices ● USB Ethernet (4.0) ● Firewire (fwe(4) in 4.8, fwip(4) in 5.3) ● Bluetooth (5.2) ● Hot plug PCI ● Compact PCI ● PCI Express ● Express Card Netgraph ● Node implement network functions ● Arbitrary connection of nodes allowed ● ng_iface(4) node creates interfaces on demand Interface Cloning ● Handles most pseudo
    [Show full text]
  • The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel
    Blekinge Institute of Technology Research Report 2005:06 The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel Simon Kågström Håkan Grahn Lars Lundberg Department of Systems and Software School of Engineering Blekinge Institute of Technology The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel Simon Kågström, Håkan Grahn, and Lars Lundberg Department of Systems and Software Engineering School of Engineering Blekinge Institute of Technology P.O. Box 520, SE-372 25 Ronneby, Sweden {ska, hgr, llu}@bth.se Abstract The ongoing transition from uniprocessor to multiprocessor computers requires support from the op- erating system kernel. Although many general-purpose multiprocessor operating systems exist, there is a large number of specialized operating systems which require porting in order to work on multipro- cessors. In this paper we describe the multiprocessor port of a cluster operating system kernel from a producer of industrial systems. Our initial implementation uses a giant locking scheme that serializes kernel execution. We also employed a method in which CPU-local variables are placed in a special sec- tion mapped to per-CPU physical memory pages. The giant lock and CPU-local section allowed us to implement an initial working version with only minor changes to the original code, although the giant lock and kernel-bound applications limit the performance of our multiprocessor port. Finally, we also discuss experiences from the implementation. 1 Introduction A current trend in the computer industry is the transition from uniprocessors to various kinds of multipro- cessors, also for desktop and embeddedsystems. Apart from traditional SMP systems, many manufacturers are now presenting chip multiprocessors or simultaneous multithreaded CPUs [9, 15, 16] which allow more efficient use of chip area.
    [Show full text]
  • Introduzione Al Mondo Freebsd
    Introduzione al mondo FreeBSD Corso avanzato Netstudent Netstudent http://netstudent.polito.it E.Richiardone [email protected] maggio 2009 CC-by http://creativecommons.org/licenses/by/2.5/it/ The FreeBSD project - 1 ·EÁ un progetto software open in parte finanziato ·Lo scopo eÁ mantenere e sviluppare il sistema operativo FreeBSD ·Nasce su CDROM come FreeBSD 1.0 nel 1993 ·Deriva da un patchkit per 386BSD, eredita codice da UNIX versione Berkeley 1977 ·Per problemi legali subisce un rallentamento, release 2.0 nel 1995 con codice royalty-free ·Dalla release 5.0 (2003) assume la struttura che ha oggi ·Disponibile per x86 32 e 64bit, ia64, MIPS, ppc, sparc... ·La mascotte (Beastie) nasce nel 1984 The FreeBSD project - 2 ·Erede di 4.4BSD (eÁ la stessa gente...) ·Sistema stabile; sviluppo uniforme; codice molto chiaro, ordinato e ben commentato ·Documentazione ufficiale ben curata ·Licenza molto permissiva, spesso attrae aziende per progetti commerciali: ·saltuariamente esterni collaborano con implementazioni ex-novo (i.e. Intel, GEOM, atheros, NDISwrapper, ZFS) ·a volte no (i.e. Windows NT) ·Semplificazione di molte caratteristiche tradizionali UNIX Di cosa si tratta Il progetto FreeBSD include: ·Un sistema base ·Bootloader, kernel, moduli, librerie di base, comandi e utility di base, servizi tradizionali ·Sorgenti completi in /usr/src (~500MB) ·EÁ giaÁ abbastanza completo (i.e. ipfw, ppp, bind, ...) ·Un sistema di gestione per software aggiuntivo ·Ports e packages ·Documentazione, canali di assistenza, strumenti di sviluppo ·i.e. Handbook,
    [Show full text]
  • Model Driven Scheduling for Virtualized Workloads
    Model Driven Scheduling for Virtualized Workloads Proefschrift voorgelegd op 28 juni 2013 tot het behalen van de graad van Doctor in de Wetenschappen – Informatica, bij de faculteit Wetenschappen, aan de Universiteit Antwerpen. PROMOTOREN: prof. dr. Jan Broeckhove dr. Kurt Vanmechelen Sam Verboven RESEARCH GROUP COMPUTATIONAL MODELINGAND PROGRAMMING Dankwoord Het behalen van een doctoraat is een opdracht die zonder hulp onmogelijk tot een goed einde kan gebracht worden. Gelukkig heb ik de voorbije jaren de kans gekregen om samen te werken met talrijke stimulerende collega’s. Stuk voor stuk hebben zij bijgedragen tot mijn professionele en persoonlijke ontwikkeling. Hun geduldige hulp en steun was essentieel bij het overwinnen van de vele uitdagingen die met een doctoraat gepaard gaan. Graag zou ik hier dan ook enkele woorden van dank neerschrijven. Allereerst zou ik graag prof. dr. Jan Broeckhove en em. prof. dr. Frans Arickx bedanken om mij de kans te geven een gevarieerde en boeiend academisch traject te starten. Bij het beginnen van mijn doctoraat heeft Frans mij niet alleen geholpen om een capabel onderzoeker te worden, ook bij het lesgeven heeft hij mij steeds met raad en daad bijgestaan. Na het emiraat van Frans heeft Jan deze begeleiding overgenomen en er voor gezorgd dat ik het begonnen traject ook succesvol kon beeïndigen. Beiden hebben ze mij steeds grote vrijheid gegeven in mijn zoektocht om interessante onderzoeksvragen te identificeren en beantwoorden. Vervolgens zou ik graag dr. Peter Hellinckx en dr. Kurt Vanmechelen bedanken voor hun persoonlijke en vaak intensieve begeleiding. Zelfs voor de start van mijn academische carrière, bij het schrijven van mijn Master thesis, heeft Peter mij klaar- gestoomd voor een vlotte start als onderzoeker.
    [Show full text]
  • Exercises for Portfolio
    Faculty of Computing, Engineering and Technology Module Name: Operating Systems Module Number: CE01000-3 Title of Assessment: Portfolio of exercises Module Learning Outcomes for This Assessment 3. Apply to the solution of a range of problems, the fundamental concepts, principles and algorithms employed in the operation of a multi-user/multi-tasking operating systems. Hand in deadline: Friday 26th November, 2010. Assessment description Selected exercises from the weekly tutorial/practical class exercises are to be included in a portfolio of exercises to be submitted Faculty Office in a folder by the end of week 12 (Friday 26th November 2010). The weekly tutorial/practical exercises that are to be included in the portfolio are specified below. What you are required to do. Write up the selected weekly exercises in Word or other suitable word processor. The answers you submit should be your own work and not the work of any other person. Note the University policy on Plagiarism and Academic Dishonesty - see Breaches in Assessment Regulations: Academic Dishonesty - at http://www.staffs.ac.uk/current/regulations/academic/index.php Marks The marks associated with each selected exercise will be indicated when you are told which exercises are being selected. Week 1. Tutorial questions: 6. How does the distinction between supervisor mode and user mode function as a rudimentary form of protection (security) system? The operating system assures itself a total control of the system by establishing a set of privileged instructions only executable in supervisor mode. In that way the standard user won’t be able to execute these commands, thus the security.
    [Show full text]
  • The Challenges of Dynamic Network Interfaces
    The Challenges of Dynamic Network Interfaces Brooks Davis The FreeBSD Project Seattle, WA brooks@{aero,FreeBSD}.org Abstract vices to the modern age of near complete dynamism. Following this history, the problems caused by this dynamism are discussed in detail. Then solutions to On early BSD systems, network interfaces were some of these problems are proposed and analyzed, static objects created at kernel compile time. To- and advice to implementers of userland applications day the situation has changed dramatically. PC is given. Finally, the issues are summarized and fu- Card, USB, and other removable buses allow hard- ture work is discussed. ware interfaces to arrive and depart at run time. Pseudo-device cloning also allows pseudo-devices to be created dynamically. Additionally, in FreeBSD and Dragonfly, interfaces can be renamed by the ad- 2 History ministrator. With these changes, interfaces are now dynamic objects which may appear, change, or dis- appear at any time. This dynamism invalidates a In early versions of UNIX, the exact set of devices number of assumptions that have been made in the on the system had to be compiled in to the kernel. If kernel, in external programs, and even in standards the administrator attempted to use a device which such as SNMP. This paper explores the history of was compiled in, but not installed, a panic or hang the transition of network interfaces from static to dy- was nearly certain. This system was easy to pro- namic. Issues raised by these changes are discussed gram and efficient to execute. Unfortunately, it was and possible solutions suggested.
    [Show full text]
  • A Universal Framework for (Nearly) Arbitrary Dynamic Languages Shad Sterling Georgia State University
    Georgia State University ScholarWorks @ Georgia State University Undergraduate Honors Theses Honors College 5-2013 A Universal Framework for (nearly) Arbitrary Dynamic Languages Shad Sterling Georgia State University Follow this and additional works at: https://scholarworks.gsu.edu/honors_theses Recommended Citation Sterling, Shad, "A Universal Framework for (nearly) Arbitrary Dynamic Languages." Thesis, Georgia State University, 2013. https://scholarworks.gsu.edu/honors_theses/12 This Thesis is brought to you for free and open access by the Honors College at ScholarWorks @ Georgia State University. It has been accepted for inclusion in Undergraduate Honors Theses by an authorized administrator of ScholarWorks @ Georgia State University. For more information, please contact [email protected]. A UNIVERSAL FRAMEWORK FOR (NEARLY) ARBITRARY DYNAMIC LANGUAGES (A THEORETICAL STEP TOWARD UNIFYING DYNAMIC LANGUAGE FRAMEWORKS AND OPERATING SYSTEMS) by SHAD STERLING Under the DireCtion of Rajshekhar Sunderraman ABSTRACT Today's dynamiC language systems have grown to include features that resemble features of operating systems. It may be possible to improve on both by unifying a language system with an operating system. Complete unifiCation does not appear possible in the near-term, so an intermediate system is desCribed. This intermediate system uses a common call graph to allow Components in arbitrary languages to interaCt as easily as components in the same language. Potential benefits of such a system include signifiCant improvements in interoperability,
    [Show full text]
  • PC-BSD 9 Turns a New Page
    CONTENTS Dear Readers, Here is the November issue. We are happy that we didn’t make you wait for it as long as for October one. Thanks to contributors and supporters we are back and ready to give you some usefull piece of knowledge. We hope you will Editor in Chief: Patrycja Przybyłowicz enjoy it as much as we did by creating the magazine. [email protected] The opening text will tell you What’s New in BSD world. It’s a review of PC-BSD 9 by Mark VonFange. Good reading, Contributing: especially for PC-BSD users. Next in section Get Started you Mark VonFange, Toby Richards, Kris Moore, Lars R. Noldan, will �nd a great piece for novice – A Beginner’s Guide To PF Rob Somerville, Erwin Kooi, Paul McMath, Bill Harris, Jeroen van Nieuwenhuizen by Toby Richards. In Developers Corner Kris Moore will teach you how to set up and maintain your own repository on a Proofreaders: FreeBSD system. It’s a must read for eager learners. Tristan Karstens, Barry Grumbine, Zander Hill, The How To section in this issue is for those who enjoy Christopher J. Umina experimenting. Speed Daemons by Lars R Noldan is a very good and practical text. By reading it you can learn Special Thanks: how to build a highly available web application server Denise Ebery with advanced networking mechanisms in FreeBSD. The Art Director: following article is the �nal one of our GIS series. The author Ireneusz Pogroszewski will explain how to successfully manage and commission a DTP: complex GIS project.
    [Show full text]
  • Filesystem Performance on Freebsd
    Filesystem Performance on FreeBSD Kris Kennaway [email protected] BSDCan 2006, Ottawa, May 12 Introduction ● Filesystem performance has many aspects ● No single metric for quantifying it ● I will focus on aspects that are relevant for my workloads (concurrent package building) ● The main relevant filesystem workloads seem to be – Concurrent tarball extraction – Recursive filesystem traversals ● Aim: determine relative performance of FreeBSD 4.x, 5.x and 6.x on these workloads – Overall performance and SMP scaling – Evaluate results of multi-year kernel locking strategy as it relates to these workloads Outline ● SMP architectural differences between 4/5/6.x ● Test methodology ● Hardware used ● Parallel tarball extraction test – Disk array and memory disk ● Scaling beyond 4 CPUs ● Recursive filesystem traversal test ● Conclusions & future work SMP Architectural Overview ● FreeBSD 4.x; rudimentary SMP support – Giant kernel lock restricts kernel access to one process at a time – SPL model; interrupts may still be processed in parallel ● FreeBSD 5.x; aim towards greater scalability – Giant-locked to begin with; then finer-grained locking pushdown ● FreeBSD 5.3; VM Giant-free ● FreeBSD 5.4; network stack Giant-free (mostly) ● Many other subsystems/drivers also locked – Interrupts as kernel threads; compete for common locks (if any) with everything else ● FreeBSD 6.x; – Consolidation; further pushdown; payoff! – VFS subsystem, UFS filesystem Giant-free FreeBSD versions ● FreeBSD 4.11-STABLE (11/2005) – Needed for amr driver fixes after 4.11-RELEASE ● FreeBSD 5.4-STABLE (11/05) – No patches needed ● FreeBSD 6.0-STABLE (11/05) – patches: ● Locking reworked in amr driver by Scott Long for better performance ● All relevant changes merged into FreeBSD 6.1 – A kernel panic was encountered at very high I/O loads ● Also fixed in 6.1 Test aims and Methodology ● Want to measure – overall performance difference between FreeBSD branches under varying (concurrent process I/O) loads – scaling to multiple CPUs ● Avoid saturating hardware resources (e.g.
    [Show full text]
  • Towards Better Performance Per Watt in Virtual Environments on Asymmetric Single-ISA Multi-Core Systems
    Towards Better Performance Per Watt in Virtual Environments on Asymmetric Single-ISA Multi-core Systems Viren Kumar Alexandra Fedorova Simon Fraser University Simon Fraser University 8888 University Dr 8888 University Dr Vancouver, Canada Vancouver, Canada [email protected] [email protected] ABSTRACT performance per watt than homogeneous multicore proces- Single-ISA heterogeneous multicore architectures promise to sors. As power consumption in data centers becomes a grow- deliver plenty of cores with varying complexity, speed and ing concern [3], deploying ASISA multicore systems is an performance in the near future. Virtualization enables mul- increasingly attractive opportunity. These systems perform tiple operating systems to run concurrently as distinct, in- at their best if application workloads are assigned to het- dependent guest domains, thereby reducing core idle time erogeneous cores in consideration of their runtime proper- and maximizing throughput. This paper seeks to identify a ties [4][13][12][18][24][21]. Therefore, understanding how to heuristic that can aid in intelligently scheduling these vir- schedule data-center workloads on ASISA systems is an im- tualized workloads to maximize performance while reducing portant problem. This paper takes the first step towards power consumption. understanding the properties of data center workloads that determine how they should be scheduled on ASISA multi- We propose that the controlling domain in a Virtual Ma- core processors. Since virtual machine technology is a de chine Monitor or hypervisor is relatively insensitive to changes facto standard for data centers, we study virtual machine in core frequency, and thus scheduling it on a slower core (VM) workloads. saves power while only slightly affecting guest domain per- formance.
    [Show full text]