Predicting the CPU Availability of Time-Shared Unix Systems

Total Page:16

File Type:pdf, Size:1020Kb

Predicting the CPU Availability of Time-Shared Unix Systems Predicting the CPU Availability of Time-shared Unix Systems on the Computational Grid ¡ ¢ £ Rich Wolski, Neil Spring, and Jim Hayes Abstract One vexing quality of these ensemble systems is that their performance characteristics vary dynamically. In par- In this paper we focus on the problem of making short ticular, it is often economically infeasible to dedicate a large and medium term forecasts of CPU availability on time- collection of machines and networks to a single application, shared Unix systems. We evaluate the accuracy with which particularly if those resources belong to separate organiza- availability can be measured using Unix load average, the tions. Resources, therefore, must be shared, and the con- Unix utility vmstat, and the Network Weather Service tention that results from this sharing causes the deliverable CPU sensor that uses both. We also examine the autocor- performance to vary over time. To make the best use of the relation between successive CPU measurements to deter- resources that are at hand, an application scheduler (be it a mine their degree of self-similarity. While our observations program or a human being) must make a prediction of what show a long-range autocorrelation dependence, we demon- performance will be available from each. strate how this dependence manifests itself in the short and In this paper, we examine the problem of predicting medium term predictability of the CPU resources in our available CPU performance on Unix time-shared systems study. for the purpose of building dynamic schedulers. In this vein, the contributions it makes are: ¨ an exposition of the measurement error associated 1 Introduction with popular Unix load measurement facilities with re- spect to estimating the available CPU time a process Improvements in network technology have made the dis- can obtain, tributed execution of performance-starved applications fea- ¨ a study of one-step-ahead forecasting performance ob- sible. High-bandwidth, low-latency local-area networks tained by the Network Weather Service [29, 30] (a form the basis of low-cost distributed systems such as distributed, on-line performance forecasting system) the HPVM [14], the Berkeley NOW [9], and various Be- when applied to CPU availability measurements, owulf [25] configurations. Similarly, common carrier sup- port for high-performance wide-area networking is fueling ¨ an analysis of this forecasting performance in terms of an interest in aggregating geographically dispersed and in- the autocorrelation present between consecutive mea- dependently managed computers. These large-scale meta- surements of CPU availability, and, computers are capable of delivering greater execution per- formance to an application than is available at any one of ¨ verification of this analysis through its application to their constituent sites [4]. Moreover, performance-oriented longer-term forecasting of CPU availability. distributed software infrastructures such as Globus [12], Le- gion [18], Condor [26], NetSolve [7], and HPJava [6] are Our results are somewhat surprising in that they demon- attempting to knit vast collections of machines into com- strate the possibility of making short and medium term pre- putational grids [13] from which compute cycles can be dictions of available CPU performance despite the presence obtained in the way electrical power is obtained from an of long-range autocorrelation and potential self-similarity. electrical power utility. Since chaotic systems typically exhibit self-similar perfor- mance characteristics, self-similarity is often interpreted as ¤ Supported by NSF grant ASC-9701333 and Advanced Research an indication of unpredictability. The predictions we ob- Projects Agency/ITO under contract #N66001-97-C-8531. ¥ email: [email protected] tained, however, exhibited a mean absolute error of less than ¦ email: [email protected] 10%, typically making them accurate enough for use in dy- § email: [email protected] namic process scheduling. In the next section (Section 2), we detail the accuracy 2.1 Measurement methods obtained by three performance measurement methodolo- gies when applied to a collection of compute servers and For each system, we use the NWS CPU monitor to pe- workstations located in the U.C. San Diego Computer Sci- riodically generate three separate measurements of current ence and Engineering Department. Section 3 briefly de- CPU availability. The first is based on the Unix load av- scribes the Network Weather Service (NWS) [30, 29] – erage metric which is a one-minute smoothed average of a distributed performance forecasting system designed for run queue length. Almost all Unix systems gather and re- use by dynamic schedulers. The NWS treats measurement port load average values, although the smoothing factor and histories as time series and uses simple statistical techniques sampling rates vary across implementations. The NWS sen- to make short-term forecasts. Section 3 also presents the sor uses the utility uptime to obtain a load average read- analysis the forecasting errors generated by the NWS fore- ing without special access privileges2. Using a load average casting system for the resources we monitored in this study. measurement, we compute the available CPU percentage as Finally, we conclude with a discussion of these results in . § "! # the context of dynamic scheduling for metacomputing and ¢¡£ ¥¤§¦¨ © ¦¨ $&% /#¢#¥0 ¤(' ¦¨)* £+ ¥¡¢,- # computational grid systems, and point to future research di- "! rections in Section 4. (1) indicating the percentage of CPU time that would be avail- able to a newly created process. Fearing load average 2 Measurements and measurement error to be insensitive to short-term load variability, we imple- mented a second measurement technique based on the util- ity vmstat which provides periodically updated readings 213 /4 For dynamic process scheduling, the goal of CPU pre- ¤(+¥¦¨ of CPU time, consumed time, and consumed diction is to gauge the degree to which a process can be ex- 1/5 1/67 38 time, presented as percentages. pected to occupy a processor over some fixed time interval. 21/ 34 1/5 1/67 38 For example, if “50% of the CPU” is available from a time- . :<;§=¨= ¥¡£ ¥¤(¦9 © ¦¨ 7 >¤§+¥¦¨ ?- -BA 4C- shared CPU, a process should be able to obtain 50% of the 4@- time-slices over some time interval. Typically, the availabil- (2) ity percentage is used as an expansion factor [11, 23, 15, 1] where 4 is a smoothed average of the number of running to determine the potential execution time of a process. If processes over the previous set of measurements, and A is only 50% of the time-slices are available, for example, a a weighting factor equal to 21/ 34 time. The rationale is that process is expected to take twice as long to execute as it a process is entitled to the current idle time, and a fair share would if the CPU were completely unloaded1. We have of the user and system time. However, if a machine is used used CPU availability to successfully schedule parallel pro- as a network gateway (as was the case at one time in the grams in shared distributed environments [24, 2]. Other UCSD CSE Department) user-level processes may be de- scheduling systems such as Prophet [27], Winner [1], and nied CPU time as the kernel services network-level packet MARS [15] use Unix load average to accomplish the same interrupts. In our experience, the percentage of system time purpose. that is shared fairly is directly proportional to the percentage In this section, we detail the error that we observed of user time, hence the A factor. while measuring CPU availability. We report results gath- Lastly, we implemented a hybrid sensor that combines ered by monitoring a collection of workstations and com- Unix load average and vmstat measurements with a small pute servers in the Computer Science and Engineering De- probe process. The probe process occupies the CPU for a partment at UCSD. We used the CPU monitoring facilities short period of time (currently 1.5 seconds) and reports the currently implemented as part of the Network Weather Ser- ratio of the CPU time it used to the wall-clock time that vice [30]. Each series of measurements spans a 24 hour passed as a measure of the availability it experienced. The period of “production” use during August of 1998. Despite hybrid runs its probe process much less frequently than it :<;§=¨= §D ¥¡¥ ¥¤§¦¨ © ¦9 §D the usual summertime hiatus from classwork, the machines measures ¥¡¥ ¥¤§¦¨ © ¦9 and in this study experienced a wide range of loads as many of as these quantities may be obtained much less intrusively. the graduate students took the opportunity to devote them- The method (vmstat or load average) that reports the selves to research that had been neglected during the school CPU availability measure closest to that experienced by the year. As such, we believe that the data is a representative probe is chosen to generate all measurements until the next sample of the departmental load behavior. 2The current implementation of the NWS runs completely without priv- ileged access, both to reduce the possibility of breeching site security and 1This relationship assumes that the execution interval is suf®ciently to provide data that is representative of the performance an ªaverageº user large with respect to the length of a time-slice so that possible truncation can obtain. For a more complete account of the portability and implemen- and round-off errors are insigni®cant. tation issues associated with the NWS, see [30]. probe is run. In the experiments described in this paper, Host Name Load Average vmstat NWS Hybrid § the NWS hybrid sensor calculated ¢¡£ ¥¤§¦¨ © ¦¨ thing2 9.0% 11.2% 11.1% §D : ;(=¨ = and ¥¡¥ ¥¤§¦¨ © ¦9 every 10 seconds (6 times per thing1 6.4% 7.5% 6.1% minute), and probed once per minute. The method that was conundrum 34.1% 32.7% 4.4% most accurate with respect to the probe was used for the beowulf 6.3% 6.5% 7.5% subsequent 5 measurements each minute.
Recommended publications
  • A Comprehensive Review for Central Processing Unit Scheduling Algorithm
    IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 1, No 2, January 2013 ISSN (Print): 1694-0784 | ISSN (Online): 1694-0814 www.IJCSI.org 353 A Comprehensive Review for Central Processing Unit Scheduling Algorithm Ryan Richard Guadaña1, Maria Rona Perez2 and Larry Rutaquio Jr.3 1 Computer Studies and System Department, University of the East Caloocan City, 1400, Philippines 2 Computer Studies and System Department, University of the East Caloocan City, 1400, Philippines 3 Computer Studies and System Department, University of the East Caloocan City, 1400, Philippines Abstract when an attempt is made to execute a program, its This paper describe how does CPU facilitates tasks given by a admission to the set of currently executing processes is user through a Scheduling Algorithm. CPU carries out each either authorized or delayed by the long-term scheduler. instruction of the program in sequence then performs the basic Second is the Mid-term Scheduler that temporarily arithmetical, logical, and input/output operations of the system removes processes from main memory and places them on while a scheduling algorithm is used by the CPU to handle every process. The authors also tackled different scheduling disciplines secondary memory (such as a disk drive) or vice versa. and examples were provided in each algorithm in order to know Last is the Short Term Scheduler that decides which of the which algorithm is appropriate for various CPU goals. ready, in-memory processes are to be executed. Keywords: Kernel, Process State, Schedulers, Scheduling Algorithm, Utilization. 2. CPU Utilization 1. Introduction In order for a computer to be able to handle multiple applications simultaneously there must be an effective way The central processing unit (CPU) is a component of a of using the CPU.
    [Show full text]
  • The Different Unix Contexts
    The different Unix contexts • User-level • Kernel “top half” - System call, page fault handler, kernel-only process, etc. • Software interrupt • Device interrupt • Timer interrupt (hardclock) • Context switch code Transitions between contexts • User ! top half: syscall, page fault • User/top half ! device/timer interrupt: hardware • Top half ! user/context switch: return • Top half ! context switch: sleep • Context switch ! user/top half Top/bottom half synchronization • Top half kernel procedures can mask interrupts int x = splhigh (); /* ... */ splx (x); • splhigh disables all interrupts, but also splnet, splbio, splsoftnet, . • Masking interrupts in hardware can be expensive - Optimistic implementation – set mask flag on splhigh, check interrupted flag on splx Kernel Synchronization • Need to relinquish CPU when waiting for events - Disk read, network packet arrival, pipe write, signal, etc. • int tsleep(void *ident, int priority, ...); - Switches to another process - ident is arbitrary pointer—e.g., buffer address - priority is priority at which to run when woken up - PCATCH, if ORed into priority, means wake up on signal - Returns 0 if awakened, or ERESTART/EINTR on signal • int wakeup(void *ident); - Awakens all processes sleeping on ident - Restores SPL a time they went to sleep (so fine to sleep at splhigh) Process scheduling • Goal: High throughput - Minimize context switches to avoid wasting CPU, TLB misses, cache misses, even page faults. • Goal: Low latency - People typing at editors want fast response - Network services can be latency-bound, not CPU-bound • BSD time quantum: 1=10 sec (since ∼1980) - Empirically longest tolerable latency - Computers now faster, but job queues also shorter Scheduling algorithms • Round-robin • Priority scheduling • Shortest process next (if you can estimate it) • Fair-Share Schedule (try to be fair at level of users, not processes) Multilevel feeedback queues (BSD) • Every runnable proc.
    [Show full text]
  • Oracle® Linux 7 Monitoring and Tuning the System
    Oracle® Linux 7 Monitoring and Tuning the System F32306-03 October 2020 Oracle Legal Notices Copyright © 2020, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Chapter 3. Booting Operating Systems
    Chapter 3. Booting Operating Systems Abstract: Chapter 3 provides a complete coverage on operating systems booting. It explains the booting principle and the booting sequence of various kinds of bootable devices. These include booting from floppy disk, hard disk, CDROM and USB drives. Instead of writing a customized booter to boot up only MTX, it shows how to develop booter programs to boot up real operating systems, such as Linux, from a variety of bootable devices. In particular, it shows how to boot up generic Linux bzImage kernels with initial ramdisk support. It is shown that the hard disk and CDROM booters developed in this book are comparable to GRUB and isolinux in performance. In addition, it demonstrates the booter programs by sample systems. 3.1. Booting Booting, which is short for bootstrap, refers to the process of loading an operating system image into computer memory and starting up the operating system. As such, it is the first step to run an operating system. Despite its importance and widespread interests among computer users, the subject of booting is rarely discussed in operating system books. Information on booting are usually scattered and, in most cases, incomplete. A systematic treatment of the booting process has been lacking. The purpose of this chapter is to try to fill this void. In this chapter, we shall discuss the booting principle and show how to write booter programs to boot up real operating systems. As one might expect, the booting process is highly machine dependent. To be more specific, we shall only consider the booting process of Intel x86 based PCs.
    [Show full text]
  • Comparing Systems Using Sample Data
    Operating System and Process Monitoring Tools Arik Brooks, [email protected] Abstract: Monitoring the performance of operating systems and processes is essential to debug processes and systems, effectively manage system resources, making system decisions, and evaluating and examining systems. These tools are primarily divided into two main categories: real time and log-based. Real time monitoring tools are concerned with measuring the current system state and provide up to date information about the system performance. Log-based monitoring tools record system performance information for post-processing and analysis and to find trends in the system performance. This paper presents a survey of the most commonly used tools for monitoring operating system and process performance in Windows- and Unix-based systems and describes the unique challenges of real time and log-based performance monitoring. See Also: Table of Contents: 1. Introduction 2. Real Time Performance Monitoring Tools 2.1 Windows-Based Tools 2.1.1 Task Manager (taskmgr) 2.1.2 Performance Monitor (perfmon) 2.1.3 Process Monitor (pmon) 2.1.4 Process Explode (pview) 2.1.5 Process Viewer (pviewer) 2.2 Unix-Based Tools 2.2.1 Process Status (ps) 2.2.2 Top 2.2.3 Xosview 2.2.4 Treeps 2.3 Summary of Real Time Monitoring Tools 3. Log-Based Performance Monitoring Tools 3.1 Windows-Based Tools 3.1.1 Event Log Service and Event Viewer 3.1.2 Performance Logs and Alerts 3.1.3 Performance Data Log Service 3.2 Unix-Based Tools 3.2.1 System Activity Reporter (sar) 3.2.2 Cpustat 3.3 Summary of Log-Based Monitoring Tools 4.
    [Show full text]
  • A Simple Chargeback System for SAS® Applications Running on UNIX and Linux Servers
    SAS Global Forum 2007 Systems Architecture Paper: 194-2007 A Simple Chargeback System For SAS® Applications Running on UNIX and Linux Servers Michael A. Raithel, Westat, Rockville, MD Abstract Organizations that run SAS on UNIX and Linux servers often have a need to measure overall SAS usage and to charge the projects and users who utilize the servers. This allows an organization to pay for the hardware, software, and labor necessary to maintain and run these types of shared servers. However, performance management and accounting software is normally expensive. Purchasing such software, configuring it, and managing it may be prohibitive in terms of cost and in terms of having staff with the right skill sets to do so. Consequently, many organizations with UNIX or Linux servers do not have a reliable way to measure the overall use of SAS software and to charge individuals and projects for its use. This paper presents a simple methodology for creating a chargeback system on UNIX and Linux servers. It uses basic accounting programs found in the UNIX and Linux operating systems, and exploits them using SAS software. The paper presents an overview of the UNIX/Linux “sa” command and the basic accounting system. Then, it provides a SAS program that utilizes this command to capture and store monthly SAS usage statistics. The paper presents a second SAS program that reads the monthly SAS usage SAS data set and creates both a chargeback report and a general usage report. After reading this paper, you should be able to easily adapt the sample SAS programs to run on servers in your own environment.
    [Show full text]
  • UNIX OS Agent User's Guide
    IBM Tivoli Monitoring Version 6.3.0 UNIX OS Agent User's Guide SC22-5452-00 IBM Tivoli Monitoring Version 6.3.0 UNIX OS Agent User's Guide SC22-5452-00 Note Before using this information and the product it supports, read the information in “Notices” on page 399. This edition applies to version 6, release 3 of IBM Tivoli Monitoring (product number 5724-C04) and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright IBM Corporation 1994, 2013. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Tables ...............vii Solaris System CPU Workload workspace ....28 Solaris Zone Processes workspace .......28 Chapter 1. Using the monitoring agent . 1 Solaris Zones workspace ..........28 System Details workspace .........28 New in this release ............2 System Information workspace ........29 Components of the monitoring agent ......3 Top CPU-Memory %-VSize Details workspace . 30 User interface options ...........4 UNIX OS workspace ...........30 UNIX Detail workspace ..........31 Chapter 2. Requirements for the Users workspace ............31 monitoring agent ...........5 Enabling the Monitoring Agent for UNIX OS to run Chapter 4. Attributes .........33 as a nonroot user .............7 Agent Availability Management Status attributes . 36 Securing your IBM Tivoli Monitoring installation 7 Agent Active Runtime Status attributes .....37 Setting overall file ownership and permissions for AIX AMS attributes............38
    [Show full text]
  • CS414 SP 2007 Assignment 1
    CS414 SP 2007 Assignment 1 Due Feb. 07 at 11:59pm Submit your assignment using CMS 1. Which of the following should NOT be allowed in user mode? Briefly explain. a) Disable all interrupts. b) Read the time-of-day clock c) Set the time-of-day clock d) Perform a trap e) TSL (test-and-set instruction used for synchronization) Answer: (a), (c) should not be allowed. Which of the following components of a program state are shared across threads in a multi-threaded process ? Briefly explain. a) Register Values b) Heap memory c) Global variables d) Stack memory Answer: (b) and (c) are shared 2. After an interrupt occurs, hardware needs to save its current state (content of registers etc.) before starting the interrupt service routine. One issue is where to save this information. Here are two options: a) Put them in some special purpose internal registers which are exclusively used by interrupt service routine. b) Put them on the stack of the interrupted process. Briefly discuss the problems with above two options. Answer: (a) The problem is that second interrupt might occur while OS is in the middle of handling the first one. OS must prevent the second interrupt from overwriting the internal registers. This strategy leads to long dead times when interrupts are disabled and possibly lost interrupts and lost data. (b) The stack of the interrupted process might be a user stack. The stack pointer may not be legal which would cause a fatal error when the hardware tried to write some word at it.
    [Show full text]
  • SUSE Linux Enterprise Server 11 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 11 SP4
    SUSE Linux Enterprise Server 11 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 11 SP4 Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All other third party trademarks are the property of their respective owners. A trademark symbol (®, ™ etc.) denotes a SUSE or Novell trademark; an asterisk (*) denotes a third party trademark. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xi 1 Available Documentation xii 2 Feedback xiv 3 Documentation Conventions xv I BASICS 1 1 General Notes on System Tuning 2 1.1 Be Sure What Problem to Solve 2 1.2 Rule Out Common Problems 3 1.3 Finding the Bottleneck 3 1.4 Step-by-step Tuning 4 II SYSTEM MONITORING 5 2 System Monitoring Utilities 6 2.1 Multi-Purpose Tools 6 vmstat 7
    [Show full text]
  • System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 15 SP1
    SUSE Linux Enterprise Server 15 SP1 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 15 SP1 An administrator's guide for problem detection, resolution and optimization. Find how to inspect and optimize your system by means of monitoring tools and how to eciently manage resources. Also contains an overview of common problems and solutions and of additional help and documentation resources. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xiii
    [Show full text]
  • Cgroups Py: Using Linux Control Groups and Systemd to Manage CPU Time and Memory
    cgroups py: Using Linux Control Groups and Systemd to Manage CPU Time and Memory Curtis Maves Jason St. John Research Computing Research Computing Purdue University Purdue University West Lafayette, Indiana, USA West Lafayette, Indiana, USA [email protected] [email protected] Abstract—cgroups provide a mechanism to limit user and individual resource. 1 Each directory in a cgroupsfs hierarchy process resource consumption on Linux systems. This paper represents a cgroup. Subdirectories of a directory represent discusses cgroups py, a Python script that runs as a systemd child cgroups of a parent cgroup. Within each directory of service that dynamically throttles users on shared resource systems, such as HPC cluster front-ends. This is done using the the cgroups tree, various files are exposed that allow for cgroups kernel API and systemd. the control of the cgroup from userspace via writes, and the Index Terms—throttling, cgroups, systemd obtainment of statistics about the cgroup via reads [1]. B. Systemd and cgroups I. INTRODUCTION systemd A frequent problem on any shared resource with many users Systemd uses a named cgroup tree (called )—that is that a few users may over consume memory and CPU time. does not impose any resource limits—to manage processes When these resources are exhausted, other users have degraded because it provides a convenient way to organize processes in access to the system. A mechanism is needed to prevent this. a hierarchical structure. At the top of the hierarchy, systemd creates three cgroups By placing throttles on physical memory consumption and [2]: CPU time for each individual user, cgroups py prevents re- source exhaustion on a shared system and ensures continued • system.slice: This contains every non-user systemd access for other users.
    [Show full text]
  • Machine Learning for Load Balancing in the Linux Kernel Jingde Chen Subho S
    Machine Learning for Load Balancing in the Linux Kernel Jingde Chen Subho S. Banerjee [email protected] [email protected] University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign Zbigniew T. Kalbarczyk Ravishankar K. Iyer [email protected] [email protected] University of Illinois at Urbana-Champaign University of Illinois at Urbana-Champaign Abstract ACM Reference Format: The OS load balancing algorithm governs the performance Jingde Chen, Subho S. Banerjee, Zbigniew T. Kalbarczyk, and Rav- gains provided by a multiprocessor computer system. The ishankar K. Iyer. 2020. Machine Learning for Load Balancing in Linux’s Completely Fair Scheduler (CFS) scheduler tracks the Linux Kernel. In 11th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys ’20), August 24–25, 2020, Tsukuba, Japan. ACM, New process loads by average CPU utilization to balance work- York, NY, USA, 8 pages. https://doi.org/10.1145/3409963.3410492 load between processor cores. That approach maximizes the utilization of processing time but overlooks the con- tention for lower-level hardware resources. In servers run- 1 Introduction ning compute-intensive workloads, an imbalanced need for The Linux scheduler, called the Completely Fair Scheduler limited computing resources hinders execution performance. (CFS) algorithm, enforces a fair scheduling policy that aims to This paper solves the above problem using a machine learn- allow all tasks to be executed with a fair quota of processing ing (ML)-based resource-aware load balancer. We describe time. CFS allocates CPU time appropriately among the avail- (1) low-overhead methods for collecting training data; (2) an able hardware threads to allow processes to make forward ML model based on a multi-layer perceptron model that imi- progress.
    [Show full text]