Systems Performance: Enterprise and the Cloud

Total Page:16

File Type:pdf, Size:1020Kb

Systems Performance: Enterprise and the Cloud Systems Performance Second Edition This page intentionally left blank Systems Performance Enterprise and the Cloud Second Edition Brendan Gregg Boston • Columbus • New York • San Francisco • Amsterdam • Cape Town Dubai • London • Madrid • Milan • Munich • Paris • Montreal • Toronto • Delhi • Mexico City São Paulo • Sydney • Hong Kong • Seoul • Singapore • Taipei • Tokyo Many of the designations used by manufacturers and sellers to distinguish their products Publisher are claimed as trademarks. Where those designations appear in this book, and the Mark L. Taub publisher was aware of a trademark claim, the designations have been printed with initial Executive Editor capital letters or in all capitals. Greg Doench The author and publisher have taken care in the preparation of this book, but make no Managing Producer expressed or implied warranty of any kind and assume no responsibility for errors or Sandra Schroeder omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. Sr. Content Producer For information about buying this title in bulk quantities, or for special sales opportunities Julie B. Nahil (which may include electronic versions; custom cover designs; and content particular to Project Manager your business, training goals, marketing focus, or branding interests), please contact our corporate sales department at [email protected] or (800) 382-3419. Rachel Paul Copy Editor For government sales inquiries, please contact [email protected]. Kim Wimpsett For questions about sales outside the U.S., please contact [email protected]. Indexer Visit us on the Web: informit.com/aw Ted Laux Proofreader Library of Congress Control Number: 2020944455 Rachel Paul Copyright © 2021 Pearson Education, Inc. Compositor All rights reserved. This publication is protected by copyright, and permission must be The CIP Group obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopy- ing, recording, or likewise. For information regarding permissions, request forms and the appropriate contacts within the Pearson Education Global Rights & Permissions Department, please visit www.pearson.com/permissions. Cover images by Brendan Gregg Page 9, Figure 1.5: Screenshot of System metrics GUI (Grafana) © 2020 Grafana Labs Page 84, Figure 2.32: Screenshot of Firefox timeline chart © Netflix Page 164, Figure 4.7: Screenshot of sar(1) sadf(1) SVG output © 2010 W3C Page 560, Figure 10.12: Screenshot of Wireshark screenshot © Wireshark Page 740, Figure 14.3: Screenshot of KernelShark © KernelShark ISBN-13: 978-0-13-682015-4 ISBN-10: 0-13-682015-8 ScoutAutomatedPrintCode For Deirdré Straughan, an amazing person in technology, and an amazing person—we did it. This page intentionally left blank Contents at a Glance Contents ix Preface xxix Acknowledgments xxxv About the Author xxxvii 1 Introduction 1 2 Methodologies 21 3 Operating Systems 89 4 Observability Tools 129 5 Applications 171 6 CPUs 219 7 Memory 303 8 File Systems 359 9 Disks 423 10 Network 499 11 Cloud Computing 579 12 Benchmarking 641 13 perf 671 14 Ftrace 705 15 BPF 751 16 Case Study 783 A USE Method: Linux 795 B sar Summary 801 C bpftrace One-Liners 803 D Solutions to Selected Exercises 809 E Systems Performance Who’s Who 811 Glossary 815 Index 825 This page intentionally left blank Contents Preface xxix Acknowledgments xxxv About the Author xxxvii 1 Introduction 1 1.1 Systems Performance 1 1.2 Roles 2 1.3 Activities 3 1.4 Perspectives 4 1.5 Performance Is Challenging 5 1.5.1 Subjectivity 5 1.5.2 Complexity 5 1.5.3 Multiple Causes 6 1.5.4 Multiple Performance Issues 6 1.6 Latency 6 1.7 Observability 7 1.7.1 Counters, Statistics, and Metrics 8 1.7.2 Profiling 10 1.7.3 Tracing 11 1.8 Experimentation 13 1.9 Cloud Computing 14 1.10 Methodologies 15 1.10.1 Linux Perf Analysis in 60 Seconds 15 1.11 Case Studies 16 1.11.1 Slow Disks 16 1.11.2 Software Change 18 1.11.3 More Reading 19 1.12 References 19 2 Methodologies 21 2.1 Terminology 22 2.2 Models 23 2.2.1 System Under Test 23 2.2.2 Queueing System 23 2.3 Concepts 24 2.3.1 Latency 24 2.3.2 Time Scales 25 x Contents 2.3.3 Trade-Offs 26 2.3.4 Tuning Efforts 27 2.3.5 Level of Appropriateness 28 2.3.6 When to Stop Analysis 29 2.3.7 Point-in-Time Recommendations 29 2.3.8 Load vs. Architecture 30 2.3.9 Scalability 31 2.3.10 Metrics 32 2.3.11 Utilization 33 2.3.12 Saturation 34 2.3.13 Profiling 35 2.3.14 Caching 35 2.3.15 Known-Unknowns 37 2.4 Perspectives 37 2.4.1 Resource Analysis 38 2.4.2 Workload Analysis 39 2.5 Methodology 40 2.5.1 Streetlight Anti-Method 42 2.5.2 Random Change Anti-Method 42 2.5.3 Blame-Someone-Else Anti-Method 43 2.5.4 Ad Hoc Checklist Method 43 2.5.5 Problem Statement 44 2.5.6 Scientific Method 44 2.5.7 Diagnosis Cycle 46 2.5.8 Tools Method 46 2.5.9 The USE Method 47 2.5.10 The RED Method 53 2.5.11 Workload Characterization 54 2.5.12 Drill-Down Analysis 55 2.5.13 Latency Analysis 56 2.5.14 Method R 57 2.5.15 Event Tracing 57 2.5.16 Baseline Statistics 59 2.5.17 Static Performance Tuning 59 2.5.18 Cache Tuning 60 2.5.19 Micro-Benchmarking 60 2.5.20 Performance Mantras 61 Contents xi 2.6 Modeling 62 2.6.1 Enterprise vs. Cloud 62 2.6.2 Visual Identification 62 2.6.3 Amdahl’s Law of Scalability 64 2.6.4 Universal Scalability Law 65 2.6.5 Queueing Theory 66 2.7 Capacity Planning 69 2.7.1 Resource Limits 70 2.7.2 Factor Analysis 71 2.7.3 Scaling Solutions 72 2.8 Statistics 73 2.8.1 Quantifying Performance Gains 73 2.8.2 Averages 74 2.8.3 Standard Deviation, Percentiles, Median 75 2.8.4 Coefficient of Variation 76 2.8.5 Multimodal Distributions 76 2.8.6 Outliers 77 2.9 Monitoring 77 2.9.1 Time-Based Patterns 77 2.9.2 Monitoring Products 79 2.9.3 Summary-Since-Boot 79 2.10 Visualizations 79 2.10.1 Line Chart 80 2.10.2 Scatter Plots 81 2.10.3 Heat Maps 82 2.10.4 Timeline Charts 83 2.10.5 Surface Plot 84 2.10.6 Visualization Tools 85 2.11 Exercises 85 2.12 References 86 3 Operating Systems 89 3.1 Terminology 90 3.2 Background 91 3.2.1 Kernel 91 3.2.2 Kernel and User Modes 93 3.2.3 System Calls 94 xii Contents 3.2.4 Interrupts 96 3.2.5 Clock and Idle 99 3.2.6 Processes 99 3.2.7 Stacks 102 3.2.8 Virtual Memory 104 3.2.9 Schedulers 105 3.2.10 File Systems 106 3.2.11 Caching 108 3.2.12 Networking 109 3.2.13 Device Drivers 109 3.2.14 Multiprocessor 110 3.2.15 Preemption 110 3.2.16 Resource Management 110 3.2.17 Observability 111 3.3 Kernels 111 3.3.1 Unix 112 3.3.2 BSD 113 3.3.3 Solaris 114 3.4 Linux 114 3.4.1 Linux Kernel Developments 115 3.4.2 systemd 120 3.4.3 KPTI (Meltdown) 121 3.4.4 Extended BPF 121 3.5 Other Topics 122 3.5.1 PGO Kernels 122 3.5.2 Unikernels 123 3.5.3 Microkernels and Hybrid Kernels 123 3.5.4 Distributed Operating Systems 123 3.6 Kernel Comparisons 124 3.7 Exercises 124 3.8 References 125 3.8.1 Additional Reading 127 4 Observability Tools 129 4.1 Tool Coverage 130 4.1.1 Static Performance Tools 130 4.1.2 Crisis Tools 131 Contents xiii 4.2 Tool Types 133 4.2.1 Fixed Counters 133 4.2.2 Profiling 135 4.2.3 Tracing 136 4.2.4 Monitoring 137 4.3 Observability Sources 138 4.3.1 /proc 140 4.3.2 /sys 143 4.3.3 Delay Accounting 145 4.3.4 netlink 145 4.3.5 Tracepoints 146 4.3.6 kprobes 151 4.3.7 uprobes 153 4.3.8 USDT 155 4.3.9 Hardware Counters (PMCs) 156 4.3.10 Other Observability Sources 159 4.4 sar 160 4.4.1 sar(1) Coverage 161 4.4.2 sar(1) Monitoring 161 4.4.3 sar(1) Live 165 4.4.4 sar(1) Documentation 165 4.5 Tracing Tools 166 4.6 Observing Observability 167 4.7 Exercises 168 4.8 References 168 5 Applications 171 5.1 Application Basics 172 5.1.1 Objectives 173 5.1.2 Optimize the Common Case 174 5.1.3 Observability 174 5.1.4 Big O Notation 175 5.2 Application Performance Techniques 176 5.2.1 Selecting an I/O Size 176 5.2.2 Caching 176 5.2.3 Buffering 177 5.2.4 Polling 177 5.2.5 Concurrency and Parallelism 177 xiv Contents 5.2.6 Non-Blocking I/O 181 5.2.7 Processor Binding 181 5.2.8 Performance Mantras 182 5.3 Programming Languages 182 5.3.1 Compiled Languages 183 5.3.2 Interpreted Languages 184 5.3.3 Virtual Machines 185 5.3.4 Garbage Collection 185 5.4 Methodology 186 5.4.1 CPU Profiling 187 5.4.2 Off-CPU Analysis 189 5.4.3 Syscall Analysis 192 5.4.4 USE Method 193 5.4.5 Thread State Analysis 193 5.4.6 Lock Analysis 198 5.4.7 Static Performance Tuning 198 5.4.8 Distributed Tracing 199 5.5 Observability Tools 199 5.5.1 perf 200 5.5.2 profile 203 5.5.3 offcputime 204 5.5.4 strace 205 5.5.5 execsnoop 207 5.5.6 syscount 208 5.5.7 bpftrace 209 5.6 Gotchas 213 5.6.1 Missing Symbols 214 5.6.2 Missing Stacks 215 5.7 Exercises 216 5.8 References 217 6 CPUs 219 6.1 Terminology 220 6.2 Models 221 6.2.1 CPU Architecture 221 6.2.2 CPU Memory Caches 221 6.2.3 CPU Run Queues 222 Contents xv 6.3 Concepts 223 6.3.1 Clock Rate 223 6.3.2 Instructions 223 6.3.3 Instruction Pipeline 224 6.3.4 Instruction Width 224 6.3.5 Instruction Size 224 6.3.6 SMT 225 6.3.7 IPC, CPI 225 6.3.8 Utilization 226 6.3.9 User Time/Kernel Time 226 6.3.10 Saturation 226 6.3.11 Preemption 227 6.3.12 Priority Inversion 227 6.3.13 Multiprocess, Multithreading 227 6.3.14 Word Size 229 6.3.15 Compiler Optimization 229 6.4 Architecture 229 6.4.1 Hardware 230 6.4.2 Software 241 6.5 Methodology 244 6.5.1 Tools Method 245 6.5.2 USE Method 245 6.5.3 Workload Characterization 246 6.5.4 Profiling 247 6.5.5 Cycle Analysis 251 6.5.6 Performance Monitoring 251 6.5.7 Static Performance Tuning 252 6.5.8 Priority Tuning 252 6.5.9 Resource Controls
Recommended publications
  • 1 Thinking Methodically About Performance
    PERFORMANCE Thinking Methodically about Performance The USE method addresses shortcomings in other commonly used methodologies Brendan Gregg, Joyent Performance issues can be complex and mysterious, providing little or no clue to their origin. In the absence of a starting point—or a methodology to provide one—performance issues are often analyzed randomly: guessing where the problem may be and then changing things until it goes away. While this can deliver results—if you guess correctly—it can also be time-consuming, disruptive, and may ultimately overlook certain issues. This article describes system-performance issues and the methodologies in use today for analyzing them, and it proposes a new methodology for approaching and solving a class of issues. Systems-performance analysis is complex because of the number of components in a typical system and their interactions. An environment may be composed of databases, Web servers, load balancers, and custom applications, all running upon operating systems—either bare-metal or virtual. And that’s just the software. Hardware and firmware, including external storage systems and network infrastructure, add many more components to the environment, any of which is a potential source of issues. Each of these components may require its own field of expertise, and a company may not have staff knowledgeable about all the components in its environment. Performance issues may also arise from complex interactions between components that work well in isolation. Solving this type of problem may require multiple domains of expertise to work together. As an example of such complexity within an environment, consider a mysterious performance issue we encountered at Joyent for a cloud-computing customer: the problem appeared to be a memory leak, but from an unknown location.
    [Show full text]
  • Improving Route Scalability: Nexthops As Separate Objects
    Improving Route Scalability: Nexthops as Separate Objects September 2019 David Ahern | Cumulus Networks !1 Agenda Executive Summary ▪ If you remember nothing else about this talk … Driving use case Review legacy route API Dive into Nexthop API Benefits of the new API Cumulus Networks !2 Performance with the Legacy Route API route route route prefix/lenroute prefix/lendev prefix/lendev gatewayprefix/len gatewaydev gatewaydev gateway Cumulus Networks !3 Splitting Next Hops from Routes Routes with separate Nexthop objects Legacy Route API route route prefix/len nexthop route nexthop id dev route gateway prefix/lenroute prefix/lendev prefix/lendev gatewayprefix/len gatewaydev gatewaydev gateway route prefix/len nexthop nexthop nexthop id group nexthopdev nexthop[N] gatewaydev gateway Cumulus Networks !4 Dramatically Improves Route Scalability … Cumulus Networks !5 … with the Potential for Constant Insert Times Cumulus Networks !6 Networking Operating System Using Linux APIs Routing daemon or utility manages switchd ip FRR entries in kernel FIBs via rtnetlink APIs SDK userspace ▪ Enables other control plane software to use Linux networking APIs rtnetlink Data path connections, stats, troubleshooting, … FIB notifications FIB Management of hardware offload is separate kernel upper devices tunnels ▪ Keeps hardware in sync with kernel ... eth0 swp1 swp2 swpN Userspace driver with SDK leveraging driver driver driver kernel notifications NIC switch ASIC H / W Cumulus Networks !7 NOS with switchdev Driver In-kernel switchdev driver ip FRR Leverages
    [Show full text]
  • Performance Counters in Htop 3.0
    Slide: [ ] Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Performance counters in htop 3.0 Hisham Muhammad @[email protected] https://hisham.hm Slide: [ 2 ] Date: 2018-08-25 Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command About me original author of htop, a project started in 2004 http://hisham.hm/htop/ lead dev of LuaRocks, package manager for Lua http://luarocks.org/ co-founder of the GoboLinux distribution http://gobolinux.org/ developer at Kong – FLOSS API gateway http://getkong.org/ (we’re hiring!) Slide: [ 3 ] Date: 2018-08-25 Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command What is htop an interactive process manager intended to be “a better top” by this all I originally meant was: scrolling! (versions of top improved a lot since!) Slide: [ 4 ] Date: 2018-08-25 Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Hello, htop! Slide: [ 5 ] Date: 2018-08-25 Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command htop beyond Linux Linux MacOS FreeBSD OpenBSD DragonFlyBSD Solaris (illumos) Slide: [ 6 ] Date: 2018-08-25 Talk: Perf counters in htop 3.0 Presenter: https://hisham.hm PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command Then Apple released a broken kernel..
    [Show full text]
  • 20 Linux System Monitoring Tools Every Sysadmin Should Know by Nixcraft on June 27, 2009 · 315 Comments · Last Updated November 6, 2012
    About Forum Howtos & FAQs Low graphics Shell Scripts RSS/Feed nixcraft - insight into linux admin work 20 Linux System Monitoring Tools Every SysAdmin Should Know by nixCraft on June 27, 2009 · 315 comments · Last updated November 6, 2012 Need to monitor Linux server performance? Try these built-in commands and a few add-on tools. Most Linux distributions are equipped with tons of monitoring. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most basic commands when it comes to system analysis and debugging server issues such as: 1. Finding out bottlenecks. 2. Disk (storage) bottlenecks. 3. CPU and memory bottlenecks. 4. Network bottlenecks. #1: top - Process Activity Command The top program provides a dynamic real-time view of a running system i.e. actual process activity. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds. Fig.01: Linux top command Commonly Used Hot Keys The top command provides several useful hot keys: Hot Usage Key t Displays summary information off and on. m Displays memory information off and on. Sorts the display by top consumers of various system resources. Useful for quick identification of performance- A hungry tasks on a system. f Enters an interactive configuration screen for top. Helpful for setting up top for a specific task. o Enables you to interactively select the ordering within top. r Issues renice command.
    [Show full text]
  • Linux: Kernel Release Number, Part II
    Linux: Kernel Release Number, Part II Posted by jeremy on Friday, March 4, 2005 - 07:05 In the continued discussion on release numbering for the Linux kernel [story], Linux creator Linus Torvalds decided against trying to add meaning to the odd/even least significant number. Instead, the new plan is to go from the current 2.6.x numbering to a finer-grained 2.6.x.y. Linus will continue to maintain only the 2.6.x releases, and the -rc releases in between. Others will add trivial patches to create the 2.6.x.y releases. Linus cautions that the task of maintaining a 2.6.x.y tree is not going to be enjoyable: "I'll tell you what the problem is: I don't think you'll find anybody to do the parallell 'only trivial patches' tree. They'll go crazy in a couple of weeks. Why? Because it's a _damn_ hard problem. Where do you draw the line? What's an acceptable patch? And if you get it wrong, people will complain _very_ loudly, since by now you've 'promised' them a kernel that is better than the mainline. In other words: there's almost zero glory, there are no interesting problems, and there will absolutely be people who claim that you're a dick-head and worse, probably on a weekly basis." He went on to add, "that said, I think in theory it's a great idea. It might even be technically feasible if there was some hard technical criteria for each patch that gets accepted, so that you don't have the burn-out problem." His suggested criteria included limiting the patch to being 100 lines or less, and requiring that it fix an oops, a hang, or an exploitable security hole.
    [Show full text]
  • Reducing Power Consumption in Mobile Devices by Using a Kernel
    IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. Z, NO. B, AUGUST 2017 1 Reducing Event Latency and Power Consumption in Mobile Devices by Using a Kernel-Level Display Server Stephen Marz, Member, IEEE and Brad Vander Zanden and Wei Gao, Member, IEEE E-mail: [email protected], [email protected], [email protected] Abstract—Mobile devices differ from desktop computers in that they have a limited power source, a battery, and they tend to spend more CPU time on the graphical user interface (GUI). These two facts force us to consider different software approaches in the mobile device kernel that can conserve battery life and reduce latency, which is the duration of time between the inception of an event and the reaction to the event. One area to consider is a software package called the display server. The display server is middleware that handles all GUI activities between an application and the operating system, such as event handling and drawing to the screen. In both desktop and mobile devices, the display server is located in the application layer. However, the kernel layer contains most of the information needed for handling events and drawing graphics, which forces the application-level display server to make a series of system calls in order to coordinate events and to draw graphics. These calls interrupt the CPU which can increase both latency and power consumption, and also require the kernel to maintain event queues that duplicate event queues in the display server. A further drawback of placing the display server in the application layer is that the display server contains most of the information required to efficiently schedule the application and this information is not communicated to existing kernels, meaning that GUI-oriented applications are scheduled less efficiently than they might be, which further increases power consumption.
    [Show full text]
  • Jitk: a Trustworthy In-Kernel Interpreter Infrastructure
    Jitk: A trustworthy in-kernel interpreter infrastructure Xi Wang, David Lazar, Nickolai Zeldovich, Adam Chlipala, Zachary Tatlock MIT and University of Washington Modern OSes run untrusted user code in kernel In-kernel interpreters - Seccomp: sandboxing (Linux) - BPF: packet filtering - INET_DIAG: socket monitoring - Dtrace: instrumentation Critical to overall system security - Any interpreter bugs are serious! 2/30 Many bugs have been found in interpreters Kernel space bugs - Control flow errors: incorrect jump offset, ... - Arithmetic errors: incorrect result, ... - Memory errors: buffer overflow, ... - Information leak: uninitialized read Kernel-user interface bugs - Incorrect encoding/decoding User space bugs - Incorrect input generated by tools/libraries Some have security consequences: CVE-2014-2889, ... See our paper for a case study of bugs 3/30 How to get rid of all these bugs at once? Theorem proving can help kill all these bugs seL4: provably correct microkernel [SOSP'09] CompCert: provably correct C compiler [CACM'09] This talk: Jitk - Provably correct interpreter for running untrusted user code - Drop-in replacement for Linux's seccomp - Built using Coq proof assistant + CompCert 5/30 Theorem proving: overview specification proof implementation Proof is machine-checkable: Coq proof assistant Proof: correct specification correct implementation Specification should be much simpler than implementation 6/30 Challenges What is the specification? How to translate systems properties into proofs? How to extract a running
    [Show full text]
  • Mesalock Linux: Towards a Memory-Safe Linux Distribution
    MesaLock Linux Towards a memory-safe Linux distribution Mingshen Sun MesaLock Linux Maintainer | Baidu X-Lab, USA Shanghai Jiao Tong University, 2018 whoami • Senior Security Research in Baidu X-Lab, Baidu USA • PhD, The Chinese University of Hong Kong • System security, mobile security, IoT security, and car hacking • MesaLock Linux, TaintART, Pass for iOS, etc. • mssun @ GitHub | https://mssun.me !2 MesaLock Linux • Why • What • How !3 Why • Memory corruption occurs in a computer program when the contents of a memory location are unintentionally modified; this is termed violating memory safety. • Memory safety is the state of being protected from various software bugs and security vulnerabilities when dealing with memory access, such as buffer overflows and dangling pointers. !4 Stack Buffer Overflow • https://youtu.be/T03idxny9jE !5 Types of memory errors • Access errors • Buffer overflow • Race condition • Use after free • Uninitialized variables • Memory leak • Double free !6 Memory-safety in user space • CVE-2017-13089 wget: Stack-based buffer overflow in HTTP protocol handling • A stack-based buffer overflow when processing chunked, encoded HTTP responses was found in wget. By tricking an unsuspecting user into connecting to a malicious HTTP server, an attacker could exploit this flaw to potentially execute arbitrary code. • https://bugzilla.redhat.com/show_bug.cgi?id=1505444 • POC: https://github.com/r1b/CVE-2017-13089 !7 What • Linux distribution • Memory-safe user space !8 Linux Distribution • A Linux distribution (often abbreviated as distro) is an operating system made from a software collection, which is based upon the Linux kernel and, often, a package management system. !9 Linux Distros • Server: CentOS, Federa, RedHat, Debian • Desktop: Ubuntu • Mobile: Android • Embedded: OpenWRT, Yocto • Hard-core: Arch Linux, Gentoo • Misc: ChromeOS, Alpine Linux !10 Security and Safety? • Gentoo Hardened: enables several risk-mitigating options in the toolchain, supports PaX, grSecurity, SELinux, TPE and more.
    [Show full text]
  • Kernel Boot-Time Tracing
    Kernel Boot-time Tracing Linux Plumbers Conference 2019 - Tracing Track Masami Hiramatsu <[email protected]> Linaro, Ltd. Speaker Masami Hiramatsu - Working for Linaro and Linaro members - Tech Lead for a Landing team - Maintainer of Kprobes and related tracing features/tools Why Kernel Boot-time Tracing? Debug and analyze boot time errors and performance issues - Measure performance statistics of kernel boot - Analyze driver init failure - Debug boot up process - Continuously tracing from boot time etc. What We Have There are already many ftrace options on kernel command line ● Setup options (trace_options=) ● Output to printk (tp_printk) ● Enable events (trace_events=) ● Enable tracers (ftrace=) ● Filtering (ftrace_filter=,ftrace_notrace=,ftrace_graph_filter=,ftrace_graph_notrace=) ● Add kprobe events (kprobe_events=) ● And other options (alloc_snapshot, traceoff_on_warning, ...) See Documentation/admin-guide/kernel-parameters.txt Example of Kernel Cmdline Parameters In grub.conf linux /boot/vmlinuz-5.1 root=UUID=5a026bbb-6a58-4c23-9814-5b1c99b82338 ro quiet splash tp_printk trace_options=”sym-addr” trace_clock=global ftrace_dump_on_oops trace_buf_size=1M trace_event=”initcall:*,irq:*,exceptions:*” kprobe_event=”p:kprobes/myevent foofunction $arg1 $arg2;p:kprobes/myevent2 barfunction %ax” What Issues? Size limitation ● kernel cmdline size is small (< 256bytes) ● A half of the cmdline is used for normal boot Only partial features supported ● ftrace has too complex features for single command line ● per-event filters/actions, instances, histograms. Solutions? 1. Use initramfs - Too late for kernel boot time tracing 2. Expand kernel cmdline - It is not easy to write down complex tracing options on bootloader (Single line options is too simple) 3. Reuse structured boot time data (Devicetree) - Well documented, structured data -> V1 & V2 series based on this. Boot-time Trace: V1 and V2 series V1 and V2 series posted at June.
    [Show full text]
  • Ivoyeur: Inotify
    COLUMNS iVoyeur inotify DAVE JOSEPHSEN Dave Josephsen is the he last time I changed jobs, the magnitude of the change didn’t really author of Building a sink in until the morning of my first day, when I took a different com- Monitoring Infrastructure bination of freeways to work. The difference was accentuated by the with Nagios (Prentice Hall T PTR, 2007) and is Senior fact that the new commute began the same as the old one, but on this morn- Systems Engineer at DBG, Inc., where he ing, at a particular interchange, I would zig where before I zagged. maintains a gaggle of geographically dispersed It was an unexpectedly emotional and profound metaphor for the change. My old place was server farms. He won LISA ‘04’s Best Paper off to the side, and down below, while my future was straight ahead, and literally under award for his co-authored work on spam construction. mitigation, and he donates his spare time to the SourceMage GNU Linux Project. The fact that it was under construction was poetic but not surprising. Most of the roads I [email protected] travel in the Dallas/Fort Worth area are under construction and have been for as long as anyone can remember. And I don’t mean a lane closed here or there. Our roads drift and wan- der like leaves in the water—here today and tomorrow over there. The exits and entrances, neither a part of this road or that, seem unable to anticipate the movements of their brethren, and are constantly forced to react.
    [Show full text]
  • Chapter 1. Origins of Mac OS X
    1 Chapter 1. Origins of Mac OS X "Most ideas come from previous ideas." Alan Curtis Kay The Mac OS X operating system represents a rather successful coming together of paradigms, ideologies, and technologies that have often resisted each other in the past. A good example is the cordial relationship that exists between the command-line and graphical interfaces in Mac OS X. The system is a result of the trials and tribulations of Apple and NeXT, as well as their user and developer communities. Mac OS X exemplifies how a capable system can result from the direct or indirect efforts of corporations, academic and research communities, the Open Source and Free Software movements, and, of course, individuals. Apple has been around since 1976, and many accounts of its history have been told. If the story of Apple as a company is fascinating, so is the technical history of Apple's operating systems. In this chapter,[1] we will trace the history of Mac OS X, discussing several technologies whose confluence eventually led to the modern-day Apple operating system. [1] This book's accompanying web site (www.osxbook.com) provides a more detailed technical history of all of Apple's operating systems. 1 2 2 1 1.1. Apple's Quest for the[2] Operating System [2] Whereas the word "the" is used here to designate prominence and desirability, it is an interesting coincidence that "THE" was the name of a multiprogramming system described by Edsger W. Dijkstra in a 1968 paper. It was March 1988. The Macintosh had been around for four years.
    [Show full text]
  • Linux Pocket Guide.Pdf
    3rd Edition Linux Pocket Guide ESSENTIAL COMMANDS Daniel J. Barrett 3RD EDITION Linux Pocket Guide Daniel J. Barrett Linux Pocket Guide by Daniel J. Barrett Copyright © 2016 Daniel Barrett. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebasto‐ pol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promo‐ tional use. Online editions are also available for most titles (http://safaribook‐ sonline.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Editor: Nan Barber Production Editor: Nicholas Adams Copyeditor: Jasmine Kwityn Proofreader: Susan Moritz Indexer: Daniel Barrett Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Rebecca Demarest June 2016: Third Edition Revision History for the Third Edition 2016-05-27: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781491927571 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Linux Pocket Guide, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellec‐ tual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
    [Show full text]