RHEL 7 Performance Tuning

Total Page:16

File Type:pdf, Size:1020Kb

RHEL 7 Performance Tuning RHEL 7 Performance Tuning Joe Mario Senior Principal Software Engineer Sept 22, 2016 1 RED HAT CONFIDENTIAL #rhconvergence Agenda ● The RH performance team ● Problem areas where we spend most of our time: ● System tuning ● Numa issues DividerBackup Slide ● RHEL 7 performance enhancements ● Where to learn more. ● Backup slides – if interest and time permits. 2 RED HAT CONFIDENTIAL #rhconvergence •Performance Engineering Team 30+ Engineers around the globe Striving for best performance for - Micro-Benchmarks - Applications/Benchmarks - Application Scaling - OS Scaling - New performance feature support - Partner support - Customer support ... 3 RED HAT CONFIDENTIAL #rhconvergence RHEL Platform(s) Performance Coverage Benchmarks Application Performance ● CPU – linpack, lmbench ● Linpack MPI, SPEC CPU, SPECjbb 05/13 ● Memory – lmbench, McCalpin STREAM ● AIM 7 – shared, filesystem, db, ● Disk IO – iozone, fio – SCSI, FC, iSCSI compute ● Filesystems – iozone, ext3/4, xfs, gfs2, ● Database: DB2, Oracle 11/12, gluster Sybase 15.x , MySQL, MariaDB, ● Networks – netperf – 10/40Gbit, PostgreSQL, Mongo Infiniband/RoCE, Bypass ● OLTP – Bare Metal, KVM, RHEV-M clusters – TPC-C/Virt ● Bare Metal, RHEL6/7 KVM ● DSS – Bare Metal, KVM, RHEV-M, ● White box AMD/Intel, with our OEM IQ, TPC-H/Virt partners ● SPECsfs NFS ● TPC-C, TPC-H based worloads for database testing ● SAP – SLCS, SD, HANA ● YCSB and SPECvirt for virtualizaton testing 4 RED HAT CONFIDENTIAL Red#rhconvergence Hat Confidential RHEL Performance Evolution RHEL5 RHEL6 RHEL7 RH Cloud Static Hugepages Transparent Tuned - RHEV tuned Hugepages throughput- profile CPU Sets performance Tuned - Choose (default) RHEL OSP7 Ktune on/of Profile Tuned, NUMA, Automatic NUMA- SR-IOV CPU Affinity NUMAD - balancing kernel (taskset) userspace scheduler RHEL Atomic Host, Atomic Ent NUMA Pinning cgroups Containers/Docker (numactl) OpenShift v3 irqbalance - Irqbalance - irqbalance NUMA enhanced NUMA enhanced CloudForms 5 RED HAT CONFIDENTIAL #rhconvergence Agenda Problem areas where we spend most of our time. First: System tuning – with “tuned” DividerBackup Slide 6 RED HAT CONFIDENTIAL #rhconvergence Performance Metrics Latency==Speed. Throughput==Bandwidth Latency – Speed Limit Throughput – Bandwidth - # lanes in Highway - Ghz of CPU, Memory PCI - Width of data path / cachelines - Small transfers, disable - Bus Bandwidth, QPI links, PCI 1-2-3 aggregation – TCP nodelay - Network 1 / 10 / 40 Gb – aggregation, NAPI - Dataplane optimization DPDK - Fiberchannel 4/8/16, SSD, NVME Drivers What is “tuned” ? • Tuning profile delivery mechanism • Red Hat ships tuned profiles that improve performance for many workloads...hopefully yours! • Customize your own profile. 8 RED HAT CONFIDENTIAL #rhconvergence Is your OS is optimally tuned? What is my system currently tuned to? •# tuned-adm active Current active profile: balanced How do I change my current tuning setting? •# tuned-adm profile network-latency 9 RED HAT CONFIDENTIAL #rhconvergence Tuned (cont) # tuned-adm list Available profiles: - balanced - desktop - latency-performance - network-latency - network-throughput - powersave - throughput-performance - virtual-guest - virtual-host Current active profile: balanced 10 RED HAT CONFIDENTIAL #rhconvergence Tuned Updates for RHEL7 ● Installed by default! ● Profiles updated for RHEL7 features and characteristics ● Profiles automatically set based on installation – Desktop/Workstation: balanced profile – Server/HPC: throughput-performance profile ● Optional hook/callout capability ● Concept of Inheritance (just like httpd.conf) 11 RED HAT CONFIDENTIAL #rhconvergence Tuned: Network Latency Performance Boost Performance Latency Network Tuned: 12 Latency (Microseconds) C-state lock improves determinism, reduces jitter reduces determinism, C-state lock improves 250 100 150 200 50 0 C6 C6 C3 C1 C0 Time (1-sec intervals) (1-sec Time RED HAT CONFIDENTIAL HAT RED #rhconvergence Max Tuned: Storage Performance Boost RHEL7 File System In Cache Perf Intel I/O (iozone - geoM 1m-4g, 4k-1m) 4500 4000 c e S 3500 / B 3000 M 2500 n i t 2000 u p 1500 h g 1000 u o r 500 h T 0 ext3 ext4 xfs gfs2 not tuned tuned Larger is better 13 RED HAT CONFIDENTIAL #rhconvergence throughput-performance (RHEL7 default) • governor=performance • energy_perf_bias=performance • min_perf_pct=100 • readahead=4096 • kernel.sched_min_granularity_ns = 10000000 • kernel.sched_wakeup_granularity_ns = 15000000 • vm.dirty_background_ratio = 10 • vm.swappiness=10 14 RED HAT CONFIDENTIAL #rhconvergence Tuned: Profile Inheritance (throughput) throughput-performance governor=performance energy_perf_bias=performance min_perf_pct=100 readahead=4096 kernel.sched_min_granularity_ns = 10000000 kernel.sched_wakeup_granularity_ns = 15000000 vm.dirty_background_ratio = 10 vm.swappiness=10 network-throughput net.ipv4.tcp_rmem="4096 87380 16777216" net.ipv4.tcp_wmem="4096 16384 16777216" net.ipv4.udp_mem="3145728 4194304 16777216" 15 RED HAT CONFIDENTIAL #rhconvergence Tuned: Profile Inheritance Parents throughput-performance balanced latency-performance Children network-throughput desktop network-latency virtual-host virtual-guest Your-Web Your-DB Your-Middleware 16 RED HAT CONFIDENTIAL | Joe Mario #rhconvergence Agenda Two hottest categories where we spend most of our time: Second: NUMA issues DividerBackup Slide 17 RED HAT CONFIDENTIAL #rhconvergence Typical NUMA System Node 0 Node 1 Node 0 RAM Node 1 RAM Core 0 L3 Core 1 Core 0 L3 Core 1 Cache Cache Core 2 Core 3 Core 2 Core 3 QPI links, IO, etc. QPI links, IO, etc. Node 2 Node 3 Node 2 RAM Node 3 RAM Core 0 L3 Core 1 Core 0 L3 Core 1 Cache Cache Core 2 Core 3 Core 2 Core 3 QPI links, IO, etc. QPI links, IO, etc. 18 RED HAT CONFIDENTIAL #rhconvergence Goal: Align process memory and CPU threads within nodes Before: processes use After: processes have cpus & memory from more “localized” cpu & multiple nodes. memory usage. Node 0 Node 1 Node 2 Node 3 Node 0 Node 1 Node 2 Node 3 Process 37 Process 29 Proc Proc Proc 19 Proc 37 Process 19 29 61 Process 61 19 RED HAT CONFIDENTIAL #rhconvergence Numa Due Diligence ● Know your hardware ● lstopo ● numactl --hardware ● Install adapters “close” to the CPU that will run the critical application ● When BIOS reports locality, irqbalance handles NUMA/IRQ affinity automatically. 20 RED HAT CONFIDENTIAL #rhconvergence Numa Due Diligence (cont) ● Know your application's memory usage ● numastat -mcv <proc_name> ● Understand where processes are executing and the memory they access ● Run “top”, then enter “f”, then select “Last used cpu” field ● ps -T -o pid,tid,psr,comm ● Use process placement tools ● numactl, taskset ● mbind, set_mempolicy, sched_setaffinity, pthread_setaffinity_np ● Virtualized environments – just as important! 21 RED HAT CONFIDENTIAL #rhconvergence numastat: per-node meminfo (new) # numastat -mczs Node 0 Node 1 Total ------ ------ ------ MemTotal 65491 65536 131027 MemFree 60366 59733 120099 MemUsed 5124 5803 10927 Active 2650 2827 5477 FilePages 2021 3216 5238 Active(file) 1686 2277 3963 Active(anon) 964 551 1515 AnonPages 964 550 1514 Inactive 341 946 1287 Inactive(file) 340 946 1286 Slab 380 438 818 SReclaimable 208 207 415 SUnreclaim 173 230 403 AnonHugePages 134 236 370 22 RED HAT CONFIDENTIAL #rhconvergence numastat – per-PID mode # numastat -c java (default scheduler – non-optimal) Per-node process memory usage (in MBs) PID Node 0 Node 1 Node 2 Node 3 Total ------------ ------ ------ ------ ------ ----- 57501 (java) 755 1121 480 698 3054 57502 (java) 1068 702 573 723 3067 57503 (java) 649 1129 687 606 3071 57504 (java) 1202 678 1043 150 3073 ------------ ------ ------ ------ ------ ----- Total 3674 3630 2783 2177 12265 # numastat -c java (numabalance close to opt) Per-node process memory usage (in MBs) PID Node 0 Node 1 Node 2 Node 3 Total ------------ ------ ------ ------ ------ ----- 56918 (java) 49 2791 56 37 2933 56919 (java) 2769 76 55 32 2932 56920 (java) 19 55 77 2780 2932 56921 (java) 97 65 2727 47 2936 ------------ ------ ------ ------ ------ ----- Total 2935 2987 2916 2896 11734 23 RED HAT CONFIDENTIAL #rhconvergence Visualize CPUs via lstopo 24 RED HAT CONFIDENTIAL #rhconvergence Visualize NUMA Topology: lstopo NUMA Node 0 NUMA Node 1 How can I visualize my system's NUMA PCI Devices topology in Red Hat Enterprise Linux? https://access.redhat.com/site/solutions/62879 25 RED HAT CONFIDENTIAL #rhconvergence NUMA layout via numactl # numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 4 8 12 16 20 24 28 32 36 node 0 size: 65415 MB node 0 free: 63482 MB node 1 cpus: 2 6 10 14 18 22 26 30 34 38 node 1 size: 65536 MB node 1 free: 63968 MB node 2 cpus: 1 5 9 13 17 21 25 29 33 37 node 2 size: 65536 MB node 2 free: 63897 MB node 3 cpus: 3 7 11 15 19 23 27 31 35 39 node 3 size: 65536 MB node 3 free: 63971 MB node distances: node 0 1 2 3 0: 10 21 21 21 1: 21 10 21 21 2: 21 21 10 21 3: 21 21 21 10 26 RED HAT CONFIDENTIAL #rhconvergence RHEL NUMA Scheduler ● RHEL6 ● numactl, numastat enhancements ● numad – usermode tool, dynamically monitor, auto-tune ● RHEL7 – auto numa balancing ● Moves tasks (threads or processes) closer to the memory they are accessing. ● Moves application data to memory closer to the tasks that reference it. ● A win for most apps. ● Enable / Disable ● sysctl kernel.numabalancing={0,1} 27 RED HAT CONFIDENTIAL #rhconvergence NUMA Performance – SPECjbb2005 on DL980 Westmere EX RHEL7 Auto-Numa-Balance SPECjbb2005 multi-instance - bare metal + kvm 8 socket, 80 cpu, 1TB mem 3500000 3000000 RHEL7 – No_NUMA 2500000 Auto-Numa-Balance – BM Auto-Numa-Balance
Recommended publications
  • FIPS 140-2 Non-Proprietary Security Policy
    Kernel Crypto API Cryptographic Module version 1.0 FIPS 140-2 Non-Proprietary Security Policy Version 1.3 Last update: 2020-03-02 Prepared by: atsec information security corporation 9130 Jollyville Road, Suite 260 Austin, TX 78759 www.atsec.com © 2020 Canonical Ltd. / atsec information security This document can be reproduced and distributed only whole and intact, including this copyright notice. Kernel Crypto API Cryptographic Module FIPS 140-2 Non-Proprietary Security Policy Table of Contents 1. Cryptographic Module Specification ..................................................................................................... 5 1.1. Module Overview ..................................................................................................................................... 5 1.2. Modes of Operation ................................................................................................................................. 9 2. Cryptographic Module Ports and Interfaces ........................................................................................ 10 3. Roles, Services and Authentication ..................................................................................................... 11 3.1. Roles .......................................................................................................................................................11 3.2. Services ...................................................................................................................................................11
    [Show full text]
  • Metrics for Performance Advertisement
    Metrics for Performance Advertisement Vojtěch Horký Peter Libič Petr Tůma 2010 – 2021 This work is licensed under a “CC BY-NC-SA 3.0” license. Created to support the Charles University Performance Evaluation lecture. See http://d3s.mff.cuni.cz/teaching/performance-evaluation for details. Contents 1 Overview 1 2 Operation Frequency Related Metrics 2 3 Operation Duration Related Metrics 3 4 Benchmark Workloads 4 1 Overview Performance Advertisement Measuring for the purpose of publishing performance information. Requirements: – Well defined meaning. – Simple to understand. – Difficult to game. Pitfalls: – Publication makes results subject to pressure. – Often too simple to convey meaningful information. – Performance in contemporary computer systems is never simple. Speed Related Metrics Responsiveness Metrics – Time (Task Response Time) – How long does it take to finish the task ? Productivity Metrics – Rate (Task Throughput) – How many tasks can the system complete per time unit ? Utilization Metrics – Resource Use (Utilization) – How much is the system loaded when working on a task ? – Share of time a resource is busy or over given load level. – Helps identify bottlenecks (most utilized resources in the system). 1 Metric for Webmail Performance What metric would you choose to characterize performance of a web mail site ? – User oriented metric would be end-to-end operation time. – Server oriented metric would be request processing time. – How about metrics in between ? – And would you include mail delivery time ? How is throughput related to latency ? How is utilization defined for various resources ? 2 Operation Frequency Related Metrics Clock Rate Clock rate (frequency) of the component (CPU, bus, memory) in MHz. Most often we talk about CPU frequency.
    [Show full text]
  • The Nutanix Design Guide
    The FIRST EDITION Nutanix Design Guide Edited by AngeloLuciani,VCP By Nutanix, RenévandenBedem, incollaborationwith NPX,VCDX4,DECM-EA RoundTowerTechnologies Table of Contents 1 Foreword 3 2 Contributors 5 3 Acknowledgements 7 4 Using This Book 7 5 Introduction To Nutanix 9 6 Why Nutanix? 17 7 The Nutanix Eco-System 37 8 Certification & Training 43 9 Design Methodology 49 & The NPX Program 10 Channel Charter 53 11 Mission-Critical Applications 57 12 SAP On Nutanix 69 13 Hardware Platforms 81 14 Sizer & Collector 93 15 IBM Power Systems 97 16 Remote Office, Branch Office 103 17 Xi Frame & EUC 113 Table of Contents 18 Xi IoT 121 19 Xi Leap, Data Protection, 133 DR & Metro Availability 20 Cloud Management 145 & Automation: Calm, Xi Beam & Xi Epoch 21 Era 161 22 Karbon 165 23 Acropolis Security & Flow 169 24 Files 193 25 Volumes 199 26 Buckets 203 27 Prism 209 28 Life Cycle Manager 213 29 AHV 217 30 Move 225 31 X-Ray 229 32 Foundation 241 33 Data Center Facilities 245 34 People & Process 251 35 Risk Management 255 The Nutanix Design Guide 1 Foreword Iamhonoredtowritethisforewordfor‘TheNutanixDesignGuide’. Wehavealwaysbelievedthatcloudwillbemorethanjustarented modelforenterprises.Computingwithintheenterpriseisnuanced, asittriestobalancethefreedomandfriction-freeaccessofthe publiccloudwiththesecurityandcontroloftheprivatecloud.The privateclouditselfisspreadbetweencoredatacenters,remoteand branchoffices,andedgeoperations.Thetrifectaofthe3laws–(a) LawsoftheLand(dataandapplicationsovereignty),(b)Lawsof Physics(dataandmachinegravity),and(c)LawsofEconomics
    [Show full text]
  • The Xen Port of Kexec / Kdump a Short Introduction and Status Report
    The Xen Port of Kexec / Kdump A short introduction and status report Magnus Damm Simon Horman VA Linux Systems Japan K.K. www.valinux.co.jp/en/ Xen Summit, September 2006 Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 1 / 17 Outline Introduction to Kexec What is Kexec? Kexec Examples Kexec Overview Introduction to Kdump What is Kdump? Kdump Kernels The Crash Utility Xen Porting Effort Kexec under Xen Kdump under Xen The Dumpread Tool Partial Dumps Current Status Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 2 / 17 Introduction to Kexec Outline Introduction to Kexec What is Kexec? Kexec Examples Kexec Overview Introduction to Kdump What is Kdump? Kdump Kernels The Crash Utility Xen Porting Effort Kexec under Xen Kdump under Xen The Dumpread Tool Partial Dumps Current Status Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 3 / 17 Kexec allows you to reboot from Linux into any kernel. as long as the new kernel doesn’t depend on the BIOS for setup. Introduction to Kexec What is Kexec? What is Kexec? “kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel. It is like a reboot but it is indepedent of the system firmware...” Configuration help text in Linux-2.6.17 Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 4 / 17 . as long as the new kernel doesn’t depend on the BIOS for setup. Introduction to Kexec What is Kexec? What is Kexec? “kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel.
    [Show full text]
  • “Freedom” Koan-Sin Tan [email protected] OSDC.Tw, Taipei Apr 11Th, 2014
    Understanding Android Benchmarks “freedom” koan-sin tan [email protected] OSDC.tw, Taipei Apr 11th, 2014 1 disclaimers • many of the materials used in this slide deck are from the Internet and textbooks, e.g., many of the following materials are from “Computer Architecture: A Quantitative Approach,” 1st ~ 5th ed • opinions expressed here are my personal one, don’t reflect my employer’s view 2 who am i • did some networking and security research before • working for a SoC company, recently on • big.LITTLE scheduling and related stuff • parallel construct evaluation • run benchmarking from time to time • for improving performance of our products, and • know what our colleagues' progress 3 • Focusing on CPU and memory parts of benchmarks • let’s ignore graphics (2d, 3d), storage I/O, etc. 4 Blackbox ! • google image search “benchmark”, you can find many of them are Android-related benchmarks • Similar to recently Cross-Strait Trade in Services Agreement (TiSA), most benchmarks on Android platform are kinda blackbox 5 Is Apple A7 good? • When Apple released the new iPhone 5s, you saw many technical blog showed some benchmarks for reviews they came up • commonly used ones: • GeekBench • JavaScript benchmarks • Some graphics benchmarks • Why? Are they right ones? etc. e.g., http://www.anandtech.com/show/7335/the-iphone-5s-review 6 open blackbox 7 Android Benchmarks 8 http:// www.anandtech.com /show/7384/state-of- cheating-in-android- benchmarks No, not improvement in this way 9 Assuming there is not cheating, what we we can do? Outline • Performance benchmark review • Some Android benchmarks • What we did and what still can be done • Future 11 To quote what Prof.
    [Show full text]
  • The Linux 2.4 Kernel's Startup Procedure
    The Linux 2.4 Kernel’s Startup Procedure William Gatliff 1. Overview This paper describes the Linux 2.4 kernel’s startup process, from the moment the kernel gets control of the host hardware until the kernel is ready to run user processes. Along the way, it covers the programming environment Linux expects at boot time, how peripherals are initialized, and how Linux knows what to do next. 2. The Big Picture Figure 1 is a function call diagram that describes the kernel’s startup procedure. As it shows, kernel initialization proceeds through a number of distinct phases, starting with basic hardware initialization and ending with the kernel’s launching of /bin/init and other user programs. The dashed line in the figure shows that init() is invoked as a kernel thread, not as a function call. Figure 1. The kernel’s startup procedure. Figure 2 is a flowchart that provides an even more generalized picture of the boot process, starting with the bootloader extracting and running the kernel image, and ending with running user programs. Figure 2. The kernel’s startup procedure, in less detail. The following sections describe each of these function calls, including examples taken from the Hitachi SH7750/Sega Dreamcast version of the kernel. 3. In The Beginning... The Linux boot process begins with the kernel’s _stext function, located in arch/<host>/kernel/head.S. This function is called _start in some versions. Interrupts are disabled at this point, and only minimal memory accesses may be possible depending on the capabilities of the host hardware.
    [Show full text]
  • Kdump, a Kexec-Based Kernel Crash Dumping Mechanism
    Kdump, A Kexec-based Kernel Crash Dumping Mechanism Vivek Goyal Eric W. Biederman Hariprasad Nellitheertha IBM Linux NetworkX IBM [email protected] [email protected] [email protected] Abstract important consideration for the success of a so- lution has been the reliability and ease of use. Kdump is a crash dumping solution that pro- Kdump is a kexec based kernel crash dump- vides a very reliable dump generation and cap- ing mechanism, which is being perceived as turing mechanism [01]. It is simple, easy to a reliable crash dumping solution for Linux R . configure and provides a great deal of flexibility This paper begins with brief description of what in terms of dump device selection, dump saving kexec is and what it can do in general case, and mechanism, and plugging-in filtering mecha- then details how kexec has been modified to nism. boot a new kernel even in a system crash event. The idea of kdump has been around for Kexec enables booting into a new kernel while quite some time now, and initial patches for preserving the memory contents in a crash sce- kdump implementation were posted to the nario, and kdump uses this feature to capture Linux kernel mailing list last year [03]. Since the kernel crash dump. Physical memory lay- then, kdump has undergone significant design out and processor state are encoded in ELF core changes to ensure improved reliability, en- format, and these headers are stored in a re- hanced ease of use and cleaner interfaces. This served section of memory. Upon a crash, new paper starts with an overview of the kdump de- kernel boots up from reserved memory and pro- sign and development history.
    [Show full text]
  • Taming Hosted Hypervisors with (Mostly) Deprivileged Execution
    Taming Hosted Hypervisors with (Mostly) Deprivileged Execution Chiachih Wu†, Zhi Wang*, Xuxian Jiang† †North Carolina State University, *Florida State University Virtualization is Widely Used 2 “There are now hundreds of thousands of companies around the world using AWS to run all their business, or at least a portion of it. They are located across 190 countries, which is just about all of them on Earth.” Werner Vogels, CTO at Amazon AWS Summit ‘12 “Virtualization penetration has surpassed 50% of all server workloads, and continues to grow.” Magic Quadrant for x86 Server Virtualization Infrastructure June ‘12 Threats to Hypervisors 3 Large Code Bases Hypervisor SLOC Xen (4.0) 194K VMware ESXi1 200K Hyper-V1 100K KVM (2.6.32.28) 33.6K 1: Data source: NOVA (Steinberg et al., EuroSys ’10) Hypervisor Vulnerabilities Vulnerabilities Xen 41 KVM 24 VMware ESXi 43 VMware Workstation 49 Data source: National Vulnerability Database (‘09~’12) Threats to Hosted Hypervisors 4 Applications … Applications Guest OS Guest OS Hypervisor Host OS Physical Hardware Can we prevent the compromised hypervisor from attacking the rest of the system? DeHype 5 Decomposing the KVM hypervisor codebase De-privileged part user-level (93.2% codebase) Privileged part small kernel module (2.3 KSLOC) Guest VM Applications … Applications Applications Applications … Guest OS Guest OS De-privilege Guest OS Guest OS DeHyped DeHyped KVM KVM’ HypeLet KVM ~4% overhead Host OS Host OS Physical Hardware Physical Hardware Challenges 6 Providing the OS services in user mode Minimizing performance overhead Supporting hardware-assisted memory virtualization at user-level Challenge I 7 Providing the OS services in user mode De-privileged Hypervisor Hypervisor User Kernel Hypervisor HypeLet Host OS Host OS Physical Hardware Physical Hardware Original Hosted Hypervisor DeHype’d Hosted Hypervisor Dependency Decoupling 8 Abstracting the host OS interface and providing OS functionalities in user mode For example Memory allocator: kmalloc/kfree, alloc_page, etc.
    [Show full text]
  • Linux Core Dumps
    Linux Core Dumps Kevin Grigorenko [email protected] Many Interactions with Core Dumps systemd-coredump abrtd Process Crashes Ack! 4GB File! Most Interactions with Core Dumps Poof! Process Crashes systemd-coredump Nobody abrtd Looks core kdump not Poof! Kernel configured Crashes So what? ● Crashes are problems! – May be symptoms of security vulnerabilities – May be application bugs ● Data corruption ● Memory leaks – A hard crash kills outstanding work – Without automatic process restarts, crashes lead to service unavailability ● With restarts, a hacker may continue trying. ● We shouldn't be scared of core dumps. – When a dog poops inside the house, we don't just `rm -f $poo` or let it pile up, we try to figure out why or how to avoid it again. What is a core dump? ● It's just a file that contains virtual memory contents, register values, and other meta-data. – User land core dump: Represents state of a particular process (e.g. from crash) – Kernel core dump: Represents state of the kernel (e.g. from panic) and process data ● ELF-formatted file (like a program) User Land User Land Crash core Process 1 Process N Kernel Panic vmcore What is Virtual Memory? ● Virtual Memory is an abstraction over physical memory (RAM/swap) – Simplifies programming – User land: process isolation – Kernel/processor translate virtual address references to physical memory locations 64-bit Process Virtual 8GB RAM Address Space (16EB) (Example) 0 0 16 8 EB GB How much virtual memory is used? ● Use `ps` or similar tools to query user process virtual memory usage (in KB): – $ ps -o pid,vsz,rss -p 14062 PID VSZ RSS 14062 44648 42508 Process 1 Virtual 8GB RAM Memory Usage (VSZ) (Example) 0 0 Resident Page 1 Resident Page 2 16 8 EB GB Process 2 How much virtual memory is used? ● Virtual memory is broken up into virtual memory areas (VMAs), the sum of which equal VSZ and may be printed with: – $ cat /proc/${PID}/smaps 00400000-0040b000 r-xp 00000000 fd:02 22151273 /bin/cat Size: 44 kB Rss: 20 kB Pss: 12 kB..
    [Show full text]
  • SUSE Linux Enterprise Server 15 SP2 Autoyast Guide Autoyast Guide SUSE Linux Enterprise Server 15 SP2
    SUSE Linux Enterprise Server 15 SP2 AutoYaST Guide AutoYaST Guide SUSE Linux Enterprise Server 15 SP2 AutoYaST is a system for unattended mass deployment of SUSE Linux Enterprise Server systems. AutoYaST installations are performed using an AutoYaST control le (also called a “prole”) with your customized installation and conguration data. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents 1 Introduction to AutoYaST 1 1.1 Motivation 1 1.2 Overview and Concept 1 I UNDERSTANDING AND CREATING THE AUTOYAST CONTROL FILE 4 2 The AutoYaST Control
    [Show full text]
  • Guest-Transparent Instruction Authentication for Self-Patching Kernels
    Guest-Transparent Instruction Authentication for Self-Patching Kernels Dannie M. Stanley, Zhui Deng, and Dongyan Xu Rick Porter Shane Snyder Department of Computer Science Applied Communication Sciences US Army CERDEC Purdue University Piscataway, NJ 08854 Information Assurance Division West Lafayette, IN 47907 Email: [email protected] Aberdeen Proving Ground, MD Email: fds,deng14,[email protected] Abstract—Attackers can exploit vulnerable programs that are system. Security mechanisms like NICKLE have been created running with elevated permissions to insert kernel rootkits into a to prevent kernel rootkits by relocating the vulnerable physical system. Security mechanisms have been created to prevent kernel system to a guest virtual machine and enforcing a W ⊕ KX rootkit implantation by relocating the vulnerable physical system to a guest virtual machine and enforcing a W ⊕ KX memory memory access control policy from the host virtual machine access control policy from the host virtual machine monitor. monitor (VMM)[1]. The W ⊕ KX memory access control Such systems must also be able to identify and authorize the policy guarantees that no region of guest memory is both introduction of known-good kernel code. Previous works use writable and kernel-executable. cryptographic hashes to verify the integrity of kernel code at The guest system must have a way to bypass the W ⊕ KX load-time. The hash creation and verification procedure depends on immutable kernel code. However, some modern kernels restriction to load valid kernel code, such as kernel drivers, into contain self-patching kernel code; they may overwrite executable memory. To distinguish between valid kernel code and mali- instructions in memory after load-time.
    [Show full text]
  • Ubuntu Server Guide Basic Installation Preparing to Install
    Ubuntu Server Guide Welcome to the Ubuntu Server Guide! This site includes information on using Ubuntu Server for the latest LTS release, Ubuntu 20.04 LTS (Focal Fossa). For an offline version as well as versions for previous releases see below. Improving the Documentation If you find any errors or have suggestions for improvements to pages, please use the link at thebottomof each topic titled: “Help improve this document in the forum.” This link will take you to the Server Discourse forum for the specific page you are viewing. There you can share your comments or let us know aboutbugs with any page. PDFs and Previous Releases Below are links to the previous Ubuntu Server release server guides as well as an offline copy of the current version of this site: Ubuntu 20.04 LTS (Focal Fossa): PDF Ubuntu 18.04 LTS (Bionic Beaver): Web and PDF Ubuntu 16.04 LTS (Xenial Xerus): Web and PDF Support There are a couple of different ways that the Ubuntu Server edition is supported: commercial support and community support. The main commercial support (and development funding) is available from Canonical, Ltd. They supply reasonably- priced support contracts on a per desktop or per-server basis. For more information see the Ubuntu Advantage page. Community support is also provided by dedicated individuals and companies that wish to make Ubuntu the best distribution possible. Support is provided through multiple mailing lists, IRC channels, forums, blogs, wikis, etc. The large amount of information available can be overwhelming, but a good search engine query can usually provide an answer to your questions.
    [Show full text]