Optimistic Compiler Optimizations for Network Systems Sapan Bhatia

Total Page:16

File Type:pdf, Size:1020Kb

Optimistic Compiler Optimizations for Network Systems Sapan Bhatia Optimistic compiler optimizations for network systems Sapan Bhatia To cite this version: Sapan Bhatia. Optimistic compiler optimizations for network systems. Software Engineering [cs.SE]. Université Sciences et Technologies - Bordeaux I, 2006. English. tel-00402492 HAL Id: tel-00402492 https://tel.archives-ouvertes.fr/tel-00402492 Submitted on 7 Jul 2009 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. i Ecole´ doctorale de Math´ematiques et Informatique de Bordeaux/No. d’ordre 3169 Optimisations de Compilateur Optimistes pour les Syst`emes R´eseaux (Optimistic compiler optimizations for network systems) THESE` pr´esent´ee et soutenue publiquement le 9 Juin 2006 pour l’obtention du Doctorat de l’universit´ede Bordeaux 1 (sp´ecialit´einformatique) par Sapan Bhatia Composition du jury Pr´esident : Raymond Namyst (Professeur) Rapporteurs : Gilles Muller (Professeur) Marc Shapiro (Directeur de recherche) Directeur de th`ese: Charles Consel (Professeur) Laboratoire Bordelais de Recherche en Informatique ii iii This thesis is dedicated to you, the reader. iv Contents List of Figures viii 1 Introduction 1 1.1 Overview: optimizations and system architecture ........... 2 1.2 Contributions ............................... 3 1.2.1 Protocol-stack specialization ................... 4 1.2.2 Remote specialization ....................... 4 1.2.3 Memory-manager/scheduler co-design to improve server perfor- mance ............................... 5 1.2.4 The broomstick toolkit ...................... 5 1.3 Thesis organization ............................ 6 2 Related Work 7 2.1 Optimizations for network software ................... 8 2.1.1 Optimizing protocol stacks .................... 8 2.1.2 Network-server optimization ................... 9 2.2 Extensible operating systems ....................... 10 2.3 Program specialization .......................... 11 2.4 Cache optimizations ........................... 12 2.5 Miscellaneous work ............................ 12 3 Protocol-Stack Specialization 14 3.1 Specialization of protocol stacks: concept and mechanism ....... 15 3.1.1 Specialization opportunities in the sockets, udp and ip layers . 16 3.1.2 Code guards ............................ 21 3.2 Enabling the specialization ........................ 23 3.2.1 Describing the specialization opportunities to the specializer . 24 3.2.2 Specialization process: local or remote? ............. 25 3.2.3 Specializing locally ........................ 25 3.2.4 Specializing remotely ....................... 25 3.3 Experimental performance evaluation and analysis ........... 26 3.3.1 Experiments ............................ 26 v 3.3.2 Size and performance of specialized code ............ 26 3.4 Conclusion ................................. 29 3.5 Shortcomings and possible extensions .................. 29 4 Remote specialization 30 4.1 Remote Specialization: A birds-eye view ................ 32 4.1.1 From A Programmer’s Point of View .............. 34 4.1.2 From an OS developer’s point of View ............. 36 4.2 Specialization Infrastructure ....................... 38 4.2.1 Functional overview ........................ 38 4.2.2 Detailed descriptions ....................... 40 4.2.3 Code manager ........................... 41 4.2.4 Helper process ........................... 43 4.2.5 Run-time layer .......................... 43 4.2.6 Specialization templates ..................... 44 4.2.7 Specializer and compiler ..................... 44 4.2.8 TCP client and server ...................... 45 4.3 Discussion ................................. 46 4.3.1 Specialization latency ....................... 46 4.4 Discussion ................................. 46 4.4.1 Applicability of our approach .................. 46 4.4.2 Module dependencies ....................... 47 4.5 Conclusion and Future Work ....................... 47 5 Memory-manager/Scheduler Co-design 49 5.1 Event-driven servers ........................... 51 5.1.1 Overview ............................. 51 5.1.2 Performance of event-driven servers ............... 52 5.1.3 Caching behavior ......................... 53 5.2 Eliminating data-cache misses ...................... 55 5.2.1 The stingy allocator ....................... 55 5.2.2 Constraints ............................ 56 5.2.3 Objective function ........................ 57 5.2.4 Scheduling for cache efficiency .................. 58 5.3 Performance evaluation .......................... 58 5.3.1 Benchmarking methodology ................... 58 5.3.2 Tools ................................ 58 5.3.3 Environment ........................... 60 5.4 Performance analysis ........................... 60 5.4.1 Httperf .............................. 60 5.4.2 Apachebench ........................... 60 vi 5.4.3 Analysis .............................. 62 5.5 Shortcomings and possible extensions .................. 63 6 The Broomstick Optimizer 64 6.1 A programmer’s eye view ......................... 64 6.2 Analyses and transformations ...................... 67 6.2.1 Identifying the stages and the scheduler ............. 67 6.2.2 Memory analysis ......................... 69 6.2.3 Parameterizing the stingy allocator ............... 71 6.2.4 Code annotations ......................... 71 6.3 Application to real programs ....................... 72 6.4 Applicability ................................ 72 6.5 Conclusion ................................. 73 7 Case Study: The TUX server 76 7.1 The tux web server ............................ 76 7.1.1 Overview ............................. 76 7.1.2 Server architecture ........................ 77 7.1.3 Staging and scheduling ...................... 77 7.1.4 Memory management. ...................... 78 7.2 Specializing TUX ............................. 79 7.2.1 Protocol served .......................... 79 7.2.2 Number of CPUs, Sockets .................... 80 7.2.3 Server actions ........................... 80 7.2.4 Mime-types handled ....................... 81 7.2.5 Other options ........................... 81 7.2.6 Experiments ............................ 81 7.3 Cache-related slowdowns in TUX .................... 82 7.4 Cache-optimizing the TUX server: an experience study ........ 83 7.4.1 Preparatory step ......................... 83 7.4.2 Memory analysis with memwalk ................. 84 7.4.3 Generating a customized allocator using stingygen ...... 86 7.4.4 Modifying the server to use the customized allocator ..... 87 7.4.5 Modifying the scheduler ..................... 87 7.5 Performance evaluation .......................... 91 7.5.1 Benchmarking methodology ................... 91 7.5.2 Tools ................................ 91 7.5.3 Environment ........................... 92 7.6 Performance analysis ........................... 92 7.6.1 Httperf .............................. 93 7.6.2 Apachebench ........................... 93 vii 7.6.3 Analysis .............................. 93 7.7 Shortcomings and possible extensions .................. 95 8 Conclusion 97 Publications derived from this thesis 99 Bibliography of Related Work 100 viii List of Figures 1.1 High-level view of the optimization framework. (a) Specialization of protocol stacks. (b) Optimization of an event-driven network server. 2 3.1 Fast-path of the UDP send operation in BSD ............. 22 3.2 Specialization declarations ........................ 24 3.3 Specialization results: Performance, code size and overhead. ..... 27 3.4 Original vs. specialized code: maximum throughput on the PIII, 486 and iPAQ for UDP (±5% at 99%) confidence and TCP (±2.5% at 99% confidence) ................................ 28 4.1 Requesting and using a specialized functionality. ............ 33 4.2 A fallback functionality. ......................... 33 4.3 From an application developer’s point of view. ............. 34 4.4 A fragment of the binding-time annotations for the send system call . 37 4.5 Configuration information provided to the stub generator ....... 38 4.6 Taxonomy of functional units of the specialization infrastructure ... 39 4.7 User-support functions for the TCP/IP stack. ............. 40 4.8 Code excerpt: do˙specialize ........................ 42 4.9 Code excerpt: done˙specialize ...................... 42 4.10 Code excerpt: specialized ......................... 43 4.11 Remote specialization infrastructure ................... 45 5.1 An event-driven server. .......................... 52 5.2 Per-task state during program execution ................ 54 5.3 Throughput degradation with increasing L2 cache misses. ....... 54 5.4 (a) Throughput of TUX with increasing concurrency. (b) Correspond- ing increase in L2 cache misses (c) Peak performance of TUX for uni- form load. ................................. 61 5.5 Comparison of the performance of the original thttpd server to that of the optimized thttpd server ....................... 62 6.1 Example annotations and wrappers for TUX. ............. 65 ix 6.2 Interface used to extract the structure and memory utilization behavior. 66 6.3 Modifying the scheduler: A high-level conceptual overview ...... 66 6.4 A fragment of the Stage Call Graph (SCG) of TUX. nf
Recommended publications
  • Security Analysis of Docker Containers in a Production Environment
    Security analysis of Docker containers in a production environment Jon-Anders Kabbe Master of Science in Communication Technology Submission date: June 2017 Supervisor: Colin Alexander Boyd, IIK Co-supervisor: Audun Bjørkøy, TIND Technologies Norwegian University of Science and Technology Department of Information Security and Communication Technology Title: Security Analysis of Docker Containers in a Production Environment Student: Jon-Anders Kabbe Problem description: Deployment of Docker containers has achieved its popularity by providing an au- tomatable and stable environment serving a particular application. Docker creates a separate image of a file system with everything the application require during runtime. Containers run atop the regular file system as separate units containing only libraries and tools that particular application require. Docker containers reduce the attack surface, improves process interaction and sim- plifies sandboxing. But containers also raise security concerns, reduced segregation between the operating system and application, out of date or variations in libraries and tools and is still an unsettled technology. Docker containers provide a stable and static environment for applications; this is achieved by creating a static image of only libraries and tools the specific application needs during runtime. Out of date tools and libraries are a major security risk when exposing applications over the Internet, but availability is essential in a competitive market. Does Docker raise some security concerns compared to standard application deploy- ment such as hypervisor-based virtual machines? Is the Docker “best practices” sufficient to secure the container and how does this compare to traditional virtual machine application deployment? Responsible professor: Colin Alexander Boyd, ITEM Supervisor: Audun Bjørkøy, TIND Technologies Abstract Container technology for hosting applications on the web is gaining traction as the preferred mode of deployment.
    [Show full text]
  • Firecracker: Lightweight Virtualization for Serverless Applications
    Firecracker: Lightweight Virtualization for Serverless Applications Alexandru Agache, Marc Brooker, Andreea Florescu, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa, Amazon Web Services https://www.usenix.org/conference/nsdi20/presentation/agache This paper is included in the Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20) February 25–27, 2020 • Santa Clara, CA, USA 978-1-939133-13-7 Open access to the Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20) is sponsored by Firecracker: Lightweight Virtualization for Serverless Applications Alexandru Agache Marc Brooker Andreea Florescu Amazon Web Services Amazon Web Services Amazon Web Services Alexandra Iordache Anthony Liguori Rolf Neugebauer Amazon Web Services Amazon Web Services Amazon Web Services Phil Piwonka Diana-Maria Popa Amazon Web Services Amazon Web Services Abstract vantage over traditional server provisioning processes: mul- titenancy allows servers to be shared across a large num- Serverless containers and functions are widely used for de- ber of workloads, and the ability to provision new func- ploying and managing software in the cloud. Their popularity tions and containers in milliseconds allows capacity to be is due to reduced cost of operations, improved utilization of switched between workloads quickly as demand changes. hardware, and faster scaling than traditional deployment meth- Serverless is also attracting the attention of the research com- ods. The economics and scale of serverless applications de- munity [21,26,27,44,47], including work on scaling out video mand that workloads from multiple customers run on the same encoding [13], linear algebra [20, 53] and parallel compila- hardware with minimal overhead, while preserving strong se- tion [12].
    [Show full text]
  • A Reliable Booting System for Zynq Ultrascale+ Mpsoc Devices
    A reliable booting system for Zynq Ultrascale+ MPSoC devices An embedded solution that provides fallbacks in different parts of the Zynq MPSoC booting process, to assure successful booting into a Linux operating system. A thesis presented for the Bachelor of Science in Electrical Engineering at the University of Applied Sciences Utrecht Name Nekija Dˇzemaili Student ID 1702168 University supervisor Corn´eDuiser CERN supervisors Marc Dobson & Petr Zejdlˇ Field of study Electrical Engineering (embedded systems) CERN-THESIS-2021-031 17/03/2021 February 15th, 2021 Geneva, Switzerland A reliable booting system for Zynq Ultrascale+ MPSoC devices Disclaimer The board of the foundation HU University of Applied Sciences in Utrecht does not accept any form of liability for damage resulting from usage of data, resources, methods, or procedures as described in this report. Duplication without consent of the author or the college is not permitted. If the graduation assignment is executed within a company, explicit consent of the company is necessary for duplication or copying of text from this report. Het bestuur van de Stichting Hogeschool Utrecht te Utrecht aanvaardt geen enkele aansprakelijkheid voor schade voortvloeiende uit het gebruik van enig gegeven, hulpmiddel, werkwijze of procedure in dit verslag beschreven. Vermenigvuldiging zonder toestemming van de auteur(s) en de school is niet toegestaan. Indien het afstudeerwerk in een bedrijf is verricht, is voor vermenigvuldiging of overname van tekst uit dit verslag eveneens toestemming van het bedrijf vereist. N. Dˇzemaili page 1 of 110 A reliable booting system for Zynq Ultrascale+ MPSoC devices Preface This thesis was written for the BSc Electrical Engineering degree of the HU University of Applied Sciences Utrecht, the Netherlands.
    [Show full text]
  • SAS RAID Controllers for Linux on the M6-700 Or M7-700
    SAS RAID controllers for Linux on the M6-700 or M7-700 ESCALA Power7 REFERENCE 86 A1 65FF 01 ESCALA Power7 SAS RAID controllers for Linux on the M6-700 or M7-700 This publication concerns the following models: - Bull Escala E5-700 (Power 750 / 8233-E8B) - Bull Escala M6-700 (Power 770 / 9117-MMB) - Bull Escala M7-700 (Power 780 / 9179-MHB) - Bull Escala E1-700 (Power 710 / 8231-E2B) - Bull Escala E2-700 / E2-700T (Power 720 / 8202-E4B) - Bull Escala E3-700 (Power 730 / 8231-E2B) - Bull Escala E4-700 / E4-700T (Power 740 / 8205-E6B) References to Power 755 / 8236-E8C models are irrelevant. Hardware September 2010 BULL CEDOC 357 AVENUE PATTON B.P.20845 49008 ANGERS CEDEX 01 FRANCE REFERENCE 86 A1 65FF 01 The following copyright notice protects this book under Copyright laws which prohibit such actions as, but not limited to, copying, distributing, modifying, and making derivative works. Copyright Bull SAS 2007-2010 Printed in France Suggestions and criticisms concerning the form, content, and presentation of this book are invited. A form is provided at the end of this book for this purpose. To order additional copies of this book or other Bull Technical Publications, you are invited to use the Ordering Form also provided at the end of this book. Trademarks and Acknowledgements We acknowledge the right of proprietors of trademarks mentioned in this book. The information in this document is subject to change without notice. Bull will not be liable for errors contained herein, or for incidental or consequential damages in connection with the use of this material.
    [Show full text]
  • 2018 NSFOCUS Technical Report on Container Security NSFOCUS Star Cloud Laboratory
    01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101 2018 NSFOCUS Technical Report on Container Security NSFOCUS Star Cloud Laboratory © 2018 NSFOCUS About NSFOCUS NSFOCUS is an iconic internet and application security company with over 18 years of proven industry experience. Today, we are operating globally with 2000+ employees at two headquarters in Beijing, China and 40+ offices worldwide including the IBD HQ in Santa Clara, CA, USA. NSFOCUS protects four of the ten largest global telecommunications companies and four of the five largest global financial institutions. With its multi-tenant and distributed cloud security platform, NSFOCUS effectively moves security into the internet backbone by: operating in data centers around the world, enabling organizations to fully leverage the promise of cloud computing, providing unparalleled and uncompromising protection and performance, and empowering our partners to provide better security as a service in a smart and simple way. NSFOCUS delivers holistic, carrier-grade, hybrid DDoS and web security powered by industry leading threat intelligence. About NeuVector NeuVector is the first company to develop Docker/Kubernetes security products. It is the leader in container network security and delivers the first and only multivector container security platform. NeuVector has committed to guaranteeing the security of enterprise-level container platforms, with its products applicable to cloud, multi-cloud, and
    [Show full text]
  • Professional Linux Programming / Jon Masters, Richard Blum
    01_776130 ffirs.qxp 2/2/07 10:13 PM Page iii Professional Linux® Programming Jon Masters Richard Blum 01_776130 ffirs.qxp 2/2/07 10:13 PM Page ii 01_776130 ffirs.qxp 2/2/07 10:13 PM Page i Professional Linux® Programming 01_776130 ffirs.qxp 2/2/07 10:13 PM Page ii 01_776130 ffirs.qxp 2/2/07 10:13 PM Page iii Professional Linux® Programming Jon Masters Richard Blum 01_776130 ffirs.qxp 2/2/07 10:13 PM Page iv Professional Linux® Programming Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2007 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data: Masters, Jon, 1981- Professional Linux programming / Jon Masters, Richard Blum. p. cm. Includes index. ISBN: 978-0-471-77613-0 (paper/website) 1. Linux. 2. Operating systems (Computers) I. Blum, Richard, 1962- II. Title. QA76.76.O63M37153 2007 005.4’32—dc22 2006102202 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600.
    [Show full text]
  • Inline and Sideline Approaches for Low-Cost Memory Safety in C
    UCAM-CL-TR-954 Technical Report ISSN 1476-2986 Number 954 Computer Laboratory Inline and sideline approaches for low-cost memory safety in C Myoung Jin Nam February 2021 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 https://www.cl.cam.ac.uk/ c 2021 Myoung Jin Nam This technical report is based on a dissertation submitted November 2020 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Selwyn College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: https://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 Inline and Sideline Approaches for Low-cost Memory Safety in C Myoung Jin Nam System languages such as C or C++ are widely used for their high performance, however the allowance of arbitrary pointer arithmetic and typecast introduces a risk of memory corruptions. These memory errors cause unexpected termination of pro- grams, or even worse, attackers can exploit them to alter the behavior of programs or leak crucial data. Despite advances in memory safety solutions, high and unpredictable overhead remains a major challenge. Accepting that it is extremely difficult to achieve com- plete memory safety with the performance level suitable for production deployment, researchers attempt to strike a balance between performance, detection coverage, in- teroperability, precision, and detection timing. Some properties are much more desir- able, e.g. the interoperability with pre-compiled libraries. Comparatively less critical properties are sacrificed for performance, for example, tolerating longer detection de- lay or narrowing down detection coverage by performing approximate or probabilistic checking or detecting only certain errors.
    [Show full text]
  • Linux Kernel Security Overview
    Linux Kernel Security Overview Kernel Conference Australia Brisbane, 2009 James Morris [email protected] Introduction Historical Background ● Linux started out with traditional Unix security – Discretionary Access Control (DAC) ● Security has been enhanced, but is constrained by original Unix design, POSIX etc. ● Approach is continual retrofit of newer security schemes, rather than fundamental redesign “The first fact to face is that UNIX was not developed with security, in any realistic sense, in mind; this fact alone guarantees a vast number of holes.” Dennis Ritchie, “On the Security of UNIX”, 1979 DAC ● Simple and quite effective, but inadequate for modern environment: – Does not protect against flawed or malicious code ● Linux implementation stems from traditional Unix: – User and group IDs – User/group/other + read/write/execute – User controls own policy – Superuser can violate policy “It must be recognized that the mere notion of a super-user is a theoretical, and usually practical, blemish on any protection scheme.” Ibid. Extended DAC ● POSIX Capabilities (privileges) – Process-based since Linux kernel v2.2 ● Limited usefulness – File-based support relatively recent (v2.6.24) ● May help eliminate setuid root binaries ● Access Control Lists (ACLs) – Based on abandoned POSIX spec – Uses extended attributes API Linux Namespaces ● File system namespaces introduced in 2000, derived from Plan 9. – Not used much until mount propagation provided more flexibility (e.g. shared RO “/”) – Mounts private by default ● Syscalls unshare(2) and
    [Show full text]
  • Container Security Fundamental Technology Concepts That Protect Containerized Applications
    Cloud Native Security Unsure where your organization stands today? This self-assessment evaluates what you should work on to maximize your cloud native security posture Get Started > Container Security Fundamental Technology Concepts that Protect Containerized Applications Liz Rice Beijing Boston Farnham Sebastopol Tokyo Container Security by Liz Rice Copyright © 2020 Vertical Shift Ltd. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Acquisitions Editor: John Devins Indexer: Devon Thomas Developmental Editor: Virginia Wilson Interior Designer: David Futato Production Editor: Nan Barber Cover Designer: Karen Montgomery Copyeditor: Arthur Johnson Illustrator: Rebecca Demarest Proofreader: Kim Wimpsett April 2020: First Edition Revision History for the First Edition 2020-04-01: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492056706 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Container Security, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work.
    [Show full text]
  • Performance Analysis for Machine Learning Applications
    Performance Analysis for Machine Learning Applications The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Wang, Yu. 2020. Performance Analysis for Machine Learning Applications. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences. Citable link https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37365132 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Performance Analysis for Machine Learning Applications a dissertation presented by Yu Wang to The Department of School of Engineering and Applied Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science Harvard University Cambridge, Massachusetts Sep 2019 ©2019 – Yu Wang all rights reserved. Thesis advisor: Professor David Brooks and Gu-Yeon Wei Yu Wang Performance Analysis for Machine Learning Applications Abstract Performance analysis has been driving the advancements of software and hardware systems for decades. Proper analysis can reveal system and architectural bottlenecks, provide essential infor- mation for choosing frameworks and platforms, and further lead to performance optimizations. Systematic performance analysis is difficult since it involves various problem domains, algorithms, software stacks, systems, and hardware. Recently, the surge of machine learning applications and domain-specific architectures stimulates rapid evolution of all those dimensions, thus making perfor- mance analysis increasingly challenging. To tackle this problem, this dissertation conducts deep and systematic performance analysis for a variety of workloads, software frameworks, and hardware systems, and demonstrates proper ways to apply several performance analysis methodologies.
    [Show full text]
  • Report of Contributions
    Linux Plumbers Conference 2020 Report of Contributions https://www.linuxplumbersconf.org/e/LPC2020 Linux Plumbers … / Report of Contributions Data-race detection in the Linux k … Contribution ID: 17 Type: not specified Data-race detection in the Linux kernel Monday, 24 August 2020 10:00 (45 minutes) In this talk, we will discuss data-race detection in the Linux kernel. The talk starts by briefly providing background on data races, how they relate to the Linux-kernel Memory Consistency Model (LKMM), and why concurrency bugs can be so subtle and hard to diagnose (with a few examples). Following that, we will discuss past attempts at data-race detectors for the Linux kernel and why they never reached production quality to make it into the mainline Linux kernel. We argue that a key piece to the puzzle is the design of the data-race detector: it needs to be as non- intrusive as possible, simple, scalable, seamlessly evolve with the kernel, and favor false negatives over false positives. Following that, we will discuss the Kernel Concurrency Sanitizer (KCSAN) and its design and some implementation details. Our story also shows that a good baseline design only gets us so far, and most important was early community feedback and iterating. We also discuss how KCSAN goes even further, and can help detect concurrency bugs that are not data races. Tentative Outline: - Background – What are data races? – Concurrency bugs are subtle: some examples - Data-race detection in the Linux kernel – Past attempts and why they never made it upstream – What is
    [Show full text]
  • Android Internals::A Confectioner's Cookbook (Volume I)
    RL Android Internals A Confectioner's Cookbook Volume I: The Power User's View With updates for Android M, Preview Release 1 (6/2015) By Jonathan Levin Cambridge, MA In memoriam: Frank R. Dye. I missed you by a day, and will miss you for a lifetime. Android Internals - A Confectioner's Cookbook - Volume I - The Power User's View © 2015 by Jonathan Levin. All rights reserved. No part of this work may be reproduced, transmitted in any form or any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the author. Printed in Cambridge, MA, USA First Printing5-Mp1-as ISBN-10: 9910555-2-4 ISBN-13: 978-0-9910555-2-4 Publisher: Technologeeks.com Production Editor: Jonathan Levin Interior Design: Jonathan Levin Technical Reviewers: Moshe Kravchik, Arie Haenel Cover Illustration: Dino Tsiopanos For information on distribution, translations, or bulk sales, please contact Jonathan Levin, at: Jonathan Levin Phone: (617)-3000-667 Email: [email protected] Web: www.NewAndroidBook.com Company and product names mentioned in this work may be trademarked by their respective owners. Any and every such name is used solely in an editorial fashion, to the benefit of the trademark owner, with no intention whatsoever of infringement. The Android robot is reproduced and/or modified from work created and shared by Google, according to the terms of the Creative Commons 3.0 Attribution License. Limit of Liability/Disclaimer of Warranty: The author neither makes or implies any representations or warranties with respect to the accuracy or completeness of the contents of this work.
    [Show full text]