Is Parallel Programming Hard, And, If So, What Can You Do About It?

Total Page:16

File Type:pdf, Size:1020Kb

Is Parallel Programming Hard, And, If So, What Can You Do About It? Is Parallel Programming Hard, And, If So, What Can You Do About It? Edited by: Paul E. McKenney Linux Technology Center IBM Beaverton [email protected] December 16, 2011 ii Legal Statement This work represents the views of the authors and does not necessarily represent the view of their employers. IBM, zSeries, and Power PC are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. Linux is a registered trademark of Linus Torvalds. i386 is a trademarks of Intel Corporation or its subsidiaries in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of such companies. The non-source-code text and images in this doc- ument are provided under the terms of the Creative Commons Attribution-Share Alike 3.0 United States li- cense (http://creativecommons.org/licenses/ by-sa/3.0/us/). In brief, you may use the contents of this document for any purpose, personal, commercial, or otherwise, so long as attribution to the authors is maintained. Likewise, the document may be modified, and derivative works and translations made available, so long as such modifications and derivations are offered to the public on equal terms as the non-source-code text and images in the original document. Source code is covered by various versions of the GPL (http://www.gnu.org/licenses/gpl-2.0.html). Some of this code is GPLv2-only, as it derives from the Linux kernel, while other code is GPLv2-or-later. See the CodeSamples directory in the git archive (git://git.kernel.org/pub/scm/linux/ kernel/git/paulmck/perfbook.git) for the exact licenses, which are included in comment headers in each file. If you are unsure of the license for a given code fragment, you should assume GPLv2-only. Combined work © 2005-2011 by Paul E. McKenney. Contents 1 Introduction 1 1.1 Historic Parallel Programming Difficulties . 1 1.2 Parallel Programming Goals . 2 1.2.1 Performance . 2 1.2.2 Productivity . 3 1.2.3 Generality . 4 1.3 Alternatives to Parallel Programming . 5 1.3.1 Multiple Instances of a Sequential Application . 6 1.3.2 Make Use of Existing Parallel Software . 6 1.3.3 Performance Optimization . 6 1.4 What Makes Parallel Programming Hard? . 7 1.4.1 Work Partitioning . 7 1.4.2 Parallel Access Control . 8 1.4.3 Resource Partitioning and Replication . 8 1.4.4 Interacting With Hardware . 8 1.4.5 Composite Capabilities . 8 1.4.6 How Do Languages and Environments Assist With These Tasks? . 9 1.5 Guide to This Book . 9 1.5.1 Quick Quizzes . 9 1.5.2 Sample Source Code . 9 2 Hardware and its Habits 11 2.1 Overview . 11 2.1.1 Pipelined CPUs . 11 2.1.2 Memory References . 12 2.1.3 Atomic Operations . 12 2.1.4 Memory Barriers . 13 2.1.5 Cache Misses . 13 2.1.6 I/O Operations . 13 2.2 Overheads . 14 2.2.1 Hardware System Architecture . 14 2.2.2 Costs of Operations . 15 2.3 Hardware Free Lunch? . 16 2.3.1 3D Integration . 17 2.3.2 Novel Materials and Processes . 17 2.3.3 Special-Purpose Accelerators . 17 2.3.4 Existing Parallel Software . 18 2.4 Software Design Implications . 18 iii iv CONTENTS 3 Tools of the Trade 19 3.1 Scripting Languages . 19 3.2 POSIX Multiprocessing . 20 3.2.1 POSIX Process Creation and Destruction . 20 3.2.2 POSIX Thread Creation and Destruction . 21 3.2.3 POSIX Locking . 21 3.2.4 POSIX Reader-Writer Locking . 23 3.3 Atomic Operations . 25 3.4 Linux-Kernel Equivalents to POSIX Operations . 26 3.5 The Right Tool for the Job: How to Choose? . 26 4 Counting 29 4.1 Why Isn’t Concurrent Counting Trivial? . 30 4.2 Statistical Counters . 31 4.2.1 Design . 31 4.2.2 Array-Based Implementation . 31 4.2.3 Eventually Consistent Implementation . 32 4.2.4 Per-Thread-Variable-Based Implementation . 33 4.2.5 Discussion . 34 4.3 Approximate Limit Counters . 34 4.3.1 Design . 34 4.3.2 Simple Limit Counter Implementation . 35 4.3.3 Simple Limit Counter Discussion . 38 4.3.4 Approximate Limit Counter Implementation . 38 4.3.5 Approximate Limit Counter Discussion . 38 4.4 Exact Limit Counters . 39 4.4.1 Atomic Limit Counter Implementation . 39 4.4.2 Atomic Limit Counter Discussion . 42 4.4.3 Signal-Theft Limit Counter Design . 42 4.4.4 Signal-Theft Limit Counter Implementation . 42 4.4.5 Signal-Theft Limit Counter Discussion . 45 4.5 Applying Specialized Parallel Counters . 45 4.6 Parallel Counting Discussion . 46 5 Partitioning and Synchronization Design 49 5.1 Partitioning Exercises . 49 5.1.1 Dining Philosophers Problem . 49 5.1.2 Double-Ended Queue . 51 5.1.3 Partitioning Example Discussion . 55 5.2 Design Criteria . 56 5.3 Synchronization Granularity . 57 5.3.1 Sequential Program . 57 5.3.2 Code Locking . 58 5.3.3 Data Locking . 58 5.3.4 Data Ownership . 60 5.3.5 Locking Granularity and Performance . 61 5.4 Parallel Fastpath . 63 5.4.1 Reader/Writer Locking . 63 5.4.2 Hierarchical Locking . ..
Recommended publications
  • Antipatterns Refactoring Software, Architectures, and Projects in Crisis
    AntiPatterns Refactoring Software, Architectures, and Projects in Crisis William J. Brown Raphael C. Malveau Hays W. McCormick III Thomas J. Mowbray John Wiley & Sons, Inc. Publisher: Robert Ipsen Editor: Theresa Hudson Managing Editor: Micheline Frederick Text Design & Composition: North Market Street Graphics Copyright © 1998 by William J. Brown, Raphael C. Malveau, Hays W. McCormick III, and Thomas J. Mowbray. All rights reserved. Published by John Wiley & Sons, Inc. Published simultaneously in Canada. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per−copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750−8400, fax (978) 750−. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158−, (212) 850−, fax (212) 850−, E−Mail: PERMREQ @ WILEY.COM. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in professional services. If professional advice or other expert assistance is required, the services of a competent professional person should be sought. Designations used by companies to distinguish their products are often claimed as trademarks. In all instances where John Wiley & Sons, Inc., is aware of a claim, the product names appear in initial capital or all capital letters.
    [Show full text]
  • Performance Analyses and Code Transformations for MATLAB Applications Patryk Kiepas
    Performance analyses and code transformations for MATLAB applications Patryk Kiepas To cite this version: Patryk Kiepas. Performance analyses and code transformations for MATLAB applications. Computa- tion and Language [cs.CL]. Université Paris sciences et lettres, 2019. English. NNT : 2019PSLEM063. tel-02516727 HAL Id: tel-02516727 https://pastel.archives-ouvertes.fr/tel-02516727 Submitted on 24 Mar 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Préparée à MINES ParisTech Analyses de performances et transformations de code pour les applications MATLAB Performance analyses and code transformations for MATLAB applications Soutenue par Composition du jury : Patryk KIEPAS Christine EISENBEIS Le 19 decembre 2019 Directrice de recherche, Inria / Paris 11 Présidente du jury João Manuel Paiva CARDOSO Professeur, University of Porto Rapporteur Ecole doctorale n° 621 Erven ROHOU Ingénierie des Systèmes, Directeur de recherche, Inria Rennes Rapporteur Matériaux, Mécanique, Michel BARRETEAU Ingénieur de recherche, THALES Examinateur Énergétique Francois GIERSCH Ingénieur de recherche, THALES Invité Spécialité Claude TADONKI Informatique temps-réel, Chargé de recherche, MINES ParisTech Directeur de thèse robotique et automatique Corinne ANCOURT Maître de recherche, MINES ParisTech Co-directrice de thèse Jarosław KOŹLAK Professeur, AGH UST Co-directeur de thèse 2 Abstract MATLAB is an interactive computing environment with an easy programming language and a vast library of built-in functions.
    [Show full text]
  • 11. Kernel Design 11
    11. Kernel Design 11. Kernel Design Interrupts and Exceptions Low-Level Synchronization Low-Level Input/Output Devices and Driver Model File Systems and Persistent Storage Memory Management Process Management and Scheduling Operating System Trends Alternative Operating System Designs 269 / 352 11. Kernel Design Bottom-Up Exploration of Kernel Internals Hardware Support and Interface Asynchronous events, switching to kernel mode I/O, synchronization, low-level driver model Operating System Abstractions File systems, memory management Processes and threads Specific Features and Design Choices Linux 2.6 kernel Other UNIXes (Solaris, MacOS), Windows XP and real-time systems 270 / 352 11. Kernel Design – Interrupts and Exceptions 11. Kernel Design Interrupts and Exceptions Low-Level Synchronization Low-Level Input/Output Devices and Driver Model File Systems and Persistent Storage Memory Management Process Management and Scheduling Operating System Trends Alternative Operating System Designs 271 / 352 11. Kernel Design – Interrupts and Exceptions Hardware Support: Interrupts Typical case: electrical signal asserted by external device I Filtered or issued by the chipset I Lowest level hardware synchronization mechanism Multiple priority levels: Interrupt ReQuests (IRQ) I Non-Maskable Interrupts (NMI) Processor switches to kernel mode and calls specific interrupt service routine (or interrupt handler) Multiple drivers may share a single IRQ line IRQ handler must identify the source of the interrupt to call the proper → service routine 272 / 352 11. Kernel Design – Interrupts and Exceptions Hardware Support: Exceptions Typical case: unexpected program behavior I Filtered or issued by the chipset I Lowest level of OS/application interaction Processor switches to kernel mode and calls specific exception service routine (or exception handler) Mechanism to implement system calls 273 / 352 11.
    [Show full text]
  • Large-Scale Software Architecture
    Large-Scale Software Architecture A Practical Guide using UML Jeff Garland CrystalClear Software Inc. Richard Anthony Object Computing Inc. Large-Scale Software Architecture Large-Scale Software Architecture A Practical Guide using UML Jeff Garland CrystalClear Software Inc. Richard Anthony Object Computing Inc. Copyright # 2003 by John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777 Email (for orders and customer service enquiries): [email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the publication. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to [email protected], or faxed to (+44) 1243 770571. Neither the authors nor John Wiley & Sons, Ltd accept any responsibility or liability for loss or damage occasioned to any person or property through using the material, instructions, methods or ideas contained herein, or acting or freraining from acting as a result of such use.
    [Show full text]
  • Understanding the Linux Kernel, 3Rd Edition by Daniel P
    1 Understanding the Linux Kernel, 3rd Edition By Daniel P. Bovet, Marco Cesati ............................................... Publisher: O'Reilly Pub Date: November 2005 ISBN: 0-596-00565-2 Pages: 942 Table of Contents | Index In order to thoroughly understand what makes Linux tick and why it works so well on a wide variety of systems, you need to delve deep into the heart of the kernel. The kernel handles all interactions between the CPU and the external world, and determines which programs will share processor time, in what order. It manages limited memory so well that hundreds of processes can share the system efficiently, and expertly organizes data transfers so that the CPU isn't kept waiting any longer than necessary for the relatively slow disks. The third edition of Understanding the Linux Kernel takes you on a guided tour of the most significant data structures, algorithms, and programming tricks used in the kernel. Probing beyond superficial features, the authors offer valuable insights to people who want to know how things really work inside their machine. Important Intel-specific features are discussed. Relevant segments of code are dissected line by line. But the book covers more than just the functioning of the code; it explains the theoretical underpinnings of why Linux does things the way it does. This edition of the book covers Version 2.6, which has seen significant changes to nearly every kernel subsystem, particularly in the areas of memory management and block devices. The book focuses on the following topics: • Memory management, including file buffering, process swapping, and Direct memory Access (DMA) • The Virtual Filesystem layer and the Second and Third Extended Filesystems • Process creation and scheduling • Signals, interrupts, and the essential interfaces to device drivers • Timing • Synchronization within the kernel • Interprocess Communication (IPC) • Program execution Understanding the Linux Kernel will acquaint you with all the inner workings of Linux, but it's more than just an academic exercise.
    [Show full text]
  • [0470848499]Large-Scale Software Architecture.Pdf
    Y L F M A E T Team-Fly® Large-Scale Software Architecture A Practical Guide using UML Jeff Garland CrystalClear Software Inc. Richard Anthony Object Computing Inc. Large-Scale Software Architecture Large-Scale Software Architecture A Practical Guide using UML Jeff Garland CrystalClear Software Inc. Richard Anthony Object Computing Inc. Copyright # 2003 by John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777 Email (for orders and customer service enquiries): [email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the publication. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to [email protected], or faxed to (+44) 1243 770571. Neither the authors nor John Wiley & Sons, Ltd accept any responsibility or liability for loss or damage occasioned to any person or property through using the material, instructions, methods or ideas contained herein, or acting or freraining from acting as a result of such use.
    [Show full text]
  • Linux Kernel Synchronization
    10/27/12 Logical Diagram Binary Memory Threads Formats Allocators Linux kernel Today’s Lecture User SynchronizationSystem Calls Kernel synchronization in the kernel RCU File System Networking Sync Don Porter CSE 506 Memory Device CPU Management Drivers Scheduler Hardware Interrupts Disk Net Consistency Why Linux Warm-up synchronization? ò What is synchronization? ò A modern OS kernel is one of the most complicated parallel programs you can study ò Code on multiple CPUs coordinate their operations ò Examples: ò Other than perhaps a database ò Includes most common synchronization patterns ò Locking provides mutual exclusion while changing a pointer-based data structure ò And a few interesting, uncommon ones ò Threads might wait at a barrier for completion of a phase of computation ò Coordinating which CPU handles an interrupt The old days: They didn’t Historical perspective worry! ò Why did OSes have to worry so much about ò Early/simple OSes (like JOS, pre-lab4): No need for synchronization back when most computers have only synchronization one CPU? ò All kernel requests wait until completion – even disk requests ò Heavily restrict when interrupts can be delivered (all traps use an interrupt gate) ò No possibility for two CPUs to touch same data 1 10/27/12 Slightly more recently A slippery slope ò Optimize kernel performance by blocking inside the kernel ò We can enable interrupts during system calls ò Example: Rather than wait on expensive disk I/O, block and ò More complexity, lower latency schedule another process until it completes ò We can block in more places that make sense ò Cost: A bit of implementation complexity ò Better CPU usage, more complexity ò Need a lock to protect against concurrent update to pages/ inodes/etc.
    [Show full text]
  • Towards Scalable Synchronization on Multi-Cores
    Towards Scalable Synchronization on Multi-Cores THÈSE NO 7246 (2016) PRÉSENTÉE LE 21 OCTOBRE 2016 À LA FACULTÉ INFORMATIQUE ET COMMUNICATIONS LABORATOIRE DE PROGRAMMATION DISTRIBUÉE PROGRAMME DOCTORAL EN INFORMATIQUE ET COMMUNICATIONS ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE POUR L'OBTENTION DU GRADE DE DOCTEUR ÈS SCIENCES PAR Vasileios TRIGONAKIS acceptée sur proposition du jury: Prof. J. R. Larus, président du jury Prof. R. Guerraoui, directeur de thèse Dr T. Harris, rapporteur Dr G. Muller, rapporteur Prof. W. Zwaenepoel, rapporteur Suisse 2016 “We can only see a short distance ahead, but we can see plenty there that needs to be done.” — Alan Turing To my parents, Eirini and Charalampos Acknowledgements “The whole is greater than the sum of its parts.” — Aristotle To date, my education (i.e., diploma, M.Sc., and Ph.D.) has lasted for 13 years. I could not possibly be here and sustain all the pressure (and of course the financial expenses) without the help and support of my family, Eirini (my mother), Charalampos (my father), and Eleni (my sister). I want to deeply thank them for being there for me throughout these years. In my experience, a successful Ph.D. thesis in the area called “systems” (i.e., with a focus on software systems) requires either 10 years of solo work, or 5–6 years of fruitful collaborations. I was lucky enough to belong in the latter category and to have the chance to collaborate with many amazing people in producing the research that is included in this dissertation. This is actually the main reason why in the main body of this dissertation I use “we” instead of “I.” First and foremost, I would like to thank my advisor, Rachid Guerraoui.
    [Show full text]
  • Synchronizations in Linux
    W4118 Operating Systems Instructor: Junfeng Yang Learning goals of this lecture Different flavors of synchronization primitives and when to use them, in the context of Linux kernel How synchronization primitives are implemented for real “Portable” tricks: useful in other context as well (when you write a high performance server) Optimize for common case 2 Synchronization is complex and subtle Already learned this from the code examples we’ve seen Kernel synchronization is even more complex and subtle Higher requirements: performance, protection … Code heavily optimized, “fast path” often in assembly, fit within one cache line 3 Recall: Layered approach to synchronization Hardware provides simple low-level atomic operations , upon which we can build high-level, synchronization primitives , upon which we can implement critical sections and build correct multi-threaded/multi-process programs Properly synchronized application High-level synchronization primitives Hardware-provided low-level atomic operations 4 Outline Low-level synchronization primitives in Linux Memory barrier Atomic operations Synchronize with interrupts Spin locks High-level synchronization primitives in Linux Completion Semaphore Futex Mutex 5 Architectural dependency Implementation of synchronization primitives: highly architecture dependent Hardware provides atomic operations Most hardware platforms provide test-and-set or similar: examine and modify a memory location atomically Some don’t, but would inform if operation attempted was atomic 6 Memory
    [Show full text]
  • Is Parallel Programming Hard, And, If So, What Can You Do About It?
    Is Parallel Programming Hard, And, If So, What Can You Do About It? Edited by: Paul E. McKenney Linux Technology Center IBM Beaverton [email protected] January 2, 2017 ii Legal Statement This work represents the views of the editor and the authors and does not necessarily represent the view of their respective employers. Trademarks: • IBM, zSeries, and PowerPC are trademarks or registered trademarks of Interna- tional Business Machines Corporation in the United States, other countries, or both. • Linux is a registered trademark of Linus Torvalds. • i386 is a trademark of Intel Corporation or its subsidiaries in the United States, other countries, or both. • Other company, product, and service names may be trademarks or service marks of such companies. The non-source-code text and images in this document are provided under the terms of the Creative Commons Attribution-Share Alike 3.0 United States license.1 In brief, you may use the contents of this document for any purpose, personal, commercial, or otherwise, so long as attribution to the authors is maintained. Likewise, the document may be modified, and derivative works and translations made available, so long as such modifications and derivations are offered to the public on equal terms as the non-source-code text and images in the original document. Source code is covered by various versions of the GPL.2 Some of this code is GPLv2-only, as it derives from the Linux kernel, while other code is GPLv2-or-later. See the comment headers of the individual source files within the CodeSamples directory in the git archive3 for the exact licenses.
    [Show full text]
  • Université De Montréal Low-Impact Operating
    UNIVERSITE´ DE MONTREAL´ LOW-IMPACT OPERATING SYSTEM TRACING MATHIEU DESNOYERS DEPARTEMENT´ DE GENIE´ INFORMATIQUE ET GENIE´ LOGICIEL ECOLE´ POLYTECHNIQUE DE MONTREAL´ THESE` PRESENT´ EE´ EN VUE DE L’OBTENTION DU DIPLOMEˆ DE PHILOSOPHIÆ DOCTOR (Ph.D.) (GENIE´ INFORMATIQUE) DECEMBRE´ 2009 c Mathieu Desnoyers, 2009. UNIVERSITE´ DE MONTREAL´ ECOL´ E POLYTECHNIQUE DE MONTREAL´ Cette th`ese intitul´ee : LOW-IMPACT OPERATING SYSTEM TRACING pr´esent´ee par : DESNOYERS Mathieu en vue de l’obtention du diplˆome de : Philosophiæ Doctor a ´et´edˆument accept´ee par le jury constitu´ede : Mme. BOUCHENEB Hanifa, Doctorat, pr´esidente M. DAGENAIS Michel, Ph.D., membre et directeur de recherche M. BOYER Fran¸cois-Raymond, Ph.D., membre M. STUMM Michael, Ph.D., membre iii I dedicate this thesis to my family, to my friends, who help me keeping balance between the joy of sharing my work, my quest for knowledge and life. Je d´edie cette th`ese `ama famille, `ames amis, qui m’aident `aconserver l’´equilibre entre la joie de partager mon travail, ma quˆete de connaissance et la vie. iv Acknowledgements I would like to thank Michel Dagenais, my advisor, for believing in my poten- tial and letting me explore the field of operating systems since the beginning of my undergraduate studies. I would also like to thank my mentors, Robert Wisniewski from IBM Research and Martin Bligh, from Google, who have been guiding me through the internships I have done in the industry. I keep a good memory of these experiences and am honored to have worked with them. A special thanks to Paul E.
    [Show full text]
  • What Is RCU, Fundamentally?
    Portland State University PDXScholar Computer Science Faculty Publications and Presentations Computer Science 12-2007 What is RCU, Fundamentally? Paul E. McKenney IBM Linux Technology Center Jonathan Walpole Portland State University Follow this and additional works at: https://pdxscholar.library.pdx.edu/compsci_fac Part of the Computer and Systems Architecture Commons, and the OS and Networks Commons Let us know how access to this document benefits ou.y Citation Details "What is RCU, Fundamentally?" Paul McKenney and Jonathan Walpole, LWN.net (http://lwn.net/Articles/ 262464/), December 17, 2007. This Article is brought to you for free and open access. It has been accepted for inclusion in Computer Science Faculty Publications and Presentations by an authorized administrator of PDXScholar. Please contact us if we can make this document more accessible: [email protected]. What is RCU, Fundamentally? [LWN.net] Weekly edition Kernel Security Distributions Contact Us Search Archives Calendar Subscribe Write for LWN LWN.net FAQ Sponsors What is RCU, Fundamentally? [Editor's note: this is the first in a three-part series on how the read-copy-update mechanism works. Many thanks to Paul McKenney and Jonathan Walpole for allowing December 17, 2007 us to publish these articles. The remaining two sections will appear in future weeks.] This article was contributed by Paul McKenney Part 1 of 3 of What is RCU, Really? Paul E. McKenney, IBM Linux Technology Center Jonathan Walpole, Portland State University Department of Computer Science Introduction Read-copy update (RCU) is a synchronization mechanism that was added to the Linux kernel in October of 2002. RCU achieves scalability improvements by allowing reads to occur concurrently with updates.
    [Show full text]