Linux Kernel Versions the Linux Kernel Development Community Before We Begin Chapter 2

Total Page:16

File Type:pdf, Size:1020Kb

Linux Kernel Versions the Linux Kernel Development Community Before We Begin Chapter 2 Table of Contents Linux Kernel Development Second Edition By Robert Love Publisher: Sams Publishing Pub Date: January 12, 2005 ISBN: 0-672-32720-1 Table of Pages: 432 • Contents • Index Copyright Foreword Preface So Here We Are Kernel Version Audience Book Website Second Edition Acknowledgments About the Author We Want to Hear from You! Reader Services Chapter 1. Introduction to the Linux Kernel Along Came Linus: Introduction to Linux Overview of Operating Systems and Kernels Linux Versus Classic Unix Kernels Linux Kernel Versions The Linux Kernel Development Community Before We Begin Chapter 2. Getting Started with the Kernel Obtaining the Kernel Source The Kernel Source Tree Building the Kernel A Beast of a Different Nature So Here We Are file:///D|/LKD/0672327201/toc.html (1 of 7)2005-5-26 9:47:02 Table of Contents Chapter 3. Process Management Process Descriptor and the Task Structure Process Creation The Linux Implementation of Threads Process Termination Process Wrap Up Chapter 4. Process Scheduling Policy The Linux Scheduling Algorithm Preemption and Context Switching Real-Time Scheduler-Related System Calls Scheduler Finale Chapter 5. System Calls APIs, POSIX, and the C Library Syscalls System Call Handler System Call Implementation System Call Context System Calls in Conclusion Chapter 6. Interrupts and Interrupt Handlers Interrupts Interrupt Handlers Registering an Interrupt Handler Writing an Interrupt Handler Interrupt Context Implementation of Interrupt Handling Interrupt Control Don't Interrupt Me; We're Almost Done! Chapter 7. Bottom Halves and Deferring Work Bottom Halves Softirqs Tasklets Work Queues Which Bottom Half Should I Use? Locking Between the Bottom Halves file:///D|/LKD/0672327201/toc.html (2 of 7)2005-5-26 9:47:02 Table of Contents The Bottom of Bottom-Half Processing Endnotes Chapter 8. Kernel Synchronization Introduction Critical Regions and Race Conditions Locking Deadlocks Contention and Scalability Locking and Your Code Chapter 9. Kernel Synchronization Methods Atomic Operations Spin Locks Reader-Writer Spin Locks Semaphores Reader-Writer Semaphores Spin Locks Versus Semaphores Completion Variables BKL: The Big Kernel Lock Preemption Disabling Ordering and Barriers Synchronization Summarization Chapter 10. Timers and Time Management Kernel Notion of Time The Tick Rate: HZ Jiffies Hardware Clocks and Timers The Timer Interrupt Handler The Time of Day Timers Delaying Execution Out of Time Chapter 11. Memory Management Pages Zones Getting Pages kmalloc() vmalloc() file:///D|/LKD/0672327201/toc.html (3 of 7)2005-5-26 9:47:02 Table of Contents Slab Layer Slab Allocator Interface Statically Allocating on the Stack High Memory Mappings Per-CPU Allocations The New percpu Interface Reasons for Using Per-CPU Data Which Allocation Method Should I Use? Chapter 12. The Virtual Filesystem Common Filesystem Interface Filesystem Abstraction Layer Unix Filesystems VFS Objects and Their Data Structures The Superblock Object The Inode Object The Dentry Object The File Object Data Structures Associated with Filesystems Data Structures Associated with a Process Filesystems in Linux Chapter 13. The Block I/O Layer Anatomy of a Block Device Buffers and Buffer Heads The bio structure Request Queues I/O Schedulers Summary Chapter 14. The Process Address Space The Memory Descriptor Memory Areas Manipulating Memory Areas mmap() and do_mmap(): Creating an Address Interval munmap() and do_munmap(): Removing an Address Interval Page Tables Conclusion Chapter 15. The Page Cache and Page Writeback file:///D|/LKD/0672327201/toc.html (4 of 7)2005-5-26 9:47:02 Table of Contents Page Cache Radix Tree The Buffer Cache The pdflush Daemon To Make a Long Story Short Chapter 16. Modules Hello, World! Building Modules Installing Modules Generating Module Dependencies Loading Modules Managing Configuration Options Module Parameters Exported Symbols Wrapping Up Modules Chapter 17. kobjects and sysfs kobjects ktypes ksets Subsystems Structure Confusion Managing and Manipulating kobjects Reference Counts sysfs The Kernel Events Layer kobjects and sysfs in a Nutshell Chapter 18. Debugging What You Need to Start Bugs in the Kernel printk() Oops Kernel Debugging Options Asserting Bugs and Dumping Information Magic SysRq Key The Saga of a Kernel Debugger Poking and Probing the System file:///D|/LKD/0672327201/toc.html (5 of 7)2005-5-26 9:47:02 Table of Contents Binary Searching to Find the Culprit Change When All Else Fails: The Community Chapter 19. Portability History of Portability in Linux Word Size and Data Types Data Alignment Byte Order Time Page Size Processor Ordering SMP, Kernel Preemption, and High Memory Endnotes Chapter 20. Patches, Hacking, and the Community The Community Linux Coding Style Chain of Command Submitting Bug Reports Generating Patches Submitting Patches Conclusion Appendix A. Linked Lists Circular Linked Lists The Linux Kernel's Implementation Manipulating Linked Lists Traversing Linked Lists Appendix B. Kernel Random Number Generator Design and Implementation Interfaces to Input Entropy Interfaces to Output Entropy Appendix C. Algorithmic Complexity Algorithms Big-O Notation Big Theta Notation Putting It All Together Perils of Time Complexity Bibliography and Reading List file:///D|/LKD/0672327201/toc.html (6 of 7)2005-5-26 9:47:02 Table of Contents Books on Operating System Design Books on Unix Kernels Books on Linux Kernels Books on Other Kernels Books on the Unix API Books on the C Programming Language Other Works Websites Index file:///D|/LKD/0672327201/toc.html (7 of 7)2005-5-26 9:47:02 Copyright Copyright Copyright © 2005 by Pearson Education, Inc. All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. No patent liability is assumed with respect to the use of the information contained herein. Although every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions. Nor is any liability assumed for damages resulting from the use of the information contained herein. Library of Congress Catalog Card Number: 2004095004 Printed in the United States of America First Printing: January 2005 08 07 06 05 4 3 2 1 Trademarks All terms mentioned in this book that are known to be trademarks or service marks have been appropriately capitalized. Novell Press cannot attest to the accuracy of this information. Use of a term in this book should not be regarded as affecting the validity of any trademark or service mark. Warning and Disclaimer Every effort has been made to make this book as complete and as accurate as possible, but no warranty or fitness is implied. The information provided is on an "as is" basis. The author and the publisher shall have neither liability nor responsibility to any person or entity with respect to any loss or damages arising from the information contained in this book. Special and Bulk Sales Pearson offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales. For more information, please contact U.S. Corporate and Government Sales 1-800-382-3419 [email protected] For sales outside of the U.S., please contact International Sales file:///D|/LKD/0672327201/copyrightpg.html (1 of 3)2005-5-26 9:47:24 Copyright [email protected] Credits Senior Editor Scott D. Meyers Managing Editor Charlotte Clapp Project Editor George Nedeff Copy Editor Margo Catts Indexer Chris Barrick Proofreader Tracy Donhardt Technical Editors Adam Belay Martin Pool Chris Rivera Publishing Coordinator Vanessa Evans Book Designer Gary Adair Page Layout file:///D|/LKD/0672327201/copyrightpg.html (2 of 3)2005-5-26 9:47:24 Copyright Michelle Mitchell Dedication To Doris and Helen. file:///D|/LKD/0672327201/copyrightpg.html (3 of 3)2005-5-26 9:47:24 Foreword Foreword As the Linux kernel and the applications that use it become more widely used, we are seeing an increasing number of system software developers who wish to become involved in the development and maintenance of Linux. Some of these engineers are motivated purely by personal interest, some work for Linux companies, some work for hardware manufacturers, and some are involved with in-house development projects. But all face a common problem: The learning curve for the kernel is getting longer and steeper. The system is becoming increasingly complex, and it is very large. And as the years pass, the current members of the kernel development team gain deeper and broader knowledge of the kernel's internals, which widens the gap between them and newcomers. I believe that this declining accessibility of the Linux source base is already a problem for the quality of the kernel, and it will become more serious over time. Those who care for Linux clearly have an interest in increasing the number of developers who can contribute to the kernel. One approach to this problem is to keep the code clean: sensible interfaces, consistent layout, "do one thing, do it well," and so on. This is Linus Torvalds' solution. The approach that I counsel is to liberally apply commentary to the code: words that the reader can use to understand what the coder intended to achieve at the time. (The process of identifying divergences between the intent and the implementation is known as debugging. It is hard to do this if the intent is not known.) But even code commentary does not provide the broad-sweep view of what a major subsystem is intended to do, and how its developers set about doing it. This, the starting point of understanding, is what the written word serves best. Robert Love's contribution provides a means by which experienced developers can gain that essential view of what services the kernel subsystems are supposed to provide, and how they set about providing them. This will be sufficient knowledge for many people: the curious, the application developers, those who wish to evaluate the kernel's design, and others.
Recommended publications
  • Micro-Viruses for Fast and Accurate Characterization of Voltage Margins and Variations in Multicore Cpus
    NATIONAL AND KAPODISTRIAN UNIVERSITY OF ATHENS SCHOOL OF SCIENCES DEPARTMENT OF INFORMATICS AND TELECOMMUNICATION COMPUTING SYSTEMS: SOFTWARE AND HARDWARE MASTER THESIS Micro-Viruses for FAst AnD AccurAte ChArActerizAtion of VoltAge MArgins And Variations in Multicore CPUs IoAnnis S. Vastakis grad1408 Supervisor: Dimitris Gizopoulos, Professor ATHENS JULY 2017 ΕΘΝΙΚΟ ΚΑΙ ΚΑΠΟΔΙΣΤΡΙΑΚΟ ΠΑΝΕΠΙΣΤΗΜΙΟ ΑΘΗΝΩΝ ΣΧΟΛΗ ΘΕΤΙΚΩΝ ΕΠΙΣΤΗΜΩΝ ΤΜΗΜΑ ΠΛΗΡΟΦΟΡΙΚΗΣ ΚΑΙ ΤΗΛΕΠΙΚΟΙΝΩΝΙΩΝ ΥΠΟΛΟΓΙΣΤΙΚΑ ΣΥΣΤΗΜΑΤΑ: ΛΟΓΙΣΜΙΚΟ ΚΑΙ ΥΛΙΚΟ ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ ΜΙΚΡΟ-ΙΟΙ ΓΙΑ ΓΡΗΓΟΡO ΚΑΙ ΑΚΡΙΒΗ ΧΑΡΑΚΤΗΡΙΣΜΟ ΤΩΝ ΠΕΡΙΘΩΡΙΩΝ ΚΑΙ ΔΙΑΚΥΜΑΝΣΕΩΝ ΤΑΣΗΣ ΣΕ ΠΟΛΥΠΥΡΗΝΟΥΣ ΕΠΕΞΕΡΓΑΣΤΕΣ Ιωάννης Σ. Βαστάκης grad1408 Επιβλέπων: Δημήτρης Γκιζόπουλος, Καθηγητής ΑΘΗΝΑ ΙΟΥΛΙΟΣ 2017 MASTER THESIS Micro-Viruses for Fast and Accurate Characterization of Voltage Margins and Variations in Multicore CPUs IoAnnis S. Vastakis grad1408 SUPERVISOR: Dimitris Gizopoulos, Professor EXAMINATION COMMITTEE: Dimitris Gizopoulos, Professor University of Athens Antonis PaschAlis, Professor University of Athens July 2017 ΔΙΠΛΩΜΑΤΙΚΗ ΕΡΓΑΣΙΑ Μικρό-ιοί για γρήγορo και ακριβή χαρακτηρισμό των περιθωρίων και διακυμάνσεων τάσης σε πολυπύρηνους επεξεργαστές Ιωάννης Σ. Βαστάκης grad1408 ΕΠΙΒΛΕΠΩΝ: Δημήτρης Γκιζόπουλος, Καθηγητής ΕΞΕΤΑΣΤΙΚΗ ΕΠΙΣΤΡΟΠΗ: Δημήτρης Γκιζόπουλος, Καθηγητής Πανεπιστήμιο Αθηνών Αντώνης Πασχάλης, Καθηγητής Πανεπιστήμιο Αθηνών Ιούλιος 2017 ABSTRACT Energy-efficient computing can be largely enabled by fast and accurate identification of the pessimistic voltage margins of multicore CPU designs
    [Show full text]
  • End-To-End Verification of Memory Isolation
    Secure System Virtualization: End-to-End Verification of Memory Isolation HAMED NEMATI Doctoral Thesis Stockholm, Sweden 2017 TRITA-CSC-A-2017:18 KTH Royal Institute of Technology ISSN 1653-5723 School of Computer Science and Communication ISRN-KTH/CSC/A--17/18-SE SE-100 44 Stockholm ISBN 978-91-7729-478-8 SWEDEN Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av teknologie doktorsexamen i datalogi fre- dagen den 20 oktober 2017 klockan 14.00 i Kollegiesalen, Kungl Tekniska högskolan, Brinellvägen 8, Stockholm. © Hamed Nemati, October 2017 Tryck: Universitetsservice US AB iii Abstract Over the last years, security kernels have played a promising role in re- shaping the landscape of platform security on today’s ubiquitous embedded devices. Security kernels, such as separation kernels, enable constructing high-assurance mixed-criticality execution platforms. They reduce the soft- ware portion of the system’s trusted computing base to a thin layer, which enforces isolation between low- and high-criticality components. The reduced trusted computing base minimizes the system attack surface and facilitates the use of formal methods to ensure functional correctness and security of the kernel. In this thesis, we explore various aspects of building a provably secure separation kernel using virtualization technology. In particular, we examine techniques related to the appropriate management of the memory subsystem. Once these techniques were implemented and functionally verified, they pro- vide reliable a foundation for application scenarios that require strong guar- antees of isolation and facilitate formal reasoning about the system’s overall security. We show how the memory management subsystem can be virtualized to enforce isolation of system components.
    [Show full text]
  • UNIVERSITY of CALIFORNIA, SAN DIEGO Beneath
    UNIVERSITY OF CALIFORNIA, SAN DIEGO Beneath the Attack Surface A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science by Keaton Mowery Committee in charge: Professor Hovav Shacham, Chair Professor Sorin Lerner Professor George Papen Professor Stefan Savage Professor Geoffrey M. Voelker 2015 Copyright Keaton Mowery, 2015 All rights reserved. The Dissertation of Keaton Mowery is approved and is acceptable in quality and form for publication on microfilm and electronically: Chair University of California, San Diego 2015 iii EPIGRAPH “Time forks perpetually toward innumerable futures. In one of them I am your enemy.” —JORGE LUIS BORGES (1941) Marco Polo imagined answering (or Kublai Khan imagined his answer) that the more one was lost in unfamiliar quarters of distant cities, the more one understood the other cities he had crossed to arrive there —ITALO CALVINO (1972) iv TABLE OF CONTENTS Signature Page . iii Epigraph . ........... iv Table of Contents . v List of Figures . viii List of Tables . xi Acknowledgements . xii Vita................................................. xiv Abstract of the Dissertation . xvi Introduction . 1 Chapter 1 Fingerprinting Information in JavaScript Implementations . 3 1.1 Introduction . 4 1.2 JavaScript Performance Fingerprinting . 8 1.2.1 Methodology . 8 1.2.2 Data Collection . 10 1.2.3 Results . 13 1.2.4 JavaScript Test Selection . 21 1.3 NoScript Whitelist Fingerprinting . 22 1.3.1 Attack Methodology . 23 1.3.2 Prevalence of Testable JavaScript . 26 1.3.3 Fingerprinting Speed . 28 1.4 Conclusions . 32 Chapter 2 Pixel Perfect: Fingerprinting Canvas in HTML5 . 34 2.1 Introduction . 34 2.2 HTML5 and CSS3 .
    [Show full text]
  • Magazines and Vmem: Extending the Slab Allocator to Many Cpus and Arbitrary Resources
    Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources Jeff Bonwick, Sun Microsystems Jonathan Adams, California Institute of Technology Abstract The slab allocator [Bonwick94] provides efficient object caching but has two significant limitations: its global locking doesn’t scale to many CPUs, and the allocator can’t manage resources other than kernel memory. To provide scalability we introduce a per−processor caching scheme called the magazine layer that provides linear scaling to any number of CPUs. To support more general resource allocation we introduce a new virtual memory allocator, vmem, which acts as a universal backing store for the slab allocator. Vmem is a complete general−purpose resource allocator in its own right, providing several important new services; it also appears to be the first resource allocator that can satisfy arbitrary−size allocations in constant time. Magazines and vmem have yielded performance gains exceeding 50% on system−level benchmarks like LADDIS and SPECweb99. We ported these technologies from kernel to user context and found that the resulting libumem outperforms the current best−of−breed user−level memory allocators. libumem also provides a richer programming model and can be used to manage other user−level resources. 1. Introduction §4. Vmem: Fast, General Resource Allocation. The slab allocator caches relatively small objects and relies The slab allocator [Bonwick94] has taken on a life of on a more general−purpose backing store to provide its own since its introduction in these pages seven slabs and satisfy large allocations. We describe a new years ago. Initially deployed in Solaris 2.4, it has resource allocator, vmem, that can manage arbitrary since been adopted in whole or in part by several other sets of integers − anything from virtual memory operating systems including Linux, FreeBSD, addresses to minor device numbers to process IDs.
    [Show full text]
  • Arxiv:2005.02605V1 [Cs.CR] 6 May 2020
    Secure System Virtualization: End-to-End Verification of Memory Isolation HAMED NEMATI arXiv:2005.02605v1 [cs.CR] 6 May 2020 Doctoral Thesis Stockholm, Sweden 2017 TRITA-CSC-A-2017:18 KTH Royal Institute of Technology ISSN 1653-5723 School of Computer Science and Communication ISRN-KTH/CSC/A--17/18-SE SE-100 44 Stockholm ISBN 978-91-7729-478-8 SWEDEN Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av teknologie doktorsexamen i datalogi fre- dagen den 20 oktober 2017 klockan 14.00 i Kollegiesalen, Kungl Tekniska högskolan, Brinellvägen 8, Stockholm. © Hamed Nemati, October 2017 Tryck: Universitetsservice US AB iii Abstract Over the last years, security kernels have played a promising role in re- shaping the landscape of platform security on today’s ubiquitous embedded devices. Security kernels, such as separation kernels, enable constructing high-assurance mixed-criticality execution platforms. They reduce the soft- ware portion of the system’s trusted computing base to a thin layer, which enforces isolation between low- and high-criticality components. The reduced trusted computing base minimizes the system attack surface and facilitates the use of formal methods to ensure functional correctness and security of the kernel. In this thesis, we explore various aspects of building a provably secure separation kernel using virtualization technology. In particular, we examine techniques related to the appropriate management of the memory subsystem. Once these techniques were implemented and functionally verified, they pro- vide reliable a foundation for application scenarios that require strong guar- antees of isolation and facilitate formal reasoning about the system’s overall security.
    [Show full text]
  • Proceedings of the 2001 USENIX Annual Technical Conference
    USENIX Association Proceedings of the 2001 USENIX Annual Technical Conference Boston, Massachusetts, USA June 25–30, 2001 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2001 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources Jeff Bonwick, Sun Microsystems Jonathan Adams, California Institute of Technology Abstract The slab allocator [Bonwick94] provides efficient object caching but has two significant limitations: its global locking doesn’t scale to many CPUs, and the allocator can’t manage resources other than kernel memory. To provide scalability we introduce a per−processor caching scheme called the magazine layer that provides linear scaling to any number of CPUs. To support more general resource allocation we introduce a new virtual memory allocator, vmem, which acts as a universal backing store for the slab allocator. Vmem is a complete general−purpose resource allocator in its own right, providing several important new services; it also appears to be the first resource allocator that can satisfy arbitrary−size allocations in constant time. Magazines and vmem have yielded performance gains exceeding 50% on system−level benchmarks like LADDIS and SPECweb99. We ported these technologies from kernel to user context and found that the resulting libumem outperforms the current best−of−breed user−level memory allocators.
    [Show full text]