Shared-Memory Synchronization

Shared-Memory Synchronization

Shared-Memory Synchronization MIchael L. Scott University of Rochester SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE #1 &MC Morgan& cLaypool publishers ABSTRACT Ever since the advent of time sharing in the 1960s, designers of concurrent and parallel systems have needed to synchronize the activities of threads of control that share data structures in memory. In recent years, the study of synchronization has gained new urgency with the proliferation of multicore processors, on which even relatively simple user-level programs must frequently run in parallel. This monograph offers a comprehensive survey of shared-memory synchronization, with an emphasis on “systems-level” issues. It includes sufficient coverage of architectural details to understand correctness and performance on modern multicore machines, and sufficient coverage of higher-level issues to understand how synchronization is embedded in modern programming languages. The primary intended audience is “systems programmers”—the authors of operating systems, library packages, language run-time systems, and server and utility programs. Much of the discussion should also be of interest to application programmers who want to make good use of the synchronization mechanisms available to them, and to computer architects who want to understand the ramifications of their design decisions on systems- level code. KEYWORDS Atomicity, barriers, busy-waiting, conditions, locality, locking, memory mod- els, monitors, multiprocessor architecture, nonblocking algorithms, scheduling, semaphores, synchronization, transactional memory. iii To Kelly, my wife and partner of more than 30 years. iv Contents 1 Introduction .................................................................1 1.1 Atomicity.................................................................3 1.2 Condition Synchronization . 5 1.3 Spinning vs. Blocking . 6 1.4 Safety and Liveness . 7 1.5 The Rest of this Monograph . 8 2 Architectural Background ...................................................9 2.1 Cores and Caches: Basic Shared-Memory Architecture . 9 2.1.1 Temporal and Spatial Locality . 11 2.1.2 Cache Coherence. .12 2.1.3 Processor (Core) Locality . 13 2.2 Cache Consistency . 14 2.2.1 Sources of Inconsistency. .14 2.2.2 Memory Fence Instructions . 16 2.2.3 Example Architectures . 17 2.3 Atomic Primitives . 18 2.3.1 The ABA Problem . 21 2.3.2 Other Synchronization Hardware . 23 3 Some Useful Theory ........................................................24 3.1 Safety....................................................................24 3.1.1 Deadlock Freedom . 25 3.1.2 Atomicity . 26 3.2 Liveness..................................................................33 3.2.1 Nonblocking Progress . 34 CONTENTS v 3.2.2 Fairness. .36 3.3 The Consensus Hierarchy. .37 3.4 Memory Models . 38 3.4.1 Formal Framework . 39 3.4.2 Data Races . 41 3.4.3 Real-World Models. .42 4 Practical Spin Locks ........................................................44 4.1 Classical load-store-only Algorithms . 44 4.2 Centralized Algorithms. .47 4.2.1 Test and set Locks.................................................48 4.2.2 The Ticket Lock . 50 4.3 Queued Spin Locks. .51 4.3.1 The MCS Lock . 52 4.3.2 The CLH Lock. .56 4.3.3 Which Spin Lock Should I Use? . 60 4.4 Special-case Optimizations . 60 4.4.1 Nested Locks . 60 4.4.2 Locality-conscious Locking . 61 4.4.3 Double-checked Locking. .63 4.4.4 Asymmetric Locking . 63 5 Spin-based Conditions and Barriers.........................................67 5.1 Flags. .67 5.2 Barrier Algorithms . 68 5.2.1 The Sense-Reversing Centralized Barrier . 69 5.2.2 Software Combining. .69 5.2.3 The Dissemination Barrier . 71 5.2.4 Tournament Barriers . 72 5.2.5 Static Tree Barriers . 74 5.2.6 Which Barrier Should I Use? . 74 vi CONTENTS 5.3 Barrier Extensions . 76 5.3.1 Fuzzy Barriers . 76 5.3.2 Adaptive Barriers . 79 5.3.3 Barrier-like Constructs . 81 6 Read-mostly Atomicity .....................................................83 6.1 Reader-writer Locks . 83 6.1.1 Centralized Algorithms . 84 6.1.2 Queued Reader-writer Locks . 85 6.2 Sequence Locks . 90 6.3 Read-Copy Update . 93 7 Synchronization and Scheduling ............................................98 7.1 Scheduling . 98 7.2 Semaphores . 98 7.3 Monitors . 98 7.4 Other Language Mechanisms . 98 7.4.1 Conditional Critical Regions . 98 7.4.2 Futures . 98 7.4.3 Series-Parallel Execution . 98 7.5 Kernel-Only Mechanisms . 98 7.6 Performance Considerations . 98 7.6.1 Spin-Then-Wait. .99 7.6.2.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    116 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us