Concurrent Objects and Linearizability Concurrent Computaton

Total Page:16

File Type:pdf, Size:1020Kb

Concurrent Objects and Linearizability Concurrent Computaton Concurrent Objects and Concurrent Computaton Linearizability memory Nir Shavit Subing for N. Lynch object Fall 2003 object © 2003 Herlihy and Shavit 2 Objectivism FIFO Queue: Enqueue Method • What is a concurrent object? q.enq( ) – How do we describe one? – How do we implement one? – How do we tell if we’re right? © 2003 Herlihy and Shavit 3 © 2003 Herlihy and Shavit 4 FIFO Queue: Dequeue Method Sequential Objects q.deq()/ • Each object has a state – Usually given by a set of fields – Queue example: sequence of items • Each object has a set of methods – Only way to manipulate state – Queue example: enq and deq methods © 2003 Herlihy and Shavit 5 © 2003 Herlihy and Shavit 6 1 Pre and PostConditions for Sequential Specifications Dequeue • If (precondition) – the object is in such-and-such a state • Precondition: – before you call the method, – Queue is non-empty • Then (postcondition) • Postcondition: – the method will return a particular value – Returns first item in queue – or throw a particular exception. • Postcondition: • and (postcondition, con’t) – Removes first item in queue – the object will be in some other state • You got a problem with that? – when the method returns, © 2003 Herlihy and Shavit 7 © 2003 Herlihy and Shavit 8 Pre and PostConditions for Why Sequential Specifications Dequeue Totally Rock • Precondition: • Documentation size linear in number –Queue is empty of methods • Postcondition: – Each method described in isolation – Throws Empty exception • Interactions among methods captured • Postcondition: by side-effects on object state – Queue state unchanged – State meaningful between method calls © 2003 Herlihy and Shavit 9 © 2003 Herlihy and Shavit 10 Why Sequential Specifications Methods Take Time Totally Rock (con’t) • Can add new methods (by subclassing) invocation response 12:00 – Without changing descriptions of old 12:01 methods • These properties are so familiar, we q.enq(...) void don’t think about them –But perhaps we should … Method call time © 2003 Herlihy and Shavit 11 © 2003 Herlihy and Shavit 12 2 Concurrent Methods Take Sequential vs Concurrent Overlapping Time • Sequential – Methods take time? Who knew? • Concurrent – Method call is not an event Method call Method call – Method call is an interval. Method call time © 2003 Herlihy and Shavit 13 © 2003 Herlihy and Shavit 14 Sequential vs Concurrent Sequential vs Concurrent • Sequential: • Sequential: – Object needs meaningful state only – Each method described in isolation between method calls • Concurrent • Concurrent – Must characterize all possible – Because method calls overlap, object interactions with concurrent calls might never be between method calls • What if two enqsoverlap? •Two deqs? enq and deq? … © 2003 Herlihy and Shavit 15 © 2003 Herlihy and Shavit 16 Sequential vs Concurrent The High-Order Bit • Sequential: • What does it mean for a concurrent – Can add new methods without affecting object to be correct? older methods •What is a concurrent FIFO queue? • Concurrent: – FIFO means strict temporal order – Everything can potentially interactc! with ani – Um, like, that’s why they call it “FIFO?” everything else P – Concurrent means ambiguous temporal order © 2003 Herlihy and Shavit 17 © 2003 Herlihy and Shavit 18 3 Linearizability Manifesto Comments on Manifesto • Each method should • Common Sense, not Science – “take effect” • Scientific justification: – Instantaneously – Facilitates reasoning – Between invocation and response events – Nice mathematical properties • Any such concurrent object is • Common-sense justification – Linearizable™ – Preserves real-time order –Matches my intuition (sorry about yours) © 2003 Herlihy and Shavit 19 © 2003 Herlihy and Shavit 20 More Comments on Manifesto Reasoning • Proposed in 1990 •People are – Since then accepted almost anywhere – OK with sequential reasoning – Challenged by concurrent reasoning – Don’t leave home without it … • Air traffic control • Not universally adopted • Toddler room in day-care center – Usually for good, but specialized reasons •Most concurrent models – But most feel need to justify why not – Propose some kind of equivalence – Make concurrent problems sequential – Except they sometimes do it wrong … © 2003 Herlihy and Shavit 21 © 2003 Herlihy and Shavit 22 Concurrent Specifications Example • Naïve approach: world of pain lin – We must specify unspeakable number of ea possible multi-way interactions ri za •Linearizability: same as it ever was q.enq(x) b q.deq(y) le – Methods still described by pre- and postconditions q.enq(y) q.deq(x) time © 2003 Herlihy and Shavit 23 (6) © 2003 Herlihy and Shavit 24 4 Example Example no t li li nea ne r ar iz iz ab ab q.enq(x) q.deq(y) le q.enq(x) le q.enq(y) q.deq(x) time time (5) © 2003 Herlihy and Shavit 25 (4) © 2003 Herlihy and Shavit 26 Example Comme ci Example q.enq(x) q.deq(y) q.enq(x)q.enq(x) q.deq(y)q.deq(y) q.enq(y) q.deq(x) q.enq(y)q.enq(y) q.deq(x)q.deq(x) time time (8) © 2003 Herlihy and Shavit 27 (8) © 2003 Herlihy and Shavit 28 Comme ci Example Comme ci Example Comme ça m u lt ip le o rd er q.enq(x) q.deq(y) q.enq(x) q.deq(y) s O K q.enq(y) q.deq(x) q.enq(y) q.deq(x) time time (8) © 2003 Herlihy and Shavit 29 © 2003 Herlihy and Shavit 30 5 Read/Write Variable Example Read/Write Variable Example no no t t li li nea nea r r iz iz ab ab write(0) read(1) write(2) le write(0) read(1) write(2) le write(1) read(0) write(1) read(1) write(1) already happened time time (4) © 2003 Herlihy and Shavit 31 (4) © 2003 Herlihy and Shavit 32 Read/Write Variable Example Read/Write Variable Example lin lin ea ea ri ri za za bl bl write(0) write(2) e write(0) read(1) write(2) e write(1) read(1) write(1) read(2) time time (4) © 2003 Herlihy and Shavit 33 (2) © 2003 Herlihy and Shavit 34 Split Method Calls into Two Formal Model Events • Define precisely what we mean • Invocation – Ambiguity is bad when intuition is weak – method name & args – q.enq(x) • Allow reasoning –Formal •Response –result or exception –But mostly informal – q.enq(x) returns void • In the long run, actually more important – q.deq() returns x •Ask me why! – q.deq() throws empty © 2003 Herlihy and Shavit 35 © 2003 Herlihy and Shavit 36 6 Invocation Notation Response Notation A q.enq(x) A q: void thread method thread result object arguments object (4) © 2003 Herlihy and Shavit 37 (2) © 2003 Herlihy and Shavit 38 Response Notation (cont) History A q.enq(3) A q: empty() A q:void A q.enq(5) H = B p.enq(4) thread exception B p:void B q.deq() Sequence of B q:3 invocations and object responses (2) © 2003 Herlihy and Shavit 39 © 2003 Herlihy and Shavit 40 Definition Object Projections • Invocation & response match if A q.enq(3) Thread Object names A q:void agree names agree H = B p.enq(4) A q.enq(3) B p:void Method call B q.deq() A q:void B q:3 (1) © 2003 Herlihy and Shavit 41 © 2003 Herlihy and Shavit 42 7 Object Projections Thread Projections A q.enq(3) A q.enq(3) A q:void A q:void B p.enq(4) B p.enq(4) H|q = B p:void H = B p:void B q.deq() B q.deq() B q:3 B q:3 © 2003 Herlihy and Shavit 43 © 2003 Herlihy and Shavit 44 Thread Projections Complete Subhistory A q.enq(3) A q.enq(3) A q:void A q:void A q.enq(5) B p.enq(4) H|B = H = B p.enq(4) B p:void B p:void B q.deq() B q.deq() An invocation is B q:3 B q:3 pending if it has no matching respnse © 2003 Herlihy and Shavit 45 © 2003 Herlihy and Shavit 46 Complete Subhistory Complete Subhistory A q.enq(3) A q.enq(3) A q:void A q:void A q.enq(5) A q.enq(5) H = B p.enq(4) Complete(H)H == B p.enq(4) B p:void B p:void B q.deq() May or may not B q.deq() discard pending B q:3 have taken effect B q:3 invocations © 2003 Herlihy and Shavit 47 © 2003 Herlihy and Shavit 48 8 Sequential Histories Well-Formed Histories A q.enq(3) match A q.enq(3) A q:void A q:void A q.enq(5) B p.enq(4) match H = B p.enq(4) B p:void match B p:void B q.deq() B q.deq() B q:3 Final pending B q:3 A q:enq(5) invocation OK (4) © 2003 Herlihy and Shavit 49 © 2003 Herlihy and Shavit 50 Well-Formed Histories Equivalent Histories Per-thread B p.enq(4) Threads see the same H|A = G|A projections sequential H|B= B p:void thing in both H|B = G|B A q.enq(3) B q.deq() B p.enq(4) B q:3 A q.enq(3) A q.enq(3) B p:void B p.enq(4) A q:void H= B q.deq() B p:void B p.enq(4) A q:void H= G= A q.enq(3) B q.deq() B p:void B q:3 H|A= A q:void A q:void B q.deq() B q:3 B q:3 © 2003 Herlihy and Shavit 51 © 2003 Herlihy and Shavit 52 Sequential Specifications Legal Histories • A sequential specification is some way • A sequential (multi-object) history H of telling whether a is legal if – Single-thread, single-object history – For every object x –Is legal – H|x is in the sequential spec for x • My favorite way is using – Pre and post-conditions – But plenty of other techniques exist … © 2003 Herlihy and Shavit 53 © 2003 Herlihy and Shavit 54 9 Method Call Precedence A q.enq(3) A q.enq(3) B p.enq(4) Interval between B p.enq(4) A method call precedes B p.void invocation and B p.void another if response B q.deq() response events A q:void event precedes A q:void B q.deq() invocation event B q:3 B q:3 Method call Method call Method call (1) © 2003 Herlihy and Shavit 55 (1) © 2003 Herlihy and Shavit 56 Non-Precedence Notation •Given A q.enq(3) –History H B p.enq(4) – method executions m0 and m1 in H B p.void Some method calls •We say m Î m , if B q.deq() overlap one another 0 H 1 m m A q:void – m0 precedes m1 0 1 Î B q:3 Method call •Relation m0 H m1 is a – Partial order Method call – Total order if H is sequential (1) © 2003 Herlihy and Shavit 57 © 2003 Herlihy and Shavit 58 Linearizability
Recommended publications
  • Computer Science 322 Operating Systems Topic Notes: Process
    Computer Science 322 Operating Systems Mount Holyoke College Spring 2008 Topic Notes: Process Synchronization Cooperating Processes An Independent process is not affected by other running processes. Cooperating processes may affect each other, hopefully in some controlled way. Why cooperating processes? • information sharing • computational speedup • modularity or convenience It’s hard to find a computer system where processes do not cooperate. Consider the commands you type at the Unix command line. Your shell process and the process that executes your com- mand must cooperate. If you use a pipe to hook up two commands, you have even more process cooperation (See the shell lab later this semester). For the processes to cooperate, they must have a way to communicate with each other. Two com- mon methods: • shared variables – some segment of memory accessible to both processes • message passing – a process sends an explicit message that is received by another For now, we will consider shared-memory communication. We saw that threads, for example, share their global context, so that is one way to get two processes (threads) to share a variable. Producer-Consumer Problem The classic example for studying cooperating processes is the Producer-Consumer problem. Producer Consumer Buffer CS322 OperatingSystems Spring2008 One or more produces processes is “producing” data. This data is stored in a buffer to be “con- sumed” by one or more consumer processes. The buffer may be: • unbounded – We assume that the producer can continue producing items and storing them in the buffer at all times. However, the consumer must wait for an item to be inserted into the buffer before it can take one out for consumption.
    [Show full text]
  • 4. Process Synchronization
    Lecture Notes for CS347: Operating Systems Mythili Vutukuru, Department of Computer Science and Engineering, IIT Bombay 4. Process Synchronization 4.1 Race conditions and Locks • Multiprogramming and concurrency bring in the problem of race conditions, where multiple processes executing concurrently on shared data may leave the shared data in an undesirable, inconsistent state due to concurrent execution. Note that race conditions can happen even on a single processor system, if processes are context switched out by the scheduler, or are inter- rupted otherwise, while updating shared data structures. • Consider a simple example of two threads of a process incrementing a shared variable. Now, if the increments happen in parallel, it is possible that the threads will overwrite each other’s result, and the counter will not be incremented twice as expected. That is, a line of code that increments a variable is not atomic, and can be executed concurrently by different threads. • Pieces of code that must be accessed in a mutually exclusive atomic manner by the contending threads are referred to as critical sections. Critical sections must be protected with locks to guarantee the property of mutual exclusion. The code to update a shared counter is a simple example of a critical section. Code that adds a new node to a linked list is another example. A critical section performs a certain operation on a shared data structure, that may temporar- ily leave the data structure in an inconsistent state in the middle of the operation. Therefore, in order to maintain consistency and preserve the invariants of shared data structures, critical sections must always execute in a mutually exclusive fashion.
    [Show full text]
  • Goals of This Lecture How to Increase the Compute Power?
    Goals of this lecture Motivate you! Design of Parallel and High-Performance Trends Computing High performance computing Fall 2017 Programming models Lecture: Introduction Course overview Instructor: Torsten Hoefler & Markus Püschel TA: Salvatore Di Girolamo 2 © Markus Püschel Computer Science Trends What doubles …? Source: Wikipedia 3 4 How to increase the compute power? Clock Speed! 10000 Sun’s Surface 1000 ) 2 Rocket Nozzle Nuclear Reactor 100 10 8086 Hot Plate 8008 8085 Pentium® Power Density (W/cm Power 4004 286 386 486 Processors 8080 1 1970 1980 1990 2000 2010 Source: Intel Source: Wikipedia 5 6 How to increase the compute power? Evolutions of Processors (Intel) Not an option anymore! Clock Speed! 10000 Sun’s Surface 1000 ) 2 Rocket Nozzle Nuclear Reactor 100 ~3 GHz Pentium 4 Core Nehalem Haswell Pentium III Sandy Bridge Hot Plate 10 8086 Pentium II 8008 free speedup 8085 Pentium® Pentium Pro Power Density (W/cm Power 4004 286 386 486 Processors 8080 Pentium 1 1970 1980 1990 2000 2010 Source: Intel 7 8 Source: Wikipedia/Intel/PCGuide Evolutions of Processors (Intel) Evolutions of Processors (Intel) Cores: 8x ~360 Gflop/s Vector units: 8x parallelism: work required 2 2 4 4 4 8 cores ~3 GHz Pentium 4 Core Nehalem Haswell Pentium III Sandy Bridge Pentium II Pentium Pro free speedup Pentium memory bandwidth (normalized) 9 10 Source: Wikipedia/Intel/PCGuide Source: Wikipedia/Intel/PCGuide A more complete view Source: www.singularity.com Can we do this today? 11 12 © Markus Püschel Computer Science High-Performance Computing (HPC) a.k.a. “Supercomputing” Question: define “Supercomputer”! High-Performance Computing 13 14 High-Performance Computing (HPC) a.k.a.
    [Show full text]
  • The Critical Section Problem Problem Description
    The Critical Section Problem Problem Description Informally, a critical section is a code segment that accesses shared variables and has to be executed as an atomic action. The critical section problem refers to the problem of how to ensure that at most one process is executing its critical section at a given time. Important: Critical sections in different threads are not necessarily the same code segment! Concurrent Software Systems 2 1 Problem Description Formally, the following requirements should be satisfied: Mutual exclusion: When a thread is executing in its critical section, no other threads can be executing in their critical sections. Progress: If no thread is executing in its critical section, and if there are some threads that wish to enter their critical sections, then one of these threads will get into the critical section. Bounded waiting: After a thread makes a request to enter its critical section, there is a bound on the number of times that other threads are allowed to enter their critical sections, before the request is granted. Concurrent Software Systems 3 Problem Description In discussion of the critical section problem, we often assume that each thread is executing the following code. It is also assumed that (1) after a thread enters a critical section, it will eventually exit the critical section; (2) a thread may terminate in the non-critical section. while (true) { entry section critical section exit section non-critical section } Concurrent Software Systems 4 2 Solution 1 In this solution, lock is a global variable initialized to false. A thread sets lock to true to indicate that it is entering the critical section.
    [Show full text]
  • Core Processors
    UNIVERSITY OF CALIFORNIA Los Angeles Parallel Algorithms for Medical Informatics on Data-Parallel Many-Core Processors A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Maryam Moazeni 2013 © Copyright by Maryam Moazeni 2013 ABSTRACT OF THE DISSERTATION Parallel Algorithms for Medical Informatics on Data-Parallel Many-Core Processors by Maryam Moazeni Doctor of Philosophy in Computer Science University of California, Los Angeles, 2013 Professor Majid Sarrafzadeh, Chair The extensive use of medical monitoring devices has resulted in the generation of tremendous amounts of data. Storage, retrieval, and analysis of such data require platforms that can scale with data growth and adapt to the various behavior of the analysis and processing algorithms. In recent years, many-core processors and more specifically many-core Graphical Processing Units (GPUs) have become one of the most promising platforms for high performance processing of data, due to the massive parallel processing power they offer. However, many of the algorithms and data structures used in medical and bioinformatics systems do not follow a data-parallel programming paradigm, and hence cannot fully benefit from the parallel processing power of ii data-parallel many-core architectures. In this dissertation, we present three techniques to adapt several non-data parallel applications in different dwarfs to modern many-core GPUs. First, we present a load balancing technique to maximize parallelism in non-serial polyadic Dynamic Programming (DP), which is a family of dynamic programming algorithms with more non-uniform data access pattern. We show that a bottom-up approach to solving the DP problem exploits more parallelism and therefore yields higher performance.
    [Show full text]
  • Rtos Implementation of Non-Linear System Using Multi Tasking, Scheduling and Critical Section
    Journal of Computer Science 10 (11): 2349-2357, 2014 ISSN: 1549-3636 © 2014 M. Sujitha et al ., This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/jcssp.2014.2349.2357 Published Online 10 (11) 2014 (http://www.thescipub.com/jcs.toc) RTOS IMPLEMENTATION OF NON-LINEAR SYSTEM USING MULTI TASKING, SCHEDULING AND CRITICAL SECTION 1Sujitha, M., 2V. Kannan and 3S. Ravi 1,3 Department of ECE, Dr. M.G.R. Educational and Research Institute University, Chennai, India 2Jeppiaar Institute of Technology, Kanchipuram, India Received 2014-04-21; Revised 2014-07-19; Accepted 2014-12-16 ABSTRACT RTOS based embedded systems are designed with priority based multiple tasks. Inter task communication and data corruptions are major constraints in multi-tasking RTOS system. This study we describe about the solution for these issues with an example Real-time Liquid level control system. Message queue and Mail box are used to perform inter task communication to improve the task execution time and performance of the system. Critical section scheduling is used to eliminate the data corruption. In this application process value monitoring is considered as critical. In critical section the interrupt disable time is the most important specification of a real time kernel. RTOS is used to keep the interrupt disable time to a minimum. The algorithm is studied with respect to task execution time and response of the interrupts. The study also presents the comparative analysis of the system with critical section and without critical section based on the performance metrics. Keywords: Critical Section, RTOS, Scheduling, Resource Sharing, Mailbox, Message Queue 1.
    [Show full text]
  • Concurrent and Parallel Programming
    Concurrent and parallel programming Seminars in Advanced Topics in Computer Science Engineering 2019/2020 Romolo Marotta Trend in processor technology Concurrent and parallel programming 2 Blocking synchronization SHARED RESOURCE Concurrent and parallel programming 3 Blocking synchronization …zZz… SHARED RESOURCE Concurrent and parallel programming 4 Blocking synchronization Correctness guaranteed by mutual exclusion …zZz… SHARED RESOURCE Performance might be hampered because of waste of clock cycles Liveness might be impaired due to the arbitration of accesses Concurrent and parallel programming 5 Parallel programming • Ad-hoc concurrent programming languages • Development tools ◦ Compilers ◦ MPI, OpenMP, libraries ◦ Tools to debug parallel code (gdb, valgrind) • Writing parallel code is an art ◦ There are approaches, not prepackaged solutions ◦ Every machine has its own singularities ◦ Every problem to face has different requisites ◦ The most efficient parallel algorithm might not be the most intuitive one Concurrent and parallel programming 6 What do we want from parallel programs? • Safety: nothing wrong happens (Correctness) ◦ parallel versions of our programs should be correct as their sequential implementations • Liveliness: something good happens eventually (Progress) ◦ if a sequential program terminates with a given input, we want that its parallel alternative also completes with the same input • Performance ◦ we want to exploit our parallel hardware Concurrent and parallel programming 7 Correctness conditions Progress conditions
    [Show full text]
  • Shared Memory Programming Models I
    Shared Memory Programming Models I Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-69120 Heidelberg phone: 06221/54-8264 email: [email protected] WS 14/15 Stefan Lang (IWR) Simulation on High-Performance Computers WS14/15 1/45 Shared Memory Programming Models I Communication by shared memory Critical section Mutual exclusion: Petersons algorithm OpenMP Barriers – Synchronisation of all processes Semaphores Stefan Lang (IWR) Simulation on High-Performance Computers WS14/15 2/45 Critical Section What is a critical section? We consider the following situation: Application consists of P concurrent processes, these are thus executed simultaneously instructions executed by one process are subdivided into interconnected groups ◮ critical sections ◮ uncritical sections Critical section: Sequence of instructions, that perform a read or write access on a shared variable. Instructions of a critical section that may not be performed simultaneously by two or more processes. → it is said only a single process may reside within the critical section. Stefan Lang (IWR) Simulation on High-Performance Computers WS14/15 3/45 Mutual Exclusion I 2 Types of synchronization can be distinguished Conditional synchronisation Mutual exclusion Mutual exclusion consists of an entry protocol and an exit protocol: Programm (Introduction of mutual exclusion) parallel critical-section { process Π[int p ∈{0,..., P − 1}] { while (1) { entry protocol; critical section; exit protocol; uncritical section; } } } Stefan Lang (IWR) Simulation on High-Performance Computers WS14/15 4/45 Mutual Exclusion II The following criteria have to be matched: 1 Mutual exclusion. At most one process executed the critical section at a time.
    [Show full text]
  • Arxiv:2012.03692V1 [Cs.DC] 7 Dec 2020
    Separation and Equivalence results for the Crash-stop and Crash-recovery Shared Memory Models Ohad Ben-Baruch Srivatsan Ravi Ben-Gurion University, Israel University of Southern California, USA [email protected] [email protected] December 8, 2020 Abstract Linearizability, the traditional correctness condition for concurrent data structures is considered insufficient for the non-volatile shared memory model where processes recover following a crash. For this crash-recovery shared memory model, strict-linearizability is considered appropriate since, unlike linearizability, it ensures operations that crash take effect prior to the crash or not at all. This work formalizes and answers the question of whether an implementation of a data type derived for the crash-stop shared memory model is also strict-linearizable in the crash-recovery model. This work presents a rigorous study to prove how helping mechanisms, typically employed by non-blocking implementations, is the algorithmic abstraction that delineates linearizabil- ity from strict-linearizability. Our first contribution formalizes the crash-recovery model and how explicit process crashes and recovery introduces further dimensionalities over the stan- dard crash-stop shared memory model. We make the following technical contributions that answer the question of whether a help-free linearizable implementation is strict-linearizable in the crash-recovery model: (i) we prove surprisingly that there exist linearizable imple- mentations of object types that are help-free, yet not strict-linearizable; (ii) we then present a natural definition of help-freedom to prove that any obstruction-free, linearizable and help-free implementation of a total object type is also strict-linearizable. The next technical contribution addresses the question of whether a strict-linearizable implementation in the crash-recovery model is also help-free linearizable in the crash-stop model.
    [Show full text]
  • A Task Parallel Approach Bachelor Thesis Paul Blockhaus
    Otto-von-Guericke-University Magdeburg Faculty of Computer Science Department of Databases and Software Engineering Parallelizing the Elf - A Task Parallel Approach Bachelor Thesis Author: Paul Blockhaus Advisors: Prof. Dr. rer. nat. habil. Gunter Saake Dr.-Ing. David Broneske Otto-von-Guericke-University Magdeburg Department of Databases and Software Engineering Dr.-Ing. Martin Schäler Karlsruhe Institute of Technology Department of Databases and Information Systems Magdeburg, November 29, 2019 Acknowledgements First and foremost I wish to thank my advisor, Dr.-Ing. David Broneske for sparking my interest in database systems, introducing me to the Elf and supporting me throughout the work on this thesis. I would also like to thank Prof. Dr. rer. nat. habil. Gunter Saake for the opportunity to write this thesis in the DBSE research group. Also a big thank you to Dr.-Ing. Martin Schäler for his valuable feedback. Finally I want to thank my family and my cat for their support and encouragement throughout my study. Abstract Analytical database queries become more and more compute intensive while at the same time the computing power of the CPUs stagnates. This leads to new approaches to accelerate complex selection predicates, which are common in analytical queries. One state of the art approach is the Elf, a highly optimized tree structure which utilizes prefix sharing to optimize multi-predicate selection queries to aim for bare metal speed. However, this approach leaves many capabilities that modern hardware offers aside. One of the most popular capabilities that results in huge speed-ups is the utilization of multiple cores, offered by modern CPUs.
    [Show full text]
  • Round-Up: Runtime Checking Quasi Linearizability Of
    Round-Up: Runtime Checking Quasi Linearizability of Concurrent Data Structures Lu Zhang, Arijit Chattopadhyay, and Chao Wang Department of ECE, Virginia Tech Blacksburg, VA 24061, USA {zhanglu, arijitvt, chaowang }@vt.edu Abstract —We propose a new method for runtime checking leading to severe performance bottlenecks. Quasi linearizabil- of a relaxed consistency property called quasi linearizability for ity preserves the intuition of standard linearizability while concurrent data structures. Quasi linearizability generalizes the providing some additional flexibility in the implementation. standard notion of linearizability by intentionally introducing nondeterminism into the parallel computations and exploiting For example, the task queue used in the scheduler of a thread such nondeterminism to improve the performance. However, pool does not need to follow the strict FIFO order. One can ensuring the quantitative aspects of this correctness condition use a relaxed queue that allows some tasks to be overtaken in the low level code is a difficult task. Our method is the occasionally if such relaxation leads to superior performance. first fully automated method for checking quasi linearizability Similarly, concurrent data structures used for web cache need in the unmodified C/C++ code of concurrent data structures. It guarantees that all the reported quasi linearizability violations not follow the strict semantics of the standard versions, since are real violations. We have implemented our method in a occasionally getting the stale data is acceptable. In distributed software tool based on LLVM and a concurrency testing tool systems, the unique id generator does not need to be a perfect called Inspect. Our experimental evaluation shows that the new counter; to avoid becoming a performance bottleneck, it is method is effective in detecting quasi linearizability violations in often acceptable for the ids to be out of order occasionally, the source code of concurrent data structures.
    [Show full text]
  • The Mutual Exclusion Problem Part II: Statement and Solutions
    56b The Mutual Exclusion Problem Part II: Statement and Solutions L. Lamport1 Digital Equipment Corporation 6 October 1980 Revised: 1 February 1983 1 May 1984 27 February 1985 June 26, 2000 To appear in Journal of the ACM 1Most of this work was performed while the author was at SRI International, where it was supported in part by the National Science Foundation under grant number MCS-7816783. Abstract The theory developed in Part I is used to state the mutual exclusion problem and several additional fairness and failure-tolerance requirements. Four “dis- tributed” N-process solutions are given, ranging from a solution requiring only one communication bit per process that permits individual starvation, to one requiring about N! communication bits per process that satisfies every reasonable fairness and failure-tolerance requirement that we can conceive of. Contents 1 Introduction 3 2TheProblem 4 2.1 Basic Requirements ........................ 4 2.2 Fairness Requirements ...................... 6 2.3 Premature Termination ..................... 8 2.4 Failure ............................... 10 3 The Solutions 14 3.1 The Mutual Exclusion Protocol ................. 15 3.2 The One-Bit Solution ...................... 17 3.3 A Digression ........................... 21 3.4 The Three-Bit Algorithm .................... 22 3.5 FCFS Solutions .......................... 26 4Conclusion 32 1 List of Figures 1 The One-Bit Algorithm: Process i ............... 17 2 The Three-Bit Algorithm: Process i .............. 24 3TheN-Bit FCFS Algorithm: Process i............. 28 4TheN!-Bit FCFS Algorithm: Process i. ............ 31 2 1 Introduction This is the second part of a two-part paper on the mutual exclusion problem. In Part I [9], we described a formal model of concurrent systems and used it to define a primitive interprocess communication mechanism (communica- tion variables) that assumes no underlying mutual exclusion.
    [Show full text]