A Design Framework for Highly Concurrent Systems

Total Page:16

File Type:pdf, Size:1020Kb

A Design Framework for Highly Concurrent Systems A Design Framework for Highly Concurrent Systems Matt Welsh, Steven D. Gribble, Eric A. Brewer, and David Culler Computer Science Division University of California, Berkeley Berkeley, CA 94720 USA mdw,gribble,brewer,culler @cs.berkeley.edu f g Abstract In addition to high concurrency, Internet services have three other properties which necessitate a fresh Building highly concurrent systems, such as large-scale look at how these systems are designed: burstiness, Internet services, requires managing many information flows continuous demand, and human-scale access latency. at once and maintaining peak throughput when demand ex- Burstiness of load is fundamental to the Internet; deal- ceeds resource availability. In addition, any platform sup- ing with overload conditions must be designed into the porting Internet services must provide high availability and be able to cope with burstiness of load. Many approaches system from the beginning. Internet services must also to building concurrent systems have been proposed, which exhibit very high availability, with a downtime of no generally fall into the two categories of threaded and event- more than a few minutes a year. Finally, because ac- driven programming. We propose that threads and events cess latencies for Internet services are at human scale are actually on the ends of a design spectrum, and that and are limited by WAN and modem access times, an the best implementation strategy for these applications is important engineering tradeoff to make is to optimize somewhere in between. for high throughput rather than low latency. We present a general-purpose design framework for build- ing highly concurrent systems, based on three design com- Building highly concurrent systems is inherently dif- ponents | tasks, queues, and thread pools | which encapsu- ficult. Structuring code to achieve high throughput late the concurrency, performance, fault isolation, and soft- is not well-supported by existing programming mod- ware engineering benefits of both threads and events. We els. While threads are a commonly used device for present a set of design patterns that can be applied to map expressing concurrency, the high resource usage and an application onto an implementation using these compo- scalability limits of many thread implementations has nents. In addition, we provide an analysis of several systems led many developers to prefer an event-driven ap- (including an Internet services platform and a highly avail- proach. However, these event-driven systems are gen- able, distributed, persistent data store) constructed using erally built from scratch for particular applications, our framework, demonstrating its benefit for building and and depend on mechanisms not well-supported by most reasoning about concurrent applications. languages and operating systems. In addition, us- ing event-driven programming for concurrency can be more complex to develop and debug than threads. 1 Introduction That threads and events are best viewed as the op- posite ends of a design spectrum; the key to developing Large Internet services must deal with concurrency highly concurrent systems is to operate in the middle at an unprecedented scale. The number of concurrent of this spectrum. Event-driven techniques are useful sessions and hits per day to Internet sites translates for obtaining high concurrency, but when building real into an even higher number of I/O and network re- systems, threads are valuable (and in many cases re- quests, placing enormous demands on underlying re- quired) for exploiting multiprocessor parallelism and sources. Microsoft's web sites receive over 300 million dealing with blocking I/O mechanisms. Most devel- hits with 4.1 million users a day; Lycos has over 82 mil- opers are aware that this spectrum exists, by utilizing lion page views and more than a million users daily. As both thread and event-oriented approaches for concur- the demand for Internet services grows, as does their rency. However, the dimensions of this spectrum are functionality, new system design techniques must be not currently well understood. used to manage this load. We propose a general-purpose design framework for building highly concurrent systems. The key idea be- hind our framework is to use event-driven program- completion rate: ming for high throughput, but leverage threads (in S tasks / sec limited quantities) for parallelism and ease of program- ming. In addition, our framework addresses the other closed loop implies S = A server # concurrent requirements for these applications: high availability latency: tasks in server: and maintenance of high throughput under load. The L sec A x L tasks former is achieved by introducing fault boundaries be- tween application components; the latter by condition- ing the load placed on system resources. task arrival rate: This framework provides a means to reason about A tasks / sec the structural and performance characteristics of the system as a whole. We analyze several different sys- tems in terms of the framework, including a distributed Figure 1: Concurrent server model: The server receives persistent store and a scalable Internet services plat- A tasks per second, handles each task with a latency of form. This analysis demonstrates that our design L seconds, and has a service response rate of S tasks per framework provides a useful model for building and second. The system is closed loop: each service response reasoning about concurrent systems. causes another tasks to be injected into the server; thus, S = A in steady state. 2 Motivation: Robust Throughput the network, and hands off incoming tasks to individ- To explore the space of concurrent programming ual task-handling threads, which step through all of styles, consider a hypothetical server (as illustrated in the stages of processing that task. One handler thread Figure 1) that receives A tasks per second from a num- is created per task. An optimization of this simple ber of clients, imposes a server-side delay of L seconds scheme creates a pool of several threads in advance per task before returning a response, but overlaps as and dispatches tasks to threads from this pool, thereby many tasks as possible. We denote the task completion amortizing the high cost of thread creation and de- rate of the server as S. A concrete example of such a struction. In steady state, the number of threads T server would be a web proxy cache; if a request to the that execute concurrently in the server is S L. As cache misses, there is a large latency while the page is the per-task latency increases, there is a corresponding× fetched from the authoritative server, but during that increase in the number of concurrent threads needed to time the task doesn't consume CPU cycles. For each absorb this latency while maintaining a fixed through- response that a client receives, it immediately issues put, and likewise the number of threads scales linearly another task to the server; this is therefore a closed- with throughput for fixed latency. loop system. Threads have become the dominant form of ex- There are two prevalent strategies for handling pressing concurrency. Thread support is standard- concurrency in modern systems: threads and events. ized across most operating systems, and is so well- Threading allows programmers to write straight-line established that it is incorporated in modern lan- code and rely on the operating system to overlap com- guages, such as Java [9]. Programmers are comfortable putation and I/O by transparently switching across coding in the sequential programming style of threads threads. The alternative, events, allows programmers and tools are relatively mature. In addition, threads to manage concurrency explicitly by structuring code allow applications to scale with the number of proces- as a single-threaded handler that reacts to events (such sors in an SMP system, as the operating system can as non-blocking I/O completions, application-specific schedule threads to execute concurrently on separate messages, or timer events). We explore each of these processors. in turn, and then formulate a robust hybrid design pat- Thread programming presents a number of correct- tern, which leads to our general design framework. ness and tuning challenges. Synchronization primi- 2.1 Threaded Servers tives (such as locks, mutexes, or condition variables) are a common source of bugs. Lock contention can A simple threaded implementation of this server cause serious performance degradation as the number (Figure 2) uses a single, dedicated thread to service of threads competing for a lock increases. 1000 t completion rate: 800 ) S tasks / sec ghpu u o r sec / 600 h t thread.sleep( L secs ) r # concurrent e v asks 400 t threads in server: r closed loop S ( implies S = A T threads 200 ax se m T' 0 dispatch( ) or create( ) 1 10 100 1000 10000 # threads executing in server (T) task arrival rate: A tasks / sec Figure 3: Threaded server throughput degradation: This benchmark has a very fast client issuing many concurrent Figure 2: Threaded server: For each task that arrives at 150-byte tasks over a single TCP connection to a threaded the server, a thread is either dispatched from a statically server as in Figure 2 with L = 50ms on a 167 MHz Ultra- created pool, or a new thread is created to handle the task. SPARC running Solaris 5.6. The arrival rate determines the At any given time, there are a total of T threads executing number of concurrent threads; sufficient threads are preal- concurrently, where T = A L. located for the load. As the number of concurrent threads × T increases, throughput increases until T T 0, after which the throughput of the system degrades substantially.≥ Regardless of how well the threaded server is crafted, as the number of threads in a system grows, operat- ing system overhead (scheduling and aggregate mem- driven programmers. However, event-driven program- ory footprint) increases, leading to a decrease in the ming avoids many of the bugs associated with synchro- overall performance of the system.
Recommended publications
  • Applying the Proactor Pattern to High-Performance Web Servers
    APPLYING THE PROACTOR PATTERN TO HIGH-PERFORMANCE WEB SERVERS James Hu Irfan Pyarali Douglas C. Schmidt [email protected],edu [email protected] [email protected] Department of Computer Science, Washington University St. Louis, MO 63130, USA This paper is to appear at the 10th International Confer- 1 INTRODUCTION ence on Parallel and Distributed Computing and Systems, IASTED, Las Vegas, Nevada, October 28-31, 1998. Computing power and network bandwidth on the Internet has increased dramatically over the past decade. High- speed networks (such as ATM and Gigabit Ethernet) and high-performance I/O subsystems (such as RAID) are be- ABSTRACT coming ubiquitous. In this context, developing scalable Web servers that can exploit these innovations remains a Modern operating systems provide multiple concurrency key challenge for developers. Thus, it is increasingly im- mechanisms to develop high-performance Web servers. portant to alleviate common Web server bottlenecks, such Synchronous multi-threading is a popular mechanism for as inappropriate choice of concurrency and dispatching developing Web servers that must perform multiple oper- strategies, excessive filesystem access, and unnecessary ations simultaneously to meet their performance require- data copying. ments. In addition, an increasing number of operating sys- Our research vehicle for exploring the performance im- tems support asynchronous mechanisms that provide the pact of applying various Web server optimization tech- benefits of concurrency, while alleviating much of the per- niques is the JAWS Adaptive Web Server (JAWS) [1]. JAWS formance overhead of synchronous multi-threading. is both an adaptive Web server and a development frame- This paper provides two contributions to the study of work for Web servers that runs on multiple OS platforms high-performance Web servers.
    [Show full text]
  • Thread Management for High Performance Database Systems - Design and Implementation
    Nr.: FIN-003-2018 Thread Management for High Performance Database Systems - Design and Implementation Robert Jendersie, Johannes Wuensche, Johann Wagner, Marten Wallewein-Eising, Marcus Pinnecke, Gunter Saake Arbeitsgruppe Database and Software Engineering Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg Nr.: FIN-003-2018 Thread Management for High Performance Database Systems - Design and Implementation Robert Jendersie, Johannes Wuensche, Johann Wagner, Marten Wallewein-Eising, Marcus Pinnecke, Gunter Saake Arbeitsgruppe Database and Software Engineering Technical report (Internet) Elektronische Zeitschriftenreihe der Fakultät für Informatik der Otto-von-Guericke-Universität Magdeburg ISSN 1869-5078 Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg Impressum (§ 5 TMG) Herausgeber: Otto-von-Guericke-Universität Magdeburg Fakultät für Informatik Der Dekan Verantwortlich für diese Ausgabe: Otto-von-Guericke-Universität Magdeburg Fakultät für Informatik Marcus Pinnecke Postfach 4120 39016 Magdeburg E-Mail: [email protected] http://www.cs.uni-magdeburg.de/Technical_reports.html Technical report (Internet) ISSN 1869-5078 Redaktionsschluss: 21.08.2018 Bezug: Otto-von-Guericke-Universität Magdeburg Fakultät für Informatik Dekanat Thread Management for High Performance Database Systems - Design and Implementation Technical Report Robert Jendersie1, Johannes Wuensche2, Johann Wagner1, Marten Wallewein-Eising2, Marcus Pinnecke1, and Gunter Saake1 Database and Software Engineering Group, Otto-von-Guericke University
    [Show full text]
  • Concurrency & Parallel Programming Patterns
    Concurrency & Parallel programming patterns Evgeny Gavrin Outline 1. Concurrency vs Parallelism 2. Patterns by groups 3. Detailed overview of parallel patterns 4. Summary 5. Proposal for language Concurrency vs Parallelism ● Parallelism is the simultaneous execution of computations “doing lots of things at once” ● Concurrency is the composition of independently execution processes “dealing with lots of thing at once” Patterns by groups Architectural Patterns These patterns define the overall architecture for a program: ● Pipe-and-filter: view the program as filters (pipeline stages) connected by pipes (channels). Data flows through the filters to take input and transform into output. ● Agent and Repository: a collection of autonomous agents update state managed on their behalf in a central repository. ● Process control: the program is structured analogously to a process control pipeline with monitors and actuators moderating feedback loops and a pipeline of processing stages. ● Event based implicit invocation: The program is a collection of agents that post events they watch for and issue events for other agents. The architecture enforces a high level abstraction so invocation of an agent is implicit; i.e. not hardwired to a specific controlling agent. ● Model-view-controller: An architecture with a central model for the state of the program, a controller that manages the state and one or more agents that export views of the model appropriate to different uses of the model. ● Bulk Iterative (AKA bulk synchronous): A program that proceeds iteratively … update state, check against a termination condition, complete coordination, and proceed to the next iteration. ● Map reduce: the program is represented in terms of two classes of functions.
    [Show full text]
  • Model Checking Multithreaded Programs With
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Illinois Digital Environment for Access to Learning and Scholarship Repository Model Checking Multithreaded Programs with Asynchronous Atomic Methods Koushik Sen and Mahesh Viswanathan Department of Computer Science, University of Illinois at Urbana-Champaign. {ksen,vmahesh}@uiuc.edu Abstract. In order to make multithreaded programming manageable, program- mers often follow a design principle where they break the problem into tasks which are then solved asynchronously and concurrently on different threads. This paper investigates the problem of model checking programs that follow this id- iom. We present a programming language SPL that encapsulates this design pat- tern. SPL extends simplified form of sequential Java to which we add the ca- pability of making asynchronous method invocations in addition to the standard synchronous method calls and the ability to execute asynchronous methods in threads atomically and concurrently. Our main result shows that the control state reachability problem for finite SPL programs is decidable. Therefore, such mul- tithreaded programs can be model checked using the counter-example guided abstraction-refinement framework. 1 Introduction Multithreaded programming is often used in software as it leads to reduced latency, improved response times of interactive applications, and more optimal use of process- ing power. Multithreaded programming also allows an application to progress even if one thread is blocked for an I/O operation. However, writing correct programs that use multiple threads is notoriously difficult, especially in the presence of a shared muta- ble memory. Since threads can interleave, there can be unintended interference through concurrent access of shared data and result in software errors due to data race and atom- icity violations.
    [Show full text]
  • Lecture 26: Creational Patterns
    Creational Patterns CSCI 4448/5448: Object-Oriented Analysis & Design Lecture 26 — 11/29/2012 © Kenneth M. Anderson, 2012 1 Goals of the Lecture • Cover material from Chapters 20-22 of the Textbook • Lessons from Design Patterns: Factories • Singleton Pattern • Object Pool Pattern • Also discuss • Builder Pattern • Lazy Instantiation © Kenneth M. Anderson, 2012 2 Pattern Classification • The Gang of Four classified patterns in three ways • The behavioral patterns are used to manage variation in behaviors (think Strategy pattern) • The structural patterns are useful to integrate existing code into new object-oriented designs (think Bridge) • The creational patterns are used to create objects • Abstract Factory, Builder, Factory Method, Prototype & Singleton © Kenneth M. Anderson, 2012 3 Factories & Their Role in OO Design • It is important to manage the creation of objects • Code that mixes object creation with the use of objects can become quickly non-cohesive • A system may have to deal with a variety of different contexts • with each context requiring a different set of objects • In design patterns, the context determines which concrete implementations need to be present © Kenneth M. Anderson, 2012 4 Factories & Their Role in OO Design • The code to determine the current context, and thus which objects to instantiate, can become complex • with many different conditional statements • If you mix this type of code with the use of the instantiated objects, your code becomes cluttered • often the use scenarios can happen in a few lines of code • if combined with creational code, the operational code gets buried behind the creational code © Kenneth M. Anderson, 2012 5 Factories provide Cohesion • The use of factories can address these issues • The conditional code can be hidden within them • pass in the parameters associated with the current context • and get back the objects you need for the situation • Then use those objects to get your work done • Factories concern themselves just with creation, letting your code focus on other things © Kenneth M.
    [Show full text]
  • An Enhanced Thread Synchronization Mechanism for Java
    SOFTWARE—PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2001; 31:667–695 (DOI: 10.1002/spe.383) An enhanced thread synchronization mechanism for Java Hsin-Ta Chiao and Shyan-Ming Yuan∗,† Department of Computer and Information Science, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu 300, Taiwan SUMMARY The thread synchronization mechanism of Java is derived from Hoare’s monitor concept. In the authors’ view, however, it is over simplified and suffers the following four drawbacks. First, it belongs to a category of no-priority monitor, the design of which, as reported in the literature on concurrent programming, is not well rated. Second, it offers only one condition queue. Where more than one long-term synchronization event is required, this restriction both degrades performance and further complicates the ordering problems that a no-priority monitor presents. Third, it lacks the support for building more elaborate scheduling programs. Fourth, during nested monitor invocations, deadlock may occur. In this paper, we first analyze these drawbacks in depth before proceeding to present our own proposal, which is a new monitor-based thread synchronization mechanism that we term EMonitor. This mechanism is implemented solely by Java, thus avoiding the need for any modification to the underlying Java Virtual Machine. A preprocessor is employed to translate the EMonitor syntax into the pure Java codes that invoke the EMonitor class libraries. We conclude with a comparison of the performance of the two monitors and allow the experimental results to demonstrate that, in most cases, replacing the Java version with the EMonitor version for developing concurrent Java objects is perfectly feasible.
    [Show full text]
  • Designing for Performance: Concurrency and Parallelism COS 518: Computer Systems Fall 2015
    Designing for Performance: Concurrency and Parallelism COS 518: Computer Systems Fall 2015 Logan Stafman Adapted from slides by Mike Freedman 2 Definitions • Concurrency: – Execution of two or more tasks overlap in time. • Parallelism: – Execution of two or more tasks occurs simultaneous. Concurrency without 3 parallelism? • Parts of tasks interact with other subsystem – Network I/O, Disk I/O, GPU, ... • Other task can be scheduled while first waits on subsystem’s response Concurrency without parrallelism? Source: bjoor.me 5 Scheduling for fairness • On time-sharing system also want to schedule between tasks, even if one not blocking – Otherwise, certain tasks can keep processing – Leads to starvation of other tasks • Preemptive scheduling – Interrupt processing of tasks to process another task (why with tasks and not network packets?) • Many scheduling disciplines – FIFO, Shortest Remaining Time, Strict Priority, Round-Robin Preemptive Scheduling Source: embeddedlinux.org.cn Concurrency with 7 parallelism • Execute code concurrently across CPUs – Clusters – Cores • CPU parallelism different from distributed systems as ready availability to shared memory – Yet to avoid difference between parallelism b/w local and remote cores, many apps just use message passing between both (like HPC’s use of MPI) Symmetric Multiprocessors 8 (SMPs) Non-Uniform Memory Architectures 9 (NUMA) 10 Pros/Cons of NUMA • Pros Applications split between different processors can share memory close to hardware Reduced bus bandwidth usage • Cons Must ensure applications sharing memory are run on processors sharing memory 11 Forms of task parallelism • Processes – Isolated process address space – Higher overhead between switching processes • Threads – Concurrency within process – Shared address space – Three forms • Kernel threads (1:1) : Kernel support, can leverage hardware parallelism • User threads (N:1): Thread library in system runtime, fastest context switching, but cannot benefit from multi- threaded/proc hardware • Hybrid (M:N): Schedule M user threads on N kernel threads.
    [Show full text]
  • Assessment of Barrier Implementations for Fine-Grain Parallel Regions on Current Multi-Core Architectures
    Assessment of Barrier Implementations for Fine-Grain Parallel Regions on Current Multi-core Architectures Simon A. Berger and Alexandros Stamatakis The Exelixis Lab Department of Computer Science Technische Universitat¨ Munchen¨ Boltzmannstr. 3, D-85748 Garching b. Munchen,¨ Germany Email: [email protected], [email protected] WWW: http://wwwkramer.in.tum.de/exelixis/ time Abstract—Barrier performance for synchronizing threads master thread on current multi-core systems can be critical for scientific applications that traverse a large number of relatively small parallel regions, that is, that exhibit an unfavorable com- fork putation to synchronization ratio. By means of a synthetic and a real-world benchmark we assess 4 alternative barrier worker threads parallel region implementations on 7 current multi-core systems with 2 up to join 32 cores. We find that, barrier performance is application- and data-specific with respect to cache utilization, but that a rather fork na¨ıve lock-free barrier implementation yields good results across all applications and multi-core systems tested. We also worker threads parallel region assess distinct implementations of reduction operations that join are computed in conjunction with the barriers. The synthetic and real-world benchmarks are made available as open-source code for further testing. Keywords-barriers; multi-cores; threads; RAxML Figure 1. Classic fork-join paradigm. I. INTRODUCTION The performance of barriers for synchronizing threads on modern general-purpose multi-core systems is of vital In addition, we analyze the efficient implementation of importance for the efficiency of scientific codes. Barrier reduction operations (sums over the double values produced performance can become critical, if a scientific code exhibits by each for-loop iteration), that are frequently required a high number of relatively small (with respect to the in conjunction with barriers.
    [Show full text]
  • Migrating Thread-Based Intentional Concurrent Programming to a Task-Based Paradigm
    University of New Hampshire University of New Hampshire Scholars' Repository Master's Theses and Capstones Student Scholarship Fall 2016 MIGRATING THREAD-BASED INTENTIONAL CONCURRENT PROGRAMMING TO A TASK-BASED PARADIGM Seth Adam Hager University of New Hampshire, Durham Follow this and additional works at: https://scholars.unh.edu/thesis Recommended Citation Hager, Seth Adam, "MIGRATING THREAD-BASED INTENTIONAL CONCURRENT PROGRAMMING TO A TASK-BASED PARADIGM" (2016). Master's Theses and Capstones. 885. https://scholars.unh.edu/thesis/885 This Thesis is brought to you for free and open access by the Student Scholarship at University of New Hampshire Scholars' Repository. It has been accepted for inclusion in Master's Theses and Capstones by an authorized administrator of University of New Hampshire Scholars' Repository. For more information, please contact [email protected]. MIGRATING THREAD-BASED INTENTIONAL CONCURRENT PROGRAMMING TO A TASK-BASED PARADIGM BY Seth Hager B.M., University of Massachusetts Lowell, 2004 THESIS Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science September 2016 This thesis has been examined and approved in partial fulfillment of the requirements for the degree of Master of Science in Computer Science by: Thesis director, Philip J. Hatcher, Professor of Computer Science Michel H. Charpentier, Associate Professor of Computer Science R. Daniel Bergeron, Professor of Computer Science August 16th, 2016 Original approval signatures are on file with the University of New Hampshire Graduate School. DEDICATION For Lily and Jacob. iii ACKNOWLEDGMENTS I would like to thank the members of my committee for all of their time and effort.
    [Show full text]
  • Intrinsically-Typed Mechanized Semantics for Session Types
    Intrinsically-Typed Mechanized Semantics for Session Types Peter Thiemann University of Freiburg Germany [email protected] ABSTRACT There are synchronous and asynchronous variants [Gay and Vasconcelos Session types have emerged as a powerful paradigm for structur- 2010; Lindley and Morris 2015; Wadler 2012], or variants with addi- ing communication-based programs. They guarantee type sound- tional typing features (e.g., dependent types [Toninho and Yoshida ness and session fidelity for concurrent programs with sophisti- 2018], context-free [Padovani 2017; Thiemann and Vasconcelos 2016]), cated communication protocols. As type soundness proofs for lan- as well as extensions to deal with more than two participants in a guages with session types are tedious and technically involved, it protocol [Honda et al. 2008]. They also found their way into func- is rare to see mechanized soundness proofs for these systems. tional and object oriented languages [Dezani-Ciancaglini et al. 2009]. We present an executable intrinsically typed small-step seman- Generally, session types are inspired by linear type systems and tics for a realistic functional session type calculus. The calculus in- systems with typestate as channels change their type at each com- cludes linearity, recursion, and recursive sessions with subtyping. munication operation and must thus be handled linearly. Some Asynchronous communication is modeled with an encoding. variants are directly connected to linear logic via the Curry-Howard The semantics is implemented in Agda as an intrinsically typed, correspondence [Caires and Pfenning 2010; Toninho et al. 2013; Wadler interruptible CEK machine. This implementation proves type preser- 2012]. vation and a particular notion of progress by construction.
    [Show full text]
  • A Comparison of Parallel Design Patterns for Game Development
    Faculty of TeknikTechnol ochogy & samh Soci¨alleety ComDatavetenskapputer Science och medieteknik Graduate Thesis 15Examensarbete hp, elementary 15 h¨ogskolepo¨ang, grundniv˚a A Comparison of Parallel Design Patterns for GameExamensarbetets Development titel En Jämförelse av Parallella Designmönster för Spelutveckling Examensarbetets titel p˚aengelska (p˚asvenska om arbetet ¨ar skrivet p˚aengelska) Robin Andblom F¨orfattaresCarl Sjöberg namn var och en p˚aegen rad, i bokstavsordning efter efternamn Thesis: Bachelor 180 hp Main �ield: Computer Science Program: Game Development Supervisor: Carl Johan Gribel Date of �inal seminar: 2018-01-15 Examiner: Carl-Magnus Olsson Eventuell bild Examen: kandidatexamen 180 hp Handledare: XYZ Huvudomr˚ade: datavetenskap Examinator: ABC Program: (t.ex. systemutvecklare) Datum f¨or slutseminarium: (t.ex. 2018-05-30) AComparisonofParallelDesignPatternsforGame Development Robin Andblom, Carl Sj¨oberg Malm¨o, Sweden Abstract As processor performance capabilities can only be increased through the use of a multicore architecture, software needs to be developed to utilize the par- allelism o↵ered by the additional cores. Especially game developers need to seize this opportunity to save cycles and decrease the general rendering time. One of the existing advances towards this potential has been the creation of multithreaded game engines that take advantage of the additional processing units. In such engines, di↵erent branches of the game loop are parallelized. However, the specifics of the parallel design patterns used are not outlined. Neither are any ideas of how to combine these patterns proposed. These missing factors are addressed in this article, to provide a guideline for when to use which one of two parallel design patterns; fork-join and pipeline par- allelism.
    [Show full text]
  • Leader/Followers a Design Pattern for Efficient Multi-Threaded I/O
    Leader/Followers A Design Pattern for Efficient Multi-threaded I/O Demultiplexing and Dispatching Douglas C. Schmidt and Carlos O’Ryan g fschmidt,coryan @uci.edu Electrical and Computer Engineering Dept. University of California, Irvine, CA 92697, USA Michael Kircher Irfan Pyarali [email protected] [email protected] Siemens Corporate Technology Department of Computer Science, Washington University Munich, Germany St. Louis, MO 63130, USA 1 Intent The front-end communication servers are actually “hybrid” client/server applications that perform two primary tasks. The Leader/Followers design pattern provides a concurrency First, they receive requests arriving simultaneously from hun- model where multiple threads can efficiently demultiplex dreds or thousands of remote clients over wide area commu- events and dispatch event handlers that process I/O handles nication links, such as X.25 or the TCP/IP. Second, they vali- shared by the threads. date the remote client requests and forward valid requests over TCP/IP connections to back-end database servers. In contrast, the back-end database servers are “pure” servers that perform 2 Example the designated transactions. After a transaction commits, the database server returns its results to the associated commu- Consider the design of a multi-tier, high-volume, on-line trans- nication server, which then forwards the results back to the action processing (OLTP) system shown in the following fig- originating remote client. ure. In this design, front-end communication servers route FRONT--END BACK--END The servers in this OLTP system spend most of their time COMM.. SERVERS DATABASE SERVERS processing various types of I/O operations in response to re- quests.
    [Show full text]