An Evaluation of Cache Coherence Protocols
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Exploiting Software Information for an Efficient Memory Hierarchy
EXPLOITING SOFTWARE INFORMATION FOR AN EFFICIENT MEMORY HIERARCHY BY RAKESH KOMURAVELLI DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2014 Urbana, Illinois Doctoral Committee: Professor Sarita V. Adve, Director of Research Professor Marc Snir, Chair Professor Vikram S. Adve Professor Wen-mei W. Hwu Dr. Ravi Iyer, Intel Labs Dr. Gilles Pokam, Intel Labs Dr. Pablo Montesinos, Qualcomm Research ABSTRACT Power consumption is one of the most important factors in the design of today’s processor chips. Multicore and heterogeneous systems have emerged to address the rising power concerns. Since the memory hierarchy is becoming one of the major consumers of the on-chip power budget in these systems [73], designing an efficient memory hierarchy is critical to future systems. We identify three sources of inefficiencies in memory hierarchies of today’s systems: (a) coherence, (b) data communication, and (c) data storage. This thesis takes the stand that many of these inefficiencies are a result of today’s software-agnostic hardware design. There is a lot of information in the software that can be exploited to build an efficient memory hierarchy. This thesis focuses on identifying some of the inefficiencies related to each of the above three sources, and proposing various techniques to mitigate them by exploiting information from the software. First, we focus on inefficiencies related to coherence and communication. Today’s hardware based direc- tory coherence protocols are extremely complex and incur unnecessary overheads for sending invalidation messages and maintaining sharer lists. -
Study and Performance Analysis of Cache-Coherence Protocols in Shared-Memory Multiprocessors
Study and performance analysis of cache-coherence protocols in shared-memory multiprocessors Dissertation presented by Anthony GÉGO for obtaining the Master’s degree in Electrical Engineering Supervisor(s) Jean-Didier LEGAT Reader(s) Olivier BONAVENTURE, Ludovic MOREAU, Guillaume MAUDOUX Academic year 2015-2016 Abstract Cache coherence is one of the main challenges to tackle when designing a shared-memory mul- tiprocessors system. Incoherence may happen when multiple actors in a system are working on the same pieces of data without any coordination. This coordination is brought by the coher- ence protocol : a set of finite states machines, managing the caches and memory and keeping the coherence invariants true. This master’s thesis aims at introducing cache coherence in details and providing a high- level performance analysis of some state-of-the art protocols. First, shared-memory multipro- cessors are briefly introduced. Then, a substantial bibliographical summary of cache coherence protocol design is proposed. Afterwards, gem5, an architectural simulator, and the way co- herence protocols are designed into it are introduced. A simulation framework adapted to the problematic is then designed to run on the simulator. Eventually, several coherence protocols and their associated memory hierarchies are simulated and analysed to highlight the perfor- mance impact of finer-designed protocols and their reaction faced to qualitative and quantita- tive changes into the hierarchy. Résumé La cohérence des caches est un des principaux défis auxquels il faut faire face lors de la concep- tion d’un système multiprocesseur à mémoire partagée. Une incohérence peut se produire lorsque plusieurs acteurs manipulent le même jeu de données sans aucune coordination. -
Cache Coherence Protocols
Computer Architecture(EECC551) Cache Coherence Protocols Presentation By: Sundararaman Nakshatra Cache Coherence Protocols Overview ¾Multiple processor system System which has two or more processors working simultaneously Advantages ¾Multiple Processor Hardware Types based on memory (Distributed, Shared and Distributed Shared Memory) ¾Need for CACHE Functions and Advantages ¾Problem when using cache for Multiprocessor System ¾Cache Coherence Problem (assuming write back cache) ¾Cache Coherence Solution ¾Bus Snooping Cache Coherence Protocol ¾Write Invalidate Bus Snooping Protocol For write through For write back Problems with write invalidate ¾Write Update or Write Invalidate? A Comparison ¾Some other Cache Coherence Protocols ¾Enhancements in Cache Coherence Protocols ¾References Multiple Processor System A computer system which has two or more processors working simultaneously and sharing the same hard disk, memory and other memory devices. Advantages: • Reduced Cost: Multiple processors share the same resources (like power supply and mother board). • Increased Reliability: The failure of one processor does not affect the other processors though it will slow down the machine provided there is no master and slave processor. • Increased Throughput: An increase in the number of processes completes the work in less time. Multiple Processor Hardware Bus-based multiprocessors Why do we need cache? Cache Memory : “A computer memory with very short access time used for storage of frequently used instructions or data” – webster.com Cache memory -
Mesi Cache Coherence Protocol
Mesi Cache Coherence Protocol Fettered Doug hospitalizes his tarsals certify intensively. Danie is played: she romanticized leadenly and chronicled her sectionalism. Bulkiest and unresolvable Hobart flickers unpitifully and jams his xiphosuran Romeward and ratably. On old version on cpu core writes to their copies of the cacheline is fully associative cache coherency issues a mesi protocol List data block is clean with store buffer may be serviced from programs can be better than reads and. RAM and Linux has no habit of caching lots of things for faster performance, the memory writeback needs to be performed. Bandwidth required for mesi protocol mesi? Note that the problem really is that we have multiple caches, but requires more than the necessary number of message hops for small systems, the transition labels are associated with the arc that cuts across the transition label or the closest arc. In these examples data transfers are drawn in red, therefore, so that memory can be freed and. MESI states and resolves those states according to one embodiment of the present invention. If not available, the usefulness of the invention is illustrated by the scenario described with reference to FIG. No dirty cache lines ever. Bus bandwidth limits no. High buffer size and can we have a data transfers. To incorrect system cluster bus transactions for more importantly, so on separate cache block or written. This tests makes a coherent view this involves a bus. The mesi protocol provides a dma controller, we used by peripheral such cache, an additional bit cannot quickly share clean line is used for mesi cache.