Interconnect-Aware Coherence Protocols for Chip Multiprocessors Liqun Cheng, Naveen Muralimanohar, Karthik Ramani, Rajeev Balasubramonian, John B. Carter School of Computing, University of Utah flegion,naveen,karthikr,rajeev,
[email protected] ∗ Abstract However, with communication emerging as a larger power and performance constraint than computation, it may be- Improvements in semiconductor technology have made come necessary to understand and leverage the properties it possible to include multiple processor cores on a single of the interconnect at a higher level. Exposing wire prop- die. Chip Multi-Processors (CMP) are an attractive choice erties to architects enables them to find creative ways to for future billion transistor architectures due to their low exploit these properties. This paper presents a number of design complexity, high clock frequency, and high through- techniques by which coherence traffic within a CMP can put. In a typical CMP architecture, the L2 cache is shared be mapped intelligently to different wire implementations by multiple cores and data coherence is maintained among with minor increases in complexity. Such an approach can private L1s. Coherence operations entail frequent commu- not only improve performance, but also reduce power dissi- nication over global on-chip wires. In future technologies, pation. communication between different L1s will have a significant In a typical CMP, the L2 cache and lower levels of the impact on overall processor performance and power con- memory hierarchy are shared by multiple cores [24, 41]. sumption. On-chip wires can be designed to have different Sharing the L2 cache allows high cache utilization and latency, bandwidth, and energy properties.