Opportunities and Challenges for Better Than Worst-Case Design
Total Page:16
File Type:pdf, Size:1020Kb
Opportunities and Challenges for Better Than Worst-Case Design Todd Austin, Valeria Bertacco, David Blaauw, and Trevor Mudge Advanced Computer Architecture Lab The University of Michigan [email protected] ABSTRACT noted that these new challenges add to the many challenges The progressive trend of fabrication technologies towards that architects already face in order to scale systems' per- the nanometer regime has created a number of new physical formance while meeting power and reliability budgets. design challenges for computer architects. Design complex- The ¯rst challenge of concern is design complexity. As ity, uncertainty in environmental and fabrication conditions, silicon feature sizes decrease, architects have available in- and single-event upsets all conspire to compromise system creasingly large transistor budgets. According to Moore's correctness and reliability. Recently, researchers have be- law, which has been tracked for decades by the semiconduc- gun to advocate a new design strategy, called Better Than tor industry, architects can expect to have available twice Worst-Case design, that couples a complex core component the number of transistors every 18 months. In pursuit of with a simple reliable checker mechanism. By delegating the increased system performance, they typically employ these responsibility for correctness and reliability of the design to transistors in components that contribute to increased in- the checker, it becomes possible to build correct-certi¯ed struction level parallelism and reduced operational latency. designs that e®ectively address the challenges of deep sub- While many of these transistors are assigned to regular, micron design. easy-to-verify components, such as branch predictors and In this paper, we present the concepts of Better Than caches, many others ¯nd their way into complex devices Worst-Case design and highlight two exemplary designs: the that increase the burden of veri¯cation placed on the de- DIVA checker and Razor logic. We show how this approach sign team. For example, the Intel Pentium IV architecture to system implementation relaxes design constraints on core (follow-on of the Pentium Pro) introduced a number of com- components, which reduces the e®ects of physical design plex components, including a trace cache, instruction replay challenges and creates opportunities to optimize performance unit, vector arithmetic, and staggered ALUs [13]. These and power characteristics. We demonstrate the advantages new devices, made a®ordable by generous transistor bud- of relaxed design constraints for the core components by gets, lead to even more challenging veri¯cation e®orts. In applying typical-case optimization (TCO) techniques to an a recent paper detailing the design and veri¯cation of the adder circuit. By analyzing the carry-propagation charac- Pentium IV processor, it was observed that its veri¯cation teristics of real programs, it is possible to design an adder required 250 person-years of e®ort, a full three-fold increase circuit that when incorporated into a Better Than Worst- in human resources compared to the design of the earlier Case design exhibits signi¯cantly reduced latency. Finally, Pentium Pro processor [6]. we discuss the challenges and opportunities posed to CAD The second challenge architects face is the design un- tools in the context of Better Than Worst-Case design. In certainty that is created by increasing environmental and particular, additional support is required for analyzing run- process variations. Environmental variation is caused by time characteristics of designs, and many opportunities are changes in temperature and supply voltage. Process vari- created to incorporate typical-case optimizations into syn- ation results from device dimension and doping concentra- thesis, testing and veri¯cation. tion variation that occurs during silicon fabrication. Pro- cess variation is of particular concern because its e®ects on devices are ampli¯ed as device dimensions shrink [2]. Ar- 1. INTRODUCTION chitects are forced to deal with these variations by design- The advent of nanometer feature sizes in silicon fabrica- ing for worst-case device characteristics (usually, a 3-sigma tion has triggered a number of new design challenges for variation from typical conditions), which leads to overly con- computer architects. These challenges include design com- servative designs. The e®ect of this conservative design ap- plexity, device uncertainty and soft-errors. It should be proach is most evident by examining the extent to which hobbyists can overclock high-end microprocessors. For ex- ample, AMD's best-of-class Barton 3200+ microprocessor is speci¯ed to run at 2.2 GHz, yet it has been successfully overclocked up to 3.1 GHz [1]. This is accomplished by op- Permission to make digital or hard copies of all or part of this work for timizing device cooling and voltage supply quality and by personal or classroom use is granted without fee provided that copies are tuning system performance to the speci¯c process conditions not made or distributed for profit or commercial advantage and that copies of their individual chip. bear this notice and the full citation on the first page. To copy otherwise, to The third challenge of growing concern is soft errors that republish, to post on servers or to redistribute to lists, requires prior specific are caused by charged particles (such as alpha particles or permission and/or a fee. Copyright 2005 ACM X-XXXXX-XX-X/XX/XX ...$5.00. neutrons) that strike the bulk silicon portion of a die. The striking particle creates extra charge that can migrate into Than Worst-Case design1 underlines the improvement that the channel of a transistor, and temporarily turn it on or o®. this approach represents over traditional worst-case design The end result is a logic glitch that can potentially corrupt techniques. logic computation or state bits. While a variety of studies have been performed to demonstrate the unlikeliness of such Well-defined events [16], great concern remains in the architecture and Operations circuit communities, fueled by the trends of reduced sup- Performance/PowerPerformance/Power ply voltage and increased transistor budgets, both of which Input Verified Output OptimizedOptimized Checker exacerbate a design's vulnerability to soft errors. CoreCore Component Component The combined e®ect of these three design challenges is that architects are forced to work harder and harder just Detects and Corrects to keep up with system performance, power and reliability Operational Faults design goals. The unsurmountable task of meeting these goals with limited resource budgets and increasing time-to- Figure 1: Better Than Worst-Case Design Concept market pressures has raised these design challenges to crisis proportion. In this paper, we highlight a novel design strat- Traditional worst-case design techniques construct com- egy to address these challenges, called Better Than Worst- plete systems which must satisfy guarantees of correctness Case design, that embraces a design style which separates and robust operation. The previously highlighted design the concerns of correctness and robustness from the ones of challenges conspire to make this an increasingly untenable performance and power. The approach decouples designs design technique. Better Than Worst-Case designs take a into two primary components: a core design component and markedly di®erent approach, as illustrated in Figure 1. In a a simple checker. The core design component is responsi- Better Than Worst-Case design, the core component of the ble for performance and power e±cient computing, and the design is coupled with a checker mechanism that validates checker is responsible for verifying that the core computation the semantics of the core operations. The advantage of such is correct. By concentrating the concerns of correctness into designs is that all e®orts with respect to correctness and ro- the simple checker component, the majority of the design bustness are concentrated on the checker component. The is freed from these overarching concerns. With relaxed cor- performance and power e±ciency concerns of the design are rectness constraints in the core component, architects can relegated to the core component, and they are addressed in- more e®ectively address the three highlighted design chal- dependently of any correctness concerns. By removing the lenges. We have demonstrated in prior work (highlighted correctness concerns from the core component, its design herein) that it is possible to decompose a variety of impor- constraints are signi¯cantly relaxed, making this approach tant processing problems into e®ective core/checker pairs. much more amenable to address physical design challenges. The designs we have constructed are faster, cooler and more To ¯nd success with a Better Than Worst-Case design reliable than traditional worst-case designs. style, the checker component must meet three design re- The remainder of this paper is organized as follows. Sec- quirements: i) it must be simple to implement lest the checker tion 2 overviews the Better Than Worst-Case design ap- increase overall design complexity, ii) it must be capable proach and presents two e®ective designs solutions: DIVA of validating all core computation at its maximum process- checker and Razor logic. Better Than Worst-Case designs ing rate lest the checker slow system operation, and iii) it have the unique property that their performance is related must be correctly implemented lest it introduce processing to the typical-case operation of the core component. This is errors into the system. In the following subsections, we in direct contrast to worst-case designs, where system per- present two Better Than Worst-Case designs that demon- formance is bound by the worst-case performance of any strate how simple checker components can meet these re- component in the system. In Section 3, we demonstrate quirements. The DIVA checker is an instruction checker how typical-case optimization (TCO) can improve the per- that validates the operations of a complex microarchitec- formance of a Better Than Worst-Case design. We show ture design.