Performance Computing Through 3D Systems with Active Cooling

[3B2-9] mmi2011040063.3d 19/7/011 11:9 Page 63 .......................................................................................................................................................................................................................... ATTAINING SINGLE-CHIP,HIGH- PERFORMANCE COMPUTING THROUGH 3D SYSTEMS WITH ACTIVE COOLING .......................................................................................................................................................................................................................... THIS ARTICLE EXPLORES THE BENEFITS AND THE CHALLENGES OF 3D DESIGN AND DISCUSSES NOVEL TECHNIQUES TO INTEGRATE PREDICTIVE COOLING CONTROL WITH CHIP-LEVEL THERMAL-MANAGEMENT METHODS SUCH AS JOB SCHEDULING AND VOLTAGE FREQUENCY SCALING.USING 3D LIQUID-COOLED SYSTEMS WITH INTELLIGENT RUNTIME MANAGEMENT PROVIDES AN ENERGY-EFFICIENT SOLUTION FOR DESIGNING SINGLE-CHIP MANY-CORE ARCHITECTURES. ......Performance demands are increas- An emerging design technique called 3D ing in data centers and high-performance stacking addresses these challenges. First, computing clusters, which today run various because a 3D stacked system has a smaller applications from document and media pro- footprint (that is, per-chip area), the manu- cessing to scientific computing and complex facturing yield is higher, reducing the overall modeling. In tandem, the industry has design cost.2,3 The on-chip interconnects are Ayse K. Coskun moved into building many-core systems, shorter in 3D systems, reducing the wire where a single chip has dozens of cores. delay and capacitance. Because the through- 1 Jie Meng Intel’s Single-Chip Cloud Computer and silicon vias (TSVs) connecting the layers do Tilera’s 64-core processors are recent exam- not adhere to the chip’s pin-out restrictions, Boston University ples of such systems. Although many-core we can build interconnect architectures with systems could provide immense computa- higher bandwidth between cores and mem- tional capacity, achieving high performance ory blocks. However, 3D design accelerates David Atienza in such systems is highly challenging. Com- the thermal challenges because of the higher munication latency and memory bandwidth thermal resistivity resulting from vertical Mohamed M. Sabry limit many-core performance. Many-core stacking. In fact, high temperatures and cool- systems have other challenges as well. Be- ing challenges are among the major gating E´cole Polytechnique cause of the larger die sizes, the manufactur- factors in building high-performance 3D ing yield is lower, high reliability is more systems.4 Fe´de´rale de Lausanne difficult to achieve, process variations are Prior research has addressed the thermal more severe, and the production cost is challenges in 3D systems through design higher compared to smaller chips. These techniques such as temperature-aware floor- challenges accelerate with smaller technology planning5 and dynamic management tech- nodes. niques such as temperature-aware job ................................................................... 0272-1732/11/$26.00 c 2011 IEEE Published by the IEEE Computer Society 63 [3B2-9] mmi2011040063.3d 19/7/011 11:9 Page 64 ............................................................................................................................................................................................... BIG CHIPS chip—has emerged as a viable cooling alter- native for high-performance 3D systems in the past decade.7 Recently, a prototype 3D system with built-in microchannels was man- ufactured8 (see Figures 1 and 2). Liquid cooling has a higher efficiency of removing heat compared to conventional heat sinks and fans, and therefore can address the press- ing thermal challenges in 3D systems. Liquid- cooled 3D systems, however, bring novel challenges in cooling control and in efficient in- tegration with chip-level thermal-management Figure 1. Liquid-cooled 3D chip prototype, techniques. built by IBM Zu¨rich and E´ cole Polytechnique Many-core3Ddesignwithactivecooling Fe´de´rale de Lausanne (EPFL).8 Liquid cool- is highly complex, with a number of con- ing can more effectively remove heat com- straints such as cost, peak power, energy, re- pared to conventional heat sinks and fans. liability, and yield. Solving these challenges through design-time optimization adds to the area overhead, increases time-to-market and cost, and often results in suboptimal op- eration owing to dynamic variations in workload. We propose integrating active- cooling control with thermally aware job scheduling and DVFS to optimize energy efficiency of high-performance 3D systems while maintaining low and stable temperature profiles. Temperature and energy benefits as compared to conventional cooling systems are remarkable: we reduce the peak temperature to 50Cfromcriticallyhighlev- els, while achieving over 2Â cooling-energy savings on a 64-core 3D system. Figure 2. Side view of the 3D liquid-cooled system prototype showing A manufacturing perspective on 3D design the built-in microchannels. The liquid arrives at the chip through a pipe, Recent chip sizes for many-core systems gets distributed among the microchannels, flows through the chip, and is reach 500 to 600 mm2. An important ad- collected into another pipe at the outlet. vantage of 3D stacking comes from silicon economics: individual chip yield, which is inversely correlated with area, increases scheduling and dynamic voltage and fre- when a larger number of chips with smaller quency scaling (DVFS).6 Although these areas are manufactured.2 We can estimate approaches provide substantial benefits in the cost of manufacturing a 3D system, C, reducing the temperatures below critical lev- using the wafer cost Cwafer; wafer utilization els, they are not always sufficient or energy U3D (that is, how many 3D systems can be efficient for managing 3D many-core systems. cut from the wafer, considering the chip, Temperature can increase dramatically in 3D TSV, and scribe area); and the yield Ysystem systems (for example, above 100C), making (see Equation 1).3 well-known thermal-management techniques inadequate for controlling temperature with- C wafer C ¼ (1) out considerably hurting performance. U3D Á Ysystem Active cooling—in which liquid (for example, water) flowing through built-in We calculate a 3D system’s yield by extend- microchannels (or cold plates) cools the ing the negative binomial-distribution model. .................................................................... 64 IEEE MICRO [3B2-9] mmi2011040063.3d 19/7/011 11:9 Page 65 Equation 2 shows the yield computation for 29 0.95 3D systems with known-good-die (KGD) Cost per system (left y-axis) bonding, where the dies are tested prior to 27 0.90 Y system (right y-axis) bonding. In the equation, D is the defect den- sity, with a typical range between 0.001 25 0.85 2 2 per mm and 0.005 per mm (D ¼ 23 0.80 0.001/mm2 in this work). A is the total chip area to be split into n layers, a is the de- 21 0.75 3 (%) Yield fect clustering ratio (we set a to 4), ATSV is 19 0.70 the area overhead of the TSVs, Pstack is the probability of having a successful stacking op- Cost per 2D/3D system ($) 17 0.65 eration for KGDs, and n is the number of 15 0.60 stacks. In wafer-to-wafer bonding, the yield 2D 2-tier 3D 4-tier 3D loss is higher because individual dies are not tested prior to bonding. Figure 3. Comparing the cost and yield for building a 64-core chip using 2D and 3D design, assuming a mature stacking process (see Table 1 for chip À D A properties). Building the 64-core chip as a four-tier 3D system decreases Y ¼ 1þ þ A P n system n TSV stack the manufacturing cost by 24 percent compared to the 2D chip. This decrease is mainly due to the increase in yield. For the four-tier system, (2) wafer utilization increases slightly as well (by 9 percent), because smaller chips use the wafer area more efficiently. Figure 3 shows the manufacturing yield and cost per 3D system for building a 64- core chip using 45-nm technology. We as- sume a standard 300-mm wafer with an esti- mated cost of $3,000, and we compare 3D Table 1. Properties of the 64-core big chip. designs with two and four layers to a single-layered many-core system with a total Parameter Value 2 area of 321 mm . Table 1 provides the sys- Process technology 45 nm tem properties, which are based on the No. of cores 64 sizes and architectures of the cores and caches No. of private Level 2 (L2) caches 64 in the Intel 48-core Single-Chip Cloud L2 cache size and area 256 Kbytes, 0.74 mm2 1 Computer (SCC). This example illustrates Core architecture Similar to Pentium-class cores in Intel the benefits of 3D stacking for designing Single-Chip Cloud Computer (SCC)1 big chips with respect to cost and yield: Core area 4.3 mm2 buildingthesamemany-corechipasa Through-silicon via (TSV) dimensions 50 mm, Â 50mm, 100-mm pitch four-tier 3D system instead of a 2D system On-chip bus width 128 bits decreases the per-system manufacturing cost by 24 percent. This analysis assumes a mature, reliable bonding process with a proba- 3,9 bility of success value (Pstack) of 0.99. If existing process for the TSV design, which the stacking success is lower, system yield includes etching and filling up the etched drops with the number of layers, increasing space with copper. Based on our discussions the cost of 3D design (see Equation 2). with industry, the additional etching phase Therefore, a key aspect for making 3D for the microchannels incurs

Load more