Cooling a Microprocessor Chip
Total Page:16
File Type:pdf, Size:1020Kb
INVITED PAPER Cooling a Microprocessor Chip Heat spreading, choice of thermal interface materials, and proper heat-sink design can enhance cooling of microprocessor packages and systems. By Ravi Mahajan, Senior Member IEEE, Chia-pin Chiu, Member IEEE, and Greg Chrysler ABSTRACT | Increasing microprocessor performance has 1965 [1]. This paper will elaborate on the issues and historically been accompanied by increasing power and solutions associated with thermal management of the increasing on-chip power density, both of which present a microprocessor. It will highlight the increased under- cooling challenge. In this paper, the historical evolution of standing of the thermal management challenges and show power is traced and the impact of power and power density on some of the innovations developed to meet these thermal solution designs is summarized. Industrial and aca- challenges. demic researchers have correspondingly increased their focus The microprocessor typically requires thermal man- on elucidating the problem and developing innovative solu- agement in three distinct environments tions in devices, circuits, architectures, packaging and system 1) Cooling is required during functional test to level heatsinking. Examples of some of the current packaging prevent transient temperature rises from causing and system thermal solutions are provided to illustrate the false performance readouts or failures. strategies used in their design. This is followed by a brief 2) During burn-in where infant mortality fails are discussion of some of the future trends in demand and solution identified as a function of the temperature, re- strategies that are being developed by academic and industrial quiring accurate temperature control. researchers to meet these demands. Potential opportunities 3) During functioning in a user environment. Man- and limitations with these strategies are reviewed. aging the thermal environment is essential to ensure reliable, long-term performance. KEYWORDS | Density factor; heatsinks; heat spreaders; micro- Item 1) requires transient thermal management and processors; packaging; thermal interface materials; thermal 2) and 3), to the first order, require steady state thermal management management. The focus in this paper is on 3), i.e., cooling the microprocessor in a user environment. The paper begins with a description of the thermal problem and I. INTRODUCTION describes the impact of cooling hot spots on the die. This The past few decades have seen a revolutionary increase in is followed by a description of the thermal design strat- computing performance and computers have become in- egies and the constraints, under which the thermal prob- creasingly pervasive in all aspects of modern life. Evolution lem needs to be solved. The paper concludes with some of the microprocessor is one of the most visible and re- brief comments on the pros and cons of some of the presentative facets of the computing revolution. Following technologies being developed by academic and industrial Moore’s law [1], the semiconductor industry has success- researchers to meet future challenges. fully doubled transistor density every two years and the microprocessor has been the flagship product, successfully exploiting the increased performance with each new II. THERMAL MANAGEMENT OF technology generation along Moore’s law. One of the MICROPROCESSORS historical consequences of increasing microprocessor performance is an associated increase in power dissipation. A. Cooling Demand This is not a new issue and was highlighted as early as Fig. 1 shows the evolution of Thermal Design Power (TDP)asafunctionoffrequency,whichisonemeasureof performance. TDP is of primary interest to the thermal solution designer and it represents the maximum sus- tained power dissipated by the microprocessor, across a Manuscript received February 8, 2005; revised February 3, 2006. The authors are with the Assembly Technology Development, Intel Corporation, set of realistic applications. It is possible for there to be Chandler, AZ 85226 USA (e-mail: [email protected]). brief bursts of activity where power dissipated by the Digital Object Identifier: 10.1109/JPROC.2006.879800 microprocessor is larger that TDP, but if the bursts are 1476 Proceedings of the IEEE |Vol.94,No.8,August2006 0018-9219/$20.00 Ó2006 IEEE Mahajan et al.: Cooling a Microprocessor Chip Fig. 1. Historical trend chart of microprocessor TDP versus frequency. Data is based on Intel microprocessors and different symbols in the graph represent different classes of microprocessors. The TDP trend for each class is expected to level off with the transition to multicore processors. within the thermal time constant and do not violate the base and into the fins of heatsink, and finally convection thermal specifications of the microprocessor, they are not from the fins of the heatsink to the cooling air stream. of interest to the thermal designer [2]. Modern micro- Using a simple thermal resistance model, the cooling processors also include advanced thermal monitors and demand may be represented as fail-safe mechanisms that prevent catastrophic failures by automatically shutting down the microprocessor if the temperature exceeds a predefined limit [3]. Thus, de- Required thermal resistance ¼ðTj À TaÞ=TDP: (1) signing for TDP is adequate to ensure reliable long-term performance. In general terms, the temperature difference ðT À T Þ is As seen from Fig. 1, TDP has in the past increased j a expected to slowly reduce over time, since T can be forced steadily with increasing microprocessor performance, re- j lower by reliability and performance expectations, and T quiring increased focus on thermal management. The tran- a can be forced higher due to heating of the inside box air sition to multicore microprocessors should alleviate the caused by increased integration and shrinking box sizes. growth in TDP with increased performance, and the expec- The thermal problem can thus become increasingly tation is that there will be a power cap for each product difficult either due to increases in TDP, reductions in segment. However, in addition to the TDP, thermal design ðT À T Þ or a combination of both. The thermal solution engineers need to account for thermal nonuniformity j a designer is faced with the challenge of developing a (typically referred to as hot spots, where power densities of thermal solution that has a thermal resistance at or below 300þ W/cm2 are possible) caused by nonuniform distri- the required thermal resistance. bution of power on die (Fig. 2). The thermal impact of nonuniform power distribution is schematically illustrated in Fig. 3. B. Cooling Solution Design and The thermal management problem is one of transport- Development Strategies ing the TDP from the die surface, where the hot spot The general strategy for thermal management focuses temperature is maintained at or below a certain temper- on: ature specification (typically referred to as the junction 1) minimizing impact of local hot spots by improving temperature Tj)toensurereliableperformance,tothe heat spreading; ambient air at temperature Ta. The transfer of the TDP is 2) increasing the power-dissipation capability of the by conduction through the many solid layers and interfaces thermal solutions; of the package into the heatsink, conduction through the 3) expanding the thermal envelopes of systems; Vol. 94, No. 8, August 2006 | Proceedings of the IEEE 1477 Mahajan et al.: Cooling a Microprocessor Chip Fig. 2. Illustrative IR images of two typical microprocessors showing on-die hot spots due to nonuniform power distributions on die. Die sizes are not to scale. 4) developing thermal solutions that meet cost Most of today’s high-performance microprocessors use constraints imposed by business considerations; an area array, flip chip interconnect scheme to connect the 5) developing solutions that fit within form factor active (circuit) side of the die to an organic or ceramic considerations of the chassis. package substrate (see Fig. 4 for a schematic illustration). Thermal management is thus a technical and economic The package substrate is either soldered to the computer challenge. Cost is one of the most important considera- motherboard through a grid array of solder joints, or has tionsintheselectionofathermalsolutionandisoftenthe pins that are inserted into a socket that is soldered to the reason why a new technology may find barriers to in- motherboard (another alternate socket is the land grid troduction especially if it cannot displace an existing array socket where socket fingers contact pads on the sur- technology on a cost performance basis. Fit considerations face of the package). In all cases, when dealing with high are also especially important when increasingly higher cooling demand, and in attempting to establish cooling power components need to be accommodated in the same envelopes, a reasonable first order assumption is that the chassis to prolong use of the same form factor. bulkoftheheatwillhavetoberemovedfromtheinactive Fig. 3. Schematic illustrating typical die power map and the hot spots on the corresponding die temperature map. The red region represents the highest temperature spot. 1478 Proceedings of the IEEE |Vol.94,No.8,August2006 Mahajan et al.: Cooling a Microprocessor Chip Fig. 4. Schematic of the two basic thermal architectures, illustrating their primary heat transfer path. (a) Architecture I. (b) Architecture II. side that is farther away from the motherboard. Given the heatwhiletransportingitfromthedietotheheatsink.