[3B2-9] mmi2012020006.3d 10/3/012 14:56 6

...... Guest Editors’ Introduction ...... HOT CHIPS 23

...... Hot Chips has become a bellwether presented at Hot Chips specifically targeted of computer architectural trends. We can the conflicting goals of keeping threads as trace innovation and maturity in the industry busy as possible, while using as little power by the rise and fall of these trends, on the as possible. basis of product categories that had multiple The articles in this special issue (and sub- Hot Chips sessions in some years, but sequent issues) exemplify these trends with whose numbers shrank in following years. innovative approaches to achieving high per- These numbers didn’t shrink because of in- formance under power constraints. dustry failure, but often because they were successful, and are so accepted that they no Throughput performance from multiple longer need to be presented. cores and threads We’veseenthisinHotChipsoverthe Oracle’s ‘‘Sparc T4: A Dynamically years with reduced-instruction-set comput- Threaded Server-on-a-Chip,’’ by Manish ing (RISC) versus complex-instruction-set Shah et al., shows stunning performance computing (CISC), new instruction set increases over previous generations of chips architectures, instruction set parallelism, by optimizing what is shared—and what graphics and video processors, very long in- isn’t—in a system optimized for throughput. struction word (VLIW) and digital signal Its eight cores process 64 threads, and processing (DSP) architectures, network pro- the chip is implemented in a 40-nm cessors, and the growing trend of entire sys- technology. Allen Baum tems, including servers, on a chip. Power management in servers Many cores, many threads, much power Intel’s ‘‘Power-Management Architecture Hot Chips 23 was clearly the year of of the Intel Code-Named Bevan Baas many-core and many-thread processors ,’’ by Efraim Rotem et al., (either sharing core hardware or as a separate shows the effort needed to actively and dy- University of California, many-core chip, or even as partially shared namically manage power of multiple cores cores); there wasn’t a product category that at extremely fine-grained time intervals. Davis didn’t have that as a defining feature. The ‘‘AMD Fusion APU: Llano’’ by products presented stood out not by their Alexander Branover et al. combines core sheer number of threads, but by the way power management with GPU power man- the cores and threads communicated and agement, dynamically sharing the power synchronized, and the architectural innova- constraints between them as application tions that enabled the threads to be effi- demands change. The 32-nm silicon on in- ciently used. sulator (SOI) die contains four cores, But the parallelism enabled by multiple each of which has a 1-Mbyte L2 cache, threads isn’t enough in a new world of and a graphics processor containing 400 power constraints. The product innovations stream cores......

6 Published by the IEEE Computer Society 0272-1732/12/$31.00 c 2012 IEEE [3B2-9] mmi2012020006.3d 10/3/012 14:56 Page 7

Coordinating cores and threads and area-efficient computation. Recent pro- ‘‘Godson-T: An Efficient Many-Core jects include the 36-processor and 167- Processor Exploring Thread-Level Parallel- processor AsAP chips and applications. Baas ism’’ by Dongrui Fan et al. primarily con- has a PhD in electrical engineering from cerns itself with using dedicated hardware Stanford University. He is a senior mem- to accelerate how multiple cores synchronize ber of IEEE. and communicate in a single application to avoid idle cycles. The 16-core, 130-nm, Direct questions and comments about 300 MHz prototype chip gives a good pic- this article to [email protected]. ture of what can be expected in the upcom- ing 64-processor Godson-T. Finally, in ‘‘The IBM Blue Gene/Q Compute Chip,’’ Ruud A. Haring and the IBM Blue Gene team show how cache, memory, floating-point architectures, and a server network interface can all be combined in a single chip to yield performance that can best humans at what humans (think they) do best. The 45-nm SOI die contains 18 process- ors and a massive 32-Mbyte L2 cache made from embedded DRAM.

f course, these weren’t the only O interesting presentations made at Hot Chips, but only those that could fit into an IEEE Micro special issue. The full set of presentation slides for Hot Chips 23 can be accessed online at http://www.hotchips. org/conference-archives/hot-chips-23. We hope you enjoy this special issue. We thank our numerous reviewers for insightful comments and the Hot Chips program com- mittee for putting together a program with so many fine presentations. MICRO

Allen Baum is a computer architect at Intel. He has worked in the industry for 40 years on products ranging from the Apple I and II to HP’s HP45 and PA-RISC, ARM and StrongArm, and Alpha, , and x86 server processors. Baum has an MS in electrical engineering from the Massachusetts Institute of Technology. He is a senior member of IEEE and a member of the ACM and the IEEE Computer Society.

Bevan Baas is an associate professor in the Department of Electrical and Computer Engineering at the University of California, Davis. His research interests include algo- rithms, architectures, circuits, and VLSI design for high-performance, energy-efficient, ......

MARCH/APRIL 2012 7