The Green500 List: Encouraging Sustainable Supercomputing

The Green500 List: Encouraging Sustainable Supercomputing Wu-chun Feng and Kirk W. Cameron Virginia Tech The performance-at-any-cost design mentality ignores supercomputers’excessive power consumption and need for heat dissipation and will ultimately limit their performance. Without fundamental change in the design of supercomputing systems, the performance advances common over the past two decades won’t continue. lthough there’s now been a 10,000-fold puters too unreliable for application scientists to use. increase since 1992 in the performance of Unfortunately, building exotic cooling facilities can cost supercomputers running parallel scientific as much as the supercomputer itself, and operating and applications, performance per watt has only maintaining the facilities costs even more. A improved 300-fold and performance per As “The Energy-Efficient Green Destiny” sidebar square foot only 65-fold. In response to the lagging details, the low-power supercomputer that we developed power and space-efficiency improvements, researchers was extremely reliable, with no unscheduled downtime have had to design and construct new machine rooms, in its two-year lifespan, despite residing in a dusty ware- and in some cases, entirely new buildings. house without cooling, humidification, or air filtration. Compute nodes’ exponentially increasing power The hourly cost of such downtime ranges from $90,000 requirements are a primary driver behind this less effi- for a catalog sales operation to nearly $6.5 million for a cient use of power and space. In fact, the top super- brokerage operation, according to Contingency Planning computers’ peak power consumption has been on the Research’s 2001 cost-of-downtime survey. rise over the past 15 years, as Figure 1 shows. There’s still no guarantee that the supercomputer Today, the 10 most powerful supercomputers on the won’t fail, as Table 1 illustrates. Total cost of ownership TOP500 List (www.top500.org) each require up to 10 now exceeds initial acquisition costs. megawatts of peak power—enough to sustain a city of 40,000. And even though IBM BlueGene/L, the world’s Performance at any cost fastest machine, was custom-built with low-power com- The performance-at-any-cost supercomputer design ponents, the system still consumes several megawatts of paradigm is no longer feasible. Clearly, without signif- power. At anywhere from $200,000 to $1.2 million per icant change in design, the performance gains of the megawatt, per year, these are hardly low-cost machines. past two decades won’t continue. Unfortunately, performance-only metrics don’t capture improvements in THE ENERGY CRISIS IN SUPERCOMPUTING power efficiency. Nonetheless, performance-only met- Power is a disruptive technology that requires us to rics derived from the Linpack benchmarks and Standard rethink supercomputer design. As a supercomputer’s Performance Evaluation Corp.’s (SPEC) code suite have nodes consume and dissipate more power, they must be significantly influenced the design of modern high- spaced out and aggressively cooled. Without exotic cool- performance systems, including servers and super- ing facilities, overheating makes traditional supercom- computers. 38 Computer P u b l i s h e d b y t h e I E E E C o m p u t e r S o c i e t y 0018-9162/07/$25.00 © 2007 IEEE Fujitsu IBM SP TMC Numerical Intel ASCI White Earth 6,000 kW CM-5 Wind Tunnel ASCI Red Simulator Small power plant 5 kW 100 kW 850 kW 12,000 kW generating capacity 300,000 kW 106 Gflops High-speed 105 Gflops ) electric train ax ts k (Rm 10,000 kW en pea irem ck qu npa 4 r re Li Commercial data 10 Gflops we age Efficiency po ver eak 5 a gap center 1,374 kW P Top e c n a 3 m 10 Gflops r n o atio Residential air- f plic r l ap e Rea conditioner P 15 kW 102 Gflops 10 Gflops 1 Gflop 1993 1995 1997 1999 2001 2003 2005 2007 2009 Figure 1. Rising power requirements. Peak power consumption of the top supercomputers has steadily increased over the past 15 years. The Energy-Efficient Green Destiny As a first step toward reliable and available energy- attitude shift with respect to power and energy, partic- efficient supercomputing, in 2002 we built a low- ularly in light of how quickly supercomputers’ thermal power supercomputer at Los Alamos National power envelopes have increased in size, thus adversely Laboratory. Dubbed Green Destiny, the 240-processor impacting the systems’ power and cooling costs, relia- supercomputer took up 5 square feet (the size of a bility, and availability. standard computer rack) and had a 3.2-kilowatt power The laboratory’s Biosciences Division bought a budget (the equivalent of two hairdryers) when booted Green Destiny replica about six months after Green diskless.1, 2 Its 101-gigaflop Linpack rating (equivalent Destiny’s debut. In 2006, we donated Green Destiny to a 256-processor SGI Origin 2000 supercomputer or to the division so it could run a parallel bioinformatics a Cray T3D MC1024-8) would have placed it at no. code called mpiBLAST. Both clusters are run in the 393 of the 2002 TOP500 List. same environment, yet half of the nodes are inopera- Garnering widespread media attention, Green ble on the replica, which uses higher-powered proces- Destiny delivered reliable supercomputing with no sors. Hence, although the original Green Destiny was unscheduled downtime in its two-year lifetime. It 0.150 gigahertz slower in clock speed, its productivity endured sitting in a dusty warehouse at temperatures in answers per month was much better than the faster of 85-90 degrees Fahrenheit (29-32 degrees Celsius) but often inoperable replica. and an altitude of 7,400 feet (2,256 meters). Further- Green Destiny is no longer used for computing, and more, it did so without air-conditioning, humidification resides in the Computer History Museum in Mountain control, air filtration, or ventilation. View, California. Yet despite Green Destiny’s accomplishments, not everyone was convinced of its potential. Comments References ranged from Green Destiny being so lower power 1. W. Feng, “Making a Case for Efficient Supercomputing,” that it ran just as fast when it was unplugged to the ACM Queue, Oct. 2003, pp. 54-64. notion that no one in HPC would ever care about 2. W. Feng, “The Importance of Being Low Power in High- power and cooling. Performance Computing,” Cyberinfrastructure Technology However, in the past year, we’ve seen a dramatic Watch, Aug. 2005, pp. 12-21. December 2007 39 marks for servers running commercial pro- Table 1. Reliability and availability of large-scale computing systems. duction codes. The diverse types of evalu- ations that efforts like the Green500 and System Processors Reliability and availability SPECPower (www.spec.org/ specpower) ASC Q 8,192 Mean time between interrupts: 6.5 hours, provide will give users more choice in 114 unplanned outages/month determining efficiency metrics for their sys- Outage sources: storage, CPU, memory tems and applications. ASC White 8,192 Mean time between failures: 5 hours (2001) and 40 hours (2003) Measuring efficiency Outage sources: storage, CPU, third-party hardware In the Green500 effort, we treat both PSC Lemieux 3,016 Mean time between interrupts: 9.7 hours performance (speed) and power consump- Availability: 98.33 percent tion as first-class design constraints for Google 450,000 600 reboots/day; 2-3 percent replacement/year supercomputer deployments. (estimate) Outage sources: Storage and memory Speed and workload. The supercom- Availability: ~100 percent puting community already accepts the flops metric for the Linpack benchmark, which Source: D.A. Reed the TOP500 List uses. Although TOP500 principals acknowledge that Linpack isn’t Developing new metrics the be-all or end-all benchmark for high-performance Performance-only metrics are likely to remain valu- computing (HPC), it continues to prevail despite the able for comparing existing systems prior to acquisition emergence of other benchmarks. As other benchmark and helping drive system design. Nonetheless, we need suites gain acceptance, most notably the SPEChpc3 and new metrics that capture design differences in energy HPC Challenge benchmarks,4 we plan to extend our efficiency. For example, two hypothetical high-perfor- Green500 List methodology as mentioned. For now, mance machines could both achieve 100 teraflops run- since the HPC community seems to identify with the ning Linpack and secure a high equivalent ranking on notion of a clearly articulated and easily understood sin- the TOP500 List. But enable smart-power-management gle number that indicates a machine’s prowess, we opt hardware or software1,2 on one machine that can sus- to use floating-point operations per second (flops) as a tain performance and reduce energy consumption by 10 speed metric for supercomputer performance and the percent, and the TOP500 rankings remain the same. Linpack benchmark as a scalable workload. Unfortunately, metric development is fraught with EDn metric. There are many possibilities for perfor- technical and political challenges. On the technical side, mance-efficiency metrics, including circuit design’s EDn operators must perceive the metric and its associated metric—with E standing for the energy a system uses benchmarks as representative of the workloads typically while running a benchmark, D for the time to complete running on the production system. On the political side, that same benchmark,5-8 and n a weight for the delay metrics and benchmarks need strong community buy-in. term. However, the EDn metrics are biased when applied to supercomputers, particularly as n increases. For THE GREEN500 LIST example, with large values for n, the delay term domi- We’ve been working to improve awareness of energy- nates so that very small changes in execution time efficient supercomputer (and data-center) design since impact the metric dramatically and render changes in E the turn of the century.

Load more