Overview of the K computer System

 Hiroyuki Miyazaki  Yoshihiro Kusano  Naoki Shinjou  Fumiyoshi Shoji  Mitsuo Yokokawa  Tadashi Watanabe

RIKEN and have been working together to develop the K computer, with the aim of beginning shared use by the fall of 2012, as a part of the High-Performance Computing Infrastructure (HPCI) initiative led by Japan’s Ministry of Education, Culture, Sports, Science and Technology (MEXT). Since the K computer involves over 80 000 compute nodes, building it with lower power consumption and high reliability was important from the availability point of view. This paper describes the K computer system and the measures taken for reducing power consumption and achieving high reliability and high availability. It also presents the results of implementing those measures.

1. Introduction Technology (MEXT). As the name “Kei” in Fujitsu has been actively developing and Japanese implies, one objective of this project providing advanced for over was to achieve a computing performance of 30 years since its development of the FACOM 1016 floating-point operations per second (10 230-75 APU—Japan’s first —in PFLOPS). The K computer, moreover, was 1977 (Figure 1). As part of this effort, it has developed not just to achieve peak performance been developing its own hardware including in benchmark tests but also to ensure high original processors and software too and building effective performance in applications used in up its technical expertise in supercomputers actual research. Furthermore, to enable the along the way. entire system to be installed and operated at The sum total of this technical expertise one location, it was necessary to reduce power has been applied to developing a massively consumption and provide a level of reliability parallel computer system—the K computer1), note)i that could ensure the total operation of a large- —which has been ranked as the top performing scale system. supercomputer in the world. To this end, four development targets were The K computer was developed jointly established. by and Fujitsu as part of the High • A high-performance CPU for scientific Performance Computing Infrastructure computation (HPCI) initiative overseen by Japan’s Ministry • New interconnect architecture for massively of Education, Culture, Sports, Science and parallel computing • Low power consumption note)i “K computer” is the English name that RIKEN has been using for the • High reliability and high availability supercomputer of this project since July The CPU and interconnect architecture are 2010. “K” comes from the Japanese word defined in detail in other papers of this special “Kei,” which means ten peta or 10 to the 2),3) 16th power. issue. In this paper, we present an overview

FUJITSU Sci. Tech. J., Vol. 48, No. 3, pp. 255–265 (July 2012) 255 H. Miyazaki et al.: Overview of the K computer System

K computer Peta-scale

SPARC FX1 Enterprise VPP5000

VPP300/700 PRIMEQUEST VPP500 PRIMEPOWER Vector machine HPC2500 BX900 VP Series Cluster node

AP3000 Scalar machine HX600 Cluster node ScalarAP1000 MPP FACOM230-75 APU PRIMERGY x86 cluster RX200 Cluster node

~1985 1990 1995 2000 2005 2010 2015

Figure 1 FigureHistory of1 supercomputer development at Fujitsu. History of supercomputer development at Fujitsu. of the K computer system, describe the measures implementing power-reduction measures from taken for reducing power consumption and both process-technology and design points of achieving high reliability and high availability at view. This CPU applies High Performance the system level of the K computer, and present Computing–Arithmetic Computational Extensions the results of implementing those measures. (HPC-ACE) applicable to scientifi c computing and analysis. It also uses a 6-MB/12-way 2. Compute-node confi guration sector cache as an L2 cache and uses a Virtual in the K computer Single Processor by Integrated Multicore We fi rst provide an overview of the compute Architecture (VISIMPACT), whose effectiveness nodes, which lie at the center of the K computer has previously been demonstrated on Fujitsu’s system. A compute node consists of a CPU, FX1 high-end technical computing server. As memory, and an interconnect. a result of these features, the SPARC64 VIIIfx 1) CPU achieves high execution performance in the We developed an 8-core CPU with a HPC fi eld. In addition, the system-control and theoretical peak performance of 128 GFLOPS memory-access-control (MAC) functions, which called the “SPARC64 VIIIfx” as the CPU for the were previously implemented on separate chips, K computer (Figure 2).4) The SPARC64 VIIIfx are now integrated in the CPU, resulting in high features Fujitsu’s advanced 45-nm semiconductor memory throughput and low latency. process technology and a world-class power 2) Memory performance of 2.2 GFLOPS/W achieved by Commercially available DDR3-SDRAM-

256 FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) H. Miyazaki et al.: Overview of the K computer System

System bus interface

Core 5 Core 7 L2 cache data

Core 4 Core 6

MAC L2 cache control MAC 6D mesh/torus Node group Core 3 DDR3 interface DDR3 interface Core 1 Figure 3 TofuFigure interconnect. 3 L2 cache data Tofu interconnect. Core 0 Core 2 torus confi guration. It employs an extended- dimension-order routing algorithm that enables the provision of a CPU group joined in a 1–3 Figure 2 2 dimensional torus so that a faulty node in the SPARC64 VIIIfx. SPARC64 VIIIfx. system becomes non-existent as far as the user is concerned. The Tofu interconnect also makes it possible to allocate to the user a complete CPU DIMM is used as main memory. The dual in-line group joined in a 1–3 dimensional torus even if memory module (DIMM) is a commodity module part of the system has been extracted for use. that is also used in servers and PC clusters, A compute node of the K computer which provides a number of advantages. For consists of the CPU/DIMM described above example, it can be multi-sourced, a stable supply and an InterConnect Controller (ICC) chip for of about 700 000 modules can be obtained in a implementing the interconnect. Since the Tofu roughly one-year manufacturing period with a interconnect forms a direct interconnection stable level of quality, and modules with superior network, the ICC includes a function for relaying power-consumption specifi cations can be selected. packets between other nodes (Figure 4). If the Besides this, an 8-channel memory interface node section of a compute node should fail, only per CPU provides a peak memory bandwidth the job using that compute node will be affected, of 64 GB/s, surpassing that of competitors’ but if the router section of that compute node computers and the high memory throughput should fail, all jobs using that compute node as deemed necessary for scientifi c computing. a packet relay point will be affected. The failure 3) Interconnect rate of the router section is nearly proportional To provide a computing network to the amount of circuitry that it contains, and (interconnect), We developed and implemented it needs to be made signifi cantly lower than the an interconnect architecture called “Tofu” failure rate of the node section. ICC-chip faults for massively parallel computers in excess are therefore categorized in accordance with of 80 000 nodes.5) The Tofu interconnect their impact, which determines the action taken (Figure 3) constitutes a direct interconnection at the time of a fault occurrence. The end result network that provides scalable connections with is a mechanism for continuing system operation low latency and high throughput for a massively in the event of a fault in a node section. parallel group of CPUs and achieves high availability and operability by using a 6D mesh/

FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) 257 H. Miyazaki et al.: Overview of the K computer System

Node section Router section

CPU ICC Core Core Port TNI Port Memory Core Core Port DIMM Port

DIMM MAC r DIMM TNI DIMM MAC a L2 cache System System b Port

s

DIMM control bus IF bus IF s Port

DIMM MAC o

DIMM r DIMM MAC TNI C Port Port Core Core L2 cache Port PCIe PCIe TNI Core data Core Port core core

Tofu interconnect PCIe

MAC: Memory access controller ICC: InterConnect Controller TNI: Tofu network interface IF: Interface PCIe: Peripheral component interconnect express

Figure 4 ConceptualFigure 4 diagram of node and router sections. Conceptual diagram of node and router sections.

Memory

3. Rack configuration in the K CPU Memory PCIe riser computer ICC Compute nodes of the K computer are installed in specially developed racks. A compute rack can accommodate 24 system boards (SBs), each of which mount 4 compute nodes (Figure 5), CPU and 6 I/O system boards (IOSBs), each of which Memory System board I/O system board mount an I/O node for system booting and file- system access. A total of 102 nodes can therefore FigureFigure 5 5 be mounted in one rack. System board and I/O system board. Water cooling is applied to the CPU/ICCs System board and I/O system board. used for the compute nodes and I/O nodes and to on-board power-supply devices. Specifically, IOSBs are air cooled since they make use of chilled water is supplied to each SB and IOSB commercially available components. Since these via water-cooling pipes and hoses mounted boards are mounted horizontally inside the rack, on the side of the rack. Water cooling is used air must be blown in a horizontal direction. To to maintain system reliability, enable a high- enable the racks themselves to be arranged in a density implementation, and lower power high-density configuration, the SBs are mounted consumption (decrease leakage current). In diagonally within the rack to form a structure contrast, the DIMMs mounted on the SBs and in which air is sucked in from the front, where

258 FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) H. Miyazaki et al.: Overview of the K computer System

the boards are angled toward the opening, and assumes that role. passes through the SBs to cool the DIMMs before The LIO nodes connect to local disks exiting from the rear (Figure 6). (ETERNUS DX80) mounted on disk racks The six I/O nodes mounted in each rack are installed near the compute racks. classified by application: The GIO node connects to external global • Boot-IO node (BIO): two nodes storage systems via an InfiniBand quad data • Local-IO node (LIO): three nodes rate (QDR) interface. These storage systems • Global-IO node (GIO): one node use the Fujitsu Exabyte File System (FEFS)7) The BIO nodes connect via a fiber-channel to provide a large-scale, high-performance, and interface up to 8 Gbit/s (8G-FC) to the system- high-reliability file system. If the LIO nodes or boot disk (ETERNUS DX80) installed in the GIO node in a rack should fail, file-system access rack. These nodes start up autonomously at can be continued via a path to LIO or GIO nodes power-on and function as a boot server via Tofu mounted in a neighboring rack. for the remaining 100 nodes in the rack.6) One Each rack also includes power supply plays the role of active system, and one plays units (PSUs), air conditioning fans, and service the role of standby system. If the one playing processors (SPs), all in a redundant configuration the role of active system fails, the standby one so that a single failure does not bring down the

Rear

Front Rear

IOSB SB Fan Power × 12 supply

Cross-section B Front IOSB Air flow in IOSB section Power × 6 supply at cross-section B

Rear pipe Water-cooling Water-cooling System disk

Cross-section A SB SB × 12 Fan

Front

Air flow in SB section at cross-section A Compute rack configuration

Figure 6 Compute rack configuration and direction of air flow.

FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) 259 H. Miyazaki et al.: Overview of the K computer System

entire rack. simply compute racks, disk racks, and a parallel file system—a control mechanism 4. Overall system configuration consisting of peripheral server hardware and The K computer system consists of operation/management software for system 864 compute racks in total. As shown in Figure 7, and job management is an absolute necessity.8) two adjacent compute racks are connected by a Maintenance operations based on a remote alarm Z-axis cable (connecting 17 nodes including I/O system and maintenance server are also needed nodes). There is one disk rack for every four to replace hardware after a failure. An overview compute racks, and 45 racks are arranged along of the entire K computer system installed at the Y-axis (36 compute racks + 9 disk racks) RIKEN is given in Figure 8. and 24 racks along the X-axis. The K computer therefore has a rectangular floor configuration 5. Power-reduction measures at of 24×45 racks. Each disk rack mounts 12 local the system level disks and each local disk connects to 2 adjacent To reduce power consumption across LIO nodes in the X-direction. the entire K computer, the first step was to There’s more to the K computer than implement drastic power-reduction measures

X-axis cable Disk rack

36 compute racks + 9 disk racks + 2 aisles

I/O cable

Y-axis cable

24 rows

Local disk Local disk SB × 12 SB × 12 Local disk Local disk Local disk Local disk X IOSB × 6 IOSB × 6 Z-axis cable System disk System disk Local disk Local disk Local disk Local disk Y SB × 12 SB × 12 Local disk Local disk

FigureFigure 7 7 Configuration of compute racks. Configuration of compute racks. 260 FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) H. Miyazaki et al.: Overview of the K computer System

at the CPU/ICC development stage. Dynamic power effi ciently to a large number of racks current was reduced through clock gating, while while enabling a high-density implementation. operating frequency was lowered to 2.0 GHz Specifi cally, transient response characteristics and leakage current was signifi cantly decreased were improved by arranging non-isolated, point- by adopting high-Vth transistors.9) It was also of-load (POL) power supplies near loads, and decided to employ water cooling since leakage power feeding to the entire rack was made more current increases rapidly with the junction effi cient by supplying 48 V of power to each SB temperature. This had the effect of dropping from the rack power supply. The mounting of junction temperature from the usual 85°C to an insulated type of bus converter on each SB 30°C, resulting in a signifi cant decrease in makes it possible to step-down the supplied 48 V leakage current. and supply power to the POL power supplies. The use of a clock-gating design to reduce This approach reduces the number of isolated LSI power consumption makes load fl uctuation transformers and achieves a high-effi ciency and even more noticeable. This makes it necessary high-density power-supply system. to improve transient response characteristics in the power supply. An intermediate bus converter system was adopted here to improve transient response characteristics and feed

DiskDisk rack 432 cabinets, 864 racks, 82 944 compute nodes, 5184 I/O nodes, (11.280 PFLOPS), 216 disk racks (11 PB) 57.6 TB

Legend: Y-axis 18 cabinets (36 racks) InfiniBand (IB) PQ: PRIMEQUEST +9 disk racks +Aisle (2 aisles) GigabitEthernet PG: PRIMERGY (37.6 m) Management LAN OS LAN (control use) OS LAN (management use) M3000: SPARC Enterprise M3000 OS LAN (other) Compute-node Compute-node DX80: ETERNUS DX80 rack rack 10 Gigabit Ethernet 12 TFLOPS 12 TFLOPS 1 cabinet Fibre Channel I/O group = 4 compute-node racks + 1 disk rack

x2x2 For local FS For local FS Reserve node Secure partition configuration DX80 Backup node MDS node DX8010.8TB MDS node PG RX300 2CPU E5620) (max. 2) PQ 1800E 10.8TB PQ 1800E PG RX300( DX80 72GB MEM 8 CPUs (E7540) DX804.25TB 8 CPUs (E7540) 2 CPUs (E5620) Main system (Unit A) 72-GB MEM For local FS For local FS X-axis 24 rows (39.2 m) 256-GB MEM 4.25 TB 256-GB MEM x5 MDS node MDS node x2 PG RX300 PG RX300 x2 x8 x2 x2 x8 2 CPUs (X5680) DX80 2 CPUs (X5680) x10 368 x5184 x864 72-GB MEM 4.25 TB 72-GB MEM System Control node Control node Maintenance Dump-analysis Dump-analysis System backup x12 x16 Storage-cluster Storage-cluster x2 x2 x2 integration Maintenancenode node for SPARC PG RX300 PG RX300 PG RX300 node for IA machine node management node management node node node PG RX300 machine PG RX300 PG RX200 2 CPUs (X5680) 2 CPUs (X5680) PG RX3002 CPUs (E5620)0 DX80 M3000 DX80 PG RX300 PG RX300 12-GB MEMDX8012.8 TB 2 CPUs (E5620) 1 CPU (2.75 GHz) 2 CPUs (E5620) 2 CPUs (E5620) 2 CPUs (E5620) x4 x4 2 CPUs (E5620) 96-GB MEM 96-GB MEM 2 CPUs (E5620) 39.5 TB DX80 12.8 TB 192-GB MEM 16-GB MEM 48-GB MEM 12-GB MEM 12-GB MEM Control node Control node 12-GB MEM 12-GB MEM 118 TB SAS x2 x2 x2x2 PG RX300 PG RX300 x2 x4 x2x2 x4 x2 x2 x2 x2 DX80 DX80 x2 x2 2 CPUs (E5620) 2 CPUs (E5620) Tape DX80 14.81 TB 14.81 TB 24-GB MEM 14.4 TB 24-GB MEM equipment x4 x2 x4 x4 x8x4 x6 x4 x2 x2 x2 x2 x2 x2 x2 Data-transfer network [InfiniBand QDR] Manager’s terminal x4x4x4 Control network [Gigabit Ethernet]

Management network [Gigabit Ethernet] Management network [Gigabit Ethernet] (secure-partition network)

x72 x144 x4 x6 x8 x16 x52 x16 x180 x1620 x180 x2 x2 x2 x2 x2 x2x2 x4 x2x2 x4 x8 x2 x8 x8 x2 x8 x2 x2 x2x2 x2 x2 Job-management Job-management User- Pre/Pos x20 x10 x20 Global FS Global FS x2 x2 management processing node x2 x4 x4 x2 x2 x2 Job-management Job-management node node MDS node MDS node x2 x2 node PQ 1800E Manager’s Job-managementsub-node Job-managementsub-node PQ 1800E PQ 1800E PG RX200 PQ 1800E PQ 1800E OSS node for OSS node for sub-node sub-node 8 CPUs (X7560) 8 CPUs (X7560) 8 CPUs (X7560) x2x2 terminal x4 x4 PG RX200 PG RX200 2 CPUs (E5620) x2 x2 x2 x2 8 CPUs (E7540) 8 CPUs (E7540) global FS global FS PG2 CPUs RX200 (E5620) PG2 CPUs RX200 (E5620) 64-GB MEM 64-GB MEM 48-GB MEM 1-TB MEM Log-in node PG RX300 PG RX300 DX80 External relay External relay 512-GB MEM 512-GB MEM x2 x2 2 CPUs12-GB (E5620) MEM 2 CPUs12-GB (E5620) MEM Log-inPG RX300node node node 2 CPUs (X5680) 2 CPUs (X5680) 12-GB MEM 12-GB MEM 1.02 TB 192-GB MEM 192-GB MEM Job management Job management Log-in node x36 DX80 PG2 CPUs RX300 (E5620) PG RX300 PG RX300 x48 x48 node node PG RX300 x2 6.52 TB 2 CPUs (E5620) 2 CPUs (E5620) 2 CPUs (E5620) PG RX200 PG RX200 72-GB MEM 12-GB MEM 12-GB MEM x8 x8 2 CPUs (X5680) 2 CPUs (X5680) 2 CPUs (E5620) 72-GB MEM x10 x2 x2 DX80 80 80 DX80 80 80 SN200 M600 SN200 M600 24-GB MEM 24-GB MEM 72-GB MEM x2 15 TB TB TB 15 TB TB TB x16 DX80 x10 x10 x16 x16 1.02 TB x2 x2 x 20 (300TB) x 4 (60TB) x2 x10 for data use for home use DX80DX8DX80DX80DX800DX8 0DX80 0 L2 switch L2 switch 43 TB43T43T B43T B43TB B43T B43T B B Load balancer Load balancer x 16 (688TB) BIG-IP 8900 BIG-IP 8900 x45 OSS node group for global FS (31 PB) User network environment Front end Network [Gigabit Ethernet]

User carry-in disk Disk brought by user User User network environment (NAS server and so on) User (e.g., NAS server)

Figure 8 FigureOverview 8 of entire system. Overview of entire system. FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) 261 H. Miyazaki et al.: Overview of the K computer System

6. High-reliability/high- in the number of components by using a 1-node/1- availability measures at the CPU configuration, error correction by using error system level correcting code (ECC) and retries, and reduction Adopting a system-on-a-chip (SOC) in the occurrence rate of semiconductor failure configuration means that the amount of by lowering operating temperature through the circuitry on the chip increases while the number water cooling of LSIs and POL power supplies. of components outside the chip decreases. These measures lower the probability of a node Consequently, failures in large-scale chips or job crash due to a component failure. Another account for a greater percentage of system measure to help ensure job continuity is the use failures, which means that reducing the failure of electrical cables in the Tofu interconnect. rate of large-scale chips is essential for stable operation of the K computer. 6.2 High-availability measures From the viewpoint of ensuring system 6.1 High-reliability measures availability, a Tofu coordinate arrangement has In the K computer, reliability is improved been implemented within SBs so that the faulty- by improving the reliability of individual node-avoidance mechanism based on a 6D mesh/ components and adopting an appropriate torus can be extended to SB-fault handling and method for using those components. In this hot maintenance. This enables detour routing regard, efforts to improve system reliability despite the fact that each SB mounts only four are performed within the frameworks of job nodes. Specifically, Figure 9 shows how the continuity and system availability. four nodes of a SB are confined to the same Y/B As shown in Table 1, giving major coordinates and interconnected along the A-axis components a redundant configuration helps to and C-axis on the SB so that a connection along ensure operational continuity and therefore job the B-axis offers a way out of the SB. continuity, and replacing faulty components by In addition, I/O path redundancy is provided “hot-swapping” without halting the entire system for the three types of I/O nodes (BIO, LIO, GIO) helps to ensure system availability. so that the I/O paths of compute nodes under a In more detail, measures within the job- faulty I/O node can still be used. This is done continuity framework include the application of by adopting a configuration that incorporates redundant hardware configurations, a reduction alternate I/O nodes having common connection

Table 1 High reliability by using redundant configurations and hot swapping. Major components Redundant configuration Hot swapping Rack power supply Ye s Ye s Cooling fans Ye s Ye s Service processors (SPs) Yes (duplicate/switchable) Ye s System boards (SBs/IOSBs) No ⇒ Fault avoidance along B-axis Ye s No ⇒ Reliability improved by using water cooling, retries, and CPU/ICC (SB hot swapping) error correcting code (ECC) POL power supplies No ⇒ Reliability secured by water cooling (SB hot swapping) Intermediate converters Ye s (SB hot swapping) DIMMs No ⇒ Data rescue by using extended ECC (SB hot swapping) Yes ⇒ Controller and power supply: redundant System disks Yes (module) HDD: RAID5 configuration + hot spare

262 FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) H. Miyazaki et al.: Overview of the K computer System

destinations, which enables a compute-node interconnect (Figure 10). There are two BIO group to access alternate I/O nodes via the Tofu nodes per compute rack in a redundant active/ standby confi guration, and there are three LIO nodes per compute rack that take on a redundant Y axis A confi guration with LIO nodes in another C compute rack that share local disks mounted in a disk rack. Similarly, the single GIO node in a B axis compute rack takes on a redundant confi guration with the GIO node of the other compute rack. Send node Faulty node Destination node Y axis C A 7. Benchmark results Though still in its confi guration stage, the

B axis K computer developed according to the targets described at the beginning of this paper achieved a performance of 8.162 PFLOPS on the HPC- LINPACK benchmark test using only part of Figure 9 Conceptual diagram of Tofu detour. the system (672 racks). This achievement won

Global storage

System disk Disk rack System disk FC FC BIO N/C N/C BIO node FC Local disk FC node GbE GbE FC FC Local disk LIO N/C N/C LIO node FC FC node GbE GbE IB IB GIO N/C N/C GIO node N/C Local disk N/C node GbE GbE FC Local disk FC LIO N/C N/C LIO node FC FC node GbE GbE FC FC BIO N/C N/C BIO Local disk node FC FC node GbE GbE FC Local disk FC LIO N/C N/C LIO node FC FC node GbE GbE

FC: Fibre Channel N/C: Not connected GbE: Gigabit Ethernet

FigureFigure 10 10 I/O path redundancy. I/O path redundancy.

FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) 263 H. Miyazaki et al.: Overview of the K computer System

the K computer the No. 1 spot on the TOP500 Table 2 TOP500/Green500 results. list announced in June 2011.10) Despite the June 2011 November 2011 Unit fact that this LINPACK benchmark test ran TOP500 ranking 1 1 Rank for more than 28 hours, all of the 68 544 nodes No. of cores 548 352 705 024 Units involved operated continuously without a failure, Performance (Rmax) 8162.00 10 510.00 TFLOPS demonstrating high system reliability and high Power 9898.56 12 659.9 kW job continuity. Details on these benchmark Green500 ranking 6 32 Rank results and those announced in November 2011 Power performance 824.56 830.18 MFLOPS/W are given in Table 2. The K computer also participated in the HPC Challenge benchmark test that assesses applications and actual system operation. the overall performance of a supercomputer. In this test, it received a No. 1 ranking in all four References categories {Global HPL, Global RandomAccess, 1) M. Yokokawa et al.: The K computer: Japanese next-generation supercomputer development EP STREAM (Triad) per system, and Global project. ISLPED, pp. 371–372 (2011). FFT} of the HPC Challenge Award (Class 1) 2) T. Yoshida et al.: SPARC64 VIIIfx: CPU for the K computer. Fujitsu Sci. Tech. J., Vol. 48, No.3, announced in November 2011. Additionally, the pp. 274–279 (2012). K computer was awarded the Gordon Bell Peak 3) Y. Ajima et al.: Tofu: Interconnect for the K computer. Fujitsu Sci. Tech. J., Vol. 48, No.3, Performance prize announced in November 2011 pp. 280–285 (2012). in recognition of its achievements with actual 4) T. Maruyama et al.: SPARC64 VIIIfx: A New- Generation Octocore Processor for Petascale applications. These results show that the K Computing. IEEE Micro, Vol. 30, Issue 2, pp. 30–40 computer, far from being a machine specialized (2010). 5) Y. Ajima et al.: Tofu: A 6D Mesh/Torus for the LINPACK benchmark, possesses general- Interconnect for Exascale Computers. Computer, purpose properties that can support a wide range Vol. 42, No. 11, pp. 36–40 (2009). 6) J. Moroo et al: for the K of applications. computer. Fujitsu Sci. Tech. J., Vol. 48, No.3, pp. 295–301 (2012). 7) K. Sakai et al: High-Performance and Highly 8. Conclusion Reliable File System for the K computer. Fujitsu The development objectives established for Sci. Tech. J., Vol. 48, No.3, pp. 302–309 (2012). 8) K. Hirai et al: Operations Management Software the K computer called for a large-scale high- for the K computer. Fujitsu Sci. Tech. J., Vol. 48, performance computing system with importance No.3, pp. 310–316 (2012). 9) H. Okano et al.: Fine Grained Power Analysis and placed on high reliability and high availability Low-Power Techniques of a 128GFLOPS/58W TM from the design stage. RIKEN and Fujitsu have SPARC64 VIIIfx Processor for Peta-scale Computing. Proc. VLSI Circuits Symposium, been working to achieve target peak performance pp. 167–168 (2010). and application effective performance and to 10) TOP500 Supercomputing Sites. http://www.top500.org construct a high-reliability and high-availability system. The K computer was ranked No.1 on the TOP500 benchmark list of June of 2011, and kept to be ranked No.1 on that of next November of 2011. It also achieves 10 PFLOPS of LINPACK performance. These achievements demonstrate that objectives are being steadily met as planned. To enter a commencing full-service once the performance-tuning phase comes to an end, we will be evaluating the performance of actual

264 FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) H. Miyazaki et al.: Overview of the K computer System

Hiroyuki Miyazaki Fumiyoshi Shoji Fujitsu Ltd. RIKEN Mr. Miyazaki is engaged in the drafting Dr. Shoji is engaged in system of specifi cations and the development development of system. of LSI circuits for next-generation HPC systems.

Yoshihiro Kusano Mitsuo Yokokawa Fujitsu Ltd. RIKEN Mr. Kusano is engaged in the Dr. Yokokawa is engaged in overall development work for next-generation control of system development. HPC systems.

Naoki Shinjou Tadashi Watanabe Fujitsu Ltd. RIKEN Mr. Shinjou is engaged in overall Dr. Watanabe is engaged in overall development work for next-generation control of development project. HPC systems.

FUJITSU Sci. Tech. J., Vol. 48, No. 3 (July 2012) 265