HPC in China

Xiaomei Zhang IHEP 2019-03-05 HPC in China

 Six National SuperComputing Center (SC) in China, located in six provinces

 Tianjin, ChangSha, JiNan, GuangZhou, ShenZhen,WuXi

 Cities and universities has also it own SC, but in smaller scale

 CAS (Chinese Academy of Sciences) has its own supercomputing center, located in Beijing

 Managed by Computer Network Information Center (CNIC) of CAS

2 SuperComputing Center of CAS

 Earliest public HPC service provider in China

3 New Petascale - Era

 ERA - 元(Yuan)

 CAS HPC from T to P - new period

 Peak performance - 2.36 Petaflops

 The 6th generation supercomputer in SCCAS

 Installation

 Site: Huairou Campus of CNIC

 Completed in three stages

 Stage 1: announced on June 19

 Stage 2: come online in December, 2015

 Stage 3: announced in 2018 – AI Platform AI Platform

 Will integrate computing capabilities: 20 000+ GPUs

 It will be the biggest AI computing and data platform in China

 Computing Resources: CNGrid & China Science and Technology Cloud

? SaaS Weather Drug Smart MedicineSmart Manufacture New Applications Forecast Design

Other Data PaaS Processing Caffe Engine

China Science and Technology IaaS Cloud Hardware and software

 Resource infrastructure

 CPU 700Tflops, GPU and MIC 1.6Pflops, total system memory is 140TB and the storage 6PB with 56Gb/100Gb InfiniBand

 CPU

 Intel Xeon E5-2680 V2/3 processors (2.5GHz,12 cores,256GB memory)

 Each node with two processors has a capacity of 0.96Tflops

 MIC

 Intel Xeon Phi (stage I), now?

 GPU

 Nvidia Tesla K20 (stage I), now?

 OS

 Linux,CentOS

 Software

 Compiler , Math Libs, OpenMP, MPI SuperComputing Environment over CNGrid

• With CNGrid, integrate200PB Storage, 260PF+ Computing capability

• SCCAS as head center

• 9 regional centers, including including Era, TaihuLight and Tianhe-II

• 18 institution centers, 11 GPU centers • CNIC is the operation and management center

7 SCE - Middleware for CNGrid

 SCE – Scientific computing – Lightweight – Stable

 Diveristy – CLI – Portal – GUI – API

 Developed by HPC Dept. SCCAS support ATLAS Monte Carlo Simulation

 In 2015, CNGrid works with ATLAS teams

 SCEAPI works as a bridge between ARC-CE middleware and CNGrid resources

 ATLAS simulation jobs run on Chinese HPCs including TianHe-1A and ERA

SCEAPI

*Collaborated with Institute of High Energy Physics, CAS and CERN The top 2 SC center in China

TaihuLight

 Developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC)

 At the National Supercomputing Center in Wuxi, Jiangsu province

 93 Pflop/s

 Tianhe-2A (Milky Way-2A)

 Developed by China’s National University of Defense Technology (NUDT)

 At the National Supercomputer Center in Guangzhou

 61.4 Pflop/s

10 Wuxi National Supercomputing Center

 Own the Sunway TaihuLight system

 One thing need to know

 Its CPU was designed by Shanghai High-Performance Integrated Circuit Design Center  Sunway SW26010 260C

 Not general x86 CPU architecture

 Need much efforts to migrate HEP software to run on it with good performance

11 GuangZhou National SC Center

 Own Tianhe-2A system

 16,000 nodes

 Each node: 2*12-core Intel Xeon E5-2692 v2 + 2 Matrix-2000

 Memory for each node:64GB

 Interconnect: proprietary internal high-speed interconnect TH-2 Express-2 + 14Gbps x 8 Lane

 Network interface: 2 Gigabit Ethernet interface

 Storage: 19PB, 1TB/s

 Resources can be provided through cloud platform

 Support Big Data & Deep Learning

12 TianJin National SC Center

 ~10PFlops

 Tianhe-1 was deployed

 7168 nodes, 4.7PFlops

 Intel X5670, 2.93GHz, 12 core/node, 24/48GB memory

 New system

 Intel E5-2690V4, 2.6GHz,28 core /node,128GB

 Tianhe-3 is being deployed

 No direct access from public network

 Access through VPN

 WN has no access to outside, even login node

13 Use and Charge

 All the SC centers are not free, even for science and education

 Normal price published (for example, Tianhe-II )

 Normal nodes: 0.1RMB/core.hour(2*12 Intel Xeon E5-2692 v2|64GB memory)

 Storage:2000yuan/TB/year

 Special price for large scale of usage in science? Need further confirmation

 In SCCAS case,

 “In case of large consumption of computing resources, it need to seek financial support from project cooperation”

 “For special use of HPC resources, the charge policy can be 14 negotiated case by case”