Leadership Computing in the Age of Cloud

LEADERSHIP COMPUTING IN THE AGE OF CLOUD Dan Stanzione Executive Director, Texas Advanced Computing Center Associate Vice President for Research, The University of Texas at Austin Amazon Seminar November 2019 THE BASIC OUTLINE What the heck is TACC? What is Frontera (and why do you care)? What sort of stuff needs to run on Frontera? What sort of stuff would be better off on AWS? What do future science workflows look like, and why will they run where they run? 1/13/20 2 WHAT IS TACC? Grendel, 1993 The Texas Advanced Computing Center, at UT Austin is a (primarily) NSF-funded center to Frontera, 2019 provide and apply large scale computing resources to the open science community. 1/13/20 3 TACC AT A GLANCE - 2019 Personnel 185 Staff (~70 PhD) Facilities 12 MW Data center capacity Two office buildings, Three Datacenters, two visualization facilities, and a chilling plant. Systems and Services >Seven Billion compute hours per year >5 Billion files, >100 Petabytes of Data, NSF Frontera (Track 1), Stampede2 (XSEDE FlaGship), Jetstream (Cloud), Chameleon (Cloud Testbed) system Usage >15,000 direct users in >4,000 projects, >50,000 web/portal users, User demand 8x available system time. Thousands of training/outreach participants annually 1/13/20 4 MODERN COMPUTATIONAL SCIENCE Simulation Computationally query our *mathematical models* of the world Machine Learning/AI Analytics Computationally query our Computationally analyze our *data sets* *experiments* (depending on technique, (driven by instruments that produce also called deep learning) lots of digital information) I would argue that modern science and engineering combine all three 1/13/20 5 TACC LAUNCHED IN JUNE, 2001 AFTER EXTERNAL REVIEW In 2001, budget of $600k staff of 12 (some shared). 50GF computing resource (1/200,000th of the current system). 1/13/20 6 RAPID GROWTH FROM THEN TO NOW… 2003 – First Terascale Linux cluster for open science (#26) 2004 – NSF funding to join the Teragrid 2006 – UT System Partnership to provide Lonestar-3 (#12) 2007 - $59M NSF award – largest in UT history – to deploy Ranger, the world’s largest open system (#4) 2008 – funding for new Vis software and launch of revamped visualization lab. 2009 - $50M iPlant Collaborative award (largest NSF bioinformatics award) moves a major component to TACC, life sciences group launched. In 2009, we reached, 65 employees. 1/13/20 7 NOW, A WORLD LEADER IN CYBERINFRASTRUCTURE 2010, TACC becomes a core partner (1 of 4) in XSEDE, the TeraGrid Replacement 2012, Stampede replaces Ranger with new $51.5M NSF Award 2013, iPlant is renewed, expanded to $100M 2015, Wrangler, first data intensive supercomputer is deployed. 2015, Chameleon cloud is launched 2015, DesignSafe, the cyberinfrastructure for natural hazard engineering, is launched. 2016 Stampede-2 awarded the largest academic system in the United States, 2017-2021. 2019 -- Frontera 1/13/20 8 HPC DOESN’T LOOK LIKE IT USED TO. HPC-Enabled Jupyter Notebooks Web Portal Narrative analytics and exploration environment Data management and accessible batch computing Event-driven Data Processing Extensible end-to-end framework to integrate planning, experimentation, validation and analytics From Batch Processing and single simulations of many MPI Tasks – to that, plus new modes of computing, automated workflows, users who avoid the command line, reproducibility and data reuse, collaboration, end-to-end data management, • Simulation where we have models • Machine Learning where we have data or incomplete models And most things are a blend of most of these. SUPPORTING AN EVOLVING CYBERINFRASTRUCTURE Success in Computational/Data Intensive Science and Engineering takes more than systems. Modern Cyberinfrastructure requires many modes of computing, many skillsets, and many parts of the scientific workflow. Data lifecycle, reproducibility, sharing and collaboration, event driven processing, APIs, etc. Our team and software investments are larger than our system investments Advanced Intefaces – Web front ends, Rest API, Vis/VR/AR Algorithms – Partnerships with ICES @ UT to shape future systems, applications and libraries. 1/13/20 10 FRONTERA SYSTEM --- PROJECT A new, NSF supported project to do 3 things: Deploy a system in 2019 for the largest problems scientists and engineers currently face. Support and operate this system for 5 years. Plan a potential phase 2 system, with 10x the capabilities, for the future challenges scientists will face. Frontera is the #5 ranked system in the world – and the fastest at any university in the world. Highest ranked Dell system ever, Fastest primarily Intel-based system Frontera and Stampede2 are #1 and #2 among US Universities (and Lonestar5 is still in the Top 10). On the current Top 500 list, TACC provides 77% of *all* performance available to US Universities. 1/13/20 11 FRONTERA IS A GREAT MACHINE – AND MORE THAN A MACHINE 1/13/20 12 A LITTLE ON HARDWARE AND INFRASTRUCTURE “Main” Compute Partition: 8,008 nodes Node: Dual–socket, 192GB, HDR-100 IB interface, local drive. Processor: Intel 8280 “Cascade Lake” Intel 2nd generation scalable Xeon 28 Cores 2.7Ghz clock “rate” (sometimes) 6 DIMM Channels, 2933Mhz DIMMS Core count+15%, clock rate +30%, memory bandwidth +15% vs. Skylake Why? They are universal, and not experimental 1/13/20 13 FRONTERA SYSTEM --- INFRASTRUCTURE Frontera consumes almost 6 Megawatts of Power at Peak Measured HPL power; 59+KW/rack, 5400KW from Compute nodes Direct water cooling of primary compute racks (CoolIT/DellEMC) Oil immersion Cooling (GRC) Solar, Wind inputs. TACC Machine Room Chilled Water Plant 1/13/20 14 INTERCONNECT Mellanox HDR , Fat Tree topology 8008 nodes = 88*91 = 91 Compute Racks Mellanox ASICS == 40 HDR ports. Chassis switches have 800 ports. Each rack is divided in half, with it’s own TOR switch: 44 compute nodes at HDR-100 == 22 HDR ports 18 uplink 200Gb HDR ports, 3 lines (600Gb) to each of 6 core switches. No oversubscription in higher layers of tree (11-9 in rack). No oversubscription to storage, DTN, service nodes (all connected to all 6 switches). 8200+ cards, 182 TOR switches, 6 core switches, 50 miles of cable. Good news: 8,008 compute nodes use only 3,276 fibers to connect to core. 1/13/20 15 FILESYSTEMS Lustre, POSIX, and that’s it. Disk: 50PB Flash: 3PB We have come to believe that most user’s codes accessing the filesystem look like this: While (1) { fork(); fopen(); fclose(); //optional } Mpirun –np 80000 kill_the_filesystem 1/13/20 16 FILESYSTEMS We no longer need to scale filesystem size to scale Bandwidth. The size of the filesystem is mostly to support concurrent users – Bandwidth is the limit for individual user (or IOPS for pathological ones). So – we aren’t going to build one big filesystem any more. /home1 , /home2, /home3 /scratch1, /scratch2, /scratch3 (initial assignment round robin) Flash will be a separate filesystem with some clever name, like /flash. This will require you to request access; or to be identified by our analytics as maxing a filesystem. Roughly 100GB/s to each scratch, 1.2TB/sec to /flash The code on the previous slide can trash, at most, 1/7th of the available filesystems. (Seriously, we have put in some tools to limit those; we may ask you to use a library we have that wraps Open(), and limits the number per second). 1/13/20 17 WHY DO WE HAVE COMMERCIAL CLOUD PARTNERSHIPS ON FRONTERA AWS is part of the Frontera project! Cloud/HPC is not, in my opinion, an either/or question. It’s OK to have more than one tool in the toolbox. We want to utilize the strengths of the commercial cloud, hence we are partnering in three areas: Long term data publication Access to unique and ever-changing hardware (you deploy faster than we do!) Hybrid workflows stitched together via web services (more on this later). 1/13/20 18 WHAT KINDS OF THINGS REALLY NEEDS TO RUN ON FRONTERA? 1/13/20 19 20 CENTER FOR THE PHYSICS OF LIVING CELLS ALEKSEI AKSIMENTIEV UNIVERSITY OF ILLINOIS AT URBANA- CHAMPAIGN • The nuclear pore complex serves as a gatekeeper, regulating the transport of biomolecules in and out of the nucleus of a biological cell. • To uncover the mechanism of such selective transport, the Aksimentiev lab at UIUC constructed a computational model of the complex. • The team simulated the model using memory-optimized NAMD 2.13, 8tasks/node, MPI+SMP. • Ran on up to 7,780 nodes on Frontera. • One of the largest biomolecular simulations ever performed. • Scaled close to linear on up to half of the machine. • Plan to build a new system twice as large to take advantage of large runs 21 FRONTIERS OF COARSE- GRAINING GREGORY VOTH UNIVERSITY OF CHICAGO • Mature HIV-1 capsid proteins self-assemble into large fullerene-cone structures. • These capsids enclose the infective genetic material of the virus and transport viral DNA from virion particles into the nucleus of newly infected cells. • On Frontera, Voth’s team simulated a viral capsids containing RNA and stabilizing cellular factors in full “State-of-the-art supercomputing resources like atomic detail for over 500 ns. Frontera are an invaluable resource for researchers. Molecular processes that determine • First molecular simulations of HIV capsids that contain the chemistry of life are often interconnected and biological components of the virus within the capsid. difficult to probe in isolation. Frontera enables large-scale simulations that examine these • The team ran on 4,000 nodes on Frontera. processes, and this type of science simply cannot be performed on smaller supercomputing • Measured the response of the capsid to molecular resources.” components such as including genetic cargo and cellular factors that affect the stability of the capsid. -Alvin Yu, Postdoctoral Scholar in Voth Group 22 LATTICE GAUGE THEORY AT THE INTENSITY FRONTIER CARLTON DETAR UNIVERSITY OF UTAH • Ab initio numerical simulations of quantum chromodynamics (QCD) help obtain precise predictions for the strong-interaction environment of the decays of mesons that contain a heavy bottom quark.

Leadership Computing in the Age of Cloud

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support