The Tianhe-2 Supercomputer: Less Than Meets the Eye?
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Genomics As a Service: a Joint Computing and Networking Perspective
1 Genomics as a Service: a Joint Computing and Networking Perspective G. Reali1, M. Femminella1,4, E. Nunzi2, and D. Valocchi3 1Dept. of Engineering, University of Perugia, Perugia, Italy 2Dept. of Experimental Medicine, University of Perugia, Perugia, Italy 3Dept. of Electrical and Electronic Engineering, UCL, London, United Kingdom 4Consorzio Interuniversitario per le Telecomunicazioni (CNIT), Italy Abstract—This paper shows a global picture of the deployment intense research. At that time, although the importance of of networked processing services for genomic data sets. Many results was clear, the possibility of handling the human genome current research and medical activities make an extensive use of as a commodity was far from imagination due to costs and genomic data, which are massive and rapidly increasing over time. complexity of sequencing and analyzing complex genomes. They are typically stored in remote databases, accessible by using Internet connections. For this reason, the quality of the available Today the situation is different. The progress of DNA network services could be a significant issue for effectively (deoxyribonucleic acid) sequencing technologies has reduced handling genomic data through networks. A first contribution of the cost of sequencing a human genome, down to the order of this paper consists in identifying the still unexploited features of 1000 € [2]. Since the decrease of these costs is faster that the genomic data that could allow optimizing their networked Moore’s law [4], two main consequences are expected. First, it management. The second and main contribution is a is easy to predict that in few years a lot of applicative and methodological classification of computing and networking alternatives, which can be used to deploy what we call the societal fields, including academia, business, and public health Genomics-as-a-Service (GaaS) paradigm. -
Architecture Overview
ONAP Architecture Overview Open Network Automation Platform (ONAP) Architecture White Paper 1 Introduction The ONAP project addresses the rising need for a common automation platform for telecommunication, cable, and cloud service providers—and their solution providers—to deliver differentiated network services on demand, profitably and competitively, while leveraging existing investments. The challenge that ONAP meets is to help operators of telecommunication networks to keep up with the scale and cost of manual changes required to implement new service offerings, from installing new data center equipment to, in some cases, upgrading on-premises customer equipment. Many are seeking to exploit SDN and NFV to improve service velocity, simplify equipment interoperability and integration, and to reduce overall CapEx and OpEx costs. In addition, the current, highly fragmented management landscape makes it difficult to monitor and guarantee service-level agreements (SLAs). These challenges are still very real now as ONAP creates its fourth release. ONAP is addressing these challenges by developing global and massive scale (multi-site and multi-VIM) automation capabilities for both physical and virtual network elements. It facilitates service agility by supporting data models for rapid service and resource deployment and providing a common set of northbound REST APIs that are open and interoperable, and by supporting model-driven interfaces to the networks. ONAP’s modular and layered nature improves interoperability and simplifies integration, allowing it to support multiple VNF environments by integrating with multiple VIMs, VNFMs, SDN Controllers, as well as legacy equipment (PNF). ONAP’s consolidated xNF requirements publication enables commercial development of ONAP-compliant xNFs. This approach allows network and cloud operators to optimize their physical and virtual infrastructure for cost and performance; at the same time, ONAP’s use of standard models reduces integration and deployment costs of heterogeneous equipment. -
Lessons Learned in Deploying the World's Largest Scale Lustre File
Lessons Learned in Deploying the World’s Largest Scale Lustre File System Galen M. Shipman, David A. Dillow, Sarp Oral, Feiyi Wang, Douglas Fuller, Jason Hill, Zhe Zhang Oak Ridge Leadership Computing Facility, Oak Ridge National Laboratory Oak Ridge, TN 37831, USA fgshipman,dillowda,oralhs,fwang2,fullerdj,hilljj,[email protected] Abstract 1 Introduction The Spider system at the Oak Ridge National Labo- The Oak Ridge Leadership Computing Facility ratory’s Leadership Computing Facility (OLCF) is the (OLCF) at Oak Ridge National Laboratory (ORNL) world’s largest scale Lustre parallel file system. Envi- hosts the world’s most powerful supercomputer, sioned as a shared parallel file system capable of de- Jaguar [2, 14, 7], a 2.332 Petaflop/s Cray XT5 [5]. livering both the bandwidth and capacity requirements OLCF also hosts an array of other computational re- of the OLCF’s diverse computational environment, the sources such as a 263 Teraflop/s Cray XT4 [1], visual- project had a number of ambitious goals. To support the ization, and application development platforms. Each of workloads of the OLCF’s diverse computational plat- these systems requires a reliable, high-performance and forms, the aggregate performance and storage capacity scalable file system for data storage. of Spider exceed that of our previously deployed systems Parallel file systems on leadership-class systems have by a factor of 6x - 240 GB/sec, and 17x - 10 Petabytes, traditionally been tightly coupled to single simulation respectively. Furthermore, Spider supports over 26,000 platforms. This approach had resulted in the deploy- clients concurrently accessing the file system, which ex- ment of a dedicated file system for each computational ceeds our previously deployed systems by nearly 4x. -
ISTORE: Introspective Storage for Data-Intensive Network Services
ISTORE: Introspective Storage for Data-Intensive Network Services Aaron Brown, David Oppenheimer, Kimberly Keeton, Randi Thomas, John Kubiatowicz, and David A. Patterson Computer Science Division, University of California at Berkeley 387 Soda Hall #1776, Berkeley, CA 94720-1776 {abrown,davidopp,kkeeton,randit,kubitron,patterson}@cs.berkeley.edu Abstract ware [8][9][10][19]. Such maintenance costs are inconsis- Today’s fast-growing data-intensive network services place tent with the entrepreneurial nature of the Internet, in heavy demands on the backend servers that support them. which widely utilized, large-scale network services may be This paper introduces ISTORE, a novel server architecture operated by single individuals or small companies. that couples LEGO-like plug-and-play hardware with a In this paper we turn our attention to the needs of the generic framework for constructing adaptive software that providers of new data-intensive services, and in particular leverages continuous self-monitoring. ISTORE exploits to the need for servers that overcome the scalability and introspection to provide high availability, performance, manageability problems of existing solutions. We argue and scalability while drastically reducing the cost and that continuous monitoring and adaptation, or introspec- complexity of administration. An ISTORE-based server tion, is a crucial component of such servers. Introspective monitors and adapts to changes in the imposed workload systems are naturally self-administering, as they can moni- and to unexpected system events such as hardware failure. tor for exceptional situations or changes in their imposed This adaptability is enabled by a combination of intelligent workload and immediately react to maintain availability self-monitoring hardware components and an extensible and performance goals. -
Titan: a New Leadership Computer for Science
Titan: A New Leadership Computer for Science Presented to: DOE Advanced Scientific Computing Advisory Committee November 1, 2011 Arthur S. Bland OLCF Project Director Office of Science Statement of Mission Need • Increase the computational resources of the Leadership Computing Facilities by 20-40 petaflops • INCITE program is oversubscribed • Programmatic requirements for leadership computing continue to grow • Needed to avoid an unacceptable gap between the needs of the science programs and the available resources • Approved: Raymond Orbach January 9, 2009 • The OLCF-3 project comes out of this requirement 2 ASCAC – Nov. 1, 2011 Arthur Bland INCITE is 2.5 to 3.5 times oversubscribed 2007 2008 2009 2010 2011 2012 3 ASCAC – Nov. 1, 2011 Arthur Bland What is OLCF-3 • The next phase of the Leadership Computing Facility program at ORNL • An upgrade of Jaguar from 2.3 Petaflops (peak) today to between 10 and 20 PF by the end of 2012 with operations in 2013 • Built with Cray’s newest XK6 compute blades • When completed, the new system will be called Titan 4 ASCAC – Nov. 1, 2011 Arthur Bland Cray XK6 Compute Node XK6 Compute Node Characteristics AMD Opteron 6200 “Interlagos” 16 core processor @ 2.2GHz Tesla M2090 “Fermi” @ 665 GF with 6GB GDDR5 memory Host Memory 32GB 1600 MHz DDR3 Gemini High Speed Interconnect Upgradeable to NVIDIA’s next generation “Kepler” processor in 2012 Four compute nodes per XK6 blade. 24 blades per rack 5 ASCAC – Nov. 1, 2011 Arthur Bland ORNL’s “Titan” System • Upgrade of existing Jaguar Cray XT5 • Cray Linux Environment -
Musings RIK FARROWOPINION
Musings RIK FARROWOPINION Rik is the editor of ;login:. While preparing this issue of ;login:, I found myself falling down a rabbit hole, like [email protected] Alice in Wonderland . And when I hit bottom, all I could do was look around and puzzle about what I discovered there . My adventures started with a casual com- ment, made by an ex-Cray Research employee, about the design of current super- computers . He told me that today’s supercomputers cannot perform some of the tasks that they are designed for, and used weather forecasting as his example . I was stunned . Could this be true? Or was I just being dragged down some fictional rabbit hole? I decided to learn more about supercomputer history . Supercomputers It is humbling to learn about the early history of computer design . Things we take for granted, such as pipelining instructions and vector processing, were impor- tant inventions in the 1970s . The first supercomputers were built from discrete components—that is, transistors soldered to circuit boards—and had clock speeds in the tens of nanoseconds . To put that in real terms, the Control Data Corpora- tion’s (CDC) 7600 had a clock cycle of 27 .5 ns, or in today’s terms, 36 4. MHz . This was CDC’s second supercomputer (the 6600 was first), but included instruction pipelining, an invention of Seymour Cray . The CDC 7600 peaked at 36 MFLOPS, but generally got 10 MFLOPS with carefully tuned code . The other cool thing about the CDC 7600 was that it broke down at least once a day . -
Framework for Path Finding in Multi-Layer Transport Networks
UvA-DARE (Digital Academic Repository) Framework for path finding in multi-layer transport networks Dijkstra, F. Publication date 2009 Document Version Final published version Link to publication Citation for published version (APA): Dijkstra, F. (2009). Framework for path finding in multi-layer transport networks. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:29 Sep 2021 Frame Framework for w ork f or Path Finding in Multi-La Path Finding in Multi-Layer Transport Networks Quebec Amsterdam y er T ranspor StarLight t Netw orks CAnet StarLight MAN LAN NetherLight Fr eek Dijkstra Freek Dijkstra Framework for Path Finding in Multi-Layer Transport Networks Academisch Proefschrift ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus Mw. -
Jaguar Supercomputer
Jaguar Supercomputer Jake Baskin '10, Jānis Lībeks '10 Jan 27, 2010 What is it? Currently the fastest supercomputer in the world, at up to 2.33 PFLOPS, located at Oak Ridge National Laboratory (ORNL). Leader in "petascale scientific supercomputing". Uses Massively parallel simulations. Modeling: Climate Supernovas Volcanoes Cellulose http://www.nccs.gov/wp- content/themes/nightfall/img/jaguarXT5/gallery/jaguar-1.jpg Overview Processor Specifics Network Architecture Programming Models NCCS networking Spider file system Scalability The guts 84 XT4 and 200 XT5 cabinets XT5 18688 compute nodes 256 service and i/o nodes XT4 7832 compute nodes 116 service and i/o nodes (XT5) Compute Nodes 2 Opteron 2435 Istanbul (6 core) processors per node 64K L1 instruction cache 65K L1 data cache per core 512KB L2 cache per core 6MB L3 cache per processor (shared) 8GB of DDR2-800 RAM directly attached to each processor by integrated memory controller. http://www.cray.com/Assets/PDF/products/xt/CrayXT5Brochure.pdf How are they organized? 3-D Torus topology XT5 and XT4 segments are connected by an InfiniBand DDR network 889 GB/sec bisectional bandwidth http://www.cray.com/Assets/PDF/products/xt/CrayXT5Brochure.pdf Programming Models Jaguar supports these programming models: MPI (Message Passing Interface) OpenMP (Open Multi Processing) SHMEM (SHared MEMory access library) PGAS (Partitioned global address space) NCCS networking Jaguar usually performs computations on large datasets. These datasets have to be transferred to ORNL. Jaguar is connected to ESnet (Energy Sciences Network, scientific institutions) and Internet2 (higher education institutions). ORNL owns its own optical network that allows 10Gb/s to various locations around the US. -
The Network Access Machine
NAT'L INST . OF STAND 4 TECH AlllOS T73fl?b ference National Bureau of Standards MBS Library, E-01 Admin. Bidg. Publi - cations JUN 13 1979 Q1"7 CO 186377 -&f. NBS TECHNICAL NOTE 51 I / US163 U.S. DEPARTMENT OF COMMERCE/ National Bureau of Standards A Review of Network Access Techniques with a Case Study: The Network Access Machine r~- NATIONAL BUREAU OF STANDARDS 1 The National Bureau of Standards was established by an act of Congress March 3, 1901. The Bureau's overall goal is to strengthen and advance the Nation's science and technology and facilitate their effective application for public benefit. To this end, the Bureau conducts research and provides: (1) a basis for the Nation's physical measurement system, (2) scientific and technological services for industry and government, (3) a technical basis for equity in trade, and (4) technical services to promote public safety. The Bureau consists of the Institute for Basic Standards, the Institute for Materials Research, the Institute for Applied Technology, the Institute for Computer Sciences and Technology, and the Office for Information Programs. THE INSTITUTE FOR BASIC STANDARDS provides the central basis within the United States of a complete and consistent system of physical measurement; coordinates that system with measurement systems of other nations; and furnishes essential services leading to accurate and uniform physical measurements throughout the Nation's scientific community, industry, and commerce. The Institute consists of the Office of Measurement Services, the Office of Radiation Measurement and the following Center and divisions: Applied Mathematics — Electricity — Mechanics — Heat — Optical Physics — Center " for Radiation Research: Nuclear Sciences; Applied Radiation — Laboratory Astrophysics " " = — Cryogenics — Electromagnetics — Time and Frequency . -
Intent NBI for Software Defined Networking
Intent NBI for Software Defined Networking Intent NBI for Software Defined Networking 1 SDN NBI Challenges According to the architecture definition in Open Networking Foundation (ONF), a Software Defined Network (SDN) includes three vertically separated layers: infrastructure, control and application. The North-Bound Interface (NBI), located between the control plane and the applications, is essential to enable the application innovations and nourish the eco-system of SDN by abstracting the network capabilities/information and opening the abstract/logic network to applications. Traditional NBI used to pay more attention on the function layer. The Functional NBI is usually designed by the network domain experts from the network system point of view. The network capabilities are abstracted and encapsulated from bottom up. This kind of design does not take care of the requirement from user side, but expose network information and capability as much as possible. In this way, applications can get maximum programmability from the functional NBI. The examples of functional NBI include: device and link discovery, ID assignment for network interfaces, configuration of forwarding rules, and management of tens of thousands of network state information. However, let’s look from the application developer’s point of view, and assume that they are not familiar with the complex network concepts and configurations. The application developers want to focus on the functionality of the application and the friendly interaction with application users. The network service is required to be used as easy and automatic as the cloud services. The network function is required to be flexible, available and scalable. All this requires the network to have a set of service oriented, declarative, and intent based NBI. -
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral
Efficient Object Storage Journaling in a Distributed Parallel File System Presented by Sarp Oral Sarp Oral, Feiyi Wang, David Dillow, Galen Shipman, Ross Miller, and Oleg Drokin FAST’10, Feb 25, 2010 A Demanding Computational Environment Jaguar XT5 18,688 224,256 300+ TB 2.3 PFlops Nodes Cores memory Jaguar XT4 7,832 31,328 63 TB 263 TFlops Nodes Cores memory Frost (SGI Ice) 128 Node institutional cluster Smoky 80 Node software development cluster Lens 30 Node visualization and analysis cluster 2 FAST’10, Feb 25, 2010 Spider Fastest Lustre file system in the world Demonstrated bandwidth of 240 GB/s on the center wide file system Largest scale Lustre file system in the world Demonstrated stability and concurrent mounts on major OLCF systems • Jaguar XT5 • Jaguar XT4 • Opteron Dev Cluster (Smoky) • Visualization Cluster (Lens) Over 26,000 clients mounting the file system and performing I/O General availability on Jaguar XT5, Lens, Smoky, and GridFTP servers Cutting edge resiliency at scale Demonstrated resiliency features on Jaguar XT5 • DM Multipath • Lustre Router failover 3 FAST’10, Feb 25, 2010 Designed to Support Peak Performance 100.00 ReadBW GB/s WriteBW GB/s 90.00 80.00 70.00 60.00 50.00 Bandwidth GB/s 40.00 30.00 20.00 10.00 0.00 1/1/10 0:00 1/6/10 0:00 1/11/10 0:00 1/16/10 0:00 1/21/10 0:00 1/26/10 0:00 1/31/10 0:00 Timeline (January 2010) Max data rates (hourly) on ½ of available storage controllers 4 FAST’10, Feb 25, 2010 Motivations for a Center Wide File System • Building dedicated file systems for platforms does not scale -
Designing Cisco Network Service Architectures (ARCH)
Data Sheet Learning Services Cisco Training on Demand Designing Cisco Network Service Architectures (ARCH) Overview Designing Cisco® Network Service Architectures (ARCH) Version 3.0 is a Cisco Training on Demand course. It enables you to perform the conceptual, intermediate, and detailed design of a network infrastructure that supports desired network solutions over intelligent network services to achieve effective performance, scalability, and availability. It also enables you to provide viable, stable enterprise internetworking solutions by applying solid Cisco network solution models and recommended design practices. By building on the Designing for Cisco Internetwork Solutions (DESGN) Version 3.0 course, you learn additional aspects of modular campus design, advanced addressing and routing designs, WAN service designs, enterprise data center, and security designs. Interested in purchasing this course in volume at discounts for your company? Contact [email protected]. Duration The ARCH Training on Demand is a self-paced course based on the 5-day instructor-led training version. It consists of 40 sections of instructor video and text totaling more than 28 hours of instruction along with interactive activities, 7 hands-on lab exercises, content review questions, and challenge questions. Target Audience This course is designed for network engineers and those preparing for the 300-320 ARCH exam. © 2016 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 1 of 4 Objectives After completing