GSN the Ideal Application(S) More Virtual Applications for HEP and Others Some Thoughts About Network Storage
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
End-To-End Performance of 10-Gigabit Ethernet on Commodity Systems
END-TO-END PERFORMANCE OF 10-GIGABIT ETHERNET ON COMMODITY SYSTEMS INTEL’SNETWORK INTERFACE CARD FOR 10-GIGABIT ETHERNET (10GBE) ALLOWS INDIVIDUAL COMPUTER SYSTEMS TO CONNECT DIRECTLY TO 10GBE ETHERNET INFRASTRUCTURES. RESULTS FROM VARIOUS EVALUATIONS SUGGEST THAT 10GBE COULD SERVE IN NETWORKS FROM LANSTOWANS. From its humble beginnings as such performance to bandwidth-hungry host shared Ethernet to its current success as applications via Intel’s new 10GbE network switched Ethernet in local-area networks interface card (or adapter). We implemented (LANs) and system-area networks and its optimizations to Linux, the Transmission anticipated success in metropolitan and wide Control Protocol (TCP), and the 10GbE area networks (MANs and WANs), Ethernet adapter configurations and performed sever- continues to evolve to meet the increasing al evaluations. Results showed extraordinari- demands of packet-switched networks. It does ly higher throughput with low latency, so at low implementation cost while main- indicating that 10GbE is a viable intercon- taining high reliability and relatively simple nect for all network environments. (plug and play) installation, administration, Justin (Gus) Hurwitz and maintenance. Architecture of a 10GbE adapter Although the recently ratified 10-Gigabit The world’s first host-based 10GbE adapter, Wu-chun Feng Ethernet standard differs from earlier Ether- officially known as the Intel PRO/10GbE LR net standards, primarily in that 10GbE oper- server adapter, introduces the benefits of Los Alamos National ates only over fiber and only in full-duplex 10GbE connectivity into LAN and system- mode, the differences are largely superficial. area network environments, thereby accom- Laboratory More importantly, 10GbE does not make modating the growing number of large-scale obsolete current investments in network infra- cluster systems and bandwidth-intensive structure. -
Parallel Computing at DESY Peter Wegner Outline •Types of Parallel
Parallel Computing at DESY Peter Wegner Outline •Types of parallel computing •The APE massive parallel computer •PC Clusters at DESY •Symbolic Computing on the Tablet PC Parallel Computing at DESY, CAPP2005 1 Parallel Computing at DESY Peter Wegner Types of parallel computing : •Massive parallel computing tightly coupled large number of special purpose CPUs and special purpose interconnects in n-Dimensions (n=2,3,4,5,6) Software model – special purpose tools and compilers •Event parallelism trivial parallel processing characterized by communication independent programs which are running on large PC farms Software model – Only scheduling via a Batch System Parallel Computing at DESY, CAPP2005 2 Parallel Computing at DESY Peter Wegner Types of parallel computing cont.: •“Commodity ” parallel computing on clusters one parallel program running on a distributed PC Cluster, the cluster nodes are connected via special high speed, low latency interconnects (GBit Ethernet, Myrinet, Infiniband) Software model – MPI (Message Passing Interface) •SMP (Symmetric MultiProcessing) parallelism many CPUs are sharing a global memory, one program is running on different CPUs in parallel Software model – OpenPM and MPI Parallel Computing at DESY, CAPP2005 3 Parallel computing at DESY: Zeuthen Computer Center Massive parallel PC Farms PC Clusters computer Parallel Computing Parallel Computing at DESY, CAPP2005 4 Parallel Computing at DESY Massive parallel APE (Array Processor Experiment) - since 1994 at DESY, exclusively used for Lattice Simulations for simulations of Quantum Chromodynamics in the framework of the John von Neumann Institute of Computing (NIC, FZ Jülich, DESY) http://www-zeuthen.desy.de/ape PC Cluster with fast interconnect (Myrinet, Infiniband) – since 2001, Applications: LQCD, Parform ? Parallel Computing at DESY, CAPP2005 5 Parallel computing at DESY: APEmille Parallel Computing at DESY, CAPP2005 6 Parallel computing at DESY: apeNEXT Parallel computing at DESY: apeNEXT Parallel Computing at DESY, CAPP2005 7 Parallel computing at DESY: Motivation for PC Clusters 1. -
Data Center Architecture and Topology
CENTRAL TRAINING INSTITUTE JABALPUR Data Center Architecture and Topology Data Center Architecture Overview The data center is home to the computational power, storage, and applications necessary to support an enterprise business. The data center infrastructure is central to the IT architecture, from which all content is sourced or passes through. Proper planning of the data center infrastructure design is critical, and performance, resiliency, and scalability need to be carefully considered. Another important aspect of the data center design is flexibility in quickly deploying and supporting new services. Designing a flexible architecture that has the ability to support new applications in a short time frame can result in a significant competitive advantage. Such a design requires solid initial planning and thoughtful consideration in the areas of port density, access layer uplink bandwidth, true server capacity, and oversubscription, to name just a few. The data center network design is based on a proven layered approach, which has been tested and improved over the past several years in some of the largest data center implementations in the world. The layered approach is the basic foundation of the data center design that seeks to improve scalability, performance, flexibility, resiliency, and maintenance. Figure 1-1 shows the basic layered design. 1 CENTRAL TRAINING INSTITUTE MPPKVVCL JABALPUR Figure 1-1 Basic Layered Design Campus Core Core Aggregation 10 Gigabit Ethernet Gigabit Ethernet or Etherchannel Backup Access The layers of the data center design are the core, aggregation, and access layers. These layers are referred to extensively throughout this guide and are briefly described as follows: • Core layer—Provides the high-speed packet switching backplane for all flows going in and out of the data center. -
(12) United States Patent (10) Patent No.: US 8,862,870 B2 Reddy Et Al
USOO886287OB2 (12) United States Patent (10) Patent No.: US 8,862,870 B2 Reddy et al. (45) Date of Patent: Oct. 14, 2014 (54) SYSTEMS AND METHODS FOR USPC .......... 713/152–154, 168, 170; 709/223, 224, MULTI-LEVELTAGGING OF ENCRYPTED 709/225 ITEMIS FOR ADDITIONAL SECURITY AND See application file for complete search history. EFFICIENT ENCRYPTED ITEM (56) References Cited DETERMINATION U.S. PATENT DOCUMENTS (75) Inventors: Anoop Reddy, Santa Clara, CA (US); 5,867,494 A 2/1999 Krishnaswamy et al. Craig Anderson, Santa Clara, CA (US) 5,909,559 A 6, 1999 SO (73) Assignee: Citrix Systems, Inc., Fort Lauderdale, (Continued) FL (US) FOREIGN PATENT DOCUMENTS (*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 CN 1478348 A 2, 2004 U.S.C. 154(b) by 0 days. EP 1422.907 A2 5, 2004 (Continued) (21) Appl. No.: 13/337.735 OTHER PUBLICATIONS (22) Filed: Dec. 27, 2011 Australian Examination Report on 200728.1083 dated Nov.30, 2010. (65) Prior Publication Data (Continued) US 2012/O17387OA1 Jul. 5, 2012 Primary Examiner — Abu Sholeman (74) Attorney, Agent, or Firm — Foley & Lardner LLP: Related U.S. Application Data Christopher J. McKenna (60) Provisional application No. 61/428,138, filed on Dec. (57) ABSTRACT 29, 2010. The present disclosure is directed towards systems and meth ods for performing multi-level tagging of encrypted items for (51) Int. Cl. additional security and efficient encrypted item determina H04L 9M32 (2006.01) tion. A device intercepts a message from a server to a client, H04L 2L/00 (2006.01) parses the message and identifies a cookie. -
The Quadrics Network (Qsnet): High-Performance Clustering Technology
Proceedings of the 9th IEEE Hot Interconnects (HotI'01), Palo Alto, California, August 2001. The Quadrics Network (QsNet): High-Performance Clustering Technology Fabrizio Petrini, Wu-chun Feng, Adolfy Hoisie, Salvador Coll, and Eitan Frachtenberg Computer & Computational Sciences Division Los Alamos National Laboratory ¡ fabrizio,feng,hoisie,scoll,eitanf ¢ @lanl.gov Abstract tegration into large-scale systems. While GigE resides at the low end of the performance spectrum, it provides a low-cost The Quadrics interconnection network (QsNet) con- solution. GigaNet, GSN, Myrinet, and SCI add programma- tributes two novel innovations to the field of high- bility and performance by providing communication proces- performance interconnects: (1) integration of the virtual- sors on the network interface cards and implementing differ- address spaces of individual nodes into a single, global, ent types of user-level communication protocols. virtual-address space and (2) network fault tolerance via The Quadrics network (QsNet) surpasses the above inter- link-level and end-to-end protocols that can detect faults connects in functionality by including a novel approach to and automatically re-transmit packets. QsNet achieves these integrate the local virtual memory of a node into a globally feats by extending the native operating system in the nodes shared, virtual-memory space; a programmable processor in with a network operating system and specialized hardware the network interface that allows the implementation of intel- support in the network interface. As these and other impor- ligent communication protocols; and an integrated approach tant features of QsNet can be found in the InfiniBand speci- to network fault detection and fault tolerance. Consequently, fication, QsNet can be viewed as a precursor to InfiniBand. -
High Performance Network and Channel-Based Storage
High Performance Network and Channel-Based Storage Randy H. Katz Report No. UCB/CSD 91/650 September 1991 Computer Science Division (EECS) University of California, Berkeley Berkeley, California 94720 (NASA-CR-189965) HIGH PERFORMANCE NETWORK N92-19260 AND CHANNEL-BASED STORAGE (California Univ.) 42 p CSCL 098 Unclas G3/60 0073846 High Performance Network and Channel-Based Storage Randy H. Katz Computer Science Division Department of Electrical Engineering and Computer Sciences University of California Berkeley, California 94720 Abstract: In the traditional mainframe-centered view of a computer system, storage devices are coupled to the system through complex hardware subsystems called I/O channels. With the dramatic shift towards workstation-based com- puting, and its associated client/server model of computation, storage facilities are now found attached to file servers and distributed throughout the network. In this paper, we discuss the underlying technology trends that are leading to high performance network-based storage, namely advances in networks, storage devices, and I/O controller and server architectures. We review several commercial systems and research prototypes that are leading to a new approach to high performance computing based on network-attached storage. Key Words and Phrases: High Performance Computing, Computer Networks, File and Storage Servers, Secondary and Tertiary Storage Device 1. Introduction The traditional mainframe-centered model of computing can be characterized by small numbers of large-scale mainframe computers, with shared storage devices attached via I/O channel hard- ware. Today, we are experiencing a major paradigm shift away from centralized mainframes to a distributed model of computation based on workstations and file servers connected via high per- formance networks. -
Comparing Ethernet and Myrinet for MPI Communication
Comparing Ethernet and Myrinet for MPI Communication Supratik Majumder Scott Rixner Rice University Rice University Houston, Texas Houston, Texas [email protected] [email protected] ABSTRACT operating system. In addition, TCP is a carefully developed This paper compares the performance of Myrinet and Eth- network protocol that gracefully handles lost packets and ernet as a communication substrate for MPI libraries. MPI flow control. User-level protocols can not hope to achieve library implementations for Myrinet utilize user-level com- the efficiency of TCP when dealing with these issues. Fur- munication protocols to provide low latency and high band- thermore, there has been significant work in the network width MPI messaging. In contrast, MPI library impleme- server domain to optimize the use of TCP for efficient com- nations for Ethernet utilize the operating system network munication. protocol stack, leading to higher message latency and lower message bandwidth. However, on the NAS benchmarks, GM This paper evaluates the performance differences between messaging over Myrinet only achieves 5% higher applica- GM over 1.2 Gbps Myrinet and TCP over 1 Gbps Ethernet tion performance than TCP messaging over Ethernet. Fur- as communication substrates for the Los Alamos MPI (LA- thermore, efficient TCP messaging implmentations improve MPI) library. The raw network unidirectional ping latency communication latency tolerance, which closes the perfor- of Myrinet is lower than Ethernet by almost 40 µsec. How- mance gap between Myrinet and Ethernet to about 0.5% ever, a considerable fraction of this difference is due to in- on the NAS benchmarks. This shows that commodity net- terrupt coalescing employed by the Ethernet network inter- working, if used efficiently, can be a viable alternative to face driver. -
HIPPI Developments for CERN Experiments A
VERSION OF: 5-Feb-98 10:15 HIPPI Developments for CERN experiments A. van Praag ,T. Anguelov, R.A. McLaren, H.C. van der Bij, CERN, Geneva, Switzerland. J. Bovier, P. Cristin Creative Electronic Systems, Geneva, Switzerland. M. Haben, P. Jovanovic, I. Kenyon, R. Staley University of Birmingham, Birmingham, U.K. D. Cunningham, G. Watson Hewlett Packard Laboratories, Bristol, U.K. B. Green, J. Strong Royal Hollaway and Bedford New College, U.K. Abstract HIPPI Standard, fast, simple, inexpensive; is this not a contradiction in terms? The High-Performance Parallel We have decided to use the High Performance Parallel Interface (HIPPI) is a new proposed ANSI standard, using a Interface (HIPPI) to implement these links. The HIPPI minimal protocol and providing 100 Mbyte/sec transfers over specification was started in the Los Alamos laboratory in distances up to 25 m. Equipment using this standard is 1989 and is now a proposed ANSI standard (X3T9/88-127, offered by a growing number of computer manufacturers. A X3T9.3/88-23, HIPPI PH) [1,2]. This standard allows commercially available HIPPI chipset allows low cost 100 Mbyte/sec synchronous data transfers between a "Source" implementations. In this article a brief technical introduction and a "Destination". Seen from the lowest level upwards the to the HIPPI will be given, followed by examples of planned HIPPI specification proposes a logical framing hierarchy applications in High Energy Physics experiments including where the smallest unit of data to be transferred, called a the present developments involving CERN: a detector "burst" has a standard size of 256 words of 32 bit or optional emulator, a risc processor based VME connection, a long 64 bit (Fig. -
Designing High-Performance and Scalable Clustered Network Attached Storage with Infiniband
DESIGNING HIGH-PERFORMANCE AND SCALABLE CLUSTERED NETWORK ATTACHED STORAGE WITH INFINIBAND DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Ranjit Noronha, MS * * * * * The Ohio State University 2008 Dissertation Committee: Approved by Dhabaleswar K. Panda, Adviser Ponnuswammy Sadayappan Adviser Feng Qin Graduate Program in Computer Science and Engineering c Copyright by Ranjit Noronha 2008 ABSTRACT The Internet age has exponentially increased the volume of digital media that is being shared and distributed. Broadband Internet has made technologies such as high quality streaming video on demand possible. Large scale supercomputers also consume and cre- ate huge quantities of data. This media and data must be stored, cataloged and retrieved with high-performance. Researching high-performance storage subsystems to meet the I/O demands of applications in modern scenarios is crucial. Advances in microprocessor technology have given rise to relatively cheap off-the-shelf hardware that may be put together as personal computers as well as servers. The servers may be connected together by networking technology to create farms or clusters of work- stations (COW). The evolution of COWs has significantly reduced the cost of ownership of high-performance clusters and has allowed users to build fairly large scale machines based on commodity server hardware. As COWs have evolved, networking technologies like InfiniBand and 10 Gigabit Eth- ernet have also evolved. These networking technologies not only give lower end-to-end latencies, but also allow for better messaging throughput between the nodes. This allows us to connect the clusters with high-performance interconnects at a relatively lower cost. -
Inside the Lustre File System
Inside The Lustre File System Technology Paper An introduction to the inner workings of the world’s most scalable and popular open source HPC file system Torben Kling Petersen, PhD Inside The Lustre File System The Lustre High Performance Parallel File System Introduction Ever since the precursor to Lustre® (known as the Object- Based Filesystem, or ODBFS) was developed at Carnegie Mellon University in 1999, Lustre has been at the heart of high performance computing, providing the necessary throughput and scalability to many of the fastest supercomputers in the world. Lustre has experienced a number of changes and, despite the code being open source, the ownership has changed hands a number of times. From the original company started by Dr. Peter Braam (Cluster File Systems, or CFS), which was acquired by Sun Microsystems in 2008—which was in turn acquired by Oracle in 2010—to the acquisition of the Lustre assets by Xyratex in 2013, the open source community has supported the proliferation and acceptance of Lustre. In 2011, industry trade groups like OpenSFS1, together with its European sister organization, EOFS2, took a leading role in the continued development of Lustre, using member fees and donations to drive the evolution of specific projects, along with those sponsored by users3 such as Oak Ridge National Laboratory, Lawrence Livermore National Laboratory and the French Atomic Energy Commission (CEA), to mention a few. Today, in 2014, the Lustre community is stronger than ever, and seven of the top 10 high performance computing (HPC) systems on the international Top 5004 list (as well as 75+ of the top 100) are running the Lustre high performance parallel file system. -
PC Hardware Contents
PC Hardware Contents 1 Computer hardware 1 1.1 Von Neumann architecture ...................................... 1 1.2 Sales .................................................. 1 1.3 Different systems ........................................... 2 1.3.1 Personal computer ...................................... 2 1.3.2 Mainframe computer ..................................... 3 1.3.3 Departmental computing ................................... 4 1.3.4 Supercomputer ........................................ 4 1.4 See also ................................................ 4 1.5 References ............................................... 4 1.6 External links ............................................. 4 2 Central processing unit 5 2.1 History ................................................. 5 2.1.1 Transistor and integrated circuit CPUs ............................ 6 2.1.2 Microprocessors ....................................... 7 2.2 Operation ............................................... 8 2.2.1 Fetch ............................................. 8 2.2.2 Decode ............................................ 8 2.2.3 Execute ............................................ 9 2.3 Design and implementation ...................................... 9 2.3.1 Control unit .......................................... 9 2.3.2 Arithmetic logic unit ..................................... 9 2.3.3 Integer range ......................................... 10 2.3.4 Clock rate ........................................... 10 2.3.5 Parallelism ......................................... -
Lecture 12: I/O: Metrics, a Little Queuing Theory, and Busses
Lecture 12: I/O: Metrics, A Little Queuing Theory, and Busses Professor David A. Patterson Computer Science 252 Fall 1996 DAP.F96 1 Review: Disk Device Terminology Disk Latency = Queuing Time + Seek Time + Rotation Time + Xfer Time Order of magnitude times for 4K byte transfers: Seek: 12 ms or less Rotate: 4.2 ms @ 7200 rpm (8.3 ms @ 3600 rpm ) Xfer: 1 ms @ 7200 rpm (2 ms @ 3600 rpm) DAP.F96 2 Review: R-DAT Technology 2000 RPM Four Head Recording Helical Recording Scheme Tracks Recorded ±20° w/o guard band Read After Write Verify DAP.F96 3 Review: Automated Cartridge System STC 4400 8 feet 10 feet 6000 x 0.8 GB 3490 tapes = 5 TBytes in 1992 $500,000 O.E.M. Price 6000 x 20 GB D3 tapes = 120 TBytes in 1994 1 Petabyte (1024 TBytes) in 2000 DAP.F96 4 Review: Storage System Issues • Historical Context of Storage I/O • Secondary and Tertiary Storage Devices • Storage I/O Performance Measures • A Little Queuing Theory • Processor Interface Issues • I/O Buses • Redundant Arrarys of Inexpensive Disks (RAID) • ABCs of UNIX File Systems • I/O Benchmarks • Comparing UNIX File System Performance DAP.F96 5 Disk I/O Performance 300 Response Metrics: Time (ms) Response Time Throughput 200 100 0 0% 100% Throughput (% total BW) Queue Proc IOC Device Response time = Queue + Device Service time DAP.F96 6 Response Time vs. Productivity • Interactive environments: Each interaction or transaction has 3 parts: – Entry Time: time for user to enter command – System Response Time: time between user entry & system replies – Think Time: Time from response until user