Designing High-Performance and Scalable Clustered Network Attached Storage with Infiniband

Total Page:16

File Type:pdf, Size:1020Kb

Designing High-Performance and Scalable Clustered Network Attached Storage with Infiniband DESIGNING HIGH-PERFORMANCE AND SCALABLE CLUSTERED NETWORK ATTACHED STORAGE WITH INFINIBAND DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Ranjit Noronha, MS * * * * * The Ohio State University 2008 Dissertation Committee: Approved by Dhabaleswar K. Panda, Adviser Ponnuswammy Sadayappan Adviser Feng Qin Graduate Program in Computer Science and Engineering c Copyright by Ranjit Noronha 2008 ABSTRACT The Internet age has exponentially increased the volume of digital media that is being shared and distributed. Broadband Internet has made technologies such as high quality streaming video on demand possible. Large scale supercomputers also consume and cre- ate huge quantities of data. This media and data must be stored, cataloged and retrieved with high-performance. Researching high-performance storage subsystems to meet the I/O demands of applications in modern scenarios is crucial. Advances in microprocessor technology have given rise to relatively cheap off-the-shelf hardware that may be put together as personal computers as well as servers. The servers may be connected together by networking technology to create farms or clusters of work- stations (COW). The evolution of COWs has significantly reduced the cost of ownership of high-performance clusters and has allowed users to build fairly large scale machines based on commodity server hardware. As COWs have evolved, networking technologies like InfiniBand and 10 Gigabit Eth- ernet have also evolved. These networking technologies not only give lower end-to-end latencies, but also allow for better messaging throughput between the nodes. This allows us to connect the clusters with high-performance interconnects at a relatively lower cost. ii With the deployment of low-cost, high-performance hardware and networking technol- ogy, it is increasingly becoming important to design a storage system that can be shared across all the nodes in the cluster. Traditionally, the different components of the file system have been stringed together using the network to connect them. The protocol generally used over the network is TCP/IP. The TCP/IP protocol stack in general has been shown to have poor performance especially for high-performance networks like 10 Gigabit Ethernet or InfiniBand. This is largely due to the fragmentation and reassembly cost of TCP/IP. The cost of multiple copies also serves to severely degrade the performance of the stack. Also, TCP/IP has been been shown to reduce the capacity of network attached storage systems because of problems like incast. In this dissertation, we research the problem of designing high-performance communi- cation subsystems for network attached storage (NAS) systems. Specifically, we delve into the issues and potential solutions with designing communication protocols for high-end single-server and clustered server NAS systems. Orthogonally, we also investigate how a caching architecture may potentially enhance the performance of a NAS system. Finally, we look at the potential performance implications of using some of these designs in two scenarios; over a long haul network and when used as a basis for checkpointing parallel applications. iii VITA January 7, 1976 . Born - Mumbai, India 1998 . BE, Computer Engineering, Mumbai University, India 1998-2000 . .Associate Software Engineer, Tata In- fotech, Mumbai, India 2000 . Intern, Compaq Computer Corporation, Cuppertino, CA, USA 2002 . MS, Computer Science, Binghamton University, Binghamton, NY, USA 2002-2008 . .Graduate Research Associate, The Ohio State University, Columbus, OH, USA 2007 . Intern, NFS Group, Sun MicroSystems, Austin, TX, USA. PUBLICATIONS Dissertation Related Research Publications R. Noronha, X. Ouyang and D.K. Panda “Designing a High-Performance Clustered NAS: A Case Study with pNFS over RDMA on InfiniBand”. International Conference on High Performance Computing (HiPC), December 2008. R. Noronha and D.K. Panda “IMCa: A High-Performance Caching Front-End for Infini- Band”. International Conference on Parallel Processing (ICPP), Sept 2008. iv S. Narravula, H. Subramoni, P. Lai, B. Rajaraman, R. Noronha and D.K. Panda “Perfor- mance of HPC Middleware over InfiniBand WAN”. International Conference on Parallel Processing (ICPP), Sept 2008. R. Noronha, Lei Chai, Spencer Shepler and D.K. Panda “Enhancing the Performance of NFSv4 with RDMA”. The 4th International Workshop on Storage Network Architecture and Parallel I/Os (SNAPI'07), held with 24th IEEE Conference on Mass Storage Systems and Technologies (MSST), Sept 2007. R. Noronha, Lei Chai, Thomas Talpey and D.K. Panda “Designing NFS With RDMA for Security, Performance and Scalability”. International Conference on Parallel Processing (ICPP), Sept 2007. Lei Chai, X. Ouyang, R. Noronha and D.K. Panda “pNFS/PVFS2 over InfiniBand: Early Experiences”. PetaScale Data Storage Workshop help with SuperComputing 2008, Nov 2007. W. Yu, R. Noronha, S. Liang and D.K. Panda “Benefits of High Speed Interconnects to Cluster File Systems: A Case Study with Lustre”. Communication Architecture for Clusters (CAC) Workshop, to be held in conjunction with Int'l Parallel and Distributed Processing Symposium (IPDPS '06), April 2006. Other Related Research Publications R. Noronha and D.K. Panda “Improving Scalability of OpenMP Applications on Multi- Core Systems Using Large Page Support”. International Workshop on Multithreaded Ar- chitectures and Applications (MTAAP) to be held in conjunction with International Parallel and Distributed Processing Symposium (IPDPS), March 2007. L. Chai and R. Noronha D. K. Panda “Can Performance and Portability Exist Across Architectures?” The 6th IEEE International Symposium on Cluster Computing and the Grid (CCGrid), May 2006. S. Liang and R. Noronha and D. K. Panda “Swapping to Remote Memory over Infini- Band: An Approach using a High Performance Network Block Device”, IEEE Cluster Computing, Sept 2005. v P. Balaji and W. Feng and Q. Gao and R. Noronha and W. Yu and D. K. Panda “Head- to-TOE Evaluation of High-Performance Sockets over Protocol Offload Engines”, IEEE Cluster Computing, Sept 2005. L. Chai and R. Noronha and P. Gupta and R. Brown and D. K. Panda “Designing a Portable MPI-2 over Modern Interconnects Using uDAPL Interface”, EuroPVM/MPI, Sept 2005. R. Noronha and D. K. Panda. “Performance Evaluation of MM5 on Clusters With Modern Interconnects: Scalability and Impact”, Euro-Par, Aug 2005. R. Noronha and D. K. Panda “Can High Performance DSM Systems Designed With Infini- Band Features Benefit from PCI-Express?”, Fifth International Workshop on Distributed Shared Memory (conjunction with CCGrid 2005), May 2005. R. Noronha and D. K. Panda “Reducing Diff Overhead in Software DSM Systems using RDMA Operations in InfiniBand”, Workshop on Remote Direct Memory Access (RDMA): Applications, Implementations, and Technologies (RAIT 2004), San Diego (CA, USA), in conjunction with the IEEE CLUSTER 2004 conference, Sept 2004. R. Noronha and D. K. Panda “Designing High Performance DSM Systems using Infini- Band Features.”, Fourth International Workshop on Distributed Shared Memory (held in conjunction with CCGrid 2004), April 2004. R. Noronha and D. K. Panda “Implementing Treadmarks over GM on Myrinet: Chal- lenges, Design Experiences and Performance Evaluation”, Workshop on Communication Architecture for Clusters (CAC 03), to be held in conjunction with IPDPS 03. M. Otey and R. Noronha and G. Li and S. Parthasarathy and D.K. Panda “NIC-based Intrusion Detection: A feasibilty study”, Workshop on Data Mining for Cyber Threat Analyses (DMCTA' 02) held in conjunction with IEEE International Conference on Data Mining Prior Unrelated Research Publications R. Noronha and N. Abu-Ghazeleh “Using Programmable NIC's for TimeWarp Optimiza- tion”, International Parallel and Distributed Processing Symposium (IPDPS '02), April 2002. vi R. Noronha and N. Abu-Ghazaleh “Early Cancellation: An Active-NIC Optimization for TimeWarp”, 16th Workshop on Parallel and Distributed Simulation (PADS'02), May 2002. Masters Thesis R. Noronha “Intelligent NIC's-A Feasibilty Study of Improving Performance of Distributed Applications By Programming Some of their components on the Network Interface Card”, Masters Thesis, State University of New York at Binghamton, Binghamton, New York, June 2001. FIELDS OF STUDY Major Field: Computer Science and Engineering Studies in: Computer Architecture Prof. D.K. Panda Software Systems Prof. S. Parthasarthy Computer Networks Prof. Dong Xuan vii TABLE OF CONTENTS Page Abstract . ii Vita . iv List of Tables . xiii List of Figures . xiv Chapters: 1. Introduction . 1 1.1 Overview of Storage Terminology . 4 1.1.1 File System Concepts . 4 1.1.2 UNIX Notion of Files . 5 1.1.3 Mapping files to blocks on storage devices . 6 1.1.4 Techniques to improve file system performance . 8 1.2 Overview of Storage Architectures . 9 1.2.1 Direct Attached Storage (DAS) . 9 1.2.2 Network Storage Architectures: In-Band and Out-Of-Band Access 11 1.2.3 Network Attached Storage (NAS) . 12 1.2.4 System Area Networks (SAN) . 13 1.2.5 Clustered Network Attached Storage (CNAS) . 14 1.2.6 Object Based Storage Systems (OBSS) . 16 1.3 Representative File Systems . 17 1.3.1 Network File Systems (NFS) . 18 1.3.2 Lustre File System . 19 viii 1.4 Overview of Networking Technologies . 20 1.4.1 Fibre Channel (FC) . 21 1.4.2 10 Gigabit Ethernet . 22 1.4.3 InfiniBand . 23 2. Problem Statement . 26 2.1 Research Approach . 29 2.2 Dissertation Overview . 31 3. Single Server NAS: NFS on InfiniBand . 34 3.1 Why is NFS over RDMA important? . 34 3.2 Overview of the InfiniBand Communication Model . 39 3.2.1 Communication Primitives . 40 3.3 Overview of NFS/RDMA Architecture . 41 3.3.1 Inline Protocol for RPC Call and RPC Reply . 43 3.3.2 RDMA Protocol for bulk data transfer . 44 3.4 Proposed Read-Write Design and Comparison to the Read-Read Design . 46 3.4.1 Limitations in the Read-Read Design . 49 3.4.2 Potential Advantages of the Read-Write Design . 50 3.4.3 Proposed Registration Strategies For the Read-Write Protocol . 51 3.5 Experimental Evaluation of NFSv3 over InfiniBand .
Recommended publications
  • End-To-End Performance of 10-Gigabit Ethernet on Commodity Systems
    END-TO-END PERFORMANCE OF 10-GIGABIT ETHERNET ON COMMODITY SYSTEMS INTEL’SNETWORK INTERFACE CARD FOR 10-GIGABIT ETHERNET (10GBE) ALLOWS INDIVIDUAL COMPUTER SYSTEMS TO CONNECT DIRECTLY TO 10GBE ETHERNET INFRASTRUCTURES. RESULTS FROM VARIOUS EVALUATIONS SUGGEST THAT 10GBE COULD SERVE IN NETWORKS FROM LANSTOWANS. From its humble beginnings as such performance to bandwidth-hungry host shared Ethernet to its current success as applications via Intel’s new 10GbE network switched Ethernet in local-area networks interface card (or adapter). We implemented (LANs) and system-area networks and its optimizations to Linux, the Transmission anticipated success in metropolitan and wide Control Protocol (TCP), and the 10GbE area networks (MANs and WANs), Ethernet adapter configurations and performed sever- continues to evolve to meet the increasing al evaluations. Results showed extraordinari- demands of packet-switched networks. It does ly higher throughput with low latency, so at low implementation cost while main- indicating that 10GbE is a viable intercon- taining high reliability and relatively simple nect for all network environments. (plug and play) installation, administration, Justin (Gus) Hurwitz and maintenance. Architecture of a 10GbE adapter Although the recently ratified 10-Gigabit The world’s first host-based 10GbE adapter, Wu-chun Feng Ethernet standard differs from earlier Ether- officially known as the Intel PRO/10GbE LR net standards, primarily in that 10GbE oper- server adapter, introduces the benefits of Los Alamos National ates only over fiber and only in full-duplex 10GbE connectivity into LAN and system- mode, the differences are largely superficial. area network environments, thereby accom- Laboratory More importantly, 10GbE does not make modating the growing number of large-scale obsolete current investments in network infra- cluster systems and bandwidth-intensive structure.
    [Show full text]
  • Early Experiences with Storage Area Networks and CXFS John Lynch
    Early Experiences with Storage Area Networks and CXFS John Lynch Aerojet 6304 Spine Road Boulder CO 80516 Abstract This paper looks at the design, integration and application issues involved in deploying an early access, very large, and highly available storage area network. Covered are topics from filesystem failover, issues regarding numbers of nodes in a cluster, and using leading edge solutions to solve complex issues in a real-time data processing network. 1 Introduction SAN technology can be categorized in two distinct approaches. Both Aerojet designed and installed a highly approaches use the storage area network available, large scale Storage Area to provide access to multiple storage Network over spring of 2000. This devices at the same time by one or system due to it size and diversity is multiple hosts. The difference is how known to be one of a kind and is the storage devices are accessed. currently not offered by SGI, but would serve as a prototype system. The most common approach allows the hosts to access the storage devices across The project’s goal was to evaluate Fibre the storage area network but filesystems Channel and SAN technology for its are not shared. This allows either a benefits and applicability in a second- single host to stripe data across a greater generation, real-time data processing number of storage controllers, or to network. SAN technology seemed to be share storage controllers among several the technology of the future to replace systems. This essentially breaks up a the traditional SCSI solution. large storage system into smaller distinct pieces, but allows for the cost-sharing of The approach was to conduct and the most expensive component, the evaluation of SAN technology as a storage controller.
    [Show full text]
  • CXFSTM Client-Only Guide for SGI® Infinitestorage
    CXFSTM Client-Only Guide for SGI® InfiniteStorage 007–4507–016 COPYRIGHT © 2002–2008 SGI. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI. LIMITED RIGHTS LEGEND The software described in this document is "commercial computer software" provided with restricted rights (except as to included open/free source) as specified in the FAR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond license provisions is a violation of worldwide intellectual property laws, treaties and conventions. This document is provided with limited rights as defined in 52.227-14. TRADEMARKS AND ATTRIBUTIONS SGI, Altix, the SGI cube and the SGI logo are registered trademarks and CXFS, FailSafe, IRIS FailSafe, SGI ProPack, and Trusted IRIX are trademarks of SGI in the United States and/or other countries worldwide. Active Directory, Microsoft, Windows, and Windows NT are registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. AIX and IBM are registered trademarks of IBM Corporation. Brocade and Silkworm are trademarks of Brocade Communication Systems, Inc. AMD, AMD Athlon, AMD Duron, and AMD Opteron are trademarks of Advanced Micro Devices, Inc. Apple, Mac, Mac OS, Power Mac, and Xserve are registered trademarks of Apple Computer, Inc. Disk Manager is a registered trademark of ONTRACK Data International, Inc. Engenio, LSI Logic, and SANshare are trademarks or registered trademarks of LSI Corporation.
    [Show full text]
  • Parallel Computing at DESY Peter Wegner Outline •Types of Parallel
    Parallel Computing at DESY Peter Wegner Outline •Types of parallel computing •The APE massive parallel computer •PC Clusters at DESY •Symbolic Computing on the Tablet PC Parallel Computing at DESY, CAPP2005 1 Parallel Computing at DESY Peter Wegner Types of parallel computing : •Massive parallel computing tightly coupled large number of special purpose CPUs and special purpose interconnects in n-Dimensions (n=2,3,4,5,6) Software model – special purpose tools and compilers •Event parallelism trivial parallel processing characterized by communication independent programs which are running on large PC farms Software model – Only scheduling via a Batch System Parallel Computing at DESY, CAPP2005 2 Parallel Computing at DESY Peter Wegner Types of parallel computing cont.: •“Commodity ” parallel computing on clusters one parallel program running on a distributed PC Cluster, the cluster nodes are connected via special high speed, low latency interconnects (GBit Ethernet, Myrinet, Infiniband) Software model – MPI (Message Passing Interface) •SMP (Symmetric MultiProcessing) parallelism many CPUs are sharing a global memory, one program is running on different CPUs in parallel Software model – OpenPM and MPI Parallel Computing at DESY, CAPP2005 3 Parallel computing at DESY: Zeuthen Computer Center Massive parallel PC Farms PC Clusters computer Parallel Computing Parallel Computing at DESY, CAPP2005 4 Parallel Computing at DESY Massive parallel APE (Array Processor Experiment) - since 1994 at DESY, exclusively used for Lattice Simulations for simulations of Quantum Chromodynamics in the framework of the John von Neumann Institute of Computing (NIC, FZ Jülich, DESY) http://www-zeuthen.desy.de/ape PC Cluster with fast interconnect (Myrinet, Infiniband) – since 2001, Applications: LQCD, Parform ? Parallel Computing at DESY, CAPP2005 5 Parallel computing at DESY: APEmille Parallel Computing at DESY, CAPP2005 6 Parallel computing at DESY: apeNEXT Parallel computing at DESY: apeNEXT Parallel Computing at DESY, CAPP2005 7 Parallel computing at DESY: Motivation for PC Clusters 1.
    [Show full text]
  • HP Storageworks Clustered File System Command Line Reference
    HP StorageWorks Clustered File System 3.0 Command Line reference guide *392372-001* *392372–001* Part number: 392372–001 First edition: May 2005 Legal and notice information © Copyright 1999-2005 PolyServe, Inc. Portions © 2005 Hewlett-Packard Development Company, L.P. Neither PolyServe, Inc. nor Hewlett-Packard Company makes any warranty of any kind with regard to this material, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Neither PolyServe nor Hewlett-Packard shall be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance, or use of this material. This document contains proprietary information, which is protected by copyright. No part of this document may be photocopied, reproduced, or translated into another language without the prior written consent of Hewlett-Packard. The information is provided “as is” without warranty of any kind and is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Neither PolyServe nor HP shall be liable for technical or editorial errors or omissions contained herein. The software this document describes is PolyServe confidential and proprietary. PolyServe and the PolyServe logo are trademarks of PolyServe, Inc. PolyServe Matrix Server contains software covered by the following copyrights and subject to the licenses included in the file thirdpartylicense.pdf, which is included in the PolyServe Matrix Server distribution. Copyright © 1999-2004, The Apache Software Foundation. Copyright © 1992, 1993 Simmule Turner and Rich Salz.
    [Show full text]
  • Silicon Graphics, Inc. Scalable Filesystems XFS & CXFS
    Silicon Graphics, Inc. Scalable Filesystems XFS & CXFS Presented by: Yingping Lu January 31, 2007 Outline • XFS Overview •XFS Architecture • XFS Fundamental Data Structure – Extent list –B+Tree – Inode • XFS Filesystem On-Disk Layout • XFS Directory Structure • CXFS: shared file system ||January 31, 2007 Page 2 XFS: A World-Class File System –Scalable • Full 64 bit support • Dynamic allocation of metadata space • Scalable structures and algorithms –Fast • Fast metadata speeds • High bandwidths • High transaction rates –Reliable • Field proven • Log/Journal ||January 31, 2007 Page 3 Scalable –Full 64 bit support • Large Filesystem – 18,446,744,073,709,551,615 = 264-1 = 18 million TB (exabytes) • Large Files – 9,223,372,036,854,775,807 = 263-1 = 9 million TB (exabytes) – Dynamic allocation of metadata space • Inode size configurable, inode space allocated dynamically • Unlimited number of files (constrained by storage space) – Scalable structures and algorithms (B-Trees) • Performance is not an issue with large numbers of files and directories ||January 31, 2007 Page 4 Fast –Fast metadata speeds • B-Trees everywhere (Nearly all lists of metadata information) – Directory contents – Metadata free lists – Extent lists within file – High bandwidths (Storage: RM6700) • 7.32 GB/s on one filesystem (32p Origin2000, 897 FC disks) • >4 GB/s in one file (same Origin, 704 FC disks) • Large extents (4 KB to 4 GB) • Request parallelism (multiple AGs) • Delayed allocation, Read ahead/Write behind – High transaction rates: 92,423 IOPS (Storage: TP9700)
    [Show full text]
  • W4118: File Systems
    W4118: file systems Instructor: Junfeng Yang References: Modern Operating Systems (3rd edition), Operating Systems Concepts (8th edition), previous W4118, and OS at MIT, Stanford, and UWisc Outline File system concepts . What is a file? . What operations can be performed on files? . What is a directory and how is it organized? File implementation . How to allocate disk space to files? 1 What is a file User view . Named byte array • Types defined by user . Persistent across reboots and power failures OS view . Map bytes as collection of blocks on physical storage . Stored on nonvolatile storage device • Magnetic Disks 2 Role of file system Naming . How to “name” files . Translate “name” + offset logical block # Reliability . Must not lose file data Protection . Must mediate file access from different users Disk management . Fair, efficient use of disk space . Fast access to files 3 File metadata Name – only information kept in human-readable form Identifier – unique tag (number) identifies file within file system (inode number in UNIX) Location – pointer to file location on device Size – current file size Protection – controls who can do reading, writing, executing Time, date, and user identification – data for protection, security, and usage monitoring How is metadata stored? (inode in UNIX) 4 File operations int creat(const char* pathname, mode_t mode) int unlink(const char* pathname) int rename(const char* oldpath, const char* newpath) int open(const char* pathname, int flags, mode_t mode) int read(int fd, void* buf, size_t count); int write(int fd, const void* buf, size_t count) int lseek(int fd, offset_t offset, int whence) int truncate(const char* pathname, offset_t len) ..
    [Show full text]
  • Lecture 17: Files and Directories
    11/1/16 CS 422/522 Design & Implementation of Operating Systems Lecture 17: Files and Directories Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of the CS422/522 lectures taught by Prof. Bryan Ford and Dr. David Wolinsky, and also from the official set of slides accompanying the OSPP textbook by Anderson and Dahlin. The big picture ◆ Lectures before the fall break: – Management of CPU & concurrency – Management of main memory & virtual memory ◆ Current topics --- “Management of I/O devices” – Last week: I/O devices & device drivers – Last week: storage devices – This week: file systems * File system structure * Naming and directories * Efficiency and performance * Reliability and protection 1 11/1/16 This lecture ◆ Implementing file system abstraction Physical Reality File System Abstraction block oriented byte oriented physical sector #’s named files no protection users protected from each other data might be corrupted robust to machine failures if machine crashes File system components ◆ Disk management User – Arrange collection of disk blocks into files File File ◆ Naming Naming access – User gives file name, not track or sector number, to locate data Disk management ◆ Security / protection – Keep information secure Disk drivers ◆ Reliability/durability – When system crashes, lose stuff in memory, but want files to be durable 2 11/1/16 User vs. system view of a file ◆ User’s view – Durable data structures ◆ System’s view (system call interface) – Collection of bytes (Unix) ◆ System’s view (inside OS): – Collection of blocks – A block is a logical transfer unit, while a sector is the physical transfer unit.
    [Show full text]
  • CXFSTM Administration Guide for SGI® Infinitestorage
    CXFSTM Administration Guide for SGI® InfiniteStorage 007–4016–025 CONTRIBUTORS Written by Lori Johnson Illustrated by Chrystie Danzer Engineering contributions to the book by Vladmir Apostolov, Rich Altmaier, Neil Bannister, François Barbou des Places, Ken Beck, Felix Blyakher, Laurie Costello, Mark Cruciani, Rupak Das, Alex Elder, Dave Ellis, Brian Gaffey, Philippe Gregoire, Gary Hagensen, Ryan Hankins, George Hyman, Dean Jansa, Erik Jacobson, John Keller, Dennis Kender, Bob Kierski, Chris Kirby, Ted Kline, Dan Knappe, Kent Koeninger, Linda Lait, Bob LaPreze, Jinglei Li, Yingping Lu, Steve Lord, Aaron Mantel, Troy McCorkell, LaNet Merrill, Terry Merth, Jim Nead, Nate Pearlstein, Bryce Petty, Dave Pulido, Alain Renaud, John Relph, Elaine Robinson, Dean Roehrich, Eric Sandeen, Yui Sakazume, Wesley Smith, Kerm Steffenhagen, Paddy Sreenivasan, Roger Strassburg, Andy Tran, Rebecca Underwood, Connie Woodward, Michelle Webster, Geoffrey Wehrman, Sammy Wilborn COPYRIGHT © 1999–2007 SGI. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI. LIMITED RIGHTS LEGEND The software described in this document is "commercial computer software" provided with restricted rights (except as to included open/free source) as specified in the FAR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond
    [Show full text]
  • Data Center Architecture and Topology
    CENTRAL TRAINING INSTITUTE JABALPUR Data Center Architecture and Topology Data Center Architecture Overview The data center is home to the computational power, storage, and applications necessary to support an enterprise business. The data center infrastructure is central to the IT architecture, from which all content is sourced or passes through. Proper planning of the data center infrastructure design is critical, and performance, resiliency, and scalability need to be carefully considered. Another important aspect of the data center design is flexibility in quickly deploying and supporting new services. Designing a flexible architecture that has the ability to support new applications in a short time frame can result in a significant competitive advantage. Such a design requires solid initial planning and thoughtful consideration in the areas of port density, access layer uplink bandwidth, true server capacity, and oversubscription, to name just a few. The data center network design is based on a proven layered approach, which has been tested and improved over the past several years in some of the largest data center implementations in the world. The layered approach is the basic foundation of the data center design that seeks to improve scalability, performance, flexibility, resiliency, and maintenance. Figure 1-1 shows the basic layered design. 1 CENTRAL TRAINING INSTITUTE MPPKVVCL JABALPUR Figure 1-1 Basic Layered Design Campus Core Core Aggregation 10 Gigabit Ethernet Gigabit Ethernet or Etherchannel Backup Access The layers of the data center design are the core, aggregation, and access layers. These layers are referred to extensively throughout this guide and are briefly described as follows: • Core layer—Provides the high-speed packet switching backplane for all flows going in and out of the data center.
    [Show full text]
  • Shared File Systems: Determining the Best Choice for Your Distributed SAS® Foundation Applications Margaret Crevar, SAS Institute Inc., Cary, NC
    Paper SAS569-2017 Shared File Systems: Determining the Best Choice for your Distributed SAS® Foundation Applications Margaret Crevar, SAS Institute Inc., Cary, NC ABSTRACT If you are planning on deploying SAS® Grid Manager and SAS® Enterprise BI (or other distributed SAS® Foundation applications) with load balanced servers on multiple operating systems instances, , a shared file system is required. In order to determine the best shared file system choice for a given deployment, it is important to understand how the file system is used, the SAS® I/O workload characteristics performed on it, and the stressors that SAS Foundation applications produce on the file system. For the purposes of this paper, we use the term "shared file system" to mean both a clustered file system and shared file system, even though" shared" can denote a network file system and a distributed file system – not clustered. INTRODUCTION This paper examines the shared file systems that are most commonly used with SAS and reviews their strengths and weaknesses. SAS GRID COMPUTING REQUIREMENTS FOR SHARED FILE SYSTEMS Before we get into the reasons why a shared file system is needed for SAS® Grid Computing, let’s briefly discuss the SAS I/O characteristics. GENERAL SAS I/O CHARACTERISTICS SAS Foundation creates a high volume of predominately large-block, sequential access I/O, generally at block sizes of 64K, 128K, or 256K, and the interactions with data storage are significantly different from typical interactive applications and RDBMSs. Here are some major points to understand (more details about the bullets below can be found in this paper): SAS tends to perform large sequential Reads and Writes.
    [Show full text]
  • Operating Systems Lecture #5: File Management
    Operating Systems Lecture #5: File Management Written by David Goodwin based on the lecture series of Dr. Dayou Li and the book Understanding Operating Systems 4thed. by I.M.Flynn and A.McIver McHoes (2006) Department of Computer Science and Technology, University of Bedfordshire. Operating Systems, 2013 25th February 2013 Outline Lecture #5 File Management David Goodwin 1 Introduction University of Bedfordshire 2 Interaction with the file manager Introduction Interaction with the file manager 3 Files Files Physical storage 4 Physical storage allocation allocation Directories 5 Directories File system Access 6 File system Data compression summary 7 Access 8 Data compression 9 summary Operating Systems 46 Lecture #5 File Management David Goodwin University of Bedfordshire Introduction 3 Interaction with the file manager Introduction Files Physical storage allocation Directories File system Access Data compression summary Operating Systems 46 Introduction Lecture #5 File Management David Goodwin University of Bedfordshire Introduction 4 Responsibilities of the file manager Interaction with the file manager 1 Keep track of where each file is stored Files 2 Use a policy that will determine where and how the files will Physical storage be stored, making sure to efficiently use the available storage allocation space and provide efficient access to the files. Directories 3 Allocate each file when a user has been cleared for access to File system it, and then record its use. Access 4 Deallocate the file when the file is to be returned to storage, Data compression and communicate its availability to others who may be summary waiting for it. Operating Systems 46 Definitions Lecture #5 File Management field is a group of related bytes that can be identified by David Goodwin University of the user with a name, type, and size.
    [Show full text]