An Application-Attuned Framework for Optimizing HPC Storage Systems
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Base Performance for Beegfs with Data Protection
BeeGFS Data Integrity Improvements without impacting Performance Presenter: Dr. M. K. Jibbe Technical Director , NetApp ESG Presenter: Joey Parnell 1 SW2019 StorageArchitect, Developer Conference. NetApp © 2019 NetApp ESG Inc. NetApp Proprietary All Rights Reserved. Current Problem • Lustre has been a common choice in the HPC market for years but the acquisition of the Lustre intellectual property by storage provider in 2018 has created a challenge for users that want to avoid vendor lock-in. • Parallel file system needs are expanding beyond HPC, but most lack enterprise reliability • Any BeeGFS configuration does Lack Data protection at different locations: • Storage Client and servers do not have cache protection in the event of kernel panics, software crashes or power loss. Potential issues are: • Data cached in the BeeGFS client itself • Data cached at the underlying filesystem layer in the storage server (XFS, ZFS, ext4, etc.) • Data cached by the OS kernel itself underneath the filesystem in the storage server Goal of the Team (Remove Competitor and Expand Market Reach of BeeGFS) Parallel File system Improve solution resiliency to ensure data integrity by ML BeeGFS Swim lane identifying potential component faults that could lead to BeeGFS IB and FC data loss and changing system design to handle the fault Address known Market without any data integrity challenges. Opportunity Simple solution and support 2 2 2019 Storage Developer Conference. © 2019 NetApp Inc. NetApp Proprietary All Rights Reserved. Data Integrity & Availability expectations in HPC (Provide Enterprise Reliability with E-series Storage – BeeGFS) However, . The market we serve goes beyond the HPC scratch file system use cases . -
Spring 2014 • Vol
Editorial is published two times a year by The GAUSS Centre Editorial for Supercomputing (HLRS, LRZ, JSC) Welcome to this new issue of inside, tive of the German Research Society the journal on Innovative Supercomput- (DFG) start to create interesting re- Publishers ing in Germany published by the Gauss sults as is shown in our project section. Centre for Supercomputing (GCS). In Prof. Dr. A. Bode | Prof. Dr. Dr. Th. Lippert | Prof. Dr.-Ing. Dr. h.c. Dr. h.c. M. M. Resch this issue, there is a clear focus on With respect to hardware GCS is con- Editor applications. While the race for ever tinuously following its roadmap. New faster systems is accelerated by the systems will become available at HLRS F. Rainer Klank, HLRS [email protected] challenge to build an Exascale system in 2014 and at LRZ in 2015. System within a reasonable power budget, the shipment for HLRS is planned to start Design increased sustained performance allows on August 8 with a chance to start Linda Weinmann, HLRS [email protected] for the solution of ever more complex general operation as early as October Sebastian Zeeden, HLRS [email protected] problems. The range of applications 2014. We will report on further Pia Heusel, HLRS [email protected] presented is impressive covering a progress. variety of different fields. As usual, this issue includes informa- GCS has been a strong player in PRACE tion about events in supercomputing over the last years. In 2015 will con- in Germany over the last months and tinue to provide access to leading edge gives an outlook of workshops in the systems for all European researchers field. -
Data Integration and Analysis 10 Distributed Data Storage
1 SCIENCE PASSION TECHNOLOGY Data Integration and Analysis 10 Distributed Data Storage Matthias Boehm Graz University of Technology, Austria Computer Science and Biomedical Engineering Institute of Interactive Systems and Data Science BMK endowed chair for Data Management Last update: Dec 11, 2020 2 Announcements/Org . #1 Video Recording . Link in TeachCenter & TUbe (lectures will be public) . Optional attendance (independent of COVID) . #2 COVID‐19 Restrictions (HS i5) . Corona Traffic Light: RED ORANGE (Dec 07) . Webex lectures revaluated over holidays . #3 Projects and Exercises . 34x SystemDS projects, 11x exercise projects . Final kickoff discussions last week . #4 Course Evaluation . Evaluation period: Dec 15 – Jan 31 . Exam date: TBD, but likely end of January 706.520 Data Integration and Large‐Scale Analysis – 10 Distributed Data Storage Matthias Boehm, Graz University of Technology, WS 2020/21 3 Course Outline Part B: Large‐Scale Data Management and Analysis 12 Distributed Stream 13 Distributed Machine Processing [Jan 15] Learning Systems [Jan 22] 11 Distributed Data‐Parallel Computation [Jan 08] Compute/ Storage 10 Distributed Data Storage [Dec 11] 09 Cloud Resource Management and Scheduling [Dec 04] Infra 08 Cloud Computing Fundamentals [Nov 27] 706.520 Data Integration and Large‐Scale Analysis – 10 Distributed Data Storage Matthias Boehm, Graz University of Technology, WS 2020/21 4 Agenda . Motivation and Terminology . Object Stores and Distributed File Systems . Key‐Value Stores and Cloud DBMS 706.520 Data Integration and Large‐Scale Analysis – 10 Distributed Data Storage Matthias Boehm, Graz University of Technology, WS 2020/21 5 Motivation and Terminology 706.520 Data Integration and Large‐Scale Analysis – 10 Distributed Data Storage Matthias Boehm, Graz University of Technology, WS 2020/21 Motivation and Terminology 6 Overview Distributed Data Storage Global . -
Globalfs: a Strongly Consistent Multi-Site File System
GlobalFS: A Strongly Consistent Multi-Site File System Leandro Pacheco Raluca Halalai Valerio Schiavoni University of Lugano University of Neuchatelˆ University of Neuchatelˆ Fernando Pedone Etienne Riviere` Pascal Felber University of Lugano University of Neuchatelˆ University of Neuchatelˆ Abstract consistency, availability, and tolerance to partitions. Our goal is to ensure strongly consistent file system operations This paper introduces GlobalFS, a POSIX-compliant despite node failures, at the price of possibly reduced geographically distributed file system. GlobalFS builds availability in the event of a network partition. Weak on two fundamental building blocks, an atomic multicast consistency is suitable for domain-specific applications group communication abstraction and multiple instances of where programmers can anticipate and provide resolution a single-site data store. We define four execution modes and methods for conflicts, or work with last-writer-wins show how all file system operations can be implemented resolution methods. Our rationale is that for general-purpose with these modes while ensuring strong consistency and services such as a file system, strong consistency is more tolerating failures. We describe the GlobalFS prototype in appropriate as it is both more intuitive for the users and detail and report on an extensive performance assessment. does not require human intervention in case of conflicts. We have deployed GlobalFS across all EC2 regions and Strong consistency requires ordering commands across show that the system scales geographically, providing replicas, which needs coordination among nodes at performance comparable to other state-of-the-art distributed geographically distributed sites (i.e., regions). Designing file systems for local commands and allowing for strongly strongly consistent distributed systems that provide good consistent operations over the whole system. -
Towards Transparent Data Access with Context Awareness
Computer Science • 19(2) 2018 https://doi.org/10.7494/csci.2018.19.2.2844 Micha lWrzeszcz Renata G. S lota Jacek Kitowski TOWARDS TRANSPARENT DATA ACCESS WITH CONTEXT AWARENESS Abstract Open-data research is an important factor accelerating the production and analysis of scientific results as well as worldwide collaboration; still, very little dataEarly is being shared at scale. The aim of this bird article is to analyze existing data-access solutions along with their usage limitations. After analyzing the existing solutions and data-access stakeholder needs, the authors propose their own vision of a data-access model. Keywords distributed, transparent data access, distributed data storage, context awareness Citation Computer Science 19(2) 2018: 201{221 201 202 Micha Wrzeszcz, Jacek Kitowski, Renata S lota 1. Introduction Data access and management can be analyzed on several levels; from personal data to globally distributed shared data with different problems to be overcome by data access and management tools. The simplest case is access to local data; i.e., data stored on direct attached storage (DAS), where the focus is on providing the device drivers and solutions that use the hardware optimally. The next level is to provide data access for a group of users working for a single organization; e.g., provision of network attached storage (NAS). The problems en- countered in this case pertain to possible network failures, higher latencies [36], or the simultaneous work of many users influencing the quality of service [48, 59]. The balance of use of available storage systems to maintain QoS and cost effectiveness in an organization while request scheduling and resource allocation is also expected [28,56]. -
The Leading Parallel Cluster File System, Developed with a St Rong Focus on Perf Orm Ance and Designed for Very Ea Sy Inst All Ati on and Management
The Lead ing Para llel Clus ter File Syste m ww w.thinkpa rq.com ww w.beegfs.io UT BEEGFS ABO UT BEEGFS Wh at is Bee GFS System Archit ectu re BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a st rong focus on perf orm ance and designed for very ea sy inst all ati on and management. If I/O intensive workloads a re your p roblem, BeeGFS is the solution. Wh y use BeeGFS BeeGFS tr ansparentl y sp reads user data ac ross multiple servers. By inc reasing the number of servers and disks in the system, you can simply scale perf ormance and capacity of the file system to the level that you need, seamlessly f rom small clusters up to enterprise-class systems with thousands of nodes. Get The Most Out Of Your Data The flexibility, robustness, and outstanding performance of BeeGFS help our customers a round the globe to increa se prod uctivit y by delivering results faster and by enabli ng new data anal ysis met hod s that we re not possible without the advantages of BeeGFS. ASP ECTS KEY ASP ECTS Ma xim um Scala bili ty Ma xim um Flexibi lit y BeeGFS offers maximum performance and scalability on various BeeGFS supports a wide range of Linu x dist ribu ti on s such levels. It supports dist ribu te d fi le con te nts with flexible as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a striping ac ross storage servers on a pe r-file or pe r-di rectory basis wide range of Linux ke rnels f rom ancient 2.6.18 up to the as well as dist ribu te d me tadat a. -
Maximizing the Performance of Scientific Data Transfer By
Maximizing the Performance of Scientific Data Transfer by Optimizing the Interface Between Parallel File Systems and Advanced Research Networks Nicholas Millsa,∗, F. Alex Feltusb, Walter B. Ligon IIIa aHolcombe Department of Electrical and Computer Engineering, Clemson University, Clemson, SC bDepartment of Genetics and Biochemistry, Clemson University, Clemson, SC Abstract The large amount of time spent transferring experimental data in fields such as genomics is hampering the ability of scientists to generate new knowledge. Often, computer hardware is capable of faster transfers but sub-optimal transfer software and configurations are limiting performance. This work seeks to serve as a guide to identifying the optimal configuration for performing genomics data transfers. A wide variety of tests narrow in on the optimal data transfer parameters for parallel data streaming across Internet2 and between two CloudLab clusters loading real genomics data onto a parallel file system. The best throughput was found to occur with a configuration using GridFTP with at least 5 parallel TCP streams with a 16 MiB TCP socket buffer size to transfer to/from 4{8 BeeGFS parallel file system nodes connected by InfiniBand. Keywords: Software Defined Networking, High Throughput Computing, DNA sequence, Parallel Data Transfer, Parallel File System, Data Intensive Science 1. Introduction of over 2,500 human genomes from 26 populations have been determined (The 1000 Genomes Project [3]); genome Solving scientific problems on a high-performance com- sequences have been produced for 3,000 rice varieties from puting (HPC) cluster will happen faster by taking full ad- 89 countries (The 3000 Rice Genomes Project [4]). These vantage of specialized infrastructure such as parallel file raw datasets are but a few examples that aggregate into systems and advanced software-defined networks. -
Some Results Recall: FAT Properties FAT Assessment – Issues
Recall: A Little Queuing Theory: Some Results • Assumptions: CS162 – System in equilibrium; No limit to the queue Operating Systems and – Time between successive arrivals is random and memoryless Systems Programming Queue Server Lecture 19 Arrival Rate Service Rate μ=1/Tser File Systems (Con’t), • Parameters that describe our system: MMAP, Buffer Cache – : mean number of arriving customers/second –Tser: mean time to service a customer (“m1”) –C: squared coefficient of variance = 2/m12 –μ: service rate = 1/Tser th April 11 , 2019 –u: server utilization (0u1): u = /μ = Tser • Parameters we wish to compute: Prof. Ion Stoica –Tq: Time spent in queue http://cs162.eecs.Berkeley.edu –Lq: Length of queue = Tq (by Little’s law) • Results: –Memoryless service distribution (C = 1): » Called M/M/1 queue: Tq = Tser x u/(1 – u) –General service distribution (no restrictions), 1 server: » Called M/G/1 queue: Tq = Tser x ½(1+C) x u/(1 – u)) 4/11/19 Kubiatowicz CS162 © UCB Spring 2019 Lec 19.2 Recall: FAT Properties FAT Assessment – Issues • File is collection of disk blocks • Time to find block (large files) ?? (Microsoft calls them “clusters”) FAT Disk Blocks FAT Disk Blocks • FAT is array of integers mapped • Block layout for file ??? 1-1 with disk blocks 0: 0: 0: 0: File #1 File #1 – Each integer is either: • Sequential Access ??? » Pointer to next block in file; or 31: File 31, Block 0 31: File 31, Block 0 » End of file flag; or File 31, Block 1 File 31, Block 1 » Free block flag File 63, Block 1 • Random Access ??? File 63, Block 1 • File Number -
855400Cd46b6c9e7683b5ace58a55234.Pdf
Ars TeAchrsn iTceachnica UK Register Log in ▼ ▼ Ars Technica has arrived in Europe. Check it out! × Encryption options are great, but Apple's attitude on checksums is still funky. by Adam H. Leventhal - Jun 26, 2016 1:00pm UTC KEEPING ON TRUCKING INDEPENDENCE, WHAT INDEPENDENCE? Two hours or so of WWDC keynoting and Tim Cook didn't mention a new file system once? Andrew Cunningham This article was originally published on Adam Leventhal's blog in multiple parts. Programmer Andrew Nõmm: "I had to be made Apple announced a new file system that will make its way into all of its OS variants (macOS, tvOS, iOS, an example of as a warning to all IT people." watchOS) in the coming years. Media coverage to this point has been mostly breathless elongations of Apple's developer documentation. With a dearth of detail I decided to attend the presentation and Q&A with the APFS team at WWDC. Dominic Giampaolo and Eric Tamura, two members of the APFS team, gave an overview to a packed room; along with other members of the team, they patiently answered questions later in the day. With those data points and some first-hand usage I wanted to provide an overview and analysis both as a user of Apple-ecosystem products and as a long-time operating system and file system developer. The overview is divided into several sections. I'd encourage you to jump around to topics of interest or skip right to the conclusion (or to the tweet summary). Highest praise goes to encryption; ire to data integrity. -
Recent Filesystem Optimisations in Freebsd
Recent Filesystem Optimisations in FreeBSD Ian Dowse <[email protected]> Corvil Networks. David Malone <[email protected]> CNRI, Dublin Institute of Technology. Abstract 2.1 Soft Updates In this paper we summarise four recent optimisations Soft updates is one solution to the problem of keeping to the FFS implementation in FreeBSD: soft updates, on-disk filesystem metadata recoverably consistent. Tra- dirpref, vmiodir and dirhash. We then give a detailed ex- ditionally, this has been achieved by using synchronous position of dirhash’s implementation. Finally we study writes to order metadata updates. However, the perfor- these optimisations under a variety of benchmarks and mance penalty of synchronous writes is high. Various look at their interactions. Under micro-benchmarks, schemes, such as journaling or the use of NVRAM, have combinations of these optimisations can offer improve- been devised to avoid them [14]. ments of over two orders of magnitude. Even real-world workloads see improvements by a factor of 2–10. Soft updates, proposed by Ganger and Patt [4], allows the reordering and coalescing of writes while maintain- ing consistency. Consequently, some operations which have traditionally been durable on system call return are 1 Introduction no longer so. However, any applications requiring syn- chronous updates can still use fsync(2) to force specific changes to be fully committed to disk. The implementa- Over the last few years a number of interesting tion of soft updates is relatively complicated, involving filesystem optimisations have become available under tracking of dependencies and the roll forward/back of FreeBSD. In this paper we have three goals. -
Toward Automatic Context-Based Attribute Assignment for Semantic file Systems
Toward automatic context-based attribute assignment for semantic file systems Craig A. N. Soules, Gregory R. Ganger CMU-PDL-04-105 June 2004 Parallel Data Laboratory Carnegie Mellon University Pittsburgh, PA 15213-3890 Abstract Semantic file systems enable users to search for files based on attributes rather than just pre-assigned names. This paper devel- ops and evaluates several new approaches to automatically generating file attributes based on context, complementing existing approaches based on content analysis. Context captures broader system state that can be used to provide new attributes for files, and to propagate attributes among related files; context is also how humans often remember previous items [2], and so should fit the primary role of semantic file systems well. Based on our study of ten systems over four months, the addition of context-based mechanisms, on average, reduces the number of files with zero attributes by 73%. This increases the total number of classifiable files by over 25% in most cases, as is shown in Figure 1. Also, on average, 71% of the content-analyzable files also gain additional valuable attributes. Acknowledgements: We thank the members and companies of the PDL Consortium (including EMC, Hewlett-Packard, HGST, Hitachi, IBM, Intel, LSI Logic, Microsoft, Network Appliance, Panasas, Seagate, Sun, and Veritas) for their interest, insights, feedback, and support. We thank IBM and Intel for hardware grants supporting our research efforts. Keywords: semantic filesystems, context, attributes, classfication 1 Introduction As storage capacity continues to increase, the number of files belonging to an individual user has increased accordingly. Already, storage capacity has reached the point where there is little reason for a user to delete old content—in fact, the time required to do so would be wasted. -
Dominic Giampaolo Interview Page 1 of 4
frontwheeldrive.com: dominic giampaolo interview Page 1 of 4 home | interviews | reviews | mentors | about | contact Anno Dominic Dominic Giampaolo [by Tom Georgoulias] Dominic Giampaolo isn't exactly a household name, but chances are high you've actually seen the benefits of his work. Dominic was the guy who tracked down a bug in the graphics driver during his days at SGI that was inhibiting a digital effects studio from getting their work completed on the movie Speed. He was also responsible for tracking down a gnarly bug in an eight processor system that only occured within a window of 12 nanoseconds. Dominic worked on various portions of the RealityEngine before Google moving to Be Inc. as a kernel engineer and file system architect for the BeOS. Easily the best search I met Dominic several years ago through a mutual friend and we've been engine out there. corresponding ever since, discussing topics like music, traveling, and software on a weekly basis. Dominic recently left Be Inc. for a senior project management position at Google, so I felt it was time to put him under the BeDope Interview frontwheeldrive spotlight and get his take on various events such as the rise of TA short but entertaining Linux and the Microsoft anti-trust trial. interview from BeDope, the Be Inc. fan site that makes up it's own news. frontwheeldrive: As I understand it, after your graduation you basically drove from Worcester, MA to San Jose, CA and landed a job at SGI. How did you like it there? Inside the BeOS Take a stroll through the BeOS with Dominic and Dominic Giampaolo: I loved working at SGI.