The Leading Parallel Cluster File System, Developed with a St Rong Focus on Perf Orm Ance and Designed for Very Ea Sy Inst All Ati on and Management

The Lead ing Para llel Clus ter File Syste m ww w.thinkpa rq.com ww w.beegfs.io UT BEEGFS ABO UT BEEGFS Wh at is Bee GFS System Archit ectu re BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a st rong focus on perf orm ance and designed for very ea sy inst all ati on and management. If I/O intensive workloads a re your p roblem, BeeGFS is the solution. Wh y use BeeGFS BeeGFS tr ansparentl y sp reads user data ac ross multiple servers. By inc reasing the number of servers and disks in the system, you can simply scale perf ormance and capacity of the file system to the level that you need, seamlessly f rom small clusters up to enterprise-class systems with thousands of nodes. Get The Most Out Of Your Data The flexibility, robustness, and outstanding performance of BeeGFS help our customers a round the globe to increa se prod uctivit y by delivering results faster and by enabli ng new data anal ysis met hod s that we re not possible without the advantages of BeeGFS. ASP ECTS KEY ASP ECTS Ma xim um Scala bili ty Ma xim um Flexibi lit y BeeGFS offers maximum performance and scalability on various BeeGFS supports a wide range of Linu x dist ribu ti on s such levels. It supports dist ribu te d fi le con te nts with flexible as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a striping ac ross storage servers on a pe r-file or pe r-di rectory basis wide range of Linux ke rnels f rom ancient 2.6.18 up to the as well as dist ribu te d me tadat a. latest vanilla. BeeGFS is optimized especially for use in envi ronments where The storage services run on top of an exis ting local perf ormance matt ers to p rovide: fi lesyst em (such as xfs, zfs or others) using the normal POSIX interface and clients and servers can be add ed to an exis ting � Best in cla ss client throu ghp ut: 8 GB/s with only a single system wit hou t downtim e. process st reaming on a 100GBit network, while a few st reams can fully saturate the network. BeeGFS supports mult iple net works and dynamic failover in � Best in cla ss meta data perf orm ance: Linear scalability case one of the network connections is down. th rough dynamic metadata namespace partitioning. � Best in cla ss stora ge throu ghp ut: BeeGFS servers allow flexible choice of underlying file system to perfectly fit the given storage ha rdwa re. BeeGFS St orage Poo ls make d iffe rent types of storage devices available within the same namespace. By having SS Ds and HDDs in d if fe rent pools, pinning of a user project to the flash pool enables all-flash storage performance for the cur rent p roject while still p roviding the advantage of the cost-e ffecient high capacity of spinning disks for other data. BeeGFS client and server components can also run on the same physical machines. Thus, BeeGFS can turn a compu te ra ck into a cost -efficie nt con ver ged data proces sing and shared st orage un it , eliminating the need for exte rnal storage resou rces and p roviding simpl ified management. EGFS ON DEMAN D BE EGFS ON DEMAN D Ma xim um Usabi lit y Be eON D The BeeGFS se rver compon ents are use rspace daemon s, BeeOND (BeeGFS on demand) allows on the fly c reation of a while the c li ent is a nat ive kernel mod ule that does not complete parallel file system instance on a given set of hosts require any patches to the kernel itse lf. All BeeGFS components with just one single comm and. can be installed and updated without even rebooting the machine. For installation and updates the re a re rpm/deb package repositories available; for the startup mechanism, easy-to-use system service scripts a re p rovided. BeeGFS was designed with easy administration in mind. The gra ph ical adminis tra tion and mon it ori ng syst em enables dealing with typical management tasks in a simple and intuitive way, while everything is of course also available f rom a command line interface: � Live load statistics, even for individual users � Cluster installation � Storage service management BeeOND was designed to integrate with cluster batch systems � Health monitoring to c reate temporary paral lel file syst em instances on a per- � And mo re... job bas is on the inte rnal SSDs of compute nodes, which a re Excellent documentation helps to have the whole system up part of a compute job. Such BeeOND instances do not only and running in on e hour. provide a very fast and easy to use temporary bu ffe r, but also can keep a lot of I/O load for tempo rar y or ra ndo m acc es s fi les away f rom the global cluster storage. MI RROR ING BUDDY MI RROR ING Fa ult tole ran ce BeeGFSv7: Newest features BeeGFS storage servers a re typically used with an underlying � Storage pools RAID to transpa rently handle disk er rors. Using BeeGFS with � Free space balancing when adding new hardware sha red storage is also possible to handle server failu res. The � Metadata event logging enhancement built-in Bee GFS Bu ddy Mir rori ng app roach goes even one � fsck improvements step further by tolerating the loss of complete servers including � Kernel 4.14, 4.15 support all data on their RAID volumes - and that with commodity � NIC handling during startup servers and sha red-nothing ha rdwa re. Mo re Feat ures We al ready talked about the BeeGFS key aspects scalability, flexibility and usability and what‘s behind them. But the re a re way mo re featu res in BeeGFS : � Runs on various platforms, such as x86, OpenPOW ER, ARM, Xe on Phi and mo re � Re-export th rough Samba or NFSv4 possible � Support for user/g roup quota and ACLs � Fair I/O option on user level to p revent a single user with multiple requests f rom stalling requests of other users Automatic network failove r, e.g. if InfiniBand is down, The built-in BeeGFS Buddy Mir roring automatically replicates � BeeGFS automatically switches to Ethe rnet and back later data, handles storage server failu res transpa rently for running Online file system sanity check that can analyze and repair applications and p rovides automat ic self-hea ling when a � while the system is in use server comes back online, e fficiently resyncing only the files Built-in benchmarking tools to help with optimal tuning for that have changed while the machine was o ffline. � spec ific ha rdwa re and evaluate ha rdwa re capabilities � Support for cloud deployment on e.g. Amazon EC2 or Mic rosoft Azu re Not in the lis t? Just get in touch and we‘re happ y to dis cuss all your qu est ions. CHMAR KS BEN CHMAR KS Me tada ta Oper ati ons Th rou ghp ut Scalab ili ty BeeGFS was designed for ext reme scalability. In a testbed 1 In the same testbed system with 20 servers, each equipped with 20 servers and up to 640 client p rocesses (32x the with a single node local performance of 1332 MB/s (write) and number of metadata servers), BeeGFS delivers a sustained file 1317 MB/s ( read), and 160 client p rocesses, BeeGFS creation rate of mo re than 500,000 c reates per second, demonstrates linear scaling to a sustained th roughput of 25 making it possible to c reate one billion files in as little time as GB/s - which is 94.7% of the maximum theo retical local write about 30 minutes. and 94.1% of the maximum theo retical local read th roughput. 539.7 600 30 25.2 read 24.8 creates write 25 500 20 400 15 300 GB/s eates / second r 10 200 Thousand C 5 100 20 20 18 18 16 16 14 14 12 12 0 0 10 10 8 8 6 6 4 2 4 1 # Metadata Servers 2 # Storage Servers 1 1Be chma rk Syst em: 20 servers with 2x Intel Xeon X5660 @ 2.8 GHz, 48GB RAM running a Scient ific Linux 6.3, Kernel 2.6.32- 279. Each server is equipped with 4x Intel 510 Series SSD (RAID 0) running ext4 as well as QDR Infiniband. Tests performed using BeeGFS version 2012.10. CEN SIN G MODEL LI CEN SIN G MODEL Free to use & Open source Us er Com men ts BeeGFS is free to use for se lf-supporting end users and can be downloaded di rectly f rom ww w.beegfs.io ”After many unplanned downtimes with our previous parallel filesystem, we moved to BeeGFS more than two years ago. Since then we didn't have any unplanned downtime for the filesystem Prof essio nal Supp ort anymo re. " Michael Hengst, University of Halle, Germany If you a re al ready using BeeGFS but you don’t have p rofessional support yet, he re a re a few reasons to rethink this. “We a re ext remely happy with our 3.1PB BeeGFS Free With installation on 30 se rvers. It is rock-solid.

The Leading Parallel Cluster File System, Developed with a St Rong Focus on Perf Orm Ance and Designed for Very Ea Sy Inst All Ati on and Management

Base Performance for Beegfs with Data Protection

Spring 2014 • Vol

Data Integration and Analysis 10 Distributed Data Storage

Globalfs: a Strongly Consistent Multi-Site File System

Towards Transparent Data Access with Context Awareness

Maximizing the Performance of Scientific Data Transfer By

Understanding and Finding Crash-Consistency Bugs in Parallel File Systems Jinghan Sun, Chen Wang, Jian Huang, and Marc Snir University of Illinois at Urbana-Champaign

Using On-Demand File Systems in HPC Environments

Performance and Scalability of a Sensor Data Storage Framework

Towards Transparent Data Access with Context Awareness

Beegfs Unofficial Documentation

Filesystems in IO500 3% Gridscaler Datawarp 2% Wekaio Matrix 2% Total Submissions Across All Lists: 147 5% SUSE Enterprise Storage 1% Lustre 29%