Buffered FUSE: Optimising the Android IO Stack for User-Level Filesystem

Total Page:16

File Type:pdf, Size:1020Kb

Buffered FUSE: Optimising the Android IO Stack for User-Level Filesystem Int. J. Embedded Systems, Vol. 6, Nos. 2/3, 2014 95 Buffered FUSE: optimising the Android IO stack for user-level filesystem Sooman Jeong and Youjip Won* Hanyang University, #507-2, Annex of Engineering Center, 17 Haengdang-dong, Sungdong-gu, Seoul, 133-791, South Korea E-mail: [email protected] E-mail: [email protected] *Corresponding author Abstract: In this work, we optimise the Android IO stack for user-level filesystem. Android imposes user-level filesystem over native filesystem partition to provide flexibility in managing the internal storage space and to maintain host compatibility. The overhead of user-level filesystem is prohibitively large and the native storage bandwidth is significantly under-utilised. We overhauled the FUSE layer in the Android platform and propose buffered FUSE (bFUSE) to address the overhead of user-level filesystem. The key technical ingredients of buffered FUSE are: 1) extended FUSE IO size; 2) internal user-level write buffer; 3) independent management thread which performs time-driven FUSE buffer synchronisation. With buffered FUSE, we examined the performances of five different filesystems and three disk scheduling algorithms in a combinatorial manner. With bFUSE on XFS filesystem using the deadline scheduling, we achieved the IO performance improvements of 470% and 419% in Android ICS and JB, respectively, over the existing smartphone device. Keywords: Android; storage; user-level filesystem; FUSE; write buffer; embedded systems. Reference to this paper should be made as follows: Jeong, S. and Won, Y. (2014) ‘Buffered FUSE: optimising the Android IO stack for user-level filesystem’, Int. J. Embedded Systems, Vol. 6, Nos. 2/3, pp.95–107. Biographical notes: Sooman Jeong received his BS degrees in Computer Science from Hanyang University, Seoul, Korea, in 2003. He worked for Samsung Electronics as a Software Engineer. He is currently working toward his MS degree in the Department of Computer Science, Hanyang University, Seoul, Korea. His research interests include optimising I/O stack of smartphones. Youjip Won received his BS and MS degrees in Computer Science from the Seoul National University, Korea, in 1990 and 1992, respectively. He received his PhD in Computer Science from the University of Minnesota, Minneapolis, in 1997. After receiving his PhD degree, he joined Intel as a Server Performance Analyst. Since 1999, he has been with the Department of Computer Science, Hanyang University, Seoul, Korea, as a Professor. His research interests include operating systems, file and storage subsystems, multimedia networking, and network traffic modelling and analysis. 1 Introduction An Android-based device is no longer a simple media player but has become a professional content creator. Smart devices have rapidly spread to the public and One piece of clear evidence for this trend is a recent expanded their territory into all home electronics including introduction of an Android-based digital camera with a high phones, TVs, cameras, and game consoles. As they have resolution sensor (up to 17 mega-pixels) (Galaxy Camera, become so closely related to our everyday lives, improving http://www.samsung.com/us/photography/galaxy-camera). the performance of these smart devices will have a In the future, smart devices including phones, TVs, and significant impact on society. A recent study on Android cameras are projected to acquire and play movies at smartphones suggests that storage is the biggest ultra-HD (3,840 × 2,160 @ 60 frames/sec) resolution. performance bottleneck (Kim et al., 2012a). Copyright © 2014 Inderscience Enterprises Ltd. 96 S. Jeong and Y. Won Android apps will require more IO bandwidth to handle We examined the behaviour of FUSE, native filesystem, such high definition multimedia contents. buffer cache, and IO scheduler for user partition accesses The Android platform partitions its storage into multiple and found that FUSE layer is the dominant source of logical partitions. There are two main storage partitions in overhead. FUSE entails significant overhead mostly due to the Android: data and sdcard. Data partition (/data) is used the following three reasons: to harbour text files, e.g., SQLite database tables, and sdcard 1 command fragmentation partition (/sdcard) harbours multimedia files, e.g., pictures, mp3 files and video clips. In recent smartphones, 2 excessive context switch both of these partitions share the same storage device. 3 loss of access correlation. Android storage has gone through several generations of evolution. In the early models, as the partition names We propose buffered FUSE (bFUSE) framework, which suggest, data partition was at internal NAND storage device effectively addresses the above mentioned three technical (eMMC) and sdcard partition was at external sdcard (HTC issues. bFUSE framework consists of: Dream, http://en.wikipedia.org/wiki/HTC_Dream). In the 1 extended FUSE IO unit later generation of Android-based smartphones (Samsung Galaxy S2, http://www.samsung.com/global/microsite/ 2 FUSE buffer managed by independent thread galaxys2/html/), data partition and sdcard partition resided 3 time driven FUSE buffer synchronisation. at the same physical device, internal NAND-based storage and each was allocated a ‘fixed’ size logical partition. Until The Android IO stack consists of FUSE, native filesystem this generation, the Android IO stack imposed EXT4 (EXT4), IO scheduler for block device (CFQ), and (Mathur et al., 2007) over data partition and VFAT underlying eMMC storage. We focus on finding the optimal (Microsoft Corporation, 2000) filesystem over sdcard combination in the IO stack design choices. We examined partition. VFAT filesystem is de facto standard in embedded the behaviour of five filesystems (EXT4, XFS, F2FS, multimedia devices, e.g., cameras, camcorders, TVs, etc. BTRFS, and NILFS2) and three IO schedulers provided by Therefore, for PC compatibility, it is mandatory that sdcard standard Linux kernel (CFQ, deadline, and NOOP) in a partition adopts VFAT filesystem despite its technical combinatorial manner. By using bFUSE and deadline-based deficiencies (Lim et al., 2013). However, this fixed IO scheduler and changing the filesystem to XFS, the IO partition-based storage configuration causes a significant performance (storage throughput) increases by 470% over underutilisation of the storage space. the baseline (existing smartphone). With this optimisation, Recent Android platform imposes user-level filesystem we are able to achieve 38 Mbyte/sec sequential write (FUSE) for sdcard partition instead of VFAT. This allows bandwidth and utilise 99% of the raw device bandwidth in sdcard to freely manoeuvre between the two respective the Android storage stack. partitions. The FUSE filesystem exists as a virtual The remainder of this paper is organised as follows. filesystem layer on top of EXT4, the native filesystem. Section 2 presents the background. The behaviour of the While this configuration provides flexibility in managing Android IO stack is analysed in Section 3. bFUSE is the internal storage space, it causes significant overhead on introduced in Section 4. Section 5 provides optimisations the FUSE, decreasing the performance of IO accesses to for the Android IO stack. Section 6 presents the results of user partition. For example, in Galaxy S3, sequential write our integration of the proposed schemes. Section 7 describes performance in FUSE imposed filesystem is one fourths of other works related to the study, and Section 8 concludes that in native EXT4 filesystem partition. this paper. The IO access to the Android storage can be categorised into two types: data partition accesses and sdcard partition accesses. While the IO accesses to both partitions suffer 2 Background from excessive overhead, the source of inefficiency differs vastly. For data partition accesses, majority of IOs are 2.1 Storage partitions for android-based created by SQLite operation and the inefficiency lies in the smartphones excessive filesystem journaling (Lee and Won, 2012; Jeong Android manages several filesystem partitions: /recovery, et al., 2013b). In this work, we focus on optimising the /boot, /cache, /system, /data and /sdcard. The Android IO stack for handling multimedia files. Sdcard /system partition contains Android-executable files and partition is the default storage partition for mp3, video files, pre-installed applications. The /cache partition is used for photographs, and user-downloaded documents, e.g., pdf updating firmware. The /boot and /recovery partitions files. We found that the IO accesses to sdcard partition are used for maintaining boot image. There are two types of suffer from significant overhead and the apps utilise only main partition: ‘/data’ and ‘/sdcard’. /data partition 25% of the raw NAND storage bandwidth. Buffered FUSE: optimising the Android IO stack for user-level filesystem 97 harbours SQLite tables, SQLite journals, and apps. /sdcard and user partitions are formatted with different filesystems, partition usually harbours pictures, video clips, and mp3 EXT4 and VFAT, respectively. Recently, the Android files, which are mostly user created contents. We call adopted FUSE to manage both data and user partitions /sdcard an ‘sdcard partition’ or a ‘user partition’. at the same physical partition (Google Galaxy Nexus, Typical accesses to the data partition (/data) and the http://www.google.com/nexus/4; Samsung Galaxy S3, user partition (/sdcard) are small (4 Kbyte) synchronous http://www.samsung.com/ae/microsite/galaxys3/en/
Recommended publications
  • Linux on the Road
    Linux on the Road Linux with Laptops, Notebooks, PDAs, Mobile Phones and Other Portable Devices Werner Heuser <wehe[AT]tuxmobil.org> Linux Mobile Edition Edition Version 3.22 TuxMobil Berlin Copyright © 2000-2011 Werner Heuser 2011-12-12 Revision History Revision 3.22 2011-12-12 Revised by: wh The address of the opensuse-mobile mailing list has been added, a section power management for graphics cards has been added, a short description of Intel's LinuxPowerTop project has been added, all references to Suspend2 have been changed to TuxOnIce, links to OpenSync and Funambol syncronization packages have been added, some notes about SSDs have been added, many URLs have been checked and some minor improvements have been made. Revision 3.21 2005-11-14 Revised by: wh Some more typos have been fixed. Revision 3.20 2005-11-14 Revised by: wh Some typos have been fixed. Revision 3.19 2005-11-14 Revised by: wh A link to keytouch has been added, minor changes have been made. Revision 3.18 2005-10-10 Revised by: wh Some URLs have been updated, spelling has been corrected, minor changes have been made. Revision 3.17.1 2005-09-28 Revised by: sh A technical and a language review have been performed by Sebastian Henschel. Numerous bugs have been fixed and many URLs have been updated. Revision 3.17 2005-08-28 Revised by: wh Some more tools added to external monitor/projector section, link to Zaurus Development with Damn Small Linux added to cross-compile section, some additions about acoustic management for hard disks added, references to X.org added to X11 sections, link to laptop-mode-tools added, some URLs updated, spelling cleaned, minor changes.
    [Show full text]
  • Dm-X: Protecting Volume-Level Integrity for Cloud Volumes and Local
    dm-x: Protecting Volume-level Integrity for Cloud Volumes and Local Block Devices Anrin Chakraborti Bhushan Jain Jan Kasiak Stony Brook University Stony Brook University & Stony Brook University UNC, Chapel Hill Tao Zhang Donald Porter Radu Sion Stony Brook University & Stony Brook University & Stony Brook University UNC, Chapel Hill UNC, Chapel Hill ABSTRACT on Amazon’s Virtual Private Cloud (VPC) [2] (which provides The verified boot feature in recent Android devices, which strong security guarantees through network isolation) store deploys dm-verity, has been overwhelmingly successful in elim- data on untrusted storage devices residing in the public cloud. inating the extremely popular Android smart phone rooting For example, Amazon VPC file systems, object stores, and movement [25]. Unfortunately, dm-verity integrity guarantees databases reside on virtual block devices in the Amazon are read-only and do not extend to writable volumes. Elastic Block Storage (EBS). This paper introduces a new device mapper, dm-x, that This allows numerous attack vectors for a malicious cloud efficiently (fast) and reliably (metadata journaling) assures provider/untrusted software running on the server. For in- volume-level integrity for entire, writable volumes. In a direct stance, to ensure SEC-mandated assurances of end-to-end disk setup, dm-x overheads are around 6-35% over ext4 on the integrity, banks need to guarantee that the external cloud raw volume while offering additional integrity guarantees. For storage service is unable to remove log entries documenting cloud storage (Amazon EBS), dm-x overheads are negligible. financial transactions. Yet, without integrity of the storage device, a malicious cloud storage service could remove logs of a coordinated attack.
    [Show full text]
  • SUSE BU Presentation Template 2014
    TUT7317 A Practical Deep Dive for Running High-End, Enterprise Applications on SUSE Linux Holger Zecha Senior Architect REALTECH AG [email protected] Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 2 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 3 About REALTECH 1/2 REALTECH Software REALTECH Consulting . Business Service Management . SAP Mobile . Service Operations Management . Cloud Computing . Configuration Management and CMDB . SAP HANA . IT Infrastructure Management . SAP Solution Manager . Change Management for SAP . IT Technology . Virtualization . IT Infrastructure 4 About REALTECH 2/2 5 Our Customers Manufacturing IT services Healthcare Media Utilities Consumer Automotive Logistics products Finance Retail REALTECH Consulting GmbH 6 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 7 The Inspiration for this Session • Several performance workshops at customers • Performance escalations at customer who migrated from UNIX (AIX, Solaris, HP-UX) to Linux • Presenting the experiences made at these customers in this session • Preventing the audience from performance degradation caused from: – Significant design mistakes – Wrong architecture assumptions – Having no architecture at all 8 Performance Optimization The False Estimation Upgrading server with CPUs that are 12.5% faster does not improve application
    [Show full text]
  • On the Performance Variation in Modern Storage Stacks
    On the Performance Variation in Modern Storage Stacks Zhen Cao1, Vasily Tarasov2, Hari Prasath Raman1, Dean Hildebrand2, and Erez Zadok1 1Stony Brook University and 2IBM Research—Almaden Appears in the proceedings of the 15th USENIX Conference on File and Storage Technologies (FAST’17) Abstract tions on different machines have to compete for heavily shared resources, such as network switches [9]. Ensuring stable performance for storage stacks is im- In this paper we focus on characterizing and analyz- portant, especially with the growth in popularity of ing performance variations arising from benchmarking hosted services where customers expect QoS guaran- a typical modern storage stack that consists of a file tees. The same requirement arises from benchmarking system, a block layer, and storage hardware. Storage settings as well. One would expect that repeated, care- stacks have been proven to be a critical contributor to fully controlled experiments might yield nearly identi- performance variation [18, 33, 40]. Furthermore, among cal performance results—but we found otherwise. We all system components, the storage stack is the corner- therefore undertook a study to characterize the amount stone of data-intensive applications, which become in- of variability in benchmarking modern storage stacks. In creasingly more important in the big data era [8, 21]. this paper we report on the techniques used and the re- Although our main focus here is reporting and analyz- sults of this study. We conducted many experiments us- ing the variations in benchmarking processes, we believe ing several popular workloads, file systems, and storage that our observations pave the way for understanding sta- devices—and varied many parameters across the entire bility issues in production systems.
    [Show full text]
  • Vytvoření Softwareově Definovaného Úložiště Pro Potřeby Ukládání a Sdílení Dat V Rámci Instituce a Jeho Zálohování Do Datových Úložišť Cesnet
    Slezská univerzita v Opavě Centrum informačních technologií Vysoká škola báňská – Technická univerzita Ostrava Fakulta elektrotechniky a informatiky Technická zpráva k projektu 529R1/2014 Vytvoření softwareově definovaného úložiště pro potřeby ukládání a sdílení dat v rámci instituce a jeho zálohování do datových úložišť Cesnet Řešitel: Ing. Jiří Sléžka Spoluřešitelé: Mgr. Jan Nosek, Ing. Pavel Nevlud, Ing. Marek Dvorský, Ph.D., Ing. Jiří Vychodil, Ing. Lukáš Kapičák Leden 2016 Obsah 1 Popis projektu...................................................................................................................................1 2 Cíle projektu.....................................................................................................................................1 3 Volba HW a SW komponent.............................................................................................................1 3.1 Hardware...................................................................................................................................1 3.2 GlusterFS..................................................................................................................................2 3.2.1 Replicated GlusterFS Volume...........................................................................................2 3.2.2 Distributed GlusterFS Volume..........................................................................................3 3.2.3 Striped Glusterfs Volume..................................................................................................3
    [Show full text]
  • SGI™ Propack 1.3 for Linux™ Start Here
    SGI™ ProPack 1.3 for Linux™ Start Here Document Number 007-4062-005 © 1999—2000 Silicon Graphics, Inc.— All Rights Reserved The contents of this document may not be copied or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. LIMITED AND RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure by the Government is subject to restrictions as set forth in the Rights in Data clause at FAR 52.227-14 and/or in similar or successor clauses in the FAR, or in the DOD, DOE or NASA FAR Supplements. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/ manufacturer is SGI, 1600 Amphitheatre Pkwy., Mountain View, CA 94043-1351. Silicon Graphics is a registered trademark and SGI and SGI ProPack for Linux are trademarks of Silicon Graphics, Inc. Intel is a trademark of Intel Corporation. Linux is a trademark of Linus Torvalds. NCR is a trademark of NCR Corporation. NFS is a trademark of Sun Microsystems, Inc. Oracle is a trademark of Oracle Corporation. Red Hat is a registered trademark and RPM is a trademark of Red Hat, Inc. SuSE is a trademark of SuSE Inc. TurboLinux is a trademark of TurboLinux, Inc. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd. SGI™ ProPack 1.3 for Linux™ Start Here Document Number 007-4062-005 Contents List of Tables v About This Guide vii Reader Comments vii 1. Release Features 1 Feature Overview 2 Qualified Drivers 3 Patches and Changes to Base Linux Distributions 3 2.
    [Show full text]
  • Mobile Video Ecosystem and Geofencing for Licensed Content Delivery
    Mobile Video Ecosystem and Geofencing for Licensed Content Delivery TABLE OF CONTENTS Executive Summary ...................................................................................................................................... 2 1. Introduction ............................................................................................................................................... 4 2. Video Delivery Challenges and Optimizations .......................................................................................... 6 2.1 Delivery technology ............................................................................................................................. 7 2.2 Video Codecs ...................................................................................................................................... 9 2.3 Codec comparison ............................................................................................................................... 9 2.4 Congestion Control Algorithm Enhancements .................................................................................. 11 3. Managing Video Based on Network Load............................................................................................... 12 3.1 Radio Congestion Aware Function (RCAF) ...................................................................................... 12 3.2 Mobile Throughput Guidance ............................................................................................................ 13 3.3 Self Organizing
    [Show full text]
  • Performance Characteristics of VMFS and RDM Vmware ESX Server 3.0.1
    Performance Study Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine—VMware Virtual Machine File System (VMFS), virtual raw device mapping (RDM), and physical raw device mapping. It is very important to understand the I/O characteristics of these disk access management systems in order to choose the right access type for a particular application. Choosing the right disk access management method can be a key factor in achieving high system performance for enterprise‐class applications. This study provides performance characterization of the various disk access management methods supported by VMware ESX Server. The goal is to provide data on performance and system resource utilization at various load levels for different types of work loads. This information offers you an idea of relative throughput, I/O rate, and CPU efficiency for each of the options so you can select the appropriate disk access method for your application. This study covers the following topics: “Technology Overview” on page 2 “Executive Summary” on page 2 “Test Environment” on page 2 “Performance Results” on page 5 “Conclusion” on page 10 “Configuration” on page 10 “Resources” on page 11 Copyright © 2007 VMware, Inc. All rights reserved. 1 Performance Characteristics of VMFS and RDM Technology Overview VMFS is a special high‐performance file system offered by VMware to store ESX Server virtual machines. Available as part of ESX Server, it is a distributed file system that allows concurrent access by multiple hosts to files on a shared VMFS volume.
    [Show full text]
  • Most Common Events and Errors in Actifio VDP 9.0.X
    Tech Brief Most Common Events and Errors in Actifio VDP 9.0.x The number of errors and warnings encountered by an Actifio appliance are displayed in the upper right-hand corner of the Actifio Desktop Dashboard. Click on the number to display a list of the events in the System Monitor service. This is a list of the most common Event IDs that you may see. You can find detailed solutions for these error codes are in ActifioNOW at: https://actifio.force.com/community/articles/Top_Solution/EventID-TopSolution Job failures can be caused by many errors. Each 43901 event message includes an error code and an error message. See the corresponding Error Code in Most Common Errors that Cause Events, Actifio VDP 9.0.x on page 9. Most Common Error and Warning Events, Actifio VDP 9.0.x Event ID Event Message What To Do 10019 System resource low The target Actifio pool (snapshot or dedup) is running out of space. See the AGM online help for a wealth of information on optimizing your storage pool usage and on optimizing image capture and dedup. 10026 A suitable MDisk or drive A quorum disk is a managed drive that contains a reserved area for for use as a quorum disk data, used exclusively for system management. Both Actifio nodes was not found. A FC path communicate on a constant basis with the three quorum disks from CDS nodes to a configured in the appliance. When one of the quorum disks is not target device has been available to any node, an error message is generated.
    [Show full text]
  • Opennebula 5.7 Deployment Guide Release 5.7.80
    OpenNebula 5.7 Deployment guide Release 5.7.80 OpenNebula Systems Jan 16, 2019 This document is being provided by OpenNebula Systems under the Creative Commons Attribution-NonCommercial- Share Alike License. THE DOCUMENT IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IM- PLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE DOCUMENT. i CONTENTS 1 Cloud Design 1 1.1 Overview.................................................1 1.2 Open Cloud Architecture.........................................2 1.3 VMware Cloud Architecture.......................................7 1.4 OpenNebula Provisioning Model.................................... 13 2 OpenNebula Installation 19 2.1 Overview................................................. 19 2.2 Front-end Installation.......................................... 19 2.3 MySQL Setup.............................................. 25 3 Node Installation 27 3.1 Overview................................................. 27 3.2 KVM Node Installation......................................... 28 3.3 LXD Node Installation.......................................... 35 3.4 vCenter Node Installation........................................ 37 3.5 Verify your Installation.......................................... 45 4 Authentication Setup 52
    [Show full text]
  • Network Interface Controller Drivers Release 20.08.0
    Network Interface Controller Drivers Release 20.08.0 Aug 08, 2020 CONTENTS 1 Overview of Networking Drivers1 2 Features Overview4 2.1 Speed capabilities.....................................4 2.2 Link status.........................................4 2.3 Link status event......................................4 2.4 Removal event.......................................5 2.5 Queue status event.....................................5 2.6 Rx interrupt.........................................5 2.7 Lock-free Tx queue....................................5 2.8 Fast mbuf free.......................................5 2.9 Free Tx mbuf on demand..................................6 2.10 Queue start/stop......................................6 2.11 MTU update........................................6 2.12 Jumbo frame........................................6 2.13 Scattered Rx........................................6 2.14 LRO............................................7 2.15 TSO.............................................7 2.16 Promiscuous mode.....................................7 2.17 Allmulticast mode.....................................8 2.18 Unicast MAC filter.....................................8 2.19 Multicast MAC filter....................................8 2.20 RSS hash..........................................8 2.21 Inner RSS..........................................8 2.22 RSS key update.......................................9 2.23 RSS reta update......................................9 2.24 VMDq...........................................9 2.25 SR-IOV...........................................9
    [Show full text]
  • OSD-Btrfs, a Novel Lustre OSD Based on Btrfs
    2015/04/15 OSD-Btrfs, A Novel Lustre OSD based on Btrfs Shuichi Ihara Li Xi DataDirect Networks, Inc DataDirect Networks, Inc ©2015 DataDirect Networks. All Rights Reserved. ddn.com Why Btrfs for Lustre? ► Lustre today uses ldiskfs(ext4) or, more recently, ZFS as backend file system ► Ldiskfs(ext4) • ldiskfs(ext4) is used widely and exhibits well-tuned performance and proven reliability • But, its technical limitations are increasingly apparent, it is missing important novel features, lacks scalability, while offline fsck is extremely time-consuming ► ZFS • ZFS is an extremely feature-rich file system • but, for use as Lustre OSD further performance optimization is required • Licensing and potential legal issues have prevented a broader distribution ► Btrfs • Btrfs shares many features and goals with ZFS • Btrfs is becoming sufficiently stable for production usage • Btrfs is available through the major mainstream Linux distributions • Pachless server support. Simple Lustre setup and easy maintenance 2 ©2015 DataDirect Networks. All Rights Reserved. ddn.com Attractive Btrfs Features for Lustre (1) ► Features that improve OSD internally • Integrated multiple device support: RAID 0/1/10/5/6 o RAID 0 results in less OSTs per OSS and simplifies management, while improving the performance characteristics of a single OSD o RAID 1/10/5/6 improves reliability of OSDs • Dynamic inode allocation o No need to worry about formatting the MDT with wrong option to cause insufficient inode number • crc32c checksums on data and metadata o Improves OSD reliability • Online filesystem check and very fast offline filesystem check (planed) o Drives are getting bigger, but not faster, which leads to increasing fsck times 3 ©2015 DataDirect Networks.
    [Show full text]