Zoned Namespaces Use Cases, Standard and Linux Ecosystem

Total Page:16

File Type:pdf, Size:1020Kb

Zoned Namespaces Use Cases, Standard and Linux Ecosystem Zoned Namespaces Use Cases, Standard and Linux Ecosystem SDC SNIA EMEA 20 | Javier González – Principal Software Engineer / R&D Center Lead Why do we need a new interface? SSDs are already mainstream • Great performance (Bandwidth / LatencY) – Combination of NAND + NVMe • EasY to deploy - Direct replacement for HDDs • Acceptable price $/GB But, we have 3 recurrent problems: 1. Log-on-log Problem (WAF + OP) 2. Cost Gap with HDDs 3. Multi-tenancy everywhere - Remove redundancies - Higher bit count (QLC) - Address noisy-neighbor - Leverage log data structures - Reduce DRAM - Provide QoS - Reduce OP & WAF Log-Structured Applications (e.g., RocksDB) Addr. Mapping (L2L) Data Placement Garbage Collection Metadata Mgmt. User Space Kernel Space VFS Log-Structured File System (e.g., F2FS, XFS, ZFS) Addr. Mapping (L2L) I/O Scheduling Data Placement Garbage Collection Metadata Mgmt. Block Layer Host Read / Write / Trim / Hints Traditional SSD FTL - Log-Structured Mapping Addr. Mapping (L2P) Media Mgmt. I/O Scheduling Data Placement Garbage Collection Metadata Mgmt. 1.J. Yang, N. Plasson, G. Gillis, N. Talagala, and S. Sundararaman. Don’t stack your 2.log on mY log. In 2nd Workshop on Interactions of NVM/Flash with Operating https://blocksandfiles.com/2019/08/28/nearline-disk-drives-ssd-attack/ 3.Systems and Workloads (INFLOW), 2014. 1 ZNS: Basic Concepts Divide LBA address space in fixed size zones Zone mapping in device • Write Constraints Zone 0 - Write strictlY sequentiallY within a zone (Write Pointer) Zone1 … - ExplicitlY reset write pointer Config. A Zone n-1 • No zone mapping constraints … - Zone phYsical location transparent to host Zone 0 Zone 1 Zone k-1 Zone k Zone k+1 … Zone 2k+1 - Open design space for different configuration … … … Zone n-k Zone n-k+1 … Zone n-1 Zone state machine Config. B … • Zone transitions managed bY host Config. Z - Direct commands or boundaries Die 0 Die 1 Die 2 Die 3 Die 4 Die 5 Die 6 … Die k-1 • Zone transitions triggered bY device - Notification to host through a exception LBA mapping in zone Active Resources LBA 0 LBA m-1 Open Resources ZSEO: ZSE: Explicitly Empty Open … ZSIO: Zone 0 Zone 1 Zone 2 Zone n-1 Implicitly ZSRO: Open Read Only Write Pointer ZSC: ZSO: ZSF: Full ZSLBA ZSLBA+1 ZSLBA+2 ZSLBA+3 ZSLBA ZSLBA … ZSLBA + ZCAP - 1 Closed Offline - Strict sequential writes - Write after reset 2 ZNS: Advanced Concepts Append Command • Implementation of nameless write [1] • Benefits - Increase QD in a single zone to improve parallelism • Host changes - Changes needed to block allocator in application / file sYstem - Full reimplementation of LBA-based features in submission path (e.g., snapshotting) Write Path Append Path • Required QD=1 • QD<=X / X = #LBAs in zone - Reminder: No ordering guarantees in NVMe Append Append Write Command Commands Commands (QD1) (SQ) (CQ) … … Write Pointer QD1 Write Pointer LBA=ZSLBA+4 LBA=ZSLBA+3 LBA=ZSLBA+ZCAP-1 ZSLBA ZSLBA+1 ZSLBA+2 ZSLBA+3 ZSLBA+4 ZSLBA+5 … ZSLBA + ZCAP - 1 ZSLBA ZSLBA+1 ZSLBA+2 ZSLBA+3 ZSLBA+4 ZSLBA+5 … ZSLBA + ZCAP - 1 Zone 0 Zone 0 Zone1 Zone1 … … Zone n-1 Zone n-1 Die 0 Die 1 Die 2 Die 3 Die 4 Die 5 Die 6 … Die k-1 Die 0 Die 1 Die 2 Die 3 Die 4 Die 5 Die 6 … Die k-1 [1] Y. Zhang, L. P. Arulraj, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau. De-indirection for flash-based SSDs with nameless writes. FAST, 2012. 3 ZNS: Host Responsibilities ZNS requires changes to the host • First step: Changes to file sYstems - Advantages in log-structured file sYstems (e.g., F2FS, XFS, ZFS) - Still best-effort from application perspective - First important step for adoption • Second step: Changes to applications - OnlY make sense to log-structure applications (e.g., LSM databases such as RocksDB) - Real benefits in terms of WAF, OP and data movement (GC) - Need a waY to minimize application changes! (Spoiler Alert: xNVMe) Zone Data Address Zone Address Exception Garbage Metadata Address Placement Mapping Mapping (L2L) Handling Collection Mgmt. Mapping Zone Garbage Garbage Metadata Zone Metadata Collection Collection Mgmt. Mgmt. I/O: Read / Write / Trim / Hints Host I/O: Read / Write / Append / Reset / Commit Traditional Host Admin: Zone Management SSD FTL ZNS SSD Sect. Mapping (L2P) Media Mgmt. Exception Mgmt. Zone FTL I/O Scheduling Garbage Collection Metadata Mgmt. Zone Mapping (L2P) Media Mgmt. Exception Mgmt. Over Provisioning I/O Scheduling Zone Meta. Mgmt. Metadata Mgmt. 4 ZNS Extensions: Zone Random Write Area (ZRWA) Concept • Expose the write buffer in front of a zone to the host Explicit Write Implicit Benefits Commit Commit • Enable in-place updates within ZRWA In-place Updates Allowed - Reduce WAF due to metadata updates - Aligns with metadata laYout of manY file sYstems - Simplifies recoverY (not changes to metadata laYout) ZRWA • Allows writes at QD > 1 - No need changes to host stack (as opposed to append) ZONE - ZRWA size balanced with expected performance Operation • Writes are placed into the ZRWA - No write pointer constraint Read / Write / Commit • In-place updates allowed in the ZRWA window ZRWA ZRWA ZRWA ZRWA • ZRWA can be committed explicitlY Commit - Through dedicated command Zone 0 Zone 1 Zone 2 Zone n-1 • ZRWA can be committed implicitlY … - When writing over sizeof(ZRWA) LBA 0 LBA m-1 5 ZNS Extensions: Simple Copy Concept • Offload copy operation to SSD • Simple copY, because SCSI XCOPY become to complex Benefits • Data is moved bY the SSD controller Zone 1 Zone 2 - No data movement through interconnect (e.g., PCIe) - No CPU cYcles used for data movement • SpeciallY relevant for ZNS – host GC Operation • Host forms and submits SC command Simple Copy - Select a list of source LBA ranges - Select a destination with a single LBA + length • SSD moves data from sources to destination - Error model complexity depending on scope Zone 3 Scope • Under discussion in NVMe 6 ZNS: Archival Use Case Goal: Reduce TCO Configuration optimized for cold data • Reduce WAF • Large zones • Reduce OP • Semi-immutable per-zone data (all / nothing) • Reduce DRAM usage • Minimize metadata and mapping tables • Facilitate consumption of QLC • Need for large QDs (Append or ZRWA) Adoption Architecture properties • Tiering of cold storage • Host GC à Less WAF and OP - Denmark cold: ZNS SSDs • Zone Mapping table à Less DRAM (break 1GB / 1TB) - Finland cold: High-capacity SMR HDDs 1 zone:1 application - Mars cold: Tape Target low GC Zone 0 Zone 10 • Leverage SMR ecosystem Zone 7 Zone 5 Application 1 Zone 14 Application 2 Zone 12 - Build on top of same abstractions … … - Expand use-cases Zone 245 Zone 351 Large Zones Zone 0 - Large Erase block Zone 1 - Large I/O sizes Zone 2 - Few “streams” Zone … Mapping Metadata Internal data placement Zone n-1 Device Scheduling Die 0 Die 1 Die 2 Die 3 Die 4 Die 5 Die 6 … Die k-1 7 ZNS: I/O Predictability Use Case Goal: Support multi-tenant workloads Configuration optimized for multi-tenancy • Inherit archival benefits: • Zones grouped in “isolation domains” - ↓WAF, ↓OP, and ↓DRAM • Host data placement / stripping • Support QoS policies • Host I/O scheduling - Provide isolation properties across zones Inherit archival architecture properties - Enable host I/O scheduling • Host GC à Less WAF and OP - Allow more control over I/O submission / completion • Zone Mapping table à Less DRAM (break 1GB / 1TB) 1 isolation domain:1 application Manage Stripping Target low GC Zone 0 Zone 1 Zone k Zone k+1 Application 1 Zone 2k Application 2 Zone 2k+1 … … Zone n-k Zone n-1 Small Zones Zone 0 Zone 1 … Zone k-1 - Small Erase block Zone k Zone k+1 … Zone 2k+1 - Variable I/O sizes … … - 10s / 100s “streams” Zone … Mapping Metadata Zone n-k Zone n-k+1 … Zone n-1 Host data placement Host I/O Scheduling Die 0 Die 1 Die 2 Die 3 Die 4 Die 5 Die 6 … Die k-1 8 Linux Stack: Zoned Block Framework Purpose • Built for SMR drivers • Add support for write constraints • Expose to applications through Applications block layer User ApplicationApplication Space libzbc Kernel Status Kernel • Merged in 4.9 Zoned-aware FS Space dm-zoned • Block / drivers (e.g., F2FS, Btrfs, ZFS) - Support in block laYer Zoned Block Device Zoned Block Device - Support in SCSI / ATA - Support in null_blk Block Layer • File systems SCSI / ATA Drivers - F2FS - Btrfs (patches posted) Host - ZFS / XFS (ongoing) HW SCSI / ATA • Libraries - libzbc SAS ZBC / SATA ZAC HDD • Tools - Utils-linux, fio 9 Linux Stack: Adding ZNS support ZNS support in Zoned Block I/O paths • Add support ZNS-specific features (kernel & tools) • Using zoned FS (e.g., F2FS) - Append, ZRWA and Simple CopY - No changes to application - AsYnc() path for new admin commands • Using zoned application • Relax assumptions from SMR - Support in application backend - Conventional zones - Explicit zone management - Zone sizes, state transitions, etc. • Data placement, sched., zone mgmt. blktests FS (e.g., F2FS) Legacy Zoned Applications RocksDB ZNS RocksDB ZNS ZNS Env. RocksDB xNVMe Applications (e.g., RocksDB) blkzoned Zoned Env. Zone Mgmt ZNS fio Zoned FS ZNS xNVMe SPDK nvme-cli Posix xNVMe User Space User Space xNVMe (e.g., F2FS) ZNS ZNS R / W Zone Mgmt R / W / A Admin Admin & I/O SQ / CQ (e.g., Reset) SQ / CQ io_uring libaio / sync IOCTL ZNS ZNS ZNS io_uring / AIO IOCTL io_uring / AIO FS I/O FS I/O Raw Block Admin R / W R / W / A I/O Framework F2FS XFS F2FS Zone Mgmt ZNS Zoned + ZNS Kernel-Space ZNS extensions Zone Mgmt (e.g., Reset) R / W / A Block Layer block block zoned Block Layer User-space ZNS extensions Block Layer zoned Zoned Block Kernel Space Kernel Space mq-deadline NVMe Driver mq-deadline NVMe Driver Zoned Block + ZNS
Recommended publications
  • Serverless Network File Systems
    Serverless Network File Systems Thomas E. Anderson, Michael D. Dahlin, Jeanna M. Neefe, David A. Patterson, Drew S. Roselli, and Randolph Y. Wang Computer Science Division University of California at Berkeley Abstract In this paper, we propose a new paradigm for network file system design, serverless network file systems. While traditional network file systems rely on a central server machine, a serverless system utilizes workstations cooperating as peers to provide all file system services. Any machine in the system can store, cache, or control any block of data. Our approach uses this location independence, in combination with fast local area networks, to provide better performance and scalability than traditional file systems. Further, because any machine in the system can assume the responsibilities of a failed component, our serverless design also provides high availability via redundant data storage. To demonstrate our approach, we have implemented a prototype serverless network file system called xFS. Preliminary performance measurements suggest that our architecture achieves its goal of scalability. For instance, in a 32-node xFS system with 32 active clients, each client receives nearly as much read or write throughput as it would see if it were the only active client. 1. Introduction A serverless network file system distributes storage, cache, and control over cooperating workstations. This approach contrasts with traditional file systems such as Netware [Majo94], NFS [Sand85], Andrew [Howa88], and Sprite [Nels88] where a central server machine stores all data and satisfies all client cache misses. Such a central server is both a performance and reliability bottleneck. A serverless system, on the other hand, distributes control processing and data storage to achieve scalable high performance, migrates the responsibilities of failed components to the remaining machines to provide high availability, and scales gracefully to simplify system management.
    [Show full text]
  • F2FS) Overview
    Flash Friendly File System (F2FS) Overview Leon Romanovsky [email protected] www.leon.nu November 17, 2012 Leon Romanovsky [email protected] F2FS Overview Disclaimer Everything in this lecture shall not, under any circumstances, hold any legal liability whatsoever. Any usage of the data and information in this document shall be solely on the responsibility of the user. This lecture is not given on behalf of any company or organization. Leon Romanovsky [email protected] F2FS Overview Introduction: Flash Memory Definition Flash memory is a non-volatile storage device that can be electrically erased and reprogrammed. Challenges block-level access wear leveling read disturb bad blocks management garbage collection different physics different interfaces Leon Romanovsky [email protected] F2FS Overview Introduction: Flash Memory Definition Flash memory is a non-volatile storage device that can be electrically erased and reprogrammed. Challenges block-level access wear leveling read disturb bad blocks management garbage collection different physics different interfaces Leon Romanovsky [email protected] F2FS Overview Introduction: General System Architecture Leon Romanovsky [email protected] F2FS Overview Introduction: File Systems Optimized for disk storage EXT2/3/4 BTRFS VFAT Optimized for flash, but not aware of FTL JFFS/JFFS2 YAFFS LogFS UbiFS NILFS Leon Romanovsky [email protected] F2FS Overview Background: LFS vs. Unix FS Leon Romanovsky [email protected] F2FS Overview Background: LFS Overview Leon Romanovsky [email protected] F2FS Overview Background: LFS Garbage Collection 1 A victim segment is selected through referencing segment usage table. 2 It loads parent index structures of all the data in the victim identified by segment summary blocks.
    [Show full text]
  • In Search of Optimal Data Placement for Eliminating Write Amplification in Log-Structured Storage
    In Search of Optimal Data Placement for Eliminating Write Amplification in Log-Structured Storage Qiuping Wangy, Jinhong Liy, Patrick P. C. Leey, Guangliang Zhao∗, Chao Shi∗, Lilong Huang∗ yThe Chinese University of Hong Kong ∗Alibaba Group ABSTRACT invalidated by a live block; a.k.a. the death time [12]) to achieve Log-structured storage has been widely deployed in various do- the minimum possible WA. However, without obtaining the fu- mains of storage systems for high performance. However, its garbage ture knowledge of the BIT pattern, how to design an optimal data collection (GC) incurs severe write amplification (WA) due to the placement scheme with the minimum WA remains an unexplored frequent rewrites of live data. This motivates many research stud- issue. Existing temperature-based data placement schemes that ies, particularly on data placement strategies, that mitigate WA in group blocks by block temperatures (e.g., write/update frequencies) log-structured storage. We show how to design an optimal data [7, 16, 22, 27, 29, 35, 36] are arguably inaccurate to capture the BIT placement scheme that leads to the minimum WA with the fu- pattern and fail to group the blocks with similar BITs [12]. ture knowledge of block invalidation time (BIT) of each written To this end, we propose SepBIT, a novel data placement scheme block. Guided by this observation, we propose SepBIT, a novel data that aims to minimize the WA in log-structured storage. It infers placement algorithm that aims to minimize WA in log-structured the BITs of written blocks from the underlying storage workloads storage.
    [Show full text]
  • XFS: There and Back ...And There Again? Slide 1 of 38
    XFS: There and Back.... .... and There Again? Dave Chinner <[email protected]> <[email protected]> XFS: There and Back .... and There Again? Slide 1 of 38 Overview • Story Time • Serious Things • These Days • Shiny Things • Interesting Times XFS: There and Back .... and There Again? Slide 2 of 38 Story Time • Way back in the early '90s • Storage exceeding 32 bit capacities • 64 bit CPUs, large scale MP • Hundreds of disks in a single machine • XFS: There..... Slide 3 of 38 "x" is for Undefined xFS had to support: • Fast Crash Recovery • Large File Systems • Large, Sparse Files • Large, Contiguous Files • Large Directories • Large Numbers of Files • - Scalability in the XFS File System, 1995 http://oss.sgi.com/projects/xfs/papers/xfs_usenix/index.html XFS: There..... Slide 4 of 38 The Early Years XFS: There..... Slide 5 of 38 The Early Years • Late 1994: First Release, Irix 5.3 • Mid 1996: Default FS, Irix 6.2 • Already at Version 4 • Attributes • Journalled Quotas • link counts > 64k • feature masks • • XFS: There..... Slide 6 of 38 The Early Years • • Allocation alignment to storage geometry (1997) • Unwritten extents (1998) • Version 2 directories (1999) • mkfs time configurable block size • Scalability to tens of millions of directory entries • • XFS: There..... Slide 7 of 38 What's that Linux Thing? • Feature development mostly stalled • Irix development focussed on CXFS • New team formed for Linux XFS port! • Encumberance review! • Linux was missing lots of bits XFS needed • Lot of work needed • • XFS: There and..... Slide 8 of 38 That Linux Thing? XFS: There and..... Slide 9 of 38 Light that fire! • 2000: SGI releases XFS under GPL • • 2001: First stable XFS release • • 2002: XFS merged into 2.5.36 • • JFS follows similar timeline • XFS: There and....
    [Show full text]
  • Comparing Filesystem Performance: Red Hat Enterprise Linux 6 Vs
    COMPARING FILE SYSTEM I/O PERFORMANCE: RED HAT ENTERPRISE LINUX 6 VS. MICROSOFT WINDOWS SERVER 2012 When choosing an operating system platform for your servers, you should know what I/O performance to expect from the operating system and file systems you select. In the Principled Technologies labs, using the IOzone file system benchmark, we compared the I/O performance of two operating systems and file system pairs, Red Hat Enterprise Linux 6 with ext4 and XFS file systems, and Microsoft Windows Server 2012 with NTFS and ReFS file systems. Our testing compared out-of-the-box configurations for each operating system, as well as tuned configurations optimized for better performance, to demonstrate how a few simple adjustments can elevate I/O performance of a file system. We found that file systems available with Red Hat Enterprise Linux 6 delivered better I/O performance than those shipped with Windows Server 2012, in both out-of- the-box and optimized configurations. With I/O performance playing such a critical role in most business applications, selecting the right file system and operating system combination is critical to help you achieve your hardware’s maximum potential. APRIL 2013 A PRINCIPLED TECHNOLOGIES TEST REPORT Commissioned by Red Hat, Inc. About file system and platform configurations While you can use IOzone to gauge disk performance, we concentrated on the file system performance of two operating systems (OSs): Red Hat Enterprise Linux 6, where we examined the ext4 and XFS file systems, and Microsoft Windows Server 2012 Datacenter Edition, where we examined NTFS and ReFS file systems.
    [Show full text]
  • NOVA: a Log-Structured File System for Hybrid Volatile/Non
    NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu and Steven Swanson, University of California, San Diego https://www.usenix.org/conference/fast16/technical-sessions/presentation/xu This paper is included in the Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST ’16). February 22–25, 2016 • Santa Clara, CA, USA ISBN 978-1-931971-28-7 Open access to the Proceedings of the 14th USENIX Conference on File and Storage Technologies is sponsored by USENIX NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu Steven Swanson University of California, San Diego Abstract Hybrid DRAM/NVMM storage systems present a host of opportunities and challenges for system designers. These sys- Fast non-volatile memories (NVMs) will soon appear on tems need to minimize software overhead if they are to fully the processor memory bus alongside DRAM. The result- exploit NVMM’s high performance and efficiently support ing hybrid memory systems will provide software with sub- more flexible access patterns, and at the same time they must microsecond, high-bandwidth access to persistent data, but provide the strong consistency guarantees that applications managing, accessing, and maintaining consistency for data require and respect the limitations of emerging memories stored in NVM raises a host of challenges. Existing file sys- (e.g., limited program cycles). tems built for spinning or solid-state disks introduce software Conventional file systems are not suitable for hybrid mem- overheads that would obscure the performance that NVMs ory systems because they are built for the performance char- should provide, but proposed file systems for NVMs either in- acteristics of disks (spinning or solid state) and rely on disks’ cur similar overheads or fail to provide the strong consistency consistency guarantees (e.g., that sector updates are atomic) guarantees that applications require.
    [Show full text]
  • Filesystem Considerations for Embedded Devices ELC2015 03/25/15
    Filesystem considerations for embedded devices ELC2015 03/25/15 Tristan Lelong Senior embedded software engineer Filesystem considerations ABSTRACT The goal of this presentation is to answer a question asked by several customers: which filesystem should you use within your embedded design’s eMMC/SDCard? These storage devices use a standard block interface, compatible with traditional filesystems, but constraints are not those of desktop PC environments. EXT2/3/4, BTRFS, F2FS are the first of many solutions which come to mind, but how do they all compare? Typical queries include performance, longevity, tools availability, support, and power loss robustness. This presentation will not dive into implementation details but will instead summarize provided answers with the help of various figures and meaningful test results. 2 TABLE OF CONTENTS 1. Introduction 2. Block devices 3. Available filesystems 4. Performances 5. Tools 6. Reliability 7. Conclusion Filesystem considerations ABOUT THE AUTHOR • Tristan Lelong • Embedded software engineer @ Adeneo Embedded • French, living in the Pacific northwest • Embedded software, free software, and Linux kernel enthusiast. 4 Introduction Filesystem considerations Introduction INTRODUCTION More and more embedded designs rely on smart memory chips rather than bare NAND or NOR. This presentation will start by describing: • Some context to help understand the differences between NAND and MMC • Some typical requirements found in embedded devices designs • Potential filesystems to use on MMC devices 6 Filesystem considerations Introduction INTRODUCTION Focus will then move to block filesystems. How they are supported, what feature do they advertise. To help understand how they compare, we will present some benchmarks and comparisons regarding: • Tools • Reliability • Performances 7 Block devices Filesystem considerations Block devices MMC, EMMC, SD CARD Vocabulary: • MMC: MultiMediaCard is a memory card unveiled in 1997 by SanDisk and Siemens based on NAND flash memory.
    [Show full text]
  • Compatibility Matrix for CTE Agent with Data Security Manager Release 7.0.0 Document Version 15 January 19, 2021 Contents
    Compatibility Matrix for CTE Agent with Data Security Manager Release 7.0.0 Document Version 15 January 19, 2021 Contents Rebranding Announcement 6 CTE Agent for Linux 6 Interoperability 6 Table 1: Linux Interoperability with Third Party Applications 6 ESG (Efficient Storage GuardPoint) Support 7 Table 2: Efficient Storage GuardPoint Support 7 Linux Agent Raw Device Support Matrix 8 Red Hat, CentOS, and OEL non-UEK 6.10 Raw Device Support 8 Table 3: Red Hat 6.10 | CentOS 6.10 | OEL non-UEK 6.10 (x86_64)3 8 Red Hat, CentOS, and OEL non-UEK 7.5-7.9 Raw Device Support 9 Table 4: Red Hat 7.5-7.9 | CentOS 7.5-7.8 | OEL non-UEK 7.5-7.8 (x86_64)1,2 9 Red Hat, CentOS, and OEL non-UEK 8 Raw Device Support 9 Table 5: Red Hat 8.0-8.2 | CentOS 8.0-8.2 | OEL non-UEK 8.0-8.2 (x86_64)1,2 9 SLES 12 Raw Device Support 10 Table 6: SLES 12 SP3, SLES 12 SP4, and SLES 12 SP5 (x86_64)2 10 SLES 15 Raw Device Support 10 Table 7: SLES 15, SLES 15 SP1, and SLES 15 SP2 (x86_64) 10 Redhat 6.10/7.5 ACFS Support with Secvm 11 Table 8: Oracle ACFS/Secvm support on Redhat 6.10/7.5 (x86_64) 11 Table 9: Oracle ACFS/Secvm Stack with Red Hat 6.10/7.5 (x86_64) 11 Linux Agent File System Support Matrix 12 Red Hat, CentOS, and OEL non-UEK 6. 10 File System Support 12 Table 10: Red Hat 6.10 | CentOS 6.10 | OEL non-UEK 6.10 (x86_64)1,3 12 LDT Feature for Red Hat, CentOS, and OEL non-UEK 6.10 File System Support 13 Table 11: Red Hat 6.10 | CentOS 6.10 | OEL non-UEK 6.10 (x86_64)1 13 Red Hat, CentOS, and OEL non-UEK 7.5 - 7.9 File System Support 14 Table 12: Red Hat 7.5-7.9 | CentOS
    [Show full text]
  • Xfs: a Wide Area Mass Storage File System
    xFS: A Wide Area Mass Storage File System Randolph Y. Wang and Thomas E. Anderson frywang,[email protected] erkeley.edu Computer Science Division University of California Berkeley, CA 94720 Abstract Scalability : The central le server mo del breaks down when there can b e: The current generation of le systems are inadequate thousands of clients, in facing the new technological challenges of wide area terabytes of total client caches for the servers networks and massive storage. xFS is a prototyp e le to keep track of, system we are developing to explore the issues brought billions of les, and ab out by these technological advances. xFS adapts p etabytes of total storage. many of the techniques used in the eld of high p er- Availability : As le systems are made more scalable, formance multipro cessor design. It organizes hosts into allowing larger groups of clients and servers to work a hierarchical structure so lo cality within clusters of together, it b ecomes more likely at any given time workstations can b e b etter exploited. By using an that some clients and servers will b e unable to com- invalidation-based write back cache coherence proto col, municate. xFS minimizes network usage. It exploits the le system Existing distributed le systems were originally de- naming structure to reduce cache coherence state. xFS signed for lo cal area networks and disks as the b ottom also integrates di erent storage technologies in a uni- layer of the storage hierarchy. They are inadequate in form manner. Due to its intelligent use of lo cal hosts facing the challenges of wide area networks and massive and lo cal storage, we exp ect xFS to achieve b etter p er- storage.
    [Show full text]
  • Scaling Source Control for the Next Generation of Game Development by Mike Sundy & Tobias Roberts Perforce User's Conference May 2007, Las Vegas
    Scaling Source Control for the Next Generation of Game Development by Mike Sundy & Tobias Roberts Perforce User's Conference May 2007, Las Vegas Introduction: This document intends to show performance differences between various Windows and Linux Perforce server configurations. About the Authors Tobias Roberts is the Studio IT Architect for the Electronic Arts Redwood Shores (EARS) and Sims Studios. He has been focused on server performance and studio architecture for the past nine years. E-mail: [email protected]. Mike Sundy is the Senior Perforce Administrator for the EARS/Sims Studios. He has over eleven years of version control administration experience and has specialized in Perforce for the past five years. E-mail: [email protected] or [email protected] Installation Metrics At Electronic Arts, we check in art binary data in addition to code. There are typically a much greater number of data assets compared to code. 90% of files we store in Perforce are binary, the other 10% are code. Some of our servers have upwards of 1200 users, 6.3 million files, 600,000 changelists, and a 80 GB db.have. Oldest server has 7 years of P4 history. EA is primarily a Windows-based development shop, with upwards of 90% of client desktops running Windows. EA has over 4,000 Perforce users and 90+ Perforce servers scattered across dozens of worldwide studios. Our largest server has 1.5 TB of data. We use 5 TB of Perforce RCS storage total at the EARS/Sims studios. The EARS/Sims studios have 10 Perforce servers and approximately 1,000 users.
    [Show full text]
  • Analysis and Design of Native File System Enhancements for Storage Class Memory
    Analysis and Design of Native File System Enhancements for Storage Class Memory by Raymond Robles A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved April 2016 by the Graduate Supervisory Committee: Violet Syrotiuk, Chair Sohum Sohoni Carole-Jean Wu ARIZONA STATE UNIVERSITY May 2016 ABSTRACT As persistent non-volatile memory solutions become integrated in the computing ecosystem and landscape, traditional commodity file systems architected and developed for traditional block I/O based memory solutions must be reevaluated. A majority of commodity file systems have been architected and designed with the goal of managing data on non-volatile storage devices such as hard disk drives (HDDs) and solid state drives (SSDs). HDDs and SSDs are attached to a computing system via a controller or I/O hub, often referred to as the southbridge. The point of HDD and SSD attachment creates multiple levels of translation for any data managed by the CPU that must be stored in non-volatile memory (NVM) on an HDD or SSD. Storage Class Memory (SCM) devices provide the ability to store data at the CPU and DRAM level of a computing system. A novel set of modifications to the ext2 and ext4 commodity file systems to address the needs of SCM will be presented and discussed. An in-depth analysis of many existing file systems, from multiple sources, will be presented along with an analysis to identify key modifications and extensions that would be necessary to execute file system on SCM devices. From this analysis, modifications and extensions have been applied to the FAT commodity file system for key functional tests that will be presented to demonstrate the operation and execution of the file system extensions.
    [Show full text]
  • NOVA: the Fastest File System for Nvdimms
    NOVA: The Fastest File System for NVDIMMs Steven Swanson, UC San Diego XFS F2FS NILFS EXT4 BTRFS © 2017 SNIA Persistent Memory Summit. All Rights Reserved. Disk-based file systems are inadequate for NVMM Disk-based file systems cannot 1-Sector 1-Block N-Block 1-Sector 1-Block N-Block Atomicity overwrit overwrit overwrit append append append exploit NVMM performance e e e Ext4 ✓ ✗ ✗ ✗ ✗ ✗ wb Ext4 ✓ ✓ ✗ ✓ ✗ ✓ Performance optimization Order Ext4 ✓ ✓ ✓ ✓ ✗ ✓ compromises consistency on Dataj system failure [1] Btrfs ✓ ✓ ✓ ✓ ✗ ✓ xfs ✓ ✓ ✗ ✓ ✗ ✓ Reiserfs ✓ ✓ ✗ ✓ ✗ ✓ [1] Pillai et al, All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications, OSDI '14. © 2017 SNIA Persistent Memory Summit. All Rights Reserved. BPFS SCMFS PMFS Aerie EXT4-DAX XFS-DAX NOVA M1FS © 2017 SNIA Persistent Memory Summit. All Rights Reserved. Previous Prototype NVMM file systems are not strongly consistent DAX does not provide data ATomic Atomicity Metadata Data Snapshot atomicity guarantee Mmap [1] So programming is more BPFS ✓ ✓ [2] ✗ ✗ difficult PMFS ✓ ✗ ✗ ✗ Ext4-DAX ✓ ✗ ✗ ✗ Xfs-DAX ✓ ✗ ✗ ✗ SCMFS ✗ ✗ ✗ ✗ Aerie ✓ ✗ ✗ ✗ © 2017 SNIA Persistent Memory Summit. All Rights Reserved. Ext4-DAX and xfs-DAX shortcomings No data atomicity support Single journal shared by all the transactions (JBD2- based) Poor performance Development teams are (rightly) “disk first”. © 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA provides strong atomicity guarantees 1-Sector 1-Sector 1-Block 1-Block N-Block N-Block Atomicity Atomicity Metadata Data Mmap overwrite append overwrite append overwrite append Ext4 ✓ ✗ ✗ ✗ ✗ ✗ BPFS ✓ ✓ ✗ wb Ext4 ✓ ✓ ✗ ✓ ✗ ✓ PMFS ✓ ✗ ✗ Order Ext4 ✓ ✓ ✓ ✓ ✗ ✓ Ext4-DAX ✓ ✗ ✗ Dataj Btrfs ✓ ✓ ✓ ✓ ✗ ✓ Xfs-DAX ✓ ✗ ✗ xfs ✓ ✓ ✗ ✓ ✗ ✓ SCMFS ✗ ✗ ✗ Reiserfs ✓ ✓ ✗ ✓ ✗ ✓ Aerie ✓ ✗ ✗ © 2017 SNIA Persistent Memory Summit. All Rights Reserved.
    [Show full text]