Zoned Linux Ecosystem Overview

Total Page:16

File Type:pdf, Size:1020Kb

Zoned Linux Ecosystem Overview Linux Zoned Block Device Ecosystem: No longer exotic Dmitry Fomichev Western Digital Research, System Software Group October 2019 © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 Outline • Why zoned block devices (ZBD)? – SMR recording and zoned models • Support in Linux - status overview – Standards and kernel – Application support • Kernel support details – Overview, Block layer, File systems and device-mapper, Known problems • Application support details – SG Tools, libzbc, fio, etc. • ZNS – Why zones for flash? ZNS vs. ZBD, ZNS use cases • Ongoing Work and Next Steps © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 2 What are Zoned Block Devices? Zoned device access model • The storage device logical block addresses Device LBA range divided in zones are divided into ranges of zones • Zone size is much larger than LBA size Zone 0 Zone 1 Zone 2 Zone 3 Zone X – E.g. 256 MB on today’s SMR disks • Zone size is fixed • Reads can be done in the usual manner Write commands • Writes within a zone must be sequential Write pointer advance the write pointer • A zone must be erased before it can be position rewritten Reset write pointer commands • Zones are identified by their start LBA rewind the write pointer © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 3 What are Zoned Block Devices? Accommodate advanced recording technology • Shingled Magnetic Recording (SMR) disks Conventional PMR HDD SMR HDD – Enables higher areal density Discrete Tracks Overlapped Tracks – Wider write head produces stronger field, enabling smaller grains and lower noise – Better sector erasure coding, more powerful … data detection and recovery • Zoned Access – Random reads to the device are allowed – But writes within zones must be sequential Zone – Zones can be rewritten from the beginning after erasing – Additional commands are needed • Some zones of the device can still be PMR – Conventional Zones © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 4 What are Zoned Block Devices? Standardized Zoned Device Models • Drive firmware can be designed to alleviate and conceal zone write restrictions, but this comes at a cost – Garbage collection is necessary, resulting in lower performance © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 5 What are Zoned Block Devices? Mainstream technology? • WDC: Capacity enterprise market exabyte growth expectations for FY 2019: Meaningfully exceed 30% y/y – Competitors expect similar growth • WDC: By 2023, half of HDD produced capacity will be Host Managed SMR • Both T10 (SCSI), T13 (ATA) and SAT standards are now stable – The SCSI standard is called ZBC, ATA - ZAC – SAT is SCSI to ATA translation 2013 2014 2015 2016 2017 10 11 12 1 2 3 10 11 12 1 2 3 10 11 12 1 2 3 Disk isn’t dead, it has gone to heaven cloud ZBC r00 ZBC SAT forward ZAC forwarded forwarded to to INCITS to INCITS INCITS © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 6 Host Managed Zoned Block Devices What is needed for zoned operation? • Need functionality to read the current zone state – REPORT ZONES command for SCSI ZC2 – Zoned Device Information page of IDENTIFY DEVICE log is read for ATA ZC1 ZC5 ZC4 • Need functionality for resource management – Zone resources are limited, need operations like open and close to manage them. MaxOpen and MaxActive. ZC6 ZC7 • Zone operations – new commands ZC3 – OPEN ZONE • zone can also be opened implicitly by a write – CLOSE ZONE – FINISH ZONE • write pointer becomes invalid! – RESET ZONE © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 7 I/O Path Support for Zoned Block Devices The Big Picture User Applications Space (blkzone, libzbc, fio, sg tools) File access File access Block access Zoned block access Direct device access File System File System (any) (f2fs, zonefs) Device Mapper (dm-linear, dm-flakey, dm-zoned) Kernel Block I/O Layer Space Block I/O Scheduler (deadline and mq-deadline zone write-locking since 4.16.0) SCSI Generic SCSI Mid Layer (sd driver scan code) SCSI Low Level Drivers HBA HW ZBC/ZAC Disk © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 8 Application vs Kernel Support Dependent on kernel version and HBA compliance ** Some current enterprise * ”vanilla” kernels from www.kernel.org Kernels pre-v4.10* Kernels v4.10** onward distributions may use older (Enterprise distribution kernels may kernels that have out-of-date backport features to lower kernel ZBD support version) No Kernel has zoned block device support Yes No No HBA exposes device and has HBA exposes device functional ZBC SAT Yes (SAT optional) Yes Application direct management Kernel based management SG_IO based support: Application direct access Application indirect access Device not usable • Application specific implementation (application support required): (also legacy applications): • sg3utils • ioctl and regular system calls • File system • libzbc • libzbc • Device mapper • fio Minimum support space Ideal support space © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 9 Linux Kernel Support Overview ZBC and ZAC support timeline • Initial work started with kernel version 3.18 • Full ZBC and ZAC command support was implemented in kernel version 4.10 – Exposes Host Managed disks as block devices – API for REPORT ZONES and RESET WRITE POINTER • No kernel internal support for OPEN, CLOSE and FINISH ZONE commands – Support ZBC to ZAC command translation of all commands • libATA From 5.0: scsi-mq only 2014 2017 2018 2019 10 11 12 1 2 3 4 8 9 10 11 2 3 4 5 6 1 2 3 4 5 6 7 8 9 10 11 12 3.18 4.10 4.13 4.16 5.0 5.1 5.2 SG nodes support Zoned block dm-zoned scsi-mq Latest stable TYPE_ZBC SCSI device support device mapper support release TYPE_ZAC ATA © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 10 Linux Kernel Support Overview Kernel Zoned Write Constraints for Host Managed devices • Sequential write constraint is exposed to the drive user – file systems and applications MUST write sequentially to sequential zones – Write ordering guarantees implemented by limiting per zone write queue depth to 1 User • Application Also solves many HBA level command ordering problems, including AHCI level • Implemented in the SCSI disk driver for kernels 4.10.0 to 4.15.x • Implemented with the “deadline” and “mq-deadline” schedulers since kernel 4.16.0 Kernel VFS • mq-deadline is mandatory with kernels 5.0 and above (legacy single queue I/O path File System removed) Device mapper Kernel Block Layer Block I/O scheduler • Since 4.16, sequential write constraint is enforced in deadline SCSI / ATA stack scheduler HBA Driver – Now, mq-deadline scheduler only © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 11 Linux Kernel Support Overview Why deadline scheduler? • Maintains shared state for all hardware queues – This simplifies zone locking mechanism – Support for zoned block devices is only ~30 lines of code out of ~800 User Application • Read and write queues are sorted by LBA level – Creates favorable patterns for zone I/O Kernel VFS • Never reorders I/O requests File System – Merges are allowed as long as they don’t cause chunk boundary crossing Device mapper Kernel Block Layer • Atomic bit operations are used to lock zones Block I/O scheduler – Small memory amount as the number of zones can be very large SCSI / ATA stack HBA Driver • Maintaining QD=1 with the zone lock may cause contention – Only comes to play with certain I/O patterns – An example later in the presentation © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 12 Linux Kernel Support - Core Functionalities Block layer API • A set of ioctls defined in include/uapi/linux/blkzoned.h Zone report is available via an ioctl call: /** ... * @BLKREPORTZONE: Get zone information. Takes a zone report as argument. * The zone report will start from the zone containing the User * sector specified in the report request structure. Application level */ #define BLKREPORTZONE _IOWR(0x12, 130, struct blk_zone_report) Kernel VFS Zone reset command is also available as an ioctl call: File System /** ... Device mapper * @BLKRESETZONE: Reset the write pointer of the zones in the specified * sector range. The sector range must be zone aligned. Kernel Block Layer ... */ Block I/O scheduler #define BLKRESETZONE _IOW(0x12, 131, struct blk_zone_range) SCSI / ATA stack Kernel 4.19 introduced two additional ioctl calls: HBA Driver /** ... * @BLKGETZONESZ: Get the device zone size in number of 512 B sectors. * @BLKGETNRZONES: Get the total number of zones of the device. ... */ #define BLKGETZONESZ _IOR(0x12, 132, __u32) #define BLKGETNRZONES _IOR(0x12, 133, __u32) © 2019 Western Digital Corporation or its affiliates. All rights reserved. 10/2/2019 13 Linux Kernel Support - Core Functionalities SCSI layer • Zone configuration information is printed as part of disk scan process – Kernel log messages [ 3.687797] scsi 5:0:0:0: Direct-Access-ZBC ATA HGST HSH721414AL TE8C PQ: 0 ANSI: 7 User [ 3.696359] sd 5:0:0:0: Attached scsi generic sg4 type 20 Application [ 3.696485] sd 5:0:0:0: [sdd] Host-managed zoned block device level [ 3.865072] sd 5:0:0:0: [sdd] 27344764928 512-byte logical blocks: (14.0 TB/12.7 TiB) [ 3.873046] sd 5:0:0:0: [sdd] 4096-byte physical blocks Kernel VFS [ 3.878343] sd 5:0:0:0: [sdd]
Recommended publications
  • Storage Administration Guide Storage Administration Guide SUSE Linux Enterprise Server 12 SP4
    SUSE Linux Enterprise Server 12 SP4 Storage Administration Guide Storage Administration Guide SUSE Linux Enterprise Server 12 SP4 Provides information about how to manage storage devices on a SUSE Linux Enterprise Server. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xii 2 Giving Feedback xiv 3 Documentation Conventions xiv 4 Product Life Cycle and Support xvi Support Statement for SUSE Linux Enterprise Server xvii • Technology Previews xviii I FILE SYSTEMS AND MOUNTING 1 1 Overview
    [Show full text]
  • The Linux Kernel Module Programming Guide
    The Linux Kernel Module Programming Guide Peter Jay Salzman Michael Burian Ori Pomerantz Copyright © 2001 Peter Jay Salzman 2007−05−18 ver 2.6.4 The Linux Kernel Module Programming Guide is a free book; you may reproduce and/or modify it under the terms of the Open Software License, version 1.1. You can obtain a copy of this license at http://opensource.org/licenses/osl.php. This book is distributed in the hope it will be useful, but without any warranty, without even the implied warranty of merchantability or fitness for a particular purpose. The author encourages wide distribution of this book for personal or commercial use, provided the above copyright notice remains intact and the method adheres to the provisions of the Open Software License. In summary, you may copy and distribute this book free of charge or for a profit. No explicit permission is required from the author for reproduction of this book in any medium, physical or electronic. Derivative works and translations of this document must be placed under the Open Software License, and the original copyright notice must remain intact. If you have contributed new material to this book, you must make the material and source code available for your revisions. Please make revisions and updates available directly to the document maintainer, Peter Jay Salzman <[email protected]>. This will allow for the merging of updates and provide consistent revisions to the Linux community. If you publish or distribute this book commercially, donations, royalties, and/or printed copies are greatly appreciated by the author and the Linux Documentation Project (LDP).
    [Show full text]
  • The Xen Port of Kexec / Kdump a Short Introduction and Status Report
    The Xen Port of Kexec / Kdump A short introduction and status report Magnus Damm Simon Horman VA Linux Systems Japan K.K. www.valinux.co.jp/en/ Xen Summit, September 2006 Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 1 / 17 Outline Introduction to Kexec What is Kexec? Kexec Examples Kexec Overview Introduction to Kdump What is Kdump? Kdump Kernels The Crash Utility Xen Porting Effort Kexec under Xen Kdump under Xen The Dumpread Tool Partial Dumps Current Status Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 2 / 17 Introduction to Kexec Outline Introduction to Kexec What is Kexec? Kexec Examples Kexec Overview Introduction to Kdump What is Kdump? Kdump Kernels The Crash Utility Xen Porting Effort Kexec under Xen Kdump under Xen The Dumpread Tool Partial Dumps Current Status Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 3 / 17 Kexec allows you to reboot from Linux into any kernel. as long as the new kernel doesn’t depend on the BIOS for setup. Introduction to Kexec What is Kexec? What is Kexec? “kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel. It is like a reboot but it is indepedent of the system firmware...” Configuration help text in Linux-2.6.17 Magnus Damm ([email protected]) Kexec / Kdump Xen Summit, September 2006 4 / 17 . as long as the new kernel doesn’t depend on the BIOS for setup. Introduction to Kexec What is Kexec? What is Kexec? “kexec is a system call that implements the ability to shutdown your current kernel, and to start another kernel.
    [Show full text]
  • Anatomy of Linux Loadable Kernel Modules a 2.6 Kernel Perspective
    Anatomy of Linux loadable kernel modules A 2.6 kernel perspective Skill Level: Intermediate M. Tim Jones ([email protected]) Consultant Engineer Emulex Corp. 16 Jul 2008 Linux® loadable kernel modules, introduced in version 1.2 of the kernel, are one of the most important innovations in the Linux kernel. They provide a kernel that is both scalable and dynamic. Discover the ideas behind loadable modules, and learn how these independent objects dynamically become part of the Linux kernel. The Linux kernel is what's known as a monolithic kernel, which means that the majority of the operating system functionality is called the kernel and runs in a privileged mode. This differs from a micro-kernel, which runs only basic functionality as the kernel (inter-process communication [IPC], scheduling, basic input/output [I/O], memory management) and pushes other functionality outside the privileged space (drivers, network stack, file systems). You'd think that Linux is then a very static kernel, but in fact it's quite the opposite. Linux can be dynamically altered at run time through the use of Linux kernel modules (LKMs). More in Tim's Anatomy of... series on developerWorks • Anatomy of Linux flash file systems • Anatomy of Security-Enhanced Linux (SELinux) • Anatomy of real-time Linux architectures • Anatomy of the Linux SCSI subsystem • Anatomy of the Linux file system • Anatomy of the Linux networking stack Anatomy of Linux loadable kernel modules © Copyright IBM Corporation 1994, 2008. All rights reserved. Page 1 of 11 developerWorks® ibm.com/developerWorks • Anatomy of the Linux kernel • Anatomy of the Linux slab allocator • Anatomy of Linux synchronization methods • All of Tim's Anatomy of..
    [Show full text]
  • Kdump, a Kexec-Based Kernel Crash Dumping Mechanism
    Kdump, A Kexec-based Kernel Crash Dumping Mechanism Vivek Goyal Eric W. Biederman Hariprasad Nellitheertha IBM Linux NetworkX IBM [email protected] [email protected] [email protected] Abstract important consideration for the success of a so- lution has been the reliability and ease of use. Kdump is a crash dumping solution that pro- Kdump is a kexec based kernel crash dump- vides a very reliable dump generation and cap- ing mechanism, which is being perceived as turing mechanism [01]. It is simple, easy to a reliable crash dumping solution for Linux R . configure and provides a great deal of flexibility This paper begins with brief description of what in terms of dump device selection, dump saving kexec is and what it can do in general case, and mechanism, and plugging-in filtering mecha- then details how kexec has been modified to nism. boot a new kernel even in a system crash event. The idea of kdump has been around for Kexec enables booting into a new kernel while quite some time now, and initial patches for preserving the memory contents in a crash sce- kdump implementation were posted to the nario, and kdump uses this feature to capture Linux kernel mailing list last year [03]. Since the kernel crash dump. Physical memory lay- then, kdump has undergone significant design out and processor state are encoded in ELF core changes to ensure improved reliability, en- format, and these headers are stored in a re- hanced ease of use and cleaner interfaces. This served section of memory. Upon a crash, new paper starts with an overview of the kdump de- kernel boots up from reserved memory and pro- sign and development history.
    [Show full text]
  • Communicating Between the Kernel and User-Space in Linux Using Netlink Sockets
    SOFTWARE—PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2010; 00:1–7 Prepared using speauth.cls [Version: 2002/09/23 v2.2] Communicating between the kernel and user-space in Linux using Netlink sockets Pablo Neira Ayuso∗,∗1, Rafael M. Gasca1 and Laurent Lefevre2 1 QUIVIR Research Group, Departament of Computer Languages and Systems, University of Seville, Spain. 2 RESO/LIP team, INRIA, University of Lyon, France. SUMMARY When developing Linux kernel features, it is a good practise to expose the necessary details to user-space to enable extensibility. This allows the development of new features and sophisticated configurations from user-space. Commonly, software developers have to face the task of looking for a good way to communicate between kernel and user-space in Linux. This tutorial introduces you to Netlink sockets, a flexible and extensible messaging system that provides communication between kernel and user-space. In this tutorial, we provide fundamental guidelines for practitioners who wish to develop Netlink-based interfaces. key words: kernel interfaces, netlink, linux 1. INTRODUCTION Portable open-source operating systems like Linux [1] provide a good environment to develop applications for the real-world since they can be used in very different platforms: from very small embedded devices, like smartphones and PDAs, to standalone computers and large scale clusters. Moreover, the availability of the source code also allows its study and modification, this renders Linux useful for both the industry and the academia. The core of Linux, like many modern operating systems, follows a monolithic † design for performance reasons. The main bricks that compose the operating system are implemented ∗Correspondence to: Pablo Neira Ayuso, ETS Ingenieria Informatica, Department of Computer Languages and Systems.
    [Show full text]
  • Detecting Exploit Code Execution in Loadable Kernel Modules
    Detecting Exploit Code Execution in Loadable Kernel Modules HaizhiXu WenliangDu SteveJ.Chapin Systems Assurance Institute Syracuse University 3-114 CST, 111 College Place, Syracuse, NY 13210, USA g fhxu02, wedu, chapin @syr.edu Abstract and pointer checks can lead to kernel-level exploits, which can jeopardize the integrity of the running kernel. Inside the In current extensible monolithic operating systems, load- kernel, exploitcode has the privilegeto interceptsystem ser- able kernel modules (LKM) have unrestricted access to vice routines, to modify interrupt handlers, and to overwrite all portions of kernel memory and I/O space. As a result, kernel data. In such cases, the behavior of the entire sys- kernel-module exploitation can jeopardize the integrity of tem may become suspect. the entire system. In this paper, we analyze the threat that Kernel-level protection is different from user space pro- comes from the implicit trust relationship between the oper- tection. Not every application-level protection mechanism ating system kernel and loadable kernel modules. We then can be applied directly to kernel code, because privileges present a specification-directed access monitoring tool— of the kernel environment is different from that of the user HECK, that detects kernel modules for malicious code ex- space. For example, non-executableuser page [21] and non- ecution. Inside the module, HECK prevents code execution executable user stack [29] use virtual memory mapping sup- on the kernel stack and the data sections; on the bound- port for pages and segments, but inside the kernel, a page ary, HECK restricts the module’s access to only those kernel or segment fault can lead to kernel panic.
    [Show full text]
  • DM-Relay - Safe Laptop Mode Via Linux Device Mapper
    ' $ DM-Relay - Safe Laptop Mode via Linux Device Mapper Study Thesis by cand. inform. Fabian Franz at the Faculty of Informatics Supervisor: Prof. Dr. Frank Bellosa Supervising Research Assistant: Dipl.-Inform. Konrad Miller Day of completion: 04/05/2010 &KIT – Universitat¨ des Landes Baden-Wurttemberg¨ und nationales Forschungszentrum in der Helmholtz-Gemeinschaft www.kit.edu % I hereby declare that this thesis is my own original work which I created without illegitimate help by others, that I have not used any other sources or resources than the ones indicated and that due acknowledgment is given where reference is made to the work of others. Karlsruhe, April 5th, 2010 Contents Deutsche Zusammenfassung xi 1 Introduction 1 1.1 Problem Definition . .1 1.2 Objectives . .1 1.3 Methodology . .1 1.4 Contribution . .2 1.5 Thesis Outline . .2 2 Background 3 2.1 Problems of Disk Power Management . .3 2.2 State of the Art . .4 2.3 Summary of this chapter . .8 3 Analysis 9 3.1 Pro and Contra . .9 3.2 A new approach . 13 3.3 Analysis of Proposal . 15 3.4 Summary of this chapter . 17 4 Design 19 4.1 Common problems . 19 4.2 System-Design . 21 4.3 Summary of this chapter . 21 5 Implementation of a dm-module for the Linux kernel 23 5.1 System-Architecture . 24 5.2 Log suitable for Flash-Storage . 28 5.3 Using dm-relay in practice . 31 5.4 Summary of this chapter . 31 vi Contents 6 Evaluation 33 6.1 Methodology . 33 6.2 Benchmarking setup .
    [Show full text]
  • The Linux Storage Stack Diagram
    The Linux Storage Stack Diagram version 3.17, 2014-10-17 outlines the Linux storage stack as of Kernel version 3.17 ISCSI USB mmap Fibre Channel Fibre over Ethernet Fibre Channel Fibre Virtual Host Virtual FireWire (anonymous pages) Applications (Processes) LIO malloc vfs_writev, vfs_readv, ... ... stat(2) read(2) open(2) write(2) chmod(2) VFS tcm_fc sbp_target tcm_usb_gadget tcm_vhost tcm_qla2xxx iscsi_target_mod block based FS Network FS pseudo FS special Page ext2 ext3 ext4 proc purpose FS target_core_mod direct I/O NFS coda sysfs Cache (O_DIRECT) xfs btrfs tmpfs ifs smbfs ... pipefs futexfs ramfs target_core_file iso9660 gfs ocfs ... devtmpfs ... ceph usbfs target_core_iblock target_core_pscsi network optional stackable struct bio - sector on disk BIOs (Block I/O) BIOs (Block I/O) - sector cnt devices on top of “normal” - bio_vec cnt block devices drbd LVM - bio_vec index - bio_vec list device mapper mdraid dm-crypt dm-mirror ... dm-cache dm-thin bcache BIOs BIOs Block Layer BIOs I/O Scheduler blkmq maps bios to requests multi queue hooked in device drivers noop Software (they hook in like stacked ... Queues cfq devices do) deadline Hardware Hardware Dispatch ... Dispatch Queue Queues Request Request BIO based Drivers based Drivers based Drivers request-based device mapper targets /dev/nullb* /dev/vd* /dev/rssd* dm-multipath SCSI Mid Layer /dev/rbd* null_blk SCSI upper level drivers virtio_blk mtip32xx /dev/sda /dev/sdb ... sysfs (transport attributes) /dev/nvme#n# /dev/skd* rbd Transport Classes nvme skd scsi_transport_fc network
    [Show full text]
  • An Evolutionary Study of Linux Memory Management for Fun and Profit Jian Huang, Moinuddin K
    An Evolutionary Study of Linux Memory Management for Fun and Profit Jian Huang, Moinuddin K. Qureshi, and Karsten Schwan, Georgia Institute of Technology https://www.usenix.org/conference/atc16/technical-sessions/presentation/huang This paper is included in the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16). June 22–24, 2016 • Denver, CO, USA 978-1-931971-30-0 Open access to the Proceedings of the 2016 USENIX Annual Technical Conference (USENIX ATC ’16) is sponsored by USENIX. An Evolutionary Study of inu emory anagement for Fun and rofit Jian Huang, Moinuddin K. ureshi, Karsten Schwan Georgia Institute of Technology Astract the patches committed over the last five years from 2009 to 2015. The study covers 4587 patches across Linux We present a comprehensive and uantitative study on versions from 2.6.32.1 to 4.0-rc4. We manually label the development of the Linux memory manager. The each patch after carefully checking the patch, its descrip- study examines 4587 committed patches over the last tions, and follow-up discussions posted by developers. five years (2009-2015) since Linux version 2.6.32. In- To further understand patch distribution over memory se- sights derived from this study concern the development mantics, we build a tool called MChecker to identify the process of the virtual memory system, including its patch changes to the key functions in mm. MChecker matches distribution and patterns, and techniues for memory op- the patches with the source code to track the hot func- timizations and semantics. Specifically, we find that tions that have been updated intensively.
    [Show full text]
  • Comparison of Kernel and User Space File Systems
    Comparison of kernel and user space file systems — Bachelor Thesis — Arbeitsbereich Wissenschaftliches Rechnen Fachbereich Informatik Fakultät für Mathematik, Informatik und Naturwissenschaften Universität Hamburg Vorgelegt von: Kira Isabel Duwe E-Mail-Adresse: [email protected] Matrikelnummer: 6225091 Studiengang: Informatik Erstgutachter: Professor Dr. Thomas Ludwig Zweitgutachter: Professor Dr. Norbert Ritter Betreuer: Michael Kuhn Hamburg, den 28. August 2014 Abstract A file system is part of the operating system and defines an interface between OS and the computer’s storage devices. It is used to control how the computer names, stores and basically organises the files and directories. Due to many different requirements, such as efficient usage of the storage, a grand variety of approaches arose. The most important ones are running in the kernel as this has been the only way for a long time. In 1994, developers came up with an idea which would allow mounting a file system in the user space. The FUSE (Filesystem in Userspace) project was started in 2004 and implemented in the Linux kernel by 2005. This provides the opportunity for a user to write an own file system without editing the kernel code and therefore avoid licence problems. Additionally, FUSE offers a stable library interface. It is originally implemented as a loadable kernel module. Due to its design, all operations have to pass through the kernel multiple times. The additional data transfer and the context switches are causing some overhead which will be analysed in this thesis. So, there will be a basic overview about on how exactly a file system operation takes place and which mount options for a FUSE-based system result in a better performance.
    [Show full text]
  • Lab 1: Loadable Kernel Modules (Lkms)
    Lab 1: Loadable Kernel Modules (LKMs) Overview For this lab, we will be interfacing with an LCD screen using the Phytec board. This means we will be creating device drivers for our system to allow it to interact with our hardware. Since our board is running Linux­­ which is a pretty advanced operating system for most embedded devices­­ we will want to accomplish this by appending additional functionality to the working kernel. This is done using loadable kernel modules. Background Before we get too far, here is some important information we will need to know in order to interact with the Linux kernel and perform the functions we want. To write device drivers we must first understand what they are. There are three main types of device drivers; character, block, and network. In this class, we will be writing character device drivers. A character device driver is one that transfers data directly to and from a user process. This is the most common type of device driver. Character device drivers normally perform I/O in a byte stream. They can also provide additional interfaces not present in block drivers, such as I/O control commands, memory mapping, and device polling. Drivers that support a file system are known as block device drivers. Block ​ device drivers take a file system request, in the form of a buffer structure, and issue the ​ I/O operations to the disk to transfer the specified block. Block device drivers can also provide a character driver interface that allows utility programs to bypass the file system and access the device directly.
    [Show full text]