HP-Mapper: a High Performance Storage Driver for Docker Containers

Total Page:16

File Type:pdf, Size:1020Kb

HP-Mapper: a High Performance Storage Driver for Docker Containers HP-Mapper: A High Performance Storage Driver for Docker Containers Fan Guo1, Yongkun Li1, Min Lv1, Yinlong Xu1, John C.S. Lui2 1University of Science and Technology of China 2The Chinese University of Hong Kong Outline 1 Background & Motivation 2 HP-Mapper Design 3 Evaluation 4 Conclusion 2 Container and Docker n Container ü Process-level virtualization l Share host kernel l Namespaces, Cgroup ü Better performance l Fast deployment l Low resources usage VM l Near bare-mental performance n Docker ü Most popular container engine ü Extensively used in production Container 3 Storage Management of Containers n Container images ü Store all requirements for running the containers ü Hierarchical, read-only, sharable n Storage drivers ü Support cross-level lookup and copy-on-write (COW) Container … Container Unified view Docker storage drivers Cross-level Copy-on-write lookup writeable layer writeable layer Apache (Image) nginx (Image) emacs (Image) Debian (Base Image) 4 Docker Storage Drivers n File-based Drivers n Block-based Drivers ü File-level COW ü Block-level COW (≥64KB) ü Share cached data ü Cannot share cached data ü Overlay2, AUFS ü DeviceMapper, ZFS, BtrFS file4 file5 Container vblk1 vblk2 vblk3 vblk4 vblk5 File-based Drivers (path→file) Block-based Drivers (vblk→blk) Layer3 File1 (writable) blk3 Layer2 File2 blk4 blk5 (read-only) File2 Layer1 File1 File2 blk1 blk2 blk3 blk4 blk5 (read-only) File1 File2 5 High COW Latency n File-based storage drivers incur large COW latency (especially for large files) ü Incur large write overhead ü Degrade write performance File Size 4KB 64KB 1MB 16MB DeviceMapper 0.12 0.74 0.96 1.39 Block-based BtrFS 0.09 0.09 0.09 0.10 Overlay2 1.99 2.49 7.14 61.7 File-based Copy-on-write latency (ms) 6 Disk I/O Redundancy n Block-based drivers introduce many redundant I/Os when reading data from a shared file ü Degrade I/O performance ü Waste I/O bandwidth Block-based File-based Total amount of reading data during the startup of 64 containers 7 Cache Redundancy n Both kinds of storage drivers generate a lot of redundant cached data ü Block-based: Read multiple copies of the data (as cache can not be shared) ü File-based: Unchanged data in the file are also copied when performing copy-on-write 8 Motivation n Limitations of current storage drivers ü Tradeoff between write and read performance ü Low cache efficiency n Our goal: Develop a new storage driver for docker containers ü Low COW overhead (or high write performance) ü High read I/O performance ü High cache efficiency 9 Outline 1 Background & Motivation 2 HP-Mapper Design 3 Evaluation 4 Conclusion 10 HP-Mapper: A High Performance Storage Driver n Key idea: HP-Mapper works at the block level and manages physical blocks ü Why block level: Low COW overhead ü Why physical block: Able to detect redundant I/Os & redundant cached data n Three modules ü Address mapper ü I/O interceptor ü Cache manager 11 Address Mapper n Tradeoff exists in block-based management ü Large blocks: high COW latency ü Small blocks: high lookup and storage overhead due to large metadata size n HP-Mapper uses a two-level mapping tree ü Support two different block sizes ü Differentiate different requests w/ on-demand allocation l New write: large sequential I/O (large block size) l COW: depending on req size 12 Address Mapper n Two-level mapping tree design Root of Level-1 VBN1 index1 offset key … key VBN2 index1 index2 offset … … … key … ... … key … 1xx 0xx (PBN) … … … … … … … … (root) … Root of Level-2 key … key … … … key … ... … key … PBN … … … … … … … PBN … … Storage efficiency Separated block placement + defragmentation 13 I/O Interceptor – Metadata Management n How to detect redundant I/O ü Indexing with physical block number (PBN) n PBN-indexing hash table ü Entries are linked in a two-dimensional list (LRU) l Head entry (latest copy) + Tail entry (other copies) 001 Head entry1 Head entry2 ... 002 Tail entry1 u64 PBN (key) … u16 copies_num … u64 Super_Block Tail entry2 u64 Inode_ID … void *next … ... void *tail 14 I/O Interceptor - Workflow n Detect redundant I/O w/ PBN-index Avoid unnecessary check (write I/Os to writable layer) 15 Cache Manager n How to remove redundant copies in cache ü Periodically scan the hash table to locate cached pages ü Maintain the hotness of each page (multiple LRU) n Page eviction: cache hit ratio vs. cache usage ü Limit # of copies: utilization-aware adjustment ü Hotness-aware eviction Page1 Page2 Page3 Page4 Head entry1 Head entry2 Locate & (cold) (hot) (hot) (hot) (PBN1) (PBN2) monitor Cached copies of PBN1 Tail entry Tail entry Page5 Page6 Page7 Page8 Page9 (cold) (hot) (hot) (hot) (hot) Tail entry Tail entry Cached copies of PBN2 Evict (copies_limit = 3) Tail entry Tail entry Page2 Page3 Page4 Page1 ... ... Scan next (hot) (hot) (hot) (cold) block Page6 Page7 Page8 Page5 Page9 Hash Table 16 (hot) (hot) (hot) (cold) (hot) Outline 1 Background & Motivation 2 HP-Mapper Design 3 Evaluation 4 Conclusion 17 Experiment Setup n Prototype ü Act as a plugin module in Linux kernel 3.10.0 ü Backing file system: Ext4 n Workloads ü Container images: Tomcat, Nextcloud, Vault n Overhead of HP-Mapper Overhead of HP-Mapper (MT represents Mapping Tree, HT represents Hash Table) 18 Reduction of COW Latency n Copy-on-write(COW) latency File Size 4KB 64KB 1MB 16MB DeviceMapper 0.13 0.74 0.96 1.39 BtrFS 0.09 0.09 0.09 0.10 Overlay2 1.99 2.49 7.14 61.7 HP-Mapper 0.07 0.12 0.55 0.57 Copy-on-write latency (ms) HP-Mapper reduces up to more than 90% COW latency comparing with DeviceMapper and Overlay2 19 Reduction of Redundant I/Os n Total amount of reading/writing data when launching 64 containers from a single image HP-Mapper efficiently removes redundant read I/Os, and also reduces more than 50% writing data on average 20 Improvement of Cache Efficiency n Cache usage when starting 64 containers from a single image HP-Mapper reduces more than 65% cache usage on average 21 Improvement of Startup Time n Total startup time when launching 64 containers from a single image on SSD/HDD HP-Mapper achieves up to 2.0× - 7.2× faster startup speed than the other three storage drivers 22 Improvement of Startup Time n Total startup time when launching 64 containers in memory-scarce systems DM BtrFS Overlay2 HP-Mapper DM BtrFS Overlay2 HP-Mapper 200 40 150 30 time (s) time time (s) time 100 20 Startup Startup 50 Startup 10 Total Total 0 Total 0 12GB 8GB 4GB 6GB 4GB 2GB Total available memory Total available memory Launch 64 Nextcloud containers Launch 64 Vault containers HP-Mapper achieves larger improvement as memory size decreases 23 Outline 1 Background & Motivation 2 HP-Mapper Design 3 Evaluation 4 Conclusion 24 Conclusion n Tradeoffs exist for Docker storage drivers ü File-based: High COW overhead ü Block-based: Low cache efficiency & redundant I/O n We develop HP-Mapper which achieves ü Low COW overhead by following block-based design with differentiated block sizes ü High I/O efficiency by intercepting redundant I/Os ü High cache efficiency by enabling cache sharing and hotness-aware management 25 Thanks! Q&A Yongkun Li [email protected] http://staff.ustc.edu.cn/~ykli 26.
Recommended publications
  • How to Create a Custom Live CD for Secure Remote Incident Handling in the Enterprise
    How to Create a Custom Live CD for Secure Remote Incident Handling in the Enterprise Abstract This paper will document a process to create a custom Live CD for secure remote incident handling on Windows and Linux systems. The process will include how to configure SSH for remote access to the Live CD even when running behind a NAT device. The combination of customization and secure remote access will make this process valuable to incident handlers working in enterprise environments with limited remote IT support. Bert Hayes, [email protected] How to Create a Custom Live CD for Remote Incident Handling 2 Table of Contents Abstract ...........................................................................................................................................1 1. Introduction ............................................................................................................................5 2. Making Your Own Customized Debian GNU/Linux Based System........................................7 2.1. The Development Environment ......................................................................................7 2.2. Making Your Dream Incident Handling System...............................................................9 2.3. Hardening the Base Install.............................................................................................11 2.3.1. Managing Root Access with Sudo..........................................................................11 2.4. Randomizing the Handler Password at Boot Time ........................................................12
    [Show full text]
  • In Search of the Ideal Storage Configuration for Docker Containers
    In Search of the Ideal Storage Configuration for Docker Containers Vasily Tarasov1, Lukas Rupprecht1, Dimitris Skourtis1, Amit Warke1, Dean Hildebrand1 Mohamed Mohamed1, Nagapramod Mandagere1, Wenji Li2, Raju Rangaswami3, Ming Zhao2 1IBM Research—Almaden 2Arizona State University 3Florida International University Abstract—Containers are a widely successful technology today every running container. This would cause a great burden on popularized by Docker. Containers improve system utilization by the I/O subsystem and make container start time unacceptably increasing workload density. Docker containers enable seamless high for many workloads. As a result, copy-on-write (CoW) deployment of workloads across development, test, and produc- tion environments. Docker’s unique approach to data manage- storage and storage snapshots are popularly used and images ment, which involves frequent snapshot creation and removal, are structured in layers. A layer consists of a set of files and presents a new set of exciting challenges for storage systems. At layers with the same content can be shared across images, the same time, storage management for Docker containers has reducing the amount of storage required to run containers. remained largely unexplored with a dizzying array of solution With Docker, one can choose Aufs [6], Overlay2 [7], choices and configuration options. In this paper we unravel the multi-faceted nature of Docker storage and demonstrate its Btrfs [8], or device-mapper (dm) [9] as storage drivers which impact on system and workload performance. As we uncover provide the required snapshotting and CoW capabilities for new properties of the popular Docker storage drivers, this is a images. None of these solutions, however, were designed with sobering reminder that widespread use of new technologies can Docker in mind and their effectiveness for Docker has not been often precede their careful evaluation.
    [Show full text]
  • Understanding the Performance of Container Execution Environments
    Understanding the performance of container execution environments Guillaume Everarts de Velp, Etienne Rivière and Ramin Sadre [email protected] EPL, ICTEAM, UCLouvain, Belgium Abstract example is an automatic grading platform named INGIn- Many application server backends leverage container tech- ious [2, 3]. This platform is used extensively at UCLouvain nologies to support workloads formed of short-lived, but and other institutions around the world to provide computer potentially I/O-intensive, operations. The latency at which science and engineering students with automated feedback container-supported operations complete impacts both the on programming assignments, through the execution of se- users’ experience and the throughput that the platform can ries of unit tests prepared by instructors. It is necessary that achieve. This latency is a result of both the bootstrap and the student code and the testing runtime run in isolation from execution time of the containers and is impacted greatly by each others. Containers answer this need perfectly: They the performance of the I/O subsystem. Configuring appro- allow students’ code to run in a controlled and reproducible priately the container environment and technology stack environment while reducing risks related to ill-behaved or to obtain good performance is not an easy task, due to the even malicious code. variety of options, and poor visibility on their interactions. Service latency is often the most important criteria for We present in this paper a benchmarking tool for the selecting a container execution environment. Slow response multi-parametric study of container bootstrap time and I/O times can impair the usability of an edge computing infras- performance, allowing us to understand such interactions tructure, or result in students frustration in the case of IN- within a controlled environment.
    [Show full text]
  • Container-Based Virtualization for Byte-Addressable NVM Data Storage
    2016 IEEE International Conference on Big Data (Big Data) Container-Based Virtualization for Byte-Addressable NVM Data Storage Ellis R. Giles Rice University Houston, Texas [email protected] Abstract—Container based virtualization is rapidly growing Storage Class Memory, or SCM, is an exciting new in popularity for cloud deployments and applications as a memory technology with the potential of replacing hard virtualization alternative due to the ease of deployment cou- drives and SSDs as it offers high-speed, byte-addressable pled with high-performance. Emerging byte-addressable, non- volatile memories, commonly called Storage Class Memory or persistence on the main memory bus. Several technologies SCM, technologies are promising both byte-addressability and are currently under research and development, each with dif- persistence near DRAM speeds operating on the main memory ferent performance, durability, and capacity characteristics. bus. These new memory alternatives open up a new realm of These include a ReRAM by Micron and Sony, a slower, but applications that no longer have to rely on slow, block-based very large capacity Phase Change Memory or PCM by Mi- persistence, but can rather operate directly on persistent data using ordinary loads and stores through the cache hierarchy cron and others, and a fast, smaller spin-torque ST-MRAM coupled with transaction techniques. by Everspin. High-speed, byte-addressable persistence will However, SCM presents a new challenge for container-based give rise to new applications that no longer have to rely on applications, which typically access persistent data through slow, block based storage devices and to serialize data for layers of block based file isolation.
    [Show full text]
  • Ubuntu Server Guide Basic Installation Preparing to Install
    Ubuntu Server Guide Welcome to the Ubuntu Server Guide! This site includes information on using Ubuntu Server for the latest LTS release, Ubuntu 20.04 LTS (Focal Fossa). For an offline version as well as versions for previous releases see below. Improving the Documentation If you find any errors or have suggestions for improvements to pages, please use the link at thebottomof each topic titled: “Help improve this document in the forum.” This link will take you to the Server Discourse forum for the specific page you are viewing. There you can share your comments or let us know aboutbugs with any page. PDFs and Previous Releases Below are links to the previous Ubuntu Server release server guides as well as an offline copy of the current version of this site: Ubuntu 20.04 LTS (Focal Fossa): PDF Ubuntu 18.04 LTS (Bionic Beaver): Web and PDF Ubuntu 16.04 LTS (Xenial Xerus): Web and PDF Support There are a couple of different ways that the Ubuntu Server edition is supported: commercial support and community support. The main commercial support (and development funding) is available from Canonical, Ltd. They supply reasonably- priced support contracts on a per desktop or per-server basis. For more information see the Ubuntu Advantage page. Community support is also provided by dedicated individuals and companies that wish to make Ubuntu the best distribution possible. Support is provided through multiple mailing lists, IRC channels, forums, blogs, wikis, etc. The large amount of information available can be overwhelming, but a good search engine query can usually provide an answer to your questions.
    [Show full text]
  • Scibian 9 HPC Installation Guide
    Scibian 9 HPC Installation guide CCN-HPC Version 1.9, 2018-08-20 Table of Contents About this document . 1 Purpose . 2 Structure . 3 Typographic conventions . 4 Build dependencies . 5 License . 6 Authors . 7 Reference architecture. 8 1. Hardware architecture . 9 1.1. Networks . 9 1.2. Infrastructure cluster. 10 1.3. User-space cluster . 12 1.4. Storage system . 12 2. External services . 13 2.1. Base services. 13 2.2. Optional services . 14 3. Software architecture . 15 3.1. Overview . 15 3.2. Base Services . 16 3.3. Additional Services. 19 3.4. High-Availability . 20 4. Conventions . 23 5. Advanced Topics . 24 5.1. Boot sequence . 24 5.2. iPXE Bootmenu Generator. 28 5.3. Debian Installer Preseed Generator. 30 5.4. Frontend nodes: SSH load-balancing and high-availability . 31 5.5. Service nodes: DNS load-balancing and high-availability . 34 5.6. Consul and DNS integration. 35 5.7. Scibian diskless initrd . 37 Installation procedure. 39 6. Overview. 40 7. Requirements . 41 8. Temporary installation node . 44 8.1. Base installation . 44 8.2. Administration environment . 44 9. Internal configuration repository . 46 9.1. Base directories . 46 9.2. Organization settings . 46 9.3. Cluster directories . 48 9.4. Puppet configuration . 48 9.5. Cluster definition. 49 9.6. Service role . 55 9.7. Authentication and encryption keys . 56 10. Generic service nodes . 62 10.1. Temporary installation services . 62 10.2. First Run. 62 10.3. Second Run . 64 10.4. Base system installation. 64 10.5. Ceph deployment . 66 10.6. Consul deployment.
    [Show full text]
  • Xrac: Execution and Access Control for Restricted Application Containers on Managed Hosts
    xRAC: Execution and Access Control for Restricted Application Containers on Managed Hosts Frederik Hauser, Mark Schmidt, and Michael Menth Chair of Communication Networks, University of Tuebingen, Tuebingen, Germany Email: {frederik.hauser,mark-thomas.schmidt,menth}@uni-tuebingen.de Abstract—We propose xRAC to permit users to run spe- or Internet access. That means, appropriate network control cial applications on managed hosts and to grant them access elements, e.g., firewalls or SDN controllers, may be informed to protected network resources. We use restricted application about authorized RACs and their needs. The authorized traffic containers (RACs) for that purpose. A RAC is a virtualization container with only a selected set of applications. Authentication can be identified by the RAC’s IPv6 address. We call this verifies the RAC user’s identity and the integrity of the RAC concept xRAC as it controls execution and access for RACs. image. If the user is permitted to use the RAC on a managed We mention a few use cases that may benefit from xRAC. host, launching the RAC is authorized and access to protected RACs may allow users to execute only up-to-date RAC network resources may be given, e.g., to internal networks, images. Execution of a RAC may be allowed only to special servers, or the Internet. xRAC simplifies traffic control as the traffic of a RAC has a unique IPv6 address so that it can users or user groups, e.g., to enforce license restrictions. be easily identified in the network. The architecture of xRAC Only selected users in a high-security area may get access to reuses standard technologies, protocols, and infrastructure.
    [Show full text]
  • How to Run Ubuntu, Chromiumos, Android at the Same Time on an Embedded Device
    How to run Ubuntu, ChromiumOS, Android at the Same Time on an Embedded Device Grégoire Gentil Founder Always Innovating Embedded Linux Conference April 2011 Why should I stay and attend this talk? ● Very cool demos that you have never seen before! ● Win some USB dongles and USB-to-HDMI adapters (Display Link inside, $59 value)!!! ● Come in front rows, you will better see the Pandaboard demo on multiple screens Objective of the talk ● Learn how to run multiple operating systems... ● AIOS (Angstrom fork), Ubuntu Maverick, Android Gingerbread, ChromiumOS ● …on a single ARM device (Beagleboard-xM, Pandaboard)... ● …at the same time, natively, with zero performance loss... ● …connected to one or multiple monitors simultaneously!!!! Where does it come from? (1/2) ● Hardware is becoming so versatile Where does it come from? (2/2) ● Open source is so fragmented ● => “Single” (between quote) Linux but with different userspace stacks How do we do it? ● Obviously: chroot ● Key lesson of this talk: simple theoretically, difficult to implement it right ● Kind of Rubik's cube puzzle – When assembling a face, you destroy the others ● Key issues along the way – Making work the core OS – Problem of managing output (video) – Problem of managing all inputs (HID, network) => End-up patching everything in kernel and OS Live Demo ChromiumOS Android AIOS Beagleboard xM TI OMAP3 Ubuntu AIOS ChromiumOS Android Pandaboard TI OMAP4 Multiple OS running side-by-side ● One kernel, multiple ARM native root file systems ● Hot-switchable at runtime ● Much faster than a usual virtual machine architecture AIOS Android Ubuntu Chromium Gentoo rootfs rootfs rootfs rootfs rootfs ..
    [Show full text]
  • Union Mounts
    Union mounts http://valerieaurora.org/union/ Valerie Aurora Red Hat, Inc. <[email protected]> Overview ● The problem ● Existing solutions ● Union mounts ● Why not unionfs? ● Limitations of unionfs ● What you can do ● How to take over the world with union mounts and Puppet What's the problem? ● Lots of nearly identical root file systems ● Virtual machines ● Any cluster ● LiveCD ● Could we share some of these? Ways to share file systems ● Copy-on-write (COW) block devices ● NFS-exported root file system ● Other network/cluster file systems ● Snapshots Why do we need union mounts? ● COW block device inefficient, divergent ● NFS server gets overloaded ● Cluster file system is needlessly heavyweight ● Snapshots are limited to a single system Solution: file-level sharing ● Take one read-only file system (shared) ● Add one local writable file system ● Lookups “fall through” to lower read-only fs ● Writes go to topmost writable fs ● Writes to lower fs trigger copy-up to topmost fs Sounds like... unionfs ● Unionfs and aufs panic the kernel ● Unionfs architecture is fundamentally broken ● Al Viro's canonical explanation here: http://lkml.indiana.edu/hypermail/linux/kernel/0802.0/0839.html ● University research project ● Unmerged for 7 years and counting Why union mounts is different ● Implemented in the VFS ● Designed with help of VFS maintainers ● Many fewer features Limitations of union mounts ● open(O_RDONLY) gets you a different file from open (O_RDWR) ● fchmod()/fchown()/futimensat() on fd don't work ● Only one writable layer ● Requires on-disk format changes to topmost fs What you can do ● Use open(O_RDONLY) ● Don't use fchmod()/fchown()/futimensat() on fd ● Don't access same file from multiple fds ● Let your engineers work on union mounts ● Review ● Test ● http://valerieaurora.org/union/ Puppet + Union mounts = World Domination Thank you Jan Blunck Miklos Szeredi Felix Fietkau Al Viro Erez Zadok Christoph Hellwig J.
    [Show full text]
  • Current Events in Container Storage
    Current Events in Container Storage Keith Hudgins Docker 2019 Storage Developer Conference. © Docker, Inc. All Rights Reserved. 1 Container Community Orgs ▪ Cloud Native Computing Foundation (CNCF) ▪ https://www.cncf.io/ ▪ Open Container Initiative (OCI) ▪ https://www.opencontainers.org/ ▪ Both are part of the Linux Foundation ▪ https://www.linuxfoundation.org/ 2019 Storage Developer Conference. © Docker, Inc. All Rights Reserved. 2 Container Runtimes ▪ Docker ▪ Default for most container installs, widest user base ▪ 20 million+ Docker Community Engine installs alone ▪ Containerd ▪ Fully graduated CNCF project (as of Feb 2019) ▪ Windows and Linux container runtime ▪ Upstream of Docker (for Linux, anyway) ▪ CoreOS (rkt) ▪ CoreOS acquired by RedHat May 2018 ▪ rkt archived by CNCF as of 8/16/2019 ▪ CRI-O ▪ Default runtime for RH OpenShift as of 4.0, June 2019 ▪ Intended to be Kubernetes-native runtime ▪ Fully open-source ▪ Upstream of these projects* are CNCF efforts. *Docker has non-CNCF upstream 3 2019 Storage Developer Conference. © Docker, Inc. All Rights Reserved. Components as well Container Storage Lifecycle 2019 Storage Developer Conference. © Docker, Inc. All Rights Reserved. 4 Container Storage Lifecycle (cont) ▪ Containers use storage in 3 primary contexts: ▪ Raw container image ▪ At rest, either as a file on disk or object storage ▪ At runtime ▪ One of several graph drivers ▪ These are NOT persistent ▪ Persistent storage ▪ Volumes attached to containers for persistent data ▪ Most of this talk will focus on this type of storage
    [Show full text]
  • Sibylfs: Formal Specification and Oracle-Based Testing for POSIX and Real-World File Systems
    SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems Tom Ridge1 David Sheets2 Thomas Tuerk3 Andrea Giugliano1 Anil Madhavapeddy2 Peter Sewell2 1University of Leicester 2University of Cambridge 3FireEye http://sibylfs.io/ Abstract 1. Introduction Systems depend critically on the behaviour of file systems, Problem File systems, in common with several other key but that behaviour differs in many details, both between systems components, have some well-known but challeng- implementations and between each implementation and the ing properties: POSIX (and other) prose specifications. Building robust and portable software requires understanding these details and differences, but there is currently no good way to system- • they provide behaviourally complex abstractions; atically describe, investigate, or test file system behaviour • there are many important file system implementations, across this complex multi-platform interface. each with its own internal complexities; In this paper we show how to characterise the envelope • different file systems, while broadly similar, nevertheless of allowed behaviour of file systems in a form that enables behave quite differently in some cases; and practical and highly discriminating testing. We give a math- • other system software and applications often must be ematically rigorous model of file system behaviour, SibylFS, written to be portable between file systems, and file sys- that specifies the range of allowed behaviours of a file sys- tems themselves are sometimes ported from one OS to tem for any sequence of the system calls within our scope, another, or written to support application portability. and that can be used as a test oracle to decide whether an ob- served trace is allowed by the model, both for validating the File system behaviour, and especially these variations in be- model and for testing file systems against it.
    [Show full text]
  • A NAS Integrated File System for On-Site Iot Data Storage
    A NAS Integrated File System for On-site IoT Data Storage Yuki Okamoto1 Kenichi Arai2 Toru Kobayashi2 Takuya Fujihashi1 Takashi Watanabe1 Shunsuke Saruwatari1 1Graduate School of Information Science and Technology, Osaka University, Japan 2Graduate School of Engineering, Nagasaki University, Japan Abstract—In the case of IoT data acquisition, long-term storage SDFS, multiple NAS can be handled as one file system and of large amounts of sensor data is a significant challenge. When a users can easily add new NAS. cloud service is used, fixed costs such as a communication line for The remainder of this paper is organized as follows. In uploading and a monthly fee are unavoidable. In addition, the use of large-capacity storage servers incurs high initial installation section 2, we discuss the issues associated with the sensor data costs. In this paper, we propose a Network Attached Storage storage. Section 3 describes the overview and implementation (NAS) integrated file system called the “Sensor Data File System of the proposed file system, SDFS. In section 4, we evaluate (SDFS)”, which has advantages of being on-site, low-cost, and the performance of SDFS and existing file systems. Section 5 highly scalable. We developed the SDFS using FUSE, which introduces related work, and Section 6 summarizes the main is an interface for implementing file systems in user-space. In this work, we compared SDFS with existing integrated storage findings of this paper. modalities such as aufs and unionfs-fuse. The results indicate that II. ISSUES ASSOCIATED WITH THE SENSOR DATA STORAGE SDFS achieves an equivalent throughput compared with existing file systems in terms of read/write performance.
    [Show full text]