Betrfs: a Right-Optimized Write-Optimized File System

Total Page:16

File Type:pdf, Size:1020Kb

Betrfs: a Right-Optimized Write-Optimized File System BetrFS: A Right-Optimized Write-Optimized File System William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, Stony Brook University; John Esmet, Tokutek Inc.; Yizheng Jiao, Ankur Mittal, Prashant Pandey, and Phaneendra Reddy, Stony Brook University; Leif Walsh, Tokutek Inc.; Michael Bender, Stony Brook University; Martin Farach-Colton, Rutgers University; Rob Johnson, Stony Brook University; Bradley C. Kuszmaul, Massachusetts Institute of Technology; Donald E. Porter, Stony Brook University https://www.usenix.org/conference/fast15/technical-sessions/presentation/jannen This paper is included in the Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST ’15). February 16–19, 2015 • Santa Clara, CA, USA ISBN 978-1-931971-201 Open access to the Proceedings of the 13th USENIX Conference on File and Storage Technologies is sponsored by USENIX BetrFS: A Right-Optimized Write-Optimized File System William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet∗, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh∗, Michael Bender, Martin Farach-Colton†, Rob Johnson, Bradley C. Kuszmaul‡, and Donald E. Porter Stony Brook University, ∗Tokutek Inc., †Rutgers University, and ‡Massachusetts Institute of Technology Abstract (microwrites). Examples include email delivery, creat- The Bε -tree File System, or BetrFS, (pronounced ing lock files for an editing application, making small “better eff ess”) is the first in-kernel file system to use a updates to a large file, or updating a file’s atime. The un- write-optimized index. Write optimized indexes (WOIs) derlying problem is that many standard data structures in are promising building blocks for storage systems be- the file-system designer’s toolbox optimize for one case cause of their potential to implement both microwrites at the expense of another. and large scans efficiently. Recent advances in write-optimized indexes (WOI) [4, Previous work on WOI-based file systems has shown 8–10, 23, 27, 28] are exciting because they have the po- promise but has also been hampered by several open tential to implement both efficient microwrites and large problems, which this paper addresses. For example, scans. The key strength of the best WOIs is that they can FUSE issues many queries into the file system, su- ingest data up to two orders of magnitude faster than B- perimposing read-intensive workloads on top of write- trees while matching or improving on the B-tree’s point- intensive ones, thereby reducing the effectiveness of and range-query performance [4, 9]. WOIs. Moving to an in-kernel implementation can ad- WOIs have been successful in commercial key-value dress this problem by providing finer control of reads. stores and databases [2,3,12,17,20,33,34], and previous This paper also contributes several implementation tech- research on WOIs in file systems has shown promise [15, niques to leverage kernel infrastructure without throttling 25, 31]. However, progress towards a production-quality write performance. write-optimized file system has been hampered by sev- Our results show that BetrFS provides good perfor- eral open challenges, which we address in this paper: mance for both arbitrary microdata operations, which in- • Code complexity. A production-quality WOI can eas- clude creating small files, updating metadata, and small ily be 50,000 lines of complex code, which is difficult writes into large or small files, and for large sequen- to shoehorn into an OS kernel. Previous WOI file sys- tial I/O. On one microdata benchmark, BetrFS pro- tems have been implemented in user space. vides more than 4 the performance of ext4 or XFS. × • FUSE squanders microwrite performance. BetrFS is an ongoing prototype effort, and requires ad- FUSE [16] issues a query to the underlying file ditional data-structure tuning to match current general- system before almost every update, superimposing purpose file systems on some operations such as deletes, search-intensive workloads on top of write-intensive directory renames, and large sequential writes. Nonethe- workloads. Although WOIs are no worse for point less, many applications realize significant performance queries than any other sensible data structure, writes improvements. For instance, an in-place rsync of the are much faster than reads, and injecting needless Linux kernel source realizes roughly 1.6–22 speedup × point queries can nullify the advantages of write over other commodity file systems. optimization. • Mapping file system abstractions onto a WOI. We 1 Introduction cannot realize the full potential performance benefits of write-optimization by simply dropping in a WOI Today’s applications exhibit widely varying I/O patterns, as a replacement for a B-tree. The schema and use making performance tuning of a general-purpose file sys- of kernel infrastructure must exploit the performance tem a frustrating balancing act. Some software, such advantages of the new data structure. as virus scans and backups, demand large, sequential This paper describes the Bε -tree File System, or scans of data. Other software requires many small writes BetrFS, the first in-kernel file system designed to take USENIX Association 13th USENIX Conference on File and Storage Technologies (FAST ’15) 301 full advantage of write optimization. Specifically, Update-in-place file systems [11, 32] optimize for BetrFS is built using the mature, well-engineered Bε - scans by keeping related items, such as entries in a di- tree implementation from Tokutek’s Fractal Tree index, rectory or consecutive blocks in a file, near each other. called TokuDB [33]. Our design is tailored to the perfor- However, since items are updated in place, update per- mance characteristics of Fractal Tree indexes, but other- formance is often limited by the random-write latency of wise uses them as a black-box key-value store, so many the underlying disk. of our design decisions may be applicable to other write- optimized file systems. B-tree-based file systems store related items logically Experiments show that BetrFS can give up to an order- adjacent in the B-tree, but B-trees do not guarantee that of-magnitude improvement to the performance of file logically-adjacent items will be physically adjacent. As creation, random writes to large files, recursive directory a B-tree ages, leaves become scattered across the disk traversals (such as occur in find, recursive greps, back- due to node splits from insertions and node merges from ups, and virus scans), and meta-data updates (such as up- deletions. In an aged B-tree, there is little correlation be- dating file atime each time a file is read). tween the logical and physical order of the leaves, and the The contributions of this paper are: cost of reading a new leaf involves both the data-transfer • The design and implementation of an in-kernel, write- cost and the seek cost. If leaves are too small to amor- optimized file system. tize the seek costs, then range queries can be slow. The • A schema, which ensures both locality and fast writes, seek costs can be amortized by using larger leaves, but for mapping VFS operations to a write-optimized in- this further throttles update performance. dex. At the other extreme, logging file systems [5,7,26,29, • A design that uses unmodified OS kernel infrastruc- 30] optimize for writes. Logging ensures that files are ture, designed for traditional file systems, yet mini- created and updated rapidly, but the resulting data and mizes the impact on write optimization. For instance, metadata can be spread throughout the log, leading to BetrFS uses the OS kernel cache to accelerate reads poor performance when reading data or metadata from without throttling writes smaller than a disk sector. disk. These performance problems are particularly no- • A thorough evaluation of the performance of the ticeable in large scans (recursive directory traversals and BetrFS prototype. For instance, our results show backups) that cannot be accelerated by caching. almost two orders of magnitude improvement on a small-write microbenchmark and a 1.5 speedup on × The microwrite bottleneck creates problems for a applications such as rsync range of applications. HPC checkpointing systems gen- ε Our results suggest that a well-designed B -tree-based erate so many microwrites that a custom file system, file system can match or outperform traditional file sys- PLFS, was designed to efficiently handle them [5] by tems on almost every operation, some by an order of exploiting the specifics of the checkpointing workload. magnitude. Comparisons with state-of-the-art file sys- Email servers often struggle to manage large sets of tems, such as ext4, XFS, zfs, and btrfs support this small messages and metadata about those messages, such claim. We believe that the few slower operations are not as the read flag. Desktop environments store prefer- fundamental to WOIs, but can be addressed with a com- ences and active state in a key-value store (i.e., a reg- bination of algorithmic advances and engineering effort istry) so that accessing and updating keys will not re- in future work. quire file-system-level microdata operations. Unix and Linux system administrators commonly report 10–20% performance improvements by disabling the atime op- 2 Motivation and Background tion [14]; maintaining the correct atime behavior in- duces a heavy microwrite load, but some applications re- This section summarizes background on the increasing quire accurate atime values. importance of microwrites, explains how WOIs work, and summarizes previous WOI-based file systems. Microwrites cause performance problems even when the storage system uses SSDs. In a B-tree-based file sys- 2.1 The microwrite problem tem, small writes trigger larger writes of entire B-tree nodes, which can further be amplified to an entire erase A microwrite is a write operation where the setup time block on the SSD. In a log-structured file system, mi- (i.e.
Recommended publications
  • FUJITSU AI Zinrai Deep Learning System 200M/210H Software License Terms
    FUJITSU AI Zinrai Deep Learning System 200M/210H Software License Terms Copyright 2021 FUJITSU LIMITED P3KD-1962-01EN 1. NVIDIA Product License Terms This product uses the following NVIDIA products. Name Version Tesla Driver for Linux x64 460.73.01 cuDNN 8.2.2 These NVIDIA products can be downloaded from the NVIDIA home page. http://www.nvidia.com/page/home.html Regarding the terms and conditions of the license for Tesla Driver for Linux x64 460.73.01, see "END USER LICENSE AGREEMENTS (EULA)" in the Appendix. Regarding the terms and conditions of the license for cuDNN 8.2.2, see "NVIDIA SOFTWARE LICENSE AGREEMENT" in the Appendix. 2. NGC Software This product contains the open source software ("OSS") as described in this document. The license terms for each OSS are as described below. • PyTorch 1.9.0 From PyTorch: Copyright (c) 2016- Facebook, Inc (Adam Paszke) Copyright (c) 2014- Facebook, Inc (Soumith Chintala) Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert) Copyright (c) 2012-2014 Deepmind Technologies (Koray Kavukcuoglu) Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu) Copyright (c) 2011-2013 NYU (Clement Farabet) Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston) Copyright (c) 2006 Idiap Research Institute (Samy Bengio) Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz) From Caffe2: Copyright (c) 2016-present, Facebook Inc. All rights reserved. All contributions by Facebook: Copyright (c) 2016 Facebook Inc. All contributions by Google: Copyright (c) 2015 Google Inc. All rights reserved. All contributions by Yangqing Jia: Copyright (c) 2015 Yangqing Jia All rights reserved.
    [Show full text]
  • Embedded Linux Systems with the Yocto Project™
    OPEN SOURCE SOFTWARE DEVELOPMENT SERIES Embedded Linux Systems with the Yocto Project" FREE SAMPLE CHAPTER SHARE WITH OTHERS �f, � � � � Embedded Linux Systems with the Yocto ProjectTM This page intentionally left blank Embedded Linux Systems with the Yocto ProjectTM Rudolf J. Streif Boston • Columbus • Indianapolis • New York • San Francisco • Amsterdam • Cape Town Dubai • London • Madrid • Milan • Munich • Paris • Montreal • Toronto • Delhi • Mexico City São Paulo • Sidney • Hong Kong • Seoul • Singapore • Taipei • Tokyo Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. For information about buying this title in bulk quantities, or for special sales opportunities (which may include electronic versions; custom cover designs; and content particular to your business, training goals, marketing focus, or branding interests), please contact our corporate sales depart- ment at [email protected] or (800) 382-3419. For government sales inquiries, please contact [email protected]. For questions about sales outside the U.S., please contact [email protected]. Visit us on the Web: informit.com Cataloging-in-Publication Data is on file with the Library of Congress.
    [Show full text]
  • MC-1200 Series Linux Software User's Manual
    MC-1200 Series Linux Software User’s Manual Version 1.0, November 2020 www.moxa.com/product © 2020 Moxa Inc. All rights reserved. MC-1200 Series Linux Software User’s Manual The software described in this manual is furnished under a license agreement and may be used only in accordance with the terms of that agreement. Copyright Notice © 2020 Moxa Inc. All rights reserved. Trademarks The MOXA logo is a registered trademark of Moxa Inc. All other trademarks or registered marks in this manual belong to their respective manufacturers. Disclaimer Information in this document is subject to change without notice and does not represent a commitment on the part of Moxa. Moxa provides this document as is, without warranty of any kind, either expressed or implied, including, but not limited to, its particular purpose. Moxa reserves the right to make improvements and/or changes to this manual, or to the products and/or the programs described in this manual, at any time. Information provided in this manual is intended to be accurate and reliable. However, Moxa assumes no responsibility for its use, or for any infringements on the rights of third parties that may result from its use. This product might include unintentional technical or typographical errors. Changes are periodically made to the information herein to correct such errors, and these changes are incorporated into new editions of the publication. Technical Support Contact Information www.moxa.com/support Moxa Americas Moxa China (Shanghai office) Toll-free: 1-888-669-2872 Toll-free: 800-820-5036 Tel: +1-714-528-6777 Tel: +86-21-5258-9955 Fax: +1-714-528-6778 Fax: +86-21-5258-5505 Moxa Europe Moxa Asia-Pacific Tel: +49-89-3 70 03 99-0 Tel: +886-2-8919-1230 Fax: +49-89-3 70 03 99-99 Fax: +886-2-8919-1231 Moxa India Tel: +91-80-4172-9088 Fax: +91-80-4132-1045 Table of Contents 1.
    [Show full text]
  • Algorithms and Complexity (AL)
    Algorithms and Complexity (AL) Algorithms are fundamental to computer science and software engineering. The real-world performance of any software system depends on the algorithms chosen and the suitability of the various layers of implementation. Good algorithm design is therefore crucial for the performance of all software systems. Moreover, the study of algorithms provides insight into the intrinsic nature of the problem as well as possible solution techniques independent of programming language, programming paradigm, computer hardware, or any other implementation aspect. An important part of computing is the ability to select algorithms appropriate to particular purposes and to apply them, recognizing the possibility that no suitable algorithm may exist. This facility relies on understanding the range of algorithms that address an important set of well-defined problems, recognizing their strengths and weaknesses, and their suitability in particular contexts. Efficiency is a pervasive theme throughout this area. This knowledge area defines the central concepts and skills required to design, implement, and analyze algorithms for solving problems. Algorithms are essential in all advanced areas of computer science: artificial intelligence, databases, distributed computing, graphics, networking, operating systems, programming languages, security, and so on. Algorithms that have specific utility in each of these are listed in the relevant knowledge areas. Cryptography, for example, appears in the new Knowledge Area on Information Assurance and Security (IAS), while parallel and distributed algorithms appear in the Knowledge Area in Parallel and Distributed Computing (PD). As with all knowledge areas, the order of topics and their groupings do not necessarily correlate to a specific order of presentation. Different programs will teach the topics in different courses and should do so in the order they believe is most appropriate for their students.
    [Show full text]
  • Embedded Linux Training
    Free Electrons Embedded Linux training Gregory Clement Thomas Petazzoni Michael Opdenacker Free Electrons. Kernel, drivers and embedded Linux development, consulting, training and support. http//free-electrons.com Rights to copy © Copyright 2004-2011, Free Electrons [email protected] Electronic version of this document available on http://free-electrons.com/doc/training/embedded-linux Updates will be available on http://free-electrons.com/doc/training/embedded-linux/ Attribution ± ShareAlike 3.0 Corrections, suggestions, You are free contributions and translations are welcome! to copy, distribute, display, and perform the work to make derivative works Latest update: Feb 14, 2011 to make commercial use of the work Under the following conditions Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one. For any reuse or distribution, you must make clear to others the license terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above. License text: http://creativecommons.org/licenses/by-sa/3.0/legalcode Free Electrons. Kernel, drivers and embedded Linux development, consulting, training and support. http//free-electrons.com Linux kernel Linux device drivers Free Electrons Board support code Our services Mainstreaming kernel code Kernel debugging Custom Development System integration
    [Show full text]
  • Assignment of Master's Thesis
    CZECH TECHNICAL UNIVERSITY IN PRAGUE FACULTY OF INFORMATION TECHNOLOGY ASSIGNMENT OF MASTER’S THESIS Title: Approximate Pattern Matching In Sparse Multidimensional Arrays Using Machine Learning Based Methods Student: Bc. Anna Kučerová Supervisor: Ing. Luboš Krčál Study Programme: Informatics Study Branch: Knowledge Engineering Department: Department of Theoretical Computer Science Validity: Until the end of winter semester 2018/19 Instructions Sparse multidimensional arrays are a common data structure for effective storage, analysis, and visualization of scientific datasets. Approximate pattern matching and processing is essential in many scientific domains. Previous algorithms focused on deterministic filtering and aggregate matching using synopsis style indexing. However, little work has been done on application of heuristic based machine learning methods for these approximate array pattern matching tasks. Research current methods for multidimensional array pattern matching, discovery, and processing. Propose a method for array pattern matching and processing tasks utilizing machine learning methods, such as kernels, clustering, or PSO in conjunction with inverted indexing. Implement the proposed method and demonstrate its efficiency on both artificial and real world datasets. Compare the algorithm with deterministic solutions in terms of time and memory complexities and pattern occurrence miss rates. References Will be provided by the supervisor. doc. Ing. Jan Janoušek, Ph.D. prof. Ing. Pavel Tvrdík, CSc. Head of Department Dean Prague February 28, 2017 Czech Technical University in Prague Faculty of Information Technology Department of Knowledge Engineering Master’s thesis Approximate Pattern Matching In Sparse Multidimensional Arrays Using Machine Learning Based Methods Bc. Anna Kuˇcerov´a Supervisor: Ing. LuboˇsKrˇc´al 9th May 2017 Acknowledgements Main credit goes to my supervisor Ing.
    [Show full text]
  • UNIVERSIDAD SAN FRANCISCO DE QUITO USFQ Optimizing Large
    UNIVERSIDAD SAN FRANCISCO DE QUITO USFQ Colegio de Ciencias e Ingenierías Optimizing Large Databases: A Study on Index Structures Proyecto de Investigación . Ricardo Andres Leon Ruiz Ingeniería en Sistemas Trabajo de titulación presentado como requisito para la obtención del título de Ingeniero en Sistemas Quito, 22 de Diciembre de 2017 2 UNIVERSIDAD SAN FRANCISCO DE QUITO USFQ COLEGIO DE CIENCIAS E INGENIERÍAS HOJA DE CALIFICACIÓN DE TRABAJO DE TITULACIÓN Optimizing Large Databases: A Study on Index Structures Ricardo Andres Leon Ruiz Calificación: Nombre del profesor, Título académico Aldo Cassola, Ph.D. Firma del profesor Quito, Diciembre de 2017 3 Derechos de Autor Por medio del presente documento certifico que he leído todas las Políticas y Manuales de la Universidad San Francisco de Quito USFQ, incluyendo la Política de Propiedad Intelectual USFQ, y estoy de acuerdo con su contenido, por lo que los derechos de propiedad intelectual del presente trabajo quedan sujetos a lo dispuesto en esas Políticas. Asimismo, autorizo a la USFQ para que realice la digitalización y publicación de este trabajo en el repositorio virtual, de conformidad a lo dispuesto en el Art. 144 de la Ley Orgánica de Educación Superior. Firma del estudiante: Nombres y Apellidos: Ricardo Andres Leon Ruiz Código de estudiante: 110346 C. I.: 1714286265 Lugar, Fecha Quito, 22 de Diciembre de 2017 4 RESUMEN La siguiente investigación trata sobre comparar estructuras de índice para grandes bases de datos, tanto analíticamente, como experimentalmente. El estudio se encuentra dividido en dos partes principales. La primera parte se centra en índices de hash y B-trees. Ambas estructuras son estudiadas en el contexto del modelo de acceso de disco tradicional.
    [Show full text]
  • Anatomy of Cross-Compilation Toolchains
    Embedded Linux Conference Europe 2016 Anatomy of cross-compilation toolchains Thomas Petazzoni free electrons [email protected] Artwork and Photography by Jason Freeny free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 1/1 Thomas Petazzoni I CTO and Embedded Linux engineer at Free Electrons I Embedded Linux specialists. I Development, consulting and training. I http://free-electrons.com I Contributions I Kernel support for the Marvell Armada ARM SoCs from Marvell I Major contributor to Buildroot, an open-source, simple and fast embedded Linux build system I Living in Toulouse, south west of France Drawing from Frank Tizzoni, at Kernel Recipes 2016 free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 2/1 Disclaimer I I am not a toolchain developer. Not pretending to know everything about toolchains. I Experience gained from building simple toolchains in the context of Buildroot I Purpose of the talk is to give an introduction, not in-depth information. I Focused on simple gcc-based toolchains, and for a number of examples, on ARM specific details. I Will not cover advanced use cases, such as LTO, GRAPHITE optimizations, etc. I Will not cover LLVM free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 3/1 What is a cross-compiling toolchain? I A set of tools that allows to build source code into binary code for
    [Show full text]
  • Conception Et Exploitation D'une Base De Modèles: Application Aux Data
    Conception et exploitation d’une base de modèles : application aux data sciences Cyrille Ponchateau To cite this version: Cyrille Ponchateau. Conception et exploitation d’une base de modèles : application aux data sciences. Autre [cs.OH]. ISAE-ENSMA Ecole Nationale Supérieure de Mécanique et d’Aérotechique - Poitiers, 2018. Français. NNT : 2018ESMA0005. tel-01939430 HAL Id: tel-01939430 https://tel.archives-ouvertes.fr/tel-01939430 Submitted on 29 Nov 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THESE pour l’obtention du Grade de DOCTEUR DE L'ÉCOLE NATIONALE SUPÉRIEURE DE MÉCANIQUE ET D'AÉROTECHNIQUE (Diplôme National — Arrêté du 25 mai 2016) Ecole Doctorale : Sciences et Ingénierie des Systèmes, Mathématiques, Informatique (SISMI) Secteur de Recherche : INFORMATIQUE ET APPLICATIONS Présentée par : Cyrille PONCHATEAU ******************************************************** Conception et exploitation d’une base de modèles : application aux data sciences ******************************************************** Directeur de thèse : Ladjel BELLATRECHE
    [Show full text]
  • Choosing System C Library
    Choosing System C library Khem Raj Comcast Embedded Linux Conference Europe 2014 Düsseldorf Germany Introduction “God defined C standard library everything else is creation of man” Introduction • Standard library for C language • Provides primitives for OS service • Hosted/freestanding • String manipulations • Types • I/O • Memory • APIs Linux Implementations • GNU C library (glibc) • uClibc • eglibc – Now merged into glibc • Dietlibc • Klibc • Musl • bionic Multiple C library FAQs • Can I have multiple C libraries side by side ? • Can programs compiled with glibc run on uclibc or vice versa ? • Are they functional compatible ? • Do I need to choose one over other if I am doing real time Linux? • I have a baremetal application what libc options do I have ? Posix Compliance • Posix specifies more than ISO C • Varying degree of compliance What matters to you ? • Code Size • Functionality • Interoperability • Licensing • Backward Compatibility • Variety of architecture support • Dynamic Linking • Build system Codesize • Dietlibc/klibc – Used in really small setup e.g. initramfs • Bionic – Small linked into every process • uClibc – Configurable • Size can be really small at the expense of functionality • Eglibc – Has option groups can be ( < 1M ) License • Bionic – BSD/Apache-2.0 • Musl - MIT • Uclibc – LGPL-2.1 • Eglibc/Glibc – LGPL-2.1 Assigned to FSF • Dietlibc – GPLv2 • Klibc – GPLv2 • Newlib – some parts are GPLv3 Compliance • Musl strives for ISO/C and POSIX compliance No-mmu • uClibc supported No-mmu Distributions • Glibc is used in
    [Show full text]
  • Big Data Solutions and Applications
    P. Bellini, M. Di Claudio, P. Nesi, N. Rauch Slides from: M. Di Claudio, N. Rauch Big Data Solutions and applications Slide del corso: Sistemi Collaborativi e di Protezione Corso di Laurea Magistrale in Ingegneria 2013/2014 http://www.disit.dinfo.unifi.it Distributed Data Intelligence and Technology Lab 1 Related to the following chapter in the book: • P. Bellini, M. Di Claudio, P. Nesi, N. Rauch, "Tassonomy and Review of Big Data Solutions Navigation", in "Big Data Computing", Ed. Rajendra Akerkar, Western Norway Research Institute, Norway, Chapman and Hall/CRC press, ISBN 978‐1‐ 46‐657837‐1, eBook: 978‐1‐ 46‐657838‐8, july 2013, in press. http://www.tmrfindia.org/b igdata.html 2 Index 1. The Big Data Problem • 5V’s of Big Data • Big Data Problem • CAP Principle • Big Data Analysis Pipeline 2. Big Data Application Fields 3. Overview of Big Data Solutions 4. Data Management Aspects • NoSQL Databases • MongoDB 5. Architectural aspects • Riak 3 Index 6. Access/Data Rendering aspects 7. Data Analysis and Mining/Ingestion Aspects 8. Comparison and Analysis of Architectural Features • Data Management • Data Access and Visual Rendering • Mining/Ingestion • Data Analysis • Architecture 4 Part 1 The Big Data Problem. 5 Index 1. The Big Data Problem • 5V’s of Big Data • Big Data Problem • CAP Principle • Big Data Analysis Pipeline 2. Big Data Application Fields 3. Overview of Big Data Solutions 4. Data Management Aspects • NoSQL Databases • MongoDB 5. Architectural aspects • Riak 6 What is Big Data? Variety Volume Value 5V’s of Big Data Velocity Variability 7 5V’s of Big Data (1/2) • Volume: Companies amassing terabytes/petabytes of information, and they always look for faster, more efficient, and lower‐ cost solutions for data management.
    [Show full text]
  • Embedded Linux Size Reduction Techniques
    Embedded Linux Conference 2017 Embedded Linux size reduction techniques Michael Opdenacker free electrons [email protected] free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 1/1 Michael Opdenacker I Michael Opdenacker I Founder and Embedded Linux engineer at free electrons I Embedded Linux expertise I Development, consulting and training I Strong open-source focus I Long time interest in embedded Linux boot time, and one of its prerequisites: small system size. I From Orange, France Penguin from Justin Ternet (https://openclipart.org/detail/182875/pinguin) free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 2/1 Why reduce size? There are multiple reasons for having a small kernel and system I Run on very small systems (IoT) I Run Linux as a bootloader I Boot faster (for example on FPGAs) I Reduce power consumption Even conceivable to run the whole system in CPU internal RAM or cache (DRAM is power hungry and needs refreshing) I Security: reduce the attack surface I Cloud workloads: optimize instances for size and boot time. I Spare as much RAM as possible for applications and maximizing performance. See https://tiny.wiki.kernel.org/use_cases free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 3/1 Reasons for this talk I No talk about size since ELCE 2015 I Some projects stalled (Linux tinification, LLVM Linux...) I Opportunity to have a look at solutions I didn’t try: musl library, Toybox, gcc LTO, new gcc versions, compiling with Clang..
    [Show full text]