Exercise University of Oklahoma, May 13Th – 17Th 2019 J.D

Total Page:16

File Type:pdf, Size:1020Kb

Exercise University of Oklahoma, May 13Th – 17Th 2019 J.D Linux Clusters Institute: Lustre Hands On Exercise University of Oklahoma, May 13th – 17th 2019 J.D. Maloney | Storage Engineer National Center for Supercomputing Applications (NCSA) [email protected] This document is a result of work by volunteer LCI instructors and is licensed under CC BY-NC 4.0 (https://creativecommons.org/licenses/by-nc/4.0/). Goal of Hands on Exercise • Create Lustre File System with 1 MDT & 6 OSTs • Bring in Sample Data • Configure Quotas • Test how stripe width works May 13th-17th 2019 2 Lay of the Land • You should have 4 storage servers; 1 for metadata, 3 for data • Each have 2 5GB volumes • Root SSH between all nodes in cluster • Chronyd keeping time in sync May 13th-17th 2019 3 Creating Lustre File System • Add Lustre Repos to Servers & Compute Nodes (create /etc/yum.repos.d/lustre.repo) [lustre-server] name=CentOS-$releasever - Lustre baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/server/ gpgcheck=0 [e2fsprogs] name=CentOS-$releasever - Ldiskfs baseurl=https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/ gpgcheck=0 [lustre-client] name=CentOS-$releasever - Lustre baseurl=https://downloads.hpdd.intel.com/public/lustre/latest-feature-release/el7/client/ gpgcheck=0 May 13th-17th 2019 4 Creating Lustre File System • Update/Install these packages on the four storage nodes yum upgrade -y e2fsprogs yum install -y lustre-tests • Create /etc/modprobe.d/lnet.conf and add the following on all four storage nodes options lnet networks=tcp0(eth0) • Reboot all four storage VMs shutdown –r now May 13th-17th 2019 5 Creating Lustre File System • Format the MDT on storage-0 mkfs.lustre --fsname=lci --mgs --mdt --index=0 /dev/sdc • Mount the MDT on storage-0 mkdir /mnt/mdt && mount –t lustre /dev/sdc /mnt/mdt • On storage-1, storage-2, storage-3 format the OSTs -- make sure to increment the index # for each disk in the cluster (0-5) mkfs.lustre --ost --fsname=lci --mgsnode=storage-0@tcp0 --index=0 /dev/sdc mkfs.lustre --ost --fsname=lci --mgsnode=storage-0@tcp0 --index=1 /dev/sdd …continue to do the remaining four May 13th-17th 2019 6 Creating Lustre File System • Mount the OST’s on the OSS nodes (storage-1,2,3) mkdir /mnt/ost0 && mount –t lustre /dev/sdc /mnt/ost0 mkdir /mnt/ost1 && mount –t lustre /dev/sdd /mnt/ost1 …continue to do the remaining four May 13th-17th 2019 7 Installing Client Nodes • On compute-0 and compute-1 run: yum -y install lustre-client • Probe the module: modprobe lustre • Create the mountpoint for the file system mkdir /mnt/lci May 13th-17th 2019 8 Mount the File System • On compute-0 and compute-1 run: mount -t lustre storage-0@tcp0:/lci /mnt/lci • You should have a fully mounted Lustre file system [root@Compute-0 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda2 30G 2.5G 28G 9% / devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 17M 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/sda1 497M 90M 408M 18% /boot /dev/sdb1 16G 45M 15G 1% /mnt/resource Login:/share/home 30G 2.2G 28G 8% /share/home 192.168.0.7@tcp:/lci 28G 199M 26G 1% /mnt/lci tmpfs 797M 0 797M 0% /run/user/0 May 13th-17th 2019 9 Bring In Test Data • Create /mnt/lci/home, /mnt/lci/projects, /mnt/lci/scratch directories just for posterity • Bring in folder 1 of test data • Same data and source as the Spectrum Scale Hands On May 13th-17th 2019 10 Setting Quotas • Set a quota for the user Tim & verify it worked: lfs setquota -u tim -b 1G -B 2G -i 10000 -I 15000 /mnt/lci lfs quota -hu tim /mnt/lci Disk quotas for usr tim (uid 7008): Filesystem used quota limit grace files quota limit grace /mnt/lci 0k 1G 2G - 0 10000 15000 - • Set a quota for the group hpc & verify it worked: lfs setquota -g hpc -b 3G -B 4G -i 20000 -I 25000 /mnt/lci lfs quota -hg hpc /mnt/lci Disk quotas for grp hpc (gid 7007): Filesystem used quota limit grace files quota limit grace /mnt/lci 0k 3G 4G - 0 20000 25000 - • Feel free to play around with more quotas May 13th-17th 2019 11 Playing with Stripe Width • Set the default stripe width and stripe size to 2 and 4M respectively: [root@Storage-0 lci]# lfs getstripe -d /mnt/lci stripe_count: 1 stripe_size: 1048576 pattern: 0 stripe_offset: -1 [root@Storage-0 lci]# lfs setstripe -S 4M -c 2 /mnt/lci [root@Storage-0 lci]# lfs getstripe -d /mnt/lci stripe_count: 2 stripe_size: 4194304 pattern: raid0 stripe_offset: -1 • Look at a twitter stream file that was brought in at width 1 [root@Storage-0 twitter]# lfs getstripe 2012_12_07_stream.txt 2012_12_07_stream.txt lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 5 0x5 0 May 13th-17th 2019 12 Playing with Stripe Width • Remove and re-sync the sample data over again, check it picks up the proper striping now: [root@Storage-0 twitter]# lfs getstripe 2012_12_07_stream.txt 2012_12_07_stream.txt lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 4 obdidx objid objid group 4 5 0x5 0 1 7 0x7 0 May 13th-17th 2019 13 Wrap Up • Further Exploration • Other Lustre commands or potentially set up Robinhood May 13th-17th 2019 14.
Recommended publications
  • CST8207 – Linux O/S I
    Mounting a Filesystem Directory Structure Fstab Mount command CST8207 - Algonquin College 2 Chapter 12: page 467 - 496 CST8207 - Algonquin College 3 The mount utility connects filesystems to the Linux directory hierarchy. The mount point is a directory in the local filesystem where you can access mounted filesystem. This directory must exist before you can mount a filesystem. All filesystems visible on the system exist as a mounted filesystem someplace below the root (/) directory CST8207 - Algonquin College 4 can be mounted manually ◦ can be listed in /etc/fstab, but not necessary ◦ all mounting information supplied manually at command line by user or administrator can be mounted automatically on startup ◦ must be listed /etc/fstab, with all appropriate information and options required Every filesystem, drive, storage device is listed as a mounted filesystem associated to a directory someplace under the root (/) directory CST8207 - Algonquin College 5 CST8207 - Algonquin College 6 Benefits Scalable ◦ As new drives are added and new partitions are created, further filesystems can be mounted at various mount points as required. ◦ This means a Linux system does not need to worry about running out of disk space. Transparent ◦ No application would stop working if transferred to a different partition, because access to data is done via the mount point. ◦ Also transparent to user CST8207 - Algonquin College 7 All known filesystems volumes are typically listed in the /etc/fstab (static information about filesystem) file to help automate the mounting process If it is not listed in the /etc/fstab file, then all appropriate information about the filesystem needs to be listed manually at the command line.
    [Show full text]
  • File System, Files, and *Tab /Etc/Fstab
    File system, files, and *tab File system files directories volumes, file systems mounting points local versus networked file systems 1 /etc/fstab Specifies what is to be mounted where and how fs_spec: describes block special device for remote filesystem to be mounted fs_file: describes the mount point fs_vfstype: describes the type of file system fs_mntops: describes the mount options associated with the filesystem 2 /etc/fstab cont. fs_freq: used by the dump command fs_passno: used by fsck to determine the order in which checks are done at boot time. Root file systems should be specified as 1, others should be 2. Value 0 means that file system does not need to be checked 3 /etc/fstab 4 from blocks to mounting points metadata inodes directories superblocks 5 mounting file systems mounting e.g., mount -a unmounting manually or during shutdown umount 6 /etc/mtab see what is mounted 7 Network File System Access file system (FS) over a network looks like a local file system to user e.g. mount user FS rather than duplicating it (which would be a disaster) Developed by Sun Microsystems (mid 80s) history for NFS: NFS, NFSv2, NFSv3, NFSv4 RFC 3530 (from 2003) take a look to see what these RFCs are like!) 8 Network File System How does this actually work? server needs to export the system client needs to mount the system server: /etc/exports file client: /etc/fstab file 9 Network File System Security concerns UID GID What problems could arise? 10 Network File System example from our raid system (what is a RAID again?) Example of exports file from
    [Show full text]
  • BSD UNIX Toolbox 1000+ Commands for Freebsd, Openbsd
    76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iii BSD UNIX® TOOLBOX 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD®Power Users Christopher Negus François Caen 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page ii 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page i BSD UNIX® TOOLBOX 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page ii 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iii BSD UNIX® TOOLBOX 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD®Power Users Christopher Negus François Caen 76034ffirs.qxd:Toolbox 4/2/08 12:50 PM Page iv BSD UNIX® Toolbox: 1000+ Commands for FreeBSD®, OpenBSD, and NetBSD® Power Users Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2008 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-37603-4 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 Library of Congress Cataloging-in-Publication Data is available from the publisher. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permis- sion should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at http://www.wiley.com/go/permissions.
    [Show full text]
  • Filesystem Considerations for Embedded Devices ELC2015 03/25/15
    Filesystem considerations for embedded devices ELC2015 03/25/15 Tristan Lelong Senior embedded software engineer Filesystem considerations ABSTRACT The goal of this presentation is to answer a question asked by several customers: which filesystem should you use within your embedded design’s eMMC/SDCard? These storage devices use a standard block interface, compatible with traditional filesystems, but constraints are not those of desktop PC environments. EXT2/3/4, BTRFS, F2FS are the first of many solutions which come to mind, but how do they all compare? Typical queries include performance, longevity, tools availability, support, and power loss robustness. This presentation will not dive into implementation details but will instead summarize provided answers with the help of various figures and meaningful test results. 2 TABLE OF CONTENTS 1. Introduction 2. Block devices 3. Available filesystems 4. Performances 5. Tools 6. Reliability 7. Conclusion Filesystem considerations ABOUT THE AUTHOR • Tristan Lelong • Embedded software engineer @ Adeneo Embedded • French, living in the Pacific northwest • Embedded software, free software, and Linux kernel enthusiast. 4 Introduction Filesystem considerations Introduction INTRODUCTION More and more embedded designs rely on smart memory chips rather than bare NAND or NOR. This presentation will start by describing: • Some context to help understand the differences between NAND and MMC • Some typical requirements found in embedded devices designs • Potential filesystems to use on MMC devices 6 Filesystem considerations Introduction INTRODUCTION Focus will then move to block filesystems. How they are supported, what feature do they advertise. To help understand how they compare, we will present some benchmarks and comparisons regarding: • Tools • Reliability • Performances 7 Block devices Filesystem considerations Block devices MMC, EMMC, SD CARD Vocabulary: • MMC: MultiMediaCard is a memory card unveiled in 1997 by SanDisk and Siemens based on NAND flash memory.
    [Show full text]
  • Journaling File Systems
    Linux Journaling File Systems Linux onzSeries Journaling File Systems Volker Sameske ([email protected]) Linux on zSeries Development IBM Lab Boeblingen, Germany Share Anaheim,California February27 –March 4,2005 Session 9257 ©2005 IBM Corporation Linux Journaling File Systems Agenda o File systems. • Overview, definitions. • Reliability, scalability. • File system features. • Common grounds & differences. o Volume management. • LVM, EVMS, MD. • Striping. o Measurement results. • Hardware/software setup. • throughput. • CPU load. 2 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems A file system should... o ...store data o ...organize data o ...administrate data o ...organize data about the data o ...assure integrity o ...be able to recover integrity problems o ...provide tools (expand, shrink, check, ...) o ...be able to handle many and large files o ...be fast o ... 3 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems File system-definition o Informally • The mechanism by which computer files are stored and organized on a storage device. o More formally, • A set of abstract data types that are necessary for the storage, hierarchical organization, manipulation, navigation, access and retrieval of data. 4 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems Why a journaling file system? o Imagine your Linux system crashs while you are saving an edited file: • The system crashs after the changes have been written to disk à good crash • The system crashs before the changes have been written to disk à bad crash but bearable if you have an older version • The sytem crashs just in the moment your data will be written: à very bad crash your file could be corrupted and in worst case the file system could be corrupted à That‘s why you need a journal 5 Session 9257 © 2005 IBM Corporation Linux Journaling File Systems Somefilesystemterms o Meta data • "Data about the data" • File system internal data structure (e.g.
    [Show full text]
  • Mv-Ch650-90Tm
    MV-CH650-90TM 65 MP CMOS 10 GigE Area Scan Camera Introduction Available Model MV-CH650-90TM camera adopts Gpixel GMAX3265 sensor to M58-mount with fan, mono: MV-CH650- provide high-quality image. It uses 10 GigE interface to transmit 90TM-M58S-NF non-compressed image in real time, and its max. frame rate can F-mount with fan, mono: MV-CH650-90TM- reach 15.5 fps in full resolution. F-NF Key Feature Applicable Industry Resolution of 9344 × 7000, and pixel size of 3.2 μm × 3.2 μm. PCB AOI, FPD, railway related applications, etc. Adopts 10 GigE interface providing max. transmission Sensor Quantum Efficiency distance of 100 meters without relay. Supports auto or manual adjustment for gain, exposure time, and manual adjustment for Look-Up Table (LUT), Gamma correction, etc. Compatible with GigE Vision Protocol V2.0, GenlCam Standard, and third-party software based on protocols. Dimension M58-mount with fan: F-mount with fan: Specification Model MV-CH650-90TM Camera Sensor type CMOS, global shutter Sensor model Gpixel GMAX3265 Pixel size 3.2 µm × 3.2 µm Sensor size 29.9 mm × 22.4 mm Resolution 9344 × 7000 Max. frame rate 15.5 fps @9344 × 7000 Dynamic range 66 dB SNR 40 dB Gain 1.25X to 6X Exposure time 15 μs to 10 sec Exposure mode Off/Once/Continuous exposure mode Mono/color Mono Pixel format Mono 8/10/10p/12/12p Binning Supports 1 × 1, 1 × 2, 1 × 4, 2 × 1, 2 × 2, 2 × 4, 4 × 1, 4 × 2, 4 × 4 Decimation Supports 1 × 1, 1 × 2, 1 × 4, 2 × 1, 2 × 2, 2 × 4, 4 × 1, 4 × 2, 4 × 4 Reverse image Supports horizontal and vertical reverse image output Electrical features Data interface 10 Gigabit Ethernet, compatible with Gigabit Ethernet Digital I/O 12-pin Hirose connector provides power and I/O, including opto-isolated input × 1 (Line 0), opto-isolated output × 1 (Line 1), bi-directional non-isolated I/O × 1 (Line 2), and RS-232 × 1 Power supply 9 VDC to 24 VDC Power consumption Typ.
    [Show full text]
  • How to Setup NFS File System Guide ID: 3 - Release: Initial Revision [Major] 2015-08-14
    How to setup NFS file system Guide ID: 3 - Release: Initial revision [major] 2015-08-14 How to setup NFS file system Configuration of SCO Unix shared drive in order to share printer tasks. Written By: Petr Roupec This document was generated on 2020-11-19 05:38:51 AM (MST). © 2020 omlex.dozuki.com/ Page 1 of 7 How to setup NFS file system Guide ID: 3 - Release: Initial revision [major] 2015-08-14 INTRODUCTION This guide is describing use of SCO scoadmin program to setup mount /volumes/bmprint remote drive on your OT computer. This document was generated on 2020-11-19 05:38:51 AM (MST). © 2020 omlex.dozuki.com/ Page 2 of 7 How to setup NFS file system Guide ID: 3 - Release: Initial revision [major] 2015-08-14 Step 1 — SCO Admin - Starting program Switch user on local console. Please note character "-" on command line - this load right environment for root user From remote computer use telnet connection Start scoadmin program This document was generated on 2020-11-19 05:38:51 AM (MST). © 2020 omlex.dozuki.com/ Page 3 of 7 How to setup NFS file system Guide ID: 3 - Release: Initial revision [major] 2015-08-14 Step 2 — SCO Admin - File System Manager SCO Admin - File Manager Select FileSystems Open Filesystem Manager Use TAB and arrows on your keyboard to move between the fields This document was generated on 2020-11-19 05:38:51 AM (MST). © 2020 omlex.dozuki.com/ Page 4 of 7 How to setup NFS file system Guide ID: 3 - Release: Initial revision [major] 2015-08-14 Step 3 — Start NFS file system mounting wizard Select Mount from menu Scroll down and select "Add Mount Configuration" Choose remote Use TAB and arrows on your keyboard to move between the fields Step 4 — NFS Share - Configuration details Enter IP address of your printer server Enter name of remote directory of your print server Enter name of local directory on computer you are configuring Don't forget Advanced Mount Option - Failure to configure these correctly might stop your server in case of printer server shutdown This document was generated on 2020-11-19 05:38:51 AM (MST).
    [Show full text]
  • UNIX (Solaris/Linux) Quick Reference Card Logging in Directory Commands at the Login: Prompt, Enter Your Username
    UNIX (Solaris/Linux) QUICK REFERENCE CARD Logging In Directory Commands At the Login: prompt, enter your username. At the Password: prompt, enter ls Lists files in current directory your system password. Linux is case-sensitive, so enter upper and lower case ls -l Long listing of files letters as required for your username, password and commands. ls -a List all files, including hidden files ls -lat Long listing of all files sorted by last Exiting or Logging Out modification time. ls wcp List all files matching the wildcard Enter logout and press <Enter> or type <Ctrl>-D. pattern Changing your Password ls dn List files in the directory dn tree List files in tree format Type passwd at the command prompt. Type in your old password, then your new cd dn Change current directory to dn password, then re-enter your new password for verification. If the new password cd pub Changes to subdirectory “pub” is verified, your password will be changed. Many systems age passwords; this cd .. Changes to next higher level directory forces users to change their passwords at predetermined intervals. (previous directory) cd / Changes to the root directory Changing your MS Network Password cd Changes to the users home directory cd /usr/xx Changes to the subdirectory “xx” in the Some servers maintain a second password exclusively for use with Microsoft windows directory “usr” networking, allowing you to mount your home directory as a Network Drive. mkdir dn Makes a new directory named dn Type smbpasswd at the command prompt. Type in your old SMB passwword, rmdir dn Removes the directory dn (the then your new password, then re-enter your new password for verification.
    [Show full text]
  • State of the Art: Where We Are with the Ext3 Filesystem
    State of the Art: Where we are with the Ext3 filesystem Mingming Cao, Theodore Y. Ts’o, Badari Pulavarty, Suparna Bhattacharya IBM Linux Technology Center {cmm, theotso, pbadari}@us.ibm.com, [email protected] Andreas Dilger, Alex Tomas, Cluster Filesystem Inc. [email protected], [email protected] Abstract 1 Introduction Although the ext2 filesystem[4] was not the first filesystem used by Linux and while other filesystems have attempted to lay claim to be- ing the native Linux filesystem (for example, The ext2 and ext3 filesystems on Linux R are when Frank Xia attempted to rename xiafs to used by a very large number of users. This linuxfs), nevertheless most would consider the is due to its reputation of dependability, ro- ext2/3 filesystem as most deserving of this dis- bustness, backwards and forwards compatibil- tinction. Why is this? Why have so many sys- ity, rather than that of being the state of the tem administrations and users put their trust in art in filesystem technology. Over the last few the ext2/3 filesystem? years, however, there has been a significant amount of development effort towards making There are many possible explanations, includ- ext3 an outstanding filesystem, while retaining ing the fact that the filesystem has a large and these crucial advantages. In this paper, we dis- diverse developer community. However, in cuss those features that have been accepted in our opinion, robustness (even in the face of the mainline Linux 2.6 kernel, including direc- hardware-induced corruption) and backwards tory indexing, block reservation, and online re- compatibility are among the most important sizing.
    [Show full text]
  • TS ODBC Dataserver Quick Start
    TS ODBC DataServerTM Quick Start Multiple-Tier Introduction This Multiple-Tier product includes 3 components. Follow the instructions below for each Windows workstation and DataServer Host component. Multiple-Tier components can be found by platform in a folder on the TS ODBC DataServer CD-ROM. Use these instructions for the TS ODBC Gateway for Windows version of the Multiple-Tier software. TS ODBC DataServer Server UNIX Server Install the Server on your UNIX Host system from cpio distribution media. This installation is required only once no matter how many workstations are connected. Logon as root. 1. Create and change (cd) to a base directory for the TS ODBC DataServer (For example, /usr/local/tsodbc). 2. Copy the distribution media to the system using cpio. (See Mounting UNIX CD-ROM devices on the reverse.) This example is for Linux (kernel 2.6.16+). Substitute the appropriate values for your environment. umask 0 cpio -icvBmud </mountpoint/linux2616/tsod_srv/tsod (for Linux use –ivBmud above) 3. Execute the install script. ./install 4. Activate the server (Refer to the Installation and Activation Guide). Windows Server Before continuing, review the updated installation instructions provided in the installation manual. NOTE: All Thoroughbred Windows based products prior to Version 8.7.0 must first be uninstalled and then the 8.7.1 release installed. Only 8.7.0 can be upgraded to 8.7.1 and only 8.7.0 and 8.7.1 can co-exist on the same system. If you are upgrading a pre 8.7.0 release, BEFORE continuing with this installation, please see the TS ODBC Installation and Activation Guide for complete instructions to properly prepare your system for 8.7.1.
    [Show full text]
  • Outline of Ext4 File System & Ext4 Online Defragmentation Foresight
    Outline of Ext4 File System & Ext4 Online Defragmentation Foresight LinuxCon Japan/Tokyo 2010 September 28, 2010 Akira Fujita <[email protected]> NEC Software Tohoku, Ltd. Self Introduction ▐ Name: Akira Fujita Japan ▐ Company: NEC Software Tohoku, Ltd. in Sendai, Japan. Sendai ● ▐ Since 2004, I have been working at NEC Software Tohoku developing Linux file system, mainly ext3 and ● ext4 filesystems. Tokyo Currently, I work on the quality evaluation of ext4 for enterprise use, and also develop the ext4 online defragmentation. Page 2 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Outline ▐ What is ext4 ▐ Ext4 features ▐ Compatibility ▐ Performance measurement ▐ Recent ext4 topics ▐ What is ext4 online defrag ▐ Relevant file defragmentation ▐ Current status / future plan Page 3 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. What is ext4 ▐ Ext4 is the successor of ext3 which is developed to solve performance issues and scalability bottleneck on ext3 and also provide backward compatibility with ext3. ▐ Ext4 development began in 2006. Included in stable kernel 2.6.19 as EXPERIMENTAL (ext4dev). Since kernel 2.6.28, ext4 has been released as stable (Renamed from ext4dev to ext4 in kernel 2.6.28). ▐ Maintainers Theodore Ts'o [email protected] , Andreas Dilger [email protected] ▐ ML [email protected] ▐ Ext4 Wiki http://ext4.wiki.kernel.org Page 4 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Ext4 features Page 5 Copyright(C) 2010 NEC Software Tohoku, Ltd. All Rights Reserved. Ext4 features Bigger file/filesystem size support. Compared to ext3, ext4 is: 8 times larger in file size, 65536 times(!) larger in filesystem size.
    [Show full text]
  • Wed04-Blagodarenkoartem-Scaling Ldiskfs for the Future. Again. LUG
    Scaling LDISKFS for the future. Again Artem Blagodarenko LUG 2017 Bloomington, Indiana LUG 2017 LDISKFS still grows As drive size increases ...8TB -> 10TB -> 12TB The maximum backend storage size increases ...16TB -> 500TB LDISKFS quickly exceeded the original design! LUG 2017 The summary of previous work Done Problems ➢code review ➢ inodes➢ count over ➢ Extending inodes testing suite UINT32_MAXcount over ➢patches with fixes ➢ largeUINT32_MAX memory blocks ➢move LDISKFS size allocation➢ Large memory limit to 256TB (LU- ➢ solutionblocks for allocation large 7592). directories➢ Large directories LUG 2017 Inode count limit (LU-1365) Example: a customer requires 16 billions of inodes on MDS Unfortunately we can not make Only 4 billions of inodes on 16 billions inodes on one MDT one MDT because of LDISKFS limitation mdt0 mdt1 We can use 4 MDTs with DNE but MDT’s space is not mdt2 mdt3 completely used 16 billions >4 billions inodes on LDISKFS LUG 2017 Inode count limit. Additional fields for ext4_dir_entry Offset Size Name Description 0x0 __le32 inode Inode number 0x4 __le16 rec_len Length of this directory entry 0x6 __u8 name_len Length of the file name 0x7 __u8 file_type File type (0x0F), Dirdata (0xF0) 0x8 __u8 lufid_len OST fid length 0x9 N fid EXT4_DIRENT_LUFID 0x8 + N __u8 hi_inode_len length, always 4 0x8 + N + 1 __le64 hi_inode EXT4_DIRENT_INODE LUG 2017 dirdata pros and cons ➕ less space for 64-bit inodes ➕ smaller dirents for 32-bit inodes ➕ more 32-bit dirents in leaf block ➕ backwards compatible with existing directories ➕ doesn’t
    [Show full text]