Souborové Systémy V Linuxu
Total Page:16
File Type:pdf, Size:1020Kb
Souborov´esyst´emyv Linuxu Red Hat Luk´aˇsCzerner May 16, 2016 Copyright c 2016 Luk´aˇsCzerner, Red Hat. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the COPYING file. Agenda 1 Co je to souborov´ysyst´em 2 Z´akladn´ıpojmy 3 Rozhran´ısouborov´ychsyst´em˚u 4 Intern´ıstruktury 5 Konzistence pˇriv´ypadku 6 Pokroˇcil´efunkce 7 Nov´etypy zaˇr´ızen´ı 8 Jak se zapojit 9 Ot´azky Part I Co je to souborov´ysyst´em? Co je to souborov´ysyst´em? Zp˚usoborganizace dat na nosn´emm´ediuve formˇesoubor˚ua adres´aˇr˚u Snadn´ypˇr´ıstup Uˇzivatelsk´adata snadno pˇr´ıstupn´apojmenovan´ychsouborech Soubory seskupen´ev pojmenovan´ychadres´aˇr´ıch Snadno pochopiteln´astromov´astruktura Virtualizace adresov´ehoprostoru m´edia Adresov´yprostor souboru vs. logick´yprostor m´edia Prostory jednotliv´ychsoubor˚ujsou na sobˇenez´avisl´e R´ızen´ıpˇr´ıstupuˇ Pr´avake ˇcten´ı,z´apisu Kv´oty pro omezen´ımnoˇzstv´ıdat Typy souborov´ychsyst´em˚u Souborov´esys´emyv uˇzivatelsk´emprostoru pouˇzit´ımjadern´ehomodulu FUSE GlusterFS, sshfs Souborov´esyst´emyv jadern´emprostoru Distribuovan´e,s´ıˇtov´e,lok´aln´ı Speci´aln´ısouborov´esyst´emy Pseudo souborov´esyst´emy Utility v uˇzivatelsk´emprostoru N´astrojepro vytvoˇren´ısouborov´eho syst´emu mkfs.ext4, mkfs.xfs, ... Vytvoˇren´ısouborov´ehosyst´emus dan´ymiparametry N´astrojepro kontrolu souborov´ehosyst´emu fsck.ext4, xfs repair, ... Kontrola, oprava, optimalizace N´astrojepro spr´avusouborov´ehosyst´emu btrfs, resize2fs, tune2fs, xfs growfs, debugfs Vˇseod zmˇenyvelikosti, pˇresexport metadat aˇzk detailn´ı ´upravˇeintern´ıchstruktur Part II Pojmy D˚uleˇzit´estruktury Inode - Index node Struktura reprezentuj´ıc´ıvˇsechnytypy soubor˚u- v pamˇeti i mode - typ souboru i ino - ˇc´ısloinode i nlink - poˇcetodkaz˚una inode i size - velikost inode dalˇs´ıviz. include/linux/fs.h:528 Dentry - Directory entry Struktura mapuj´ıc´ıjm´enosouboru na ˇc´ısloinode - v pamˇeti d parent - ukazatel rodiˇcovskou dentry d name - struktura obsahuj´ıc´ıjm´enoz´aznamu d inode - odkaz na inode - m˚uˇzeb´ytNULL dalˇs´ıviz. include/linux/dcache.h:108 D˚uleˇzit´estruktury - pokraˇcov´an´ı File Reprezentuje otevˇren´ysoubor f path - struktura reprezentuje cestu k souboru f inode - odkaz na pˇr´ısluˇsnouinode f mode - m´odotevˇren´ehosouboru f pos - aktu´aln´ıpozice v souboru dalˇs´ıviz. include/linux/fs.h:776 Superblock Identifikuje dan´ysouborov´ysyst´emna m´ediu- v pamˇeti s dev - ˇc´ısloidentifikuj´ıc´ızaˇr´ızen´ı s blocksize - velikost bloku s type - struktura popisuj´ıc´ıtyp souborov´ehosyst´emu s magic - magick´e ˇc´ısloidentifikuj´ıc´ıtyp souborov´ehosyst´emu dalˇs´ıviz. include/linux/fs.h:1821 Dalˇs´ıpojmy Blok Nejmenˇs´ıalokovateln´ajednotka souborov´ehosyst´emu Str´anka Pˇresundat mezi pamˇet´ıa z´aznamov´ymm´ediem Part III Rozhran´ısouborov´ychsyst´em˚u The Linux I/O Stack Diagram version 1.0, 2012-06-20 outlines the Linux I/O stack as of Kernel version 3.3 mmap (anonymous pages) Applications (Processes) malloc ... stat(2) read(2) open(2) write(2) chmod(2) VFS block based FS Network FS pseudo FS special ext2 ext3 ext4 purpose FS direct I/O NFS coda proc sysfs Page xfs btrfs tmpfs (O_DIRECT) ifs smbfs ... pipefs futexfs ramfs Cache iso9660 gfs ocfs ... devtmpfs ... usbfs network stackable Block I/O Layer optional stackable devices on topLVM of “normal” block devices – work on bios mdraid device drbd ... mapper BIOs (Block I/O) I/O Scheduler maps bios to requests cfq deadline noop hooked in Device Drivers (hook in similar like stacked devices like request-based mdraid/device mapper do) device mapper targets /dev/fio* /dev/rssd* dm-multipath SCSI upper layer iomemory-vsl mtip32xx with module option /dev/vd* /dev/fio* /dev/sda /dev/sdb ... /dev/nvme#n# nvme sysfs (transport attributes) SCSI mid layer virtio_blk iomemory-vsl Transport Classes scsi_transport_fc scsi_transport_sas SCSI low layer scsi_transport_... libata megaraid sas aacraid qla2xxx lpfc iscsi_tcp ... ahci ata_piix ... network HDD SSD DVD LSI Adaptec Qlogic Emulex ... Fusion-io nvme Micron drive RAID RAID HBA HBA PCIe Card device PCIe Card Physical devices The Linux I/O Stack Diagram (version 1.0, 2012-06-20) http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.html Created by Werner Fischer and Georg Schönberger License: CC-BY-SA 3.0, see http://creativecommons.org/licenses/by-sa/3.0/ Rozhran´ısouborov´ychsyst´em˚u Syst´emov´avol´an´ı Standardn´ırozhran´ıpro komunikaci s j´adrem read, write, stat, open, close, unlink, fallocate, ... Input/output control - ioctl P˚uvodnˇeslouˇzilopro komunikaci s HW zaˇr´ızen´ım Dnes se zneuˇz´ıv´ajako "levn´e"rozhran´ıpro cokoliv FIFREEZE, FITRIM, EXT4 IOC RESIZE FS, XFS IOC ZERO RANGE, ... procfs, sysfs Speci´aln´ısoubory vˇetˇsinouslouˇz´ıpro exportov´an´ıinformac´ıdo uˇzivatelsk´ehoprostoru /sys/fs/ext4/features/lazy itable init Nˇekdyvˇsaki pro nastaven´ıparametr˚u /sys/fs/ext4/sda1/extent max zeroout kb The Linux I/O Stack Diagram version 1.0, 2012-06-20 outlines the Linux I/O stack as of Kernel version 3.3 mmap (anonymous pages) Applications (Processes) malloc ... stat(2) read(2) open(2) write(2) chmod(2) VFS block based FS Network FS pseudo FS special ext2 ext3 ext4 purpose FS direct I/O NFS coda proc sysfs Page xfs btrfs tmpfs (O_DIRECT) ifs smbfs ... pipefs futexfs ramfs Cache iso9660 gfs ocfs ... devtmpfs ... usbfs network stackable Block I/O Layer optional stackable devices on topLVM of “normal” block devices – work on bios mdraid device drbd ... mapper BIOs (Block I/O) I/O Scheduler maps bios to requests cfq deadline noop hooked in Device Drivers (hook in similar like stacked devices like request-based mdraid/device mapper do) device mapper targets /dev/fio* /dev/rssd* dm-multipath SCSI upper layer iomemory-vsl mtip32xx with module option /dev/vd* /dev/fio* /dev/sda /dev/sdb ... /dev/nvme#n# nvme sysfs (transport attributes) SCSI mid layer virtio_blk iomemory-vsl Transport Classes scsi_transport_fc scsi_transport_sas SCSI low layer scsi_transport_... libata megaraid sas aacraid qla2xxx lpfc iscsi_tcp ... ahci ata_piix ... network HDD SSD DVD LSI Adaptec Qlogic Emulex ... Fusion-io nvme Micron drive RAID RAID HBA HBA PCIe Card device PCIe Card Physical devices The Linux I/O Stack Diagram (version 1.0, 2012-06-20) http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.html Created by Werner Fischer and Georg Schönberger License: CC-BY-SA 3.0, see http://creativecommons.org/licenses/by-sa/3.0/ VFS - Virtual File System Switch Pohled uˇzivatelsk´ehoprostoru Abstraktn´ıvrstva poskytuj´ıc´ıjednotn´euˇzivatelsk´erozhran´ı mezi uˇzivatelsk´ymi aplikacemi a r˚uzn´ymisouborov´ymisyst´emy Aplikace nemus´ıvˇedˇetzda pˇristupuje k lok´aln´ımu,nebo s´ıˇtov´emusouborov´emusyst´emu Pohled j´adra Abstraktn´ıvrstva poskytuj´ıc´ıjednotn´erozhran´ımezi j´adrem a souborov´ymisyst´emy Poskytuje funkce spoleˇcn´epro vˇsechnysouborov´esyst´emy Usnadˇnujev´yvojnov´ehosouborov´ehosyst´emu Objektovˇeorientovan´ypˇr´ıstup file operations inode operations dentry operations super operations address space operations File operations struct file_operations { .... loff_t (*llseek) (struct file *, loff_t, int); ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); int (*mmap) (struct file *, struct vm_area_struct *); int (*open) (struct inode *, struct file *); int (*fsync) (struct file *, loff_t, loff_t, int datasync); int (*fasync) (int, struct file *, int); long (*fallocate)(struct file *file, int mode, loff_t offset, loff_t len); ... }; VFS - Virtual File System Switch Dentry cache Spravuje hashovac´ıtabulku adres´aˇrov´ychz´aznam˚u Pˇrivytv´aˇren´ıcesty k inode jsou uloˇzenyvˇsechny prvky cesty Pˇrivytvoˇren´ız´aznamuv dentry cache je z´aroveˇnvytvoˇren pˇr´ısluˇsn´yz´aznamv inode cache Inode cache Spravuje hashovac´ıtabulku inode Urychluje pˇr´ıstupk inode Novou, pr´azdnouinode naplˇnujes´amsouborov´ysyst´em The Linux I/O Stack Diagram version 1.0, 2012-06-20 outlines the Linux I/O stack as of Kernel version 3.3 mmap (anonymous pages) Applications (Processes) malloc ... stat(2) read(2) open(2) write(2) chmod(2) VFS block based FS Network FS pseudo FS special ext2 ext3 ext4 purpose FS direct I/O NFS coda proc sysfs Page xfs btrfs tmpfs (O_DIRECT) ifs smbfs ... pipefs futexfs ramfs Cache iso9660 gfs ocfs ... devtmpfs ... usbfs network stackable Block I/O Layer optional stackable devices on topLVM of “normal” block devices – work on bios mdraid device drbd ... mapper BIOs (Block I/O) I/O Scheduler maps bios to requests cfq deadline noop hooked in Device Drivers (hook in similar like stacked devices like request-based mdraid/device mapper do) device mapper targets /dev/fio* /dev/rssd* dm-multipath SCSI upper layer iomemory-vsl mtip32xx with module option /dev/vd* /dev/fio* /dev/sda /dev/sdb ... /dev/nvme#n# nvme sysfs (transport attributes) SCSI mid layer virtio_blk iomemory-vsl Transport Classes scsi_transport_fc scsi_transport_sas SCSI low layer scsi_transport_... libata megaraid sas aacraid qla2xxx lpfc iscsi_tcp ... ahci ata_piix ... network HDD SSD DVD LSI Adaptec Qlogic Emulex ... Fusion-io nvme Micron drive RAID RAID HBA HBA PCIe Card device PCIe Card Physical devices The Linux I/O Stack Diagram (version 1.0, 2012-06-20) http://www.thomas-krenn.com/en/oss/linux-io-stack-diagram.html Created by Werner Fischer and Georg Schönberger License: CC-BY-SA 3.0, see http://creativecommons.org/licenses/by-sa/3.0/ Page Cache Diskov´acache - urychluje opakovan´ypˇr´ıstupk dat˚umna m´ediu Pouˇz´ıv´ase pro vˇsechnypˇrenosov´eoperace (kromˇeoperace pˇr´ım´ehopˇr´ıstupudirect I/O) Se str´ankami se opˇetpracuje pomoc´ıalgoritmu LRU Inode address space Obsahuje mimo jin´eodkaz na strom str´anekpˇr´ısluˇs´ıc´ıch dan´emusouboru address space operations pro manipulaci se str´ankami readpage, writepage, invalidatepage Ulehˇcujepr´acisouborov´emusyst´emu- pr´acese str´ankami Pˇr´ıznakystr´anek PG dirty, PG uptodate, PG locked, PG active, ..