<<

FreeBSD ZFS - English

Virtualization and ZFS • Quotas can be applied either for the whole pool or for Because of its internal structure ZFS is designed for the one single mount point: concept of virtualization called zones. This applies to # set quota=10g daten/home/user1 FreeBSD FreeBSD jails too. So it is possible to attach a part of a # zfs set reservation=20g daten/home/user2 ZPOOL to a zone which can be administrated by itself. In • A snapshot is created for the user user on Thursday: a zone the hardware is inaccessible which is a great be- ZFS: State of the art # zfs snapshot pool/home/user@thursday nefit for security. • User user has deleted important files from Monday, a in FreeBSD simple rollback restores the lost data: Connectivity In OpenSolaris a new file system was introduced that In case of connectivity ZFS has to offer some interesting # zfs rollback pool/home/user@monday was quickly ported to FreeBSD: ZFS (Zettabyte File Sys- things. Without any effort shares for SMB/CIFS or NFS • If snapshots are created in regular intervals you can tem). An implementation for Linux was realized via the are created. A volume can be prepared for an iSCSI tar- take a look at older versions of files: Linux FUSE framework. This 128-bit file system offers get or for swapping. ZFS offers also interfaces for other $ cat ~user/.zfs/snapshot/sunday/image.odg very high reliability and speed and data integrity. It is filesystems. So you can format volumes with popular also possible to administer very large amounts of data. filesystems like UFS or FAT but this is not necessary. Technical data of ZFS Internal Structure Compression and Encryption Attributes Internally ZFS consists of three layers. The lowest layer The compression works for the user fully transparent • POSIX & NFSV4 file access controls ist the ZPOOL-administration. It communicates directly whereas each block is compressed separately. The file • Transparent compression: with the hardware via device drivers. There's no need encryption which processes the data blockwise is cur- LZJB (Lossless Data Compression), GZIP (GNU-Zip), for a volume manager (LVM) anymore because ZFS also rently under construction. It works fully transparent for additional compression methods will follow takes care of disk drive administration here. user, too. At this time only the encryption algorithm • Transparent encryption: AES_CBC is implemented. Other algorithms are cur- AES_CBC (beta phase!) rently developed. Maximum values for files • length of file names: 255 bytes Short description of commands • length of names: no limit • : 264 bytes The following commands show how easy it is to admi- • files in an folder: 256 (limited to 248) nistrate an ZPOOL. Keep in mind that the identification of the data carriers are different between Solaris and Maximum values for attributes FreeBSD. • size of attributes: 264 bytes • number of attributes: 256(limited to 248) • Creating an mirrored ZPOOL Other maximum values # zpool create pool mirror da1 da2 64 • • size of filesystem: 2 Bytes Creating the filesystem for the home and • number of snapshots: 264 mounting to /export/home: • size of ZPOOL: 278 # zfs create pool/home • number of devices in ZPOOL: 264 # zfs set mountpoint=/export/home pool/home • number of ZPOOLs per system: 264 Internal structure of ZFS 64 • Creating of home directories for user user: • number of filesystems of each ZPOOL: 2 • time ticks: 1 ns The second layer is the actual core of ZFS, here the tran- # zfs create pool/home/user • sactions and checksums are processed and the balancing Adding two more harddisks: of the Merkle-Tree is controlled. The highest layer com- # zpool add pool mirror da3 da4 Further Information municates with userland and serves as interface for • ZFS project at OpenSolaris: • Shares can be created easily. Sharing all home direc- other services. : http://opensolaris.org/os/community/zfs/ tories for NFS in the network # zfs set sharenfs=rw pool/home • ZFS project at FreeBSD: • http://wiki.freebsd.org/ZFS Turning on compression and selecting the algorithm http://wiki.FreeBSD.org/ZFSKnownProblems LZJB: http://www.freebsd.org/doc/en/books/handbook/file- # zfs set compression=lzjb pool systems-zfs.html

© 2005-2010 allBSD.de Projekt – written by Jürgen Dankoweit, translated by Lars Cleary. The mark FreeBSD and the FreeBSD Logo is a registered trademark of The FreeBSD Foundation and is used by allBSD with the permission of The FreeBSD Foundation. Valid as of 01.02.2010 Storage Pool Reliability with Checksums ZFS Pools are classified as follows: ZFS-MIRROR: An even number of disks are arrayed like The file system ZFS combines physical storage devices to In every superordinate node a checksum is stored. Er- for RAID-1 so the data on each disk is identical. a logical unit. This ist called storage pool. Within this rors while reading data can be corrected with them. pool any number of logical partitions (each with a file That way silent errors do not go unnoticed. Hardware ZFS-RAIDZ: This array is similar to RAID-5 and needs at system) can be created. Hardware RAID controllers come RAID controllers offer no way of detecting silent errors. least two disks. Data blocks and parity blocks are distrib- to mind. By contrast the logical partitions can shrink or uted over all disks. The total usable capacity is (N-P)*X grow in ZFS as far as the size of the pool allows that. In- End to End Data Integrity bytes. N is the number of disks, P the number of parity disks and X the capacity of the smallest disk. teresting is that the full bandwidth is always usable. Transactions and checksums guarantee data integrity from disk node to RAM. Every step is secured with the ZFS-RAIDZ1: (single parity): Only one parity bit is used calculation and comparison of checksums. For the user here, which is like hardware RAID-5 with the same prop- and application this is transparent. erties. ZFS-RAIDZ2: (double parity): Two parity bits are used Snapshots here, which is like hardware RAID-6 with the same prop- The Copy-On- method offers a very efficient and erties. cheap way of creating snapshots. The original node is The Write-Hole (data inconsistency due to loss of power) simply not released. And since all timestamps are stored known from hardware RAID is not a problem because in the nodes there is an easy way to create and maintain ZFS offers end-to-end data integrity. Long consistency a journal. checks of the file system (fsck) are also unnecessary. For the GNOME-Nautilus there is now also a plugin that enables snapshot administration by simple Self-healing mouse clicks via graphical . ZFS offers self-healing. This offers an option to repair If snapshots are created regularly you have the means of corrupt data. Currently no hardware RAID has that fea- Data Operations with Transactions seeing older versions of your data by simply changing ture implemented! This technology, known from databases, was fully imple- into the snapshot directory (${HOME}/.zfs/...). mented in ZFS. In case of an error the state before the A Short Description of a ZFS-Mirror error is maintained and data consistency ensured. The first disk has a defective sector. ZFS reads data from To do this the Copy-On-Write method was integrated. this disk and detects corrupt data via checksum. First a copy is made of the node to be written to. The Now the correct information is from the second new data and checksums are stored in this copy. After disk. ZFS now can write the correct data to a working all data is written the links of the nodes are updated. area of the first disk. After this procedure the ZFS-Mir- ror has an consistent state again. All this is full transpar- ent for users. Self-healing works with ZFS-Mirror as well as with ZFS_RAIDZ or ZFS-RAIDZ2.

Snapshot administration with GNOME-Nautilus

ZFS RAID Types and Pools Like hardware RAID systems ZFS also offers different kinds of disk arrays without neglecting data integrity. These arrays are called Pools. ZFS also offers something special here: ZFS automatically calculates the correct stripe size and changes the parameters dynamically without needing to reformat the file system This way small stripes are used Wokflow of „Copy on Write“ for small files and large stripes for large files. So you al- ways have maximum speed and minimal waste of space. Self-healing procedure for ZFS-MIRROR

© 2005-2010 allBSD.de Projekt – written by Jürgen Dankoweit, translated by Lars Cleary. The mark FreeBSD and the FreeBSD Logo is a registered trademark of The FreeBSD Foundation and is used by allBSD with the permission of The FreeBSD Foundation. Valid as of 01.02.2010