<<

IBM and : Community Innovation for your Business

Disk Storage Setup with Linux on System z

Susanne Wintenberger ([email protected]) IBM Lab Boeblingen, Germany

© 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Trademarks

The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.

Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market. Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.

For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:

*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA, WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®

The following are trademarks or registered trademarks of other companies.

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java­based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. , Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.

* All other products may be trademarks or registered trademarks of their respective companies.

Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non­IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the Performance, compatibility, or any other claims related to non­IBM products. Questions on the capabilities of non­IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

2 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Agenda

✱ Storage – File Systems – Partitions ✱ DASD ✱ LVM – Physical Volumes (PVs) – Volume Groups (VGs) – Logical Volumes (LVs)) – Advanced LVM Topics ✱ LVM Architecture ✱ dmsetup ✱ Summary

3 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Why is disk storage critical ?

✱ Host system Disk Storage usage – application data but used at System Application Utilities Binaries runtime Runtime environment – Swap space user heap and stack data – program text loaded on demand Page Cache Mem. Mgmt Kernel – shared memory segments paged out on memory shortage.

File Swap Appl. System Space Data

4 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux File Systems

✱ Traditional file systems – ext2 – minix – MS­DOS/VFAT ✱ Journaling file systems – – reiserFS – NTFS (New Technology File System)

5 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux File Systems (cont'd)

Process 1 Process 2 Process 2

*

* *

Virtual File System (VFS)

ext2 ext3 reiserfs VFAT ...

6 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux Partition ✱ Partitioning is a means to divide a single hard drive into many logical drives. ✱ A partition is a contiguous set of blocks on a drive that are treated as an independant disk ✱ Why have multiple partitions? – Encapsulate your data. – Since file system corruption is local to a partition, you stand to loose only some of your data if an accident occurs. – Increase disk space efficiency. – Limit data growth. – Runaway processes or maniacal users can consume so much disk space that the operating system no longer has room on the hard drive for its bookkeeping operations.

hans@larsson:~> df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/123[...]-part1 247G 22G 212G 10% / /dev/sda1 1004M 31M 923M 4% /boot /dev/mapper/home 247G 87G 157G 36% /home0 [...]

7 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux on System z Partitioning

disk type driver format with partition with

ECKD dasd dasdfmt fdasd

FBA dasd —

SCSI zfcp+scsi — fdisk

EDEV dasd — fdisk

8 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Querying information about the current DASD Setup

Printing a list of active DASD devices: hans@larsson:~> lsdasd Bus-ID Status Name Device Type BlkSz Size Blocks ======0.0.ec24 active dasda 94:0 ECKD 4096 7043MB 1803060

The same information can also be obtained from the file /proc/dasd/devices How to check which DASDs a currently configured for your guest root@larsson:~> vmcp q dasd |grep -i `hostname` DASD EC24 ATTACHED TO LARSSON EC24 R/W 0XEC24 DASD EC25 ATTACHED TO LARSSON EC25 R/W 0XEC25 DASD EC26 ATTACHED TO LARSSON EC26 R/W 0XEC26 DASD EC27 ATTACHED TO LARSSON EC27 R/W FREE DASD EC28 ATTACHED TO LARSSON EC28 R/W 0XEC28 DASD EC29 ATTACHED TO LARSSON EC29 R/W 0XEC29 If the name of your guest if different from your hostname use vmcp q dasd only

9 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Adding a DASD root@larsson:~> modprobe dasd_mod dasd=ec27 root@larsson:~> chccwdev -e ec27 Setting device 0.0.ec27 online Done root@larsson:~> lsdasd Bus-ID Status Name Device Type BlkSz Size Blocks ======0.0.ec24 active dasda 94:0 ECKD 4096 7043MB 1803060 0.0.ec27 n/f dasdb 94:4 ECKD root@larsson:~> dmesg|tail|grep dasd dasd(eckd): 0.0.ec27: 3390/0C(CU:3990/01) Cyl:10017 Head:15 Sec:224 dasd(eckd): 0.0.ec27: volume analysis returned unformatted disk

10 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

DASD low level format: dasdfmt formats a DASD (ECKD) disk to prepare it for usage with Linux on System z root@larsson:~> dasdfmt -d cdl -b 4096 -f /dev/dasdb -p Drive Geometry: 10017 Cylinders * 15 Heads = 150255 Tracks

I am going to format the device /dev/dasdb in the following way: Device number of device : 0xec27 Labelling device : yes Disk : VOL1 Disk identifier : 0XEC27 Extent start (trk no) : 0 Extent end (trk no) : 150254 Compatible Disk Layout : yes Blocksize : 4096

--->> ATTENTION! <<--- All data of that device will be lost. Type "yes" to continue, no will leave the disk untouched: yes Formatting the device. This may take a while (get yourself a coffee). cyl 385 of 3339 |#####------| 11%

11 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

DASD: Partitioning Compared to other architectures, Linux on System z makes use of its own partitioning tool for DASD devices. The common Linux tool fdisk can not be used in this environment! Nevertheless the handling is similar. The system is limited to 3 partitions per disk when using DASD root@larsson:~> fdasd /dev/dasdb reading volume label ..: VOL1 reading vtoc ...... : ok

Command action m print this menu p print the partition table n add a new partition d delete a partition v change volume serial t change partition type r re-create VTOC and delete all partitions u re-create VTOC re-using existing partition sizes s show mapping (partition number - data set name) q quit without saving changes w write table to disk and exit

Command (m for help):

12 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

DASD: Partitioning (cont'd)

To create a partition:

root@larsson:~> fdasd /dev/dasdb [...] Command (m for help): n First track (1 track = 48 KByte) ([2]-150254): Using default value 2 Last track or +size[c|k|M] (2-[150254]): Using default value 150254

Command (m for help): p

Disk /dev/dasdb: cylinders ...... : 10017 tracks per cylinder ..: 15 blocks per track .....: 12 bytes per block ...... : 4096 volume label ...... : VOL1 volume serial ...... : 0XEC27 max partitions ...... : 3

------tracks ------Device start end length Id System /dev/dasdb1 2 150254 150253 1 Linux native

13 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

DASD: Partitioning (cont'd)

Your configuration is not completed before you write the changes to the disk

root@larsson:~> fdasd /dev/dasdb [...] Command (m for help): w writing VTOC... rereading partition table...

Now we have a new device partition (e.g. /dev/dasdb1) which can be used as any other Linux disk

root@larsson:~> mke2fs -j /dev/dasdb1 mke2fs 1.41.4 (27-Jan-2009) [...] Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.

14 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Summary: Useful Commands chccwdev ­e/­d enable/disable ccw device dasdview display extended DASD information dasdfmt low level format for DASD (ECKD) devices fdasd partitioning tool for DASD lsdasd list DASD related device information

15 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Logical Volume Management (LVM)

✱ Linux file system is basically

inflexible * ✱ It is difficult to modify partitions on /dev/vg1/lv1 /dev/vg1/lv2 a running system * * ✱ LVM provides a higher­level view Logical Volumes of the disk storage

✱ Gives you much more flexibility in Volume Group allocating storage to applications and users ✱ You can resize and move logical volumes while partitions are still mounted and running

✱ Use LVM to manage logical Physical Volumes volumes with names that make sense

16 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

LVM Features

General: You can combine several hard disks or partitions You can enlarge a logical volume when free space is exhausted You can add hard disks to the volume group in a running system You can add logical volumes in a running system You can use several hard disks with improved performance in the RAID 0 (striping) mode You can add up to 256 logical volumes The Snapshot feature enables consistent backups Benefits for Linux on System z Minidisks on z/VM cannot span more than one physical DASD volume. Without a volume management system like LVM the size of a file system is limited to the size of a DASD volume.

17 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Physical Volume Setup

Define each Physical Volume PV root@larsson:~> pvcreate /dev/dasdb1 pvcreate -- physical volume "/dev/dasdb1" successfully created root@larsson:~> pvcreate /dev/dasdc1 pvcreate -- physical volume "/dev/dasdd1" successfully created You can only add partitions and not a complete dasd (e.g. /dev/dasdd) Check your setup

root@larsson:~> pvdisplay "/dev/dasdb1" is a new physical volume of "2.29 GB" --- NEW Physical volume --- PV Name /dev/dasdb1 VG Name PV Size 2.29 GB Allocatable NO PE Size (KByte) 0 Total PE 0 Free PE 0 Allocated PE 0 PV UUID vwVK7v-WjGR-fdFh-GKOZ-WAtO-2cTy-r1gVs1

"/dev/dasdb1" is a new physical volume of "2.29 GB" [...]

18 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Volume Group Setup

Create a Volume Group VG. This will be the „device“ which will contain the physical volume(s) root@larsson:~> vgcreate vg1 /dev/dasdb1 /dev/dasdc1 Volume group ”vg1" successfully created root@larsson:~> vgdisplay --- Volume group --- VG Name vg1 System ID Format lvm2 Metadata Areas 2 Metadata Sequence No 4 VG Access read/write VG Status resizable

[…]

VG Size 4.58 GB PE Size 4.00 MB Total PE 1172 Alloc PE / Size 0 / 0 Free PE / Size 1172 / 4.58 GB VG UUID mQYKAf-i51q-N67F-jK2h-WL3C-hO7N-5kUWsG

19 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Creating a Logical Volume

The following example creates a linear logical volume named lv1, based on the volume group vg1 with the size of 4.5 GigaByte root@larsson:~> lvcreate -L 4.5G -n lv1 vg1 Logical volume "lv1" created root@larsson:~> lvdisplay --- Logical volume --- LV Name /dev/vg1/lv1 VG Name vg1 LV UUID 7Dso3A-ESEJ-mFEq-0TKu-cJcz-irox-XPsZeN LV Write Access read/write LV Status available # open 0 LV Size 4.50 GB Current LE 1152 Segments 2 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 253:0

To create a striped logical volume, use ­i parameter lvcreate -L 4.5G -i 2 -n lv1 vg1

20 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

LVM – Simple Example (linear device)

LV1 LV2 LV3 LE 0­5 LE 0­7 LE 0­9 6*4=24MB 8*4=32MB 10*4=40MB Logical Volume (LV) Mapping LE­>PV/PE e.g. LV3:LE7 Volume Group ­>PV2:PE9 96MB

Physical PV1 PV2 Volume (PV) PE 0­11 PE 0­11 12*4 = 48MB 12*4 = 48MB 1PE = 1LE = 4MB (default size)

21 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

LVM environment for striping

✱ Data evenly spread across Logical Volume disks

✱ The I/O capacity of the disk drives can be used in parallel Volume Group to access data on the logical volume. 1 2 3 ✱ Performance improvement 4 5 6

✱ No fault­tolerance Physical Physical Physical Volume Volume Volume

© 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

LVM components at a glance

✱ A DASD partition is called a physical volume (PV), because that’s the volume where the data is physically stored. ✱ The PV is divided into several physical extents (PE) of the same size. ✱ The PEs are like blocks on the PV. ✱ Several PVs make up a volume group (VG), which becomes a pool of PEs available for the logical volume (LV). ✱ The LVs appear as normal devices in /dev/ . ✱ You can add or delete PVs to/from a VG, and increase/decrease your LVs.

23 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

How to create a filesystem on top of a Logical Volume

Use the same commands as for any other regular linux storage device

root@larsson:~> mke2fs -j /dev/vg1/lv1 mke2fs 1.41.4 (27-Jan-2009) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) 65536 inodes, 262144 blocks 13107 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=268435456 8 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376

Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 38 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.

24 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Some more advanced LVM topics (cont.)

✱ Removing a volume group – Deactivate the volume group: vgchange -a n vg1 – Now you actually remove the volume group: vgremove vg1 ✱ Adding physical volumes to a volume group – use 'vgextend' to add an initialized physical volume to an existing volume group: vgextend vg1 /dev/dasdd1 ✱ Removing physical volumes from a volume group – Then use 'vgreduce' to remove the physical volume: vgreduce vg1 /dev/dasdd1

25 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Some more advanced LVM topics (cont.)

✱ Removing a logical volume A logical volume must be closed before it can be removed: root@larsson:~> umount /dev/vg1/lv1 root@larsson:~> lvremove /dev/vg1/lv1 lvremove -- do you really want to remove "/dev/vg1/lv1"? [y/n]: y lvremove -- doing automatic backup of volume group "vg1" lvremove -- logical volume "/dev/vg1/lv1" successfully removed

✱ Reducing a logical volume (only ext2,ext3 and reiserfs) Logical volumes can be reduced in size as well as increased. However, it is very important to remember to reduce the size of the file system or whatever is residing in the volume before shrinking the volume itself, otherwise you risk losing data.

root@larsson:~> umount /lv1 root@larsson:~> resize2fs /dev/vg1/lv1 524288 root@larsson:~> lvreduce -L-1G /dev/vg1/lv1 root@larsson:~> /dev/vg1/lv1 /lv1

26 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Some more advanced LVM topics (cont.)

✱ Extending a logical volume To extend a logical volume you simply tell the lvextend command how much you want to increase the size. You can specify how much to grow the volume, or how large you want it to grow to: root@larsson:~> lvextend -L6.7G /dev/vg1/lv1 lvextend -- extending logical volume "/dev/vg1/lv1" to 6.7 GB lvextend -- doing automatic backup of volume group "vgg1" lvextend -- logical volume "/dev/vg1/lv1" successfully extended

You can also specify a certain size to grow: lvextend -L+2.2G /dev/vg1/lv1 After you have extended the logical volume it is necessary to increase the file system size to match. how you do this depends on the file system you are using.

root@larsson:~> umount /dev/vg1/lv1 root@larsson:~> resize2fs /dev/vg1/lv1 root@larsson:~> mount /dev/vg1/lv1 /lv1

Reiserfs file systems can be resized using the following command: resize_reiserfs

27 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Some more advanced LVM topics (cont.)

✱ Activating of logical volumes after reboot Unless devices are enabled during boot time, any logical volumes must be activated later, when the system is up and running. You do this with the following sequence of comands:

root@larsson:~> chccwdev -e root@larsson:~> pvscan pvscan -- PV /dev/dasdb1 VG vg1 lvm2 [2.29 GB / 88.00 MB free] [...] root@larsson:~> vgscan vgscan -- Reading all physical volumes. This may take a while... vgscan -- Found volume group "vg1" using metadata type lvm2 root@larsson:~> lvscan lvscan -- inactive '/dev/vg1/lv1' [4.41 GB] inheritd root@larsson:~> vgchange -ay vgchange -- 1 logical volume(s) in volume group "vg1" now active

✱ For more Information about LVM please take a look at the HOWTO at ✱ http://tldp.org/HOWTO/LVM­HOWTO/

28 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

LVM command overview

✱ Create ✱ Attributes / Status – {pv|vg|lv}create – {pv|vg|lv}change ✱ Display ✱ Organize – {pv|vg|lv}display – pvmove – {pv|vg|lv}s – {vg|lv}extend / {vg|lv}reduce ✱ Remove – {vg|lv}rename – {pv|vg|lv}remove – vgcfgbackup / vgcfgrestore ✱ Scan – lvconvert – {pv|vg|lv}scan

29 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux LVM Architecture

✱ Logical Volume Management applications – dmsetup low level logical volume management dmsetup LVM2 Multipath – LVM2 latest version of Logical Volume Manager libdevmapper – Multipath multipath configuration tool User Space

kernel

Device Mapper

30 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Linux LVM Architecture (cont.)

✱ Libdevmapper library for interaction between user and kernel device mapper

✱ Device Mapper – Device Mapper is a kernel driver that provides a generic framework for volume management. – It provides a modular framework for constructing virtual block devices that map I/O to and from existing devices according to a table of rules. target drivers like dmsetup LVM2 Multipath • Linear/striped target • Mirror target • Multipath target libdevmapper – Responsibilities • Discover set of associated devices • Create mapped devices (e.g. logical volume) • Create mapping table containing configuration information • Pass mapping table into kernel Device Mapper • Possibly save mapping information

31 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business dmsetup options overview

✱ Manage ✱ Info – create – ls – remove – deps (ls ­­tree) – remove_all – table – rename – status – suspend / resume – info (table ­v / status ­v) – load / reload

32 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Useful dmsetup commands

✱ List the current table for the device hans@larsson:~> dmsetup table vg1-lv1: 0 4800512 linear 94:5 384 vg1-lv1: 4800512 4636672 linear 94:9 384

✱ Give a list of (major, minor) pairs for devices referenced by the table

hans@larsson:~> dmsetup deps vg1-lv1: 2 dependencies : (94, 9) (94, 5)

✱ List device names. ­­tree displays dependencies between devices as a tree

hans@larsson:~> dmsetup ls --tree vg1-lv1 (253:0) ├─ (94:9) └─ (94:5)

– Major 253 is major number of device­mapper

33 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Outlook

Additional storage topics: ✱ PAV ✱ HyperPAV

34 © 2009 IBM Corporation IBM and Linux: Community Innovation for your Business

Questions?

35 © 2009 IBM Corporation