GFS2 Filesystem 253 Steven Whitehouse

Proceedings of the Linux Symposium Volume Two June 27th–30th, 2007 Ottawa, Ontario Canada Contents Unifying Virtual Drivers 9 J. Mason, D. Shows, and D. Olien The new ext4 filesystem: current status and future plans 21 A. Mathur, M. Cao, S. Bhattacharya, A. Dilger, A. Tomas, & L. Vivier The 7 dwarves: debugging information beyond gdb 35 Arnaldo Carvalho de Melo Adding Generic Process Containers to the Linux Kernel 45 P.B. Menage KvmFS: Virtual Machine Partitioning For Clusters and Grids 59 Andrey Mirtchovski Linux-based Ultra Mobile PCs 65 Rajeev Muralidhar, Hari Seshadri, Krishna Paul, & Srividya Karumuri Where is your application stuck? 71 S. Nagar, B. Singh, V. Kashyap, C. Seethraman, N. Sharoff, & P. Banerjee Trusted Secure Embedded Linux 79 Hadi Nahari Hybrid-Virtualization—Enhanced Virtualization for Linux 87 J. Nakajima and A.K. Mallick Readahead: time-travel techniques for desktop and embedded systems 97 Michael Opdenacker Semantic Patches 107 Y. Padioleau & J.L. Lawall cpuidle—Do nothing, efficiently. 119 V. Pallipadi, A. Belay, & S. Li My bandwidth is wider than yours 127 Iñaky Pérez-González Zumastor Linux Storage Server 135 Daniel Phillips Cleaning up the Linux Desktop Audio Mess 145 Lennart Poettering Linux-VServer 151 H. Pötzl Internals of the RT Patch 161 Steven Rostedt lguest: Implementing the little Linux hypervisor 173 Rusty Russell ext4 online defragmentation 179 Takashi Sato The Hiker Project: An Application Framework for Mobile Linux Devices 187 David Schlesinger Getting maximum mileage out of tickless 201 Suresh Siddha Containers: Challenges with the memory resource controller and its performance 209 Balbir Singh and Vaidyanathan Srinivasan Kernel Support for Stackable File Systems 223 Sipek, Pericleous & Zadok Linux Rollout at Nortel 229 Ernest Szeideman Request-based Device-mapper multipath and Dynamic load balancing 235 Kiyoshi Ueda Short-term solution for 3G networks in Linux: umtsmon 245 Klaas van Gend The GFS2 Filesystem 253 Steven Whitehouse Driver Tracing Interface 261 David J. Wilder, Michael Holzheu, & Thomas R. Zanussi Linux readahead: less tricks for more 273 Fengguang Wu, Hongsheng Xi, Jun Li, & Nanhai Zou Regression Test Framework and Kernel Execution Coverage 285 Hiro Yoshioka Enable PCI Express Advanced Error Reporting in the Kernel 297 Y.M. Zhang and T. Long Nguyen Enabling Linux Network Support of Hardware Multiqueue Devices 305 Zhu Yi and PJ. Waskiewicz Concurrent Pagecache 311 Peter Zijlstra Conference Organizers Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering C. Craig Ross, Linux Symposium Review Committee Andrew J. Hutton, Steamballoon, Inc., Linux Symposium, Thin Lines Mountaineering Dirk Hohndel, Intel Martin Bligh, Google Gerrit Huizenga, IBM Dave Jones, Red Hat, Inc. C. Craig Ross, Linux Symposium Proceedings Formatting Team John W. Lockhart, Red Hat, Inc. Gurhan Ozen, Red Hat, Inc. John Feeney, Red Hat, Inc. Len DiMaggio, Red Hat, Inc. John Poelstra, Red Hat, Inc. Authors retain copyright to all submitted papers, but have granted unlimited redistribution rights to all as a condition of submission. Unifying Virtual Drivers Jon Mason 3Leaf Systems [email protected] Dwayne Shows 3Leaf Systems [email protected] Dave Olien 3Leaf Systems [email protected] Abstract implemented through two distinct ways: Full Virtualiza- tion and Para-virtualization. Para-virtualization presents a wide variety of issues to Full Virtualization is a method that fully simulates hard- Operating Systems. One of these is presenting virtual ware devices through emulating physical hardware com- devices to the para-virtualized operating system, as well ponents in software. This simulation of hardware allows as the device drivers which handle these devices. With the OS and other components to run their software un- the increase of virtualization in the Linux kernel, there modified. In this technique, all instructions are trans- has been an influx of unique drivers to handle all of these lated through the hypervisor into hardware system calls. new virtual devices, and there are more devices on the The hypervisor controls the devices and I/O, and simu- way. The current state of Linux has four in-tree versions lated hardware devices are exported to the virtual ma- of a virtual network device (IBM pSeries Virtual Eth- chine. This hardware simulation does have a signifi- ernet, IBM iSeries Virtual Ethernet, UML Virtual Net- cant performance penalty when compared to running the work Device, and TUN/TAP) and numerous out-of-tree same software on native hardware, but allows the user versions (one for Xen, VMware, 3leaf, and many oth- to run multiple instances of a VM (virtual machine) si- ers). Also, there are a similar number of block device multaneously (thus allowing a higher utilization of the drivers. hardware and I/O). Examples of this type of virtualiza- This paper will go into why there are so many, and the tion are QEMU and VMware. differences and commonalities between them. It will Para-virtualization is a method that modifies the OS run- go into the benefits and drawbacks of combining them, ning inside the VM to run under the hypervisor. It is their requirements, and any design issues. It will discuss modified to support the hypervisor and avoid unneces- the changes to the Linux kernel to combine the virtual sary use of privileged instructions. These modifications network and virtual block devices into two common de- allow the performance of the system to be near native. vices. We will discuss how to adapt the existing virtual However, this type of virtualization requires that vir- devices and write drivers to take advantage of this new tual devices be exposed for access to the underlying I/O interface. hardware. UML and Xen are examples of such virtualization implimentations. The scope of this paper is 1 Introduction to I/O Virtualization in considering only para-virtualized virtual machine ab- stractions; this is where virtual device implementations have proliferated. Virtualization is the ability of a system through hardware and/or software to run multiple instances of Oper- To have better I/O performance in a virtual machine, the ating Systems simultaneously. This is done through ab- VMs have virtual devices exported to them and have the stracting the physical hardware layer through a software native device located in another VM or in the hypervi- layer, known as a hypervisor. Virtualzation is primarily sor. These virtual devices have virtual device drivers for • 9 • 10 • Unifying Virtual Drivers them to appear in the VM as if they were native, physi- From the kernel address space the driver can use for user cal devices. These virtual device drivers allow the data processes to mmap to their own address spaces. to be passed from the VM to the physical hardware. This hardware then does the required work, and may or may All the guest reads and writes go through an API to the not return information to the virtual device driver. This host and are translated to host reads and writes. As such, level of abstraction allows for higher performance I/O, these operations are using the buffer cache on the host. but the underlying procedure to perform the requested This causes a disproportionate amount of memory to be operation differs on each para-virtualzation implemen- consumed by the UMLs without limits as to how much tation. they can utilize. As is characteristic to Linux, flushing modified buffers back to disk will be done when appro- priate, based upon memory pressure in the host. 2 Linux Virtualization support Another feature of UML virtual block device is supports There are many Linux-centered virtualization imple- “copy on write” (COW) partitions. COW partitions al- mentations, both in the kernel and outside the kernel. low a common image of a root device to be shared The implementations that are currently in the Linux ker- among many instances of UML. Writes to blocks on the nel have been there for some time and have matured COW partition are kept in a unique device for that UML over that time to be robust. Those implementations are instance. This is more convenient than the Xen recom- UML, IBM’s POWER virtualization, and the tun/tap de- mendation for using the LVM to provide COW function- vice driver. Those implementations that are currently ality. living outside the kernel tree are gaining popularity, and UML provides a number of APIs for file IO. These APIs may one day be included in the mainline kernel. The hook into native versions largely unmodified Linux implementations of note are Xen and 3leaf Systems vir- code. These APIs provides the virtual interface to the tualization (though many others exist). physical device on the host. Examples of these interfaces are os_open_file, os_read_file, and 2.1 In-tree Linux Virtualization support os_write_file. 2.1.1 User Mode Linux 2.1.2 IBM POWER virtualization User Mode Linux (UML) provides virtual block and network interfaces with a couple of predominant drivers. IBM’s PowerPC-based iSeries and pSeries hardware provides a native hypervisor in system firmware, allow- Briefly, the network drivers recommended for UML are ing the system to be easily virtualized. The user-level TUN/TAP. These are described in a later section. The tools allow the underlying hardware to be assigned to virtual devices created can be used with other virtual specific virtual machines, and in certain cases no I/O devices on other VMs or with systems on the external hardware at all. In the latter case, one virtual machine network. To configure for forwarding to other VMs, is assigned the physical adapters and manages all I/O to UML comes with a switch daemon that allows user- and from those adapters. The hypervisor then exposes level forwarding of packets. UML supports hot plug virtual devices to the other virtual machines. The virtual devices: new virtual devices can be configured dynam- devices are presented to the OS via the system’s Open ically while the system is up.

GFS2 Filesystem 253 Steven Whitehouse

Elastic Storage for Linux on System Z IBM Research & Development Germany

Google Android

Trusted Docker Containers and Trusted Vms in Openstack

Lguest a Journey of Learning the Linux Kernel Internals

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Release Notes for Debian GNU/Linux 5.0 (Lenny), Alpha

Red Hat Satellite 6.7 Provisioning Guide

Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C

Version 7.8-Systemd

Lguest the Little Hypervisor

Andrew File System (AFS) Google File System February 5, 2004

A Survey of Distributed File Systems