Linux Symposium 2007 As Shown by the Many Related Presentations at This Confer - Ence (E.G., KVM, Lguest, Vserver)
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Study of File System Evolution
Study of File System Evolution Swaminathan Sundararaman, Sriram Subramanian Department of Computer Science University of Wisconsin {swami, srirams} @cs.wisc.edu Abstract File systems have traditionally been a major area of file systems are typically developed and maintained by research and development. This is evident from the several programmer across the globe. At any point in existence of over 50 file systems of varying popularity time, for a file system, there are three to six active in the current version of the Linux kernel. They developers, ten to fifteen patch contributors but a single represent a complex subsystem of the kernel, with each maintainer. These people communicate through file system employing different strategies for tackling individual file system mailing lists [14, 16, 18] various issues. Although there are many file systems in submitting proposals for new features, enhancements, Linux, there has been no prior work (to the best of our reporting bugs, submitting and reviewing patches for knowledge) on understanding how file systems evolve. known bugs. The problems with the open source We believe that such information would be useful to the development approach is that all communication is file system community allowing developers to learn buried in the mailing list archives and aren’t easily from previous experiences. accessible to others. As a result when new file systems are developed they do not leverage past experience and This paper looks at six file systems (Ext2, Ext3, Ext4, could end up re-inventing the wheel. To make things JFS, ReiserFS, and XFS) from a historical perspective worse, people could typically end up doing the same (between kernel versions 1.0 to 2.6) to get an insight on mistakes as done in other file systems. -
Lguest a Journey of Learning the Linux Kernel Internals
Lguest A journey of learning the Linux kernel internals Daniel Baluta <[email protected]> What is lguest? ● “Lguest is an adventure, with you, the reader, as a Hero” ● Minimal 32-bit x86 hypervisor ● Introduced in 2.6.23 (October, 2007) by Rusty Russel & co ● Around 6000 lines of code ● “Relatively” “easy” to understand and hack on ● Support to learn linux kernel internals ● Porting to other architectures is a challenging task Hypervisor types make Preparation ● Guest ○ Life of a guest ● Drivers ○ virtio devices: console, block, net ● Launcher ○ User space program that sets up and configures Guests ● Host ○ Normal linux kernel + lg.ko kernel module ● Switcher ○ Low level Host <-> Guest switch ● Mastery ○ What’s next? Lguest overview make Guest ● “Simple creature, identical to the Host [..] behaving in simplified ways” ○ Same kernel as Host (or at least, built from the same source) ● booting trace (when using bzImage) ○ arch/x86/boot/compressed/head_32.S: startup_32 ■ uncompresses the kernel ○ arch/x86/kernel/head_32.S : startup_32 ■ initialization , checks hardware_subarch field ○ arch/x86/lguest/head_32.S ■ LGUEST_INIT hypercall, jumps to lguest_init ○ arch/x86/lguest/boot.c ■ once we are here we know we are a guest ■ override privileged instructions ■ boot as normal kernel, calling i386_start_kernel Hypercalls struct lguest_data ● second communication method between Host and Guest ● LHCALL_GUEST_INIT ○ hypercall to tell the Host where lguest_data struct is ● both Guest and Host publish information here ● optimize hypercalls ○ irq_enabled -
Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C
Membrane: Operating System Support for Restartable File Systems Swaminathan Sundararaman, Sriram Subramanian, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Michael M. Swift Computer Sciences Department, University of Wisconsin, Madison Abstract and most complex code bases in the kernel. Further, We introduce Membrane, a set of changes to the oper- file systems are still under active development, and new ating system to support restartable file systems. Mem- ones are introduced quite frequently. For example, Linux brane allows an operating system to tolerate a broad has many established file systems, including ext2 [34], class of file system failures and does so while remain- ext3 [35], reiserfs [27], and still there is great interest in ing transparent to running applications; upon failure, the next-generation file systems such as Linux ext4 and btrfs. file system restarts, its state is restored, and pending ap- Thus, file systems are large, complex, and under develop- plication requests are serviced as if no failure had oc- ment, the perfect storm for numerous bugs to arise. curred. Membrane provides transparent recovery through Because of the likely presence of flaws in their imple- a lightweight logging and checkpoint infrastructure, and mentation, it is critical to consider how to recover from includes novel techniques to improve performance and file system crashes as well. Unfortunately, we cannot di- correctness of its fault-anticipation and recovery machin- rectly apply previous work from the device-driver litera- ery. We tested Membrane with ext2, ext3, and VFAT. ture to improving file-system fault recovery. File systems, Through experimentation, we show that Membrane in- unlike device drivers, are extremely stateful, as they man- duces little performance overhead and can tolerate a wide age vast amounts of both in-memory and persistent data; range of file system crashes. -
Lguest the Little Hypervisor
Lguest the little hypervisor Matías Zabaljáuregui [email protected] based on lguest documentation written by Rusty Russell note: this is a draft and incomplete version lguest introduction launcher guest kernel host kernel switcher don't have time for: virtio virtual devices patching technique async hypercalls rewriting technique guest virtual interrupt controller guest virtual clock hw assisted vs paravirtualization hybrid virtualization benchmarks introduction A hypervisor allows multiple Operating Systems to run on a single machine. lguest paravirtualization within linux kernel proof of concept (paravirt_ops, virtio...) teaching tool guests and hosts A module (lg.ko) allows us to run other Linux kernels the same way we'd run processes. We only run specially modified Guests. Setting CONFIG_LGUEST_GUEST, compiles [arch/x86/lguest/boot.c] into the kernel so it knows how to be a Guest at boot time. This means that you can use the same kernel you boot normally (ie. as a Host) as a Guest. These Guests know that they cannot do privileged operations, such as disable interrupts, and that they have to ask the Host to do such things explicitly. Some of the replacements for such low-level native hardware operations call the Host through a "hypercall". [ drivers/lguest/hypercalls.c ] high level overview launcher /dev/lguest guest kernel host kernel switcher Launcher main() some implementations /dev/lguest write → initialize read → run guest Launcher main [ Documentation/lguest/lguest.c ] The Launcher is the Host userspace program which sets up, runs and services the Guest. Just to confuse you: to the Host kernel, the Launcher *is* the Guest and we shall see more of that later. -
Proceedings of the Linux Symposium
Proceedings of the Linux Symposium Volume One June 27th–30th, 2007 Ottawa, Ontario Canada Contents The Price of Safety: Evaluating IOMMU Performance 9 Ben-Yehuda, Xenidis, Mostrows, Rister, Bruemmer, Van Doorn Linux on Cell Broadband Engine status update 21 Arnd Bergmann Linux Kernel Debugging on Google-sized clusters 29 M. Bligh, M. Desnoyers, & R. Schultz Ltrace Internals 41 Rodrigo Rubira Branco Evaluating effects of cache memory compression on embedded systems 53 Anderson Briglia, Allan Bezerra, Leonid Moiseichuk, & Nitin Gupta ACPI in Linux – Myths vs. Reality 65 Len Brown Cool Hand Linux – Handheld Thermal Extensions 75 Len Brown Asynchronous System Calls 81 Zach Brown Frysk 1, Kernel 0? 87 Andrew Cagney Keeping Kernel Performance from Regressions 93 T. Chen, L. Ananiev, and A. Tikhonov Breaking the Chains—Using LinuxBIOS to Liberate Embedded x86 Processors 103 J. Crouse, M. Jones, & R. Minnich GANESHA, a multi-usage with large cache NFSv4 server 113 P. Deniel, T. Leibovici, & J.-C. Lafoucrière Why Virtualization Fragmentation Sucks 125 Justin M. Forbes A New Network File System is Born: Comparison of SMB2, CIFS, and NFS 131 Steven French Supporting the Allocation of Large Contiguous Regions of Memory 141 Mel Gorman Kernel Scalability—Expanding the Horizon Beyond Fine Grain Locks 153 Corey Gough, Suresh Siddha, & Ken Chen Kdump: Smarter, Easier, Trustier 167 Vivek Goyal Using KVM to run Xen guests without Xen 179 R.A. Harper, A.N. Aliguori & M.D. Day Djprobe—Kernel probing with the smallest overhead 189 M. Hiramatsu and S. Oshima Desktop integration of Bluetooth 201 Marcel Holtmann How virtualization makes power management different 205 Yu Ke Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps 215 J. -
Zack's Kernel News
KERNEL NEWS ZACK’S KERNEL NEWS ReiserFS Turmoil Their longer term plan, Alexander Multiport Card driver, again naming In light of recent events surrounding says, depends on what happens with himself the maintainer. Hans Reiser (http:// www. linux-maga- Hans. If Hans is released, the developers Jiri’s been submitting a number of zine. com/ issue/ 73/ Linux_World_News. intend to proceed as before. If he is not patches for these drivers, so it makes pdf), the question of how to continue released, Alexander’s best guess is that sense he would maintain them if he ReiserFS development came up on the the developers will try to appoint a wished; in any event, no other kernel linux-kernel mailing list. Alexander proxy to run Namesys. hacker has spoken up to claim the role. Lyamin from Hans’s Namesys company offered his take on the situation. He said Status of sysctl Filesystem Benchmarks that ReiserFS 3 has pretty much stabi- In keeping with Linus Torvalds’ recent Some early tests have indicated that ext4 lized into bugfix mode, though Suse assertions that it is never acceptable to is faster with disk writes than either ext3 folks had been adding new features like break user-space, Albert Cahalan volun- or Reiser4. There was general interest in ACL support. So ReiserFS 3 would go on teered to maintain the sysctl code if it these results, though the tests had some as before. couldn’t be removed. But Linus pointed problems (the tester thought delayed In terms of Reiser4, however, Alexan- out that really nothing actually used allocation was part of ext4, when that der said that he and the other Reiser de- sysctl (the implication being that it feature has not yet been merged into velopers were still addressing the techni- wouldn’t actually break anything to get Andrew Morton’s tree). -
Building Embedded Linux Systems ,Roadmap.18084 Page Ii Wednesday, August 6, 2008 9:05 AM
Building Embedded Linux Systems ,roadmap.18084 Page ii Wednesday, August 6, 2008 9:05 AM Other Linux resources from O’Reilly Related titles Designing Embedded Programming Embedded Hardware Systems Linux Device Drivers Running Linux Linux in a Nutshell Understanding the Linux Linux Network Adminis- Kernel trator’s Guide Linux Books linux.oreilly.com is a complete catalog of O’Reilly’s books on Resource Center Linux and Unix and related technologies, including sample chapters and code examples. ONLamp.com is the premier site for the open source web plat- form: Linux, Apache, MySQL, and either Perl, Python, or PHP. Conferences O’Reilly brings diverse innovators together to nurture the ideas that spark revolutionary industries. We specialize in document- ing the latest tools and systems, translating the innovator’s knowledge into useful skills for those in the trenches. Visit con- ferences.oreilly.com for our upcoming events. Safari Bookshelf (safari.oreilly.com) is the premier online refer- ence library for programmers and IT professionals. Conduct searches across more than 1,000 books. Subscribers can zero in on answers to time-critical questions in a matter of seconds. Read the books on your Bookshelf from cover to cover or sim- ply flip to the page you need. Try it today for free. main.title Page iii Monday, May 19, 2008 11:21 AM SECOND EDITION Building Embedded Linux SystemsTomcat ™ The Definitive Guide Karim Yaghmour, JonJason Masters, Brittain Gilad and Ben-Yossef, Ian F. Darwin and Philippe Gerum Beijing • Cambridge • Farnham • Köln • Sebastopol • Taipei • Tokyo Building Embedded Linux Systems, Second Edition by Karim Yaghmour, Jon Masters, Gilad Ben-Yossef, and Philippe Gerum Copyright © 2008 Karim Yaghmour and Jon Masters. -
Bunker: a Privacy-Oriented Platform for Network Tracing Andrew G
Bunker: A Privacy-Oriented Platform for Network Tracing Andrew G. Miklas†, Stefan Saroiu‡, Alec Wolman‡, and Angela Demke Brown† †University of Toronto and ‡Microsoft Research Abstract: ISPs are increasingly reluctant to collect closing social security numbers, names, addresses, or and store raw network traces because they can be used telephone numbers [5, 54]. to compromise their customers’ privacy. Anonymization Trace anonymization is the most common technique techniques mitigate this concern by protecting sensitive information. Trace anonymization can be performed of- for addressing these privacy concerns. A typical imple- fline (at a later time) or online (at collection time). Of- mentation uses a keyed one-way secure hash function to fline anonymization suffers from privacy problems be- obfuscate sensitive information contained in the trace. cause raw traces must be stored on disk – until the traces This could be as simple as transforming a few fields in are deleted, there is the potential for accidental leaks or the IP headers, or as complex as performing TCP connec- exposure by subpoenas. Online anonymization drasti- tion reconstruction and then obfuscating data (e.g., email cally reduces privacy risks but complicates software en- addresses) deep within the payload. There are two cur- gineering efforts because trace processing and anony- mization must be performed at line speed. This paper rent approaches to anonymizing network traces: offline presents Bunker, a network tracing system that combines and online. Offline anonymization collects and stores the software development benefits of offline anonymiz- the entire raw trace and then performs anonymization ation with the privacy benefits of online anonymization. as a post-processing step. -
Proceedings of the Linux Symposium Volume
Proceedings of the Linux Symposium Volume Two July 19th–22nd, 2006 Ottawa, Ontario Canada Contents Evolution in Kernel Debugging using Hardware Virtualization With Xen 1 Nitin A. Kamble Improving Linux Startup Time Using Software Resume (and other techniques) 17 Hiroki Kaminaga Automated Regression Hunting 27 A. Bowen, P. Fox, J. Kenefick, A. Romney, J. Ruesch, J. Wilde, & J. Wilson Hacking the Linux Automounter—Current Limitations and Future Directions 37 Ian Maxwell Kent & Jeff Moyer Why NFS Sucks 51 Olaf Kirch Efficient Use of the Page Cache with 64 KB Pages 65 Dave Kleikamp and Badari Pulavarty Startup Time in the 21st Century: Filesystem Hacks and Assorted Tweaks 71 Benjamin C.R. LaHaise Using Hugetlbfs for Mapping Application Text Regions 75 H.J. Lu, K. Doshi, R. Seth, & J. Tran Towards a Better SCM: Revlog and Mercurial 83 Matt Mackall Roadmap to a GL-based composited desktop for Linux 91 K.E. Martin and K. Packard Probing the Guts of Kprobes 101 A. Mavinakayanahalli, P. Panchamukhi, J. Keniston, A. Keshavamurthy, & M. Hiramatsu Shared Page Tables Redux 117 Dave McCracken Extending RCU for Realtime and Embedded Workloads 123 Paul E. McKenney OSTRA: Experiments With on-the-fly Source Patching 139 Arnaldo Carvalho de Melo Design and Implementation to Support Multiple Key Exchange Protocols for IPsec 143 K. Miyazawa, S. Sakane, K. Kamada, M. Kanda, & A. Fukumoto The State of Linux Power Management 2006 151 Patrick Mochel I/O Workload Fingerprinting in the Genetic-Library 165 Jake Moilanen X86-64 XenLinux: Architecture, Implementation, and Optimizations 173 Jun Nakajima, Asit Mallick GCC—An Architectural Overview, Current Status, and Future Directions 185 Diego Novillo Shared-Subtree Concept, Implementation, and Applications in Linux 201 Al Viro & Ram Pai The Ondemand Governor 215 Venkatesh Pallipadi & Alexey Starikovskiy Linux Bootup Time Reduction for Digital Still Camera 231 Chan-Ju Park A Lockless Pagecache in Linux—Introduction, Progress, Performance 241 Nick Piggin The Ongoing Evolution of Xen 255 I. -
Virtual Servers and Checkpoint/Restart in Mainstream Linux
Virtual Servers and Checkpoint/Restart in Mainstream Linux Sukadev Bhattiprolu Eric W. Biederman Serge Hallyn IBM Arastra IBM [email protected] [email protected] [email protected] Daniel Lezcano IBM [email protected] ABSTRACT one task will use one table to look for a task corresponding Virtual private servers and application checkpoint and restart to a specific pid, in other word this table is relative to the are two advanced operating system features which place dif- task. ferent but related requirements on the way kernel-provided For the past several years, we have been converting kernel- resources are accessed by userspace. In Linux, kernel re- provided resource namespaces in Linux from a single, global sources, such as process IDs and SYSV shared messages, table per resource, to Plan9-esque [31] per-process names- have traditionally been identified using global tables. Since paces for each resource. This conversion has enjoyed the for- 2005, these tables have gradually been transformed into per- tune of being deemed by many in the Linux kernel develop- process namespaces in order to support both resource avail- ment community as a general code improvement; a cleanup ability on application restart and virtual private server func- worth doing for its own sake. This perception was a tremen- tionality. Due to inherent differences in the resources them- dous help in pushing for patches to be accepted. But there selves, the semantics of namespace cloning differ for many are two additional desirable features which motivated the of the resources. This paper describes the existing and pro- authors. -
Using KVM to Run Xen Guests Without Xen
Using KVM to run Xen guests without Xen Ryan A. Harper Michael D. Day Anthony N. Liguori IBM IBM IBM [email protected] [email protected] [email protected] Abstract on providing future extensions to improve MMU and hardware virtualization. The inclusion of the Kernel Virtual Machine (KVM) Linux has also recently gained a set of x86 virtual- driver in Linux 2.6.20 has dramatically improved ization enhancements. For the 2.6.20 kernel release, Linux’s ability to act as a hypervisor. Previously, Linux the paravirt_ops interface and KVM driver were was only capable of running UML guests and contain- added. paravirt_ops provides a common infras- ers. KVM adds support for running unmodified x86 tructure for hooking portions of the Linux kernel that operating systems using hardware-based virtualization. would traditionally prevent virtualization. The KVM The current KVM user community is limited by the driver adds the ability for Linux to run unmodified availability of said hardware. guests, provided that the underlying processor supports The Xen hypervisor, which can also run unmodified op- either AMD or Intel’s CPU virtualization extensions. erating systems with hardware virtualization, introduced Currently, there are three consumers of the a paravirtual ABI for running modified operating sys- paravirt_ops interface: Lguest [Lguest], a tems such as Linux, Netware, FreeBSD, and OpenSo- “toy” hypervisor designed to provide an example laris. This ABI not only provides a performance advan- implementation; VMI [VMI], which provides support tage over running unmodified guests, it does not require for guests running under VMware; and XenoLinux any hardware support, increasing the number of users [Xen], which provides support for guests running under that can utilize the technology. -
Weekend Chat
COMMUNITY FOSDEM 2004 FOSDEM 2004 Weekend Chat FOSDEM is the Free and Open Source Developers Meeting that takes place each year in Belgium’s capital city of Brussels.The meeting can be split into two halves. BY JOHN SOUTHERN Figure 2: Jon “Maddog” Hall explaining history. he first half is the organized talks, Love, who talked about the kernel and Talking lectures and tutorials which take its affects on the desktop. Followed by The second half of FOSDEM is just as Tplace over a weekend, This year 2.6 Internals and then Hans Reiser took important. It is the community activities. started with an opening speech by Tim listeners on a tour of Reiser 4. Those that can spare the Friday evening, O’Reilly of O’Reilly books. Everyone lis- In the tutorials, scripting languages meet in the Roy d’Espagne for an tens to the opening speech and the included Ruby and NetBeans. Everyone evening of Belgian beer. traditional talk by Richard Stallman. then met for the Free Software Award, Although the beer is wonderful, many RMS gave his traditional speech, which was won by Alan Cox, and the do not drink and still go because this is including his Saint IGNUcius sketch, singing of the Free Software song. where you meet up others and get a where he blesses the crowd and indoc- Sunday started with streams on X and chance to chat about what everyone torates them in the Church of Emacs. Java. In the tutorial room, rather than thinks are the important developments. hour long talks, we were treated to sev- This chatting to like minded strangers, Talks enteen short fifteen minute talks.