Proceedings of the 4Th Annual Linux Showcase & Conference, Atlanta

USENIX Association Proceedings of the 4th Annual Linux Showcase & Conference, Atlanta Atlanta, Georgia, USA October 10 –14, 2000 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2000 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. The Linux BIOS Ron Minnich, James Hendricks, Dale Webster Advanced Computing Lab, Los Alamos National Labs Los Alamos, New Mexico August 15, 2000 Abstract the LinuxBIOS to other machines. According to one vendor, weshould be able to purchase their LinuxBIOS-based The Linux BIOS replaces the normal BIOS found on PCs, mainboards by the end of this year. Alphas, and other machines. The BIOS boot and setup is eliminated and replaced by a very simple initialization phase, followed by a gunzip of a Linux kernel. The Linux 1 Introduction kernel is then started and from there on the boot proceeds as normal. Current measurements on two mainboards Current PC and Alpha cluster nodes, as delivered, are show we can go from a machine power-off state to the dependent on a vendor-supplied BIOS for booting. The “mount root” step in a under a second, depending on the BIOS in turn relies on inherently unreliable devices – type of hardware in the machine. The actual boot time is floppy disks and hard disks – to boot the operating sys- difficult to measure accurately at present because it is so tem. While there has been some movement toward a net small. booting standard, the basic design of the netboot as de- As the name implies, the LinuxBIOS is primarily fined by PXE (ref) is inherently flawed, and not usable in Linux. Linux needs a small number of patches to han- a large-scale cluster environment. dle uninitialized hardware: about 10 lines of patches so Further, while the BIOS is only required to do a few far. Other than that it is an off-the-shelf 2.3.99-pre5 ker- very simple things, it usually does not even do those well. nel. The LinuxBIOS startup code is about 500 lines of We have found BIOSes that configure all mainboard de- assembly and 1500 lines of C. vices to a single interrupt (from Compaq); BIOSes that do The Linux BIOS can boot other kernels; it can use the not configure memory-addressable cards to page-aligned LOBOS(ref) or bootimg(ref) tools for this purpose. Be- addresses(Gateway and others); BIOSes that will not boot cause we are using Linux the boot mechanism can be very without a keyboard attached to the machine (some ver- flexible. We can boot over standard Ethernet, or over other sions of the Alpha BIOSes); BIOSes that will not boot if interconnects such as Myrinet, Quadrix, or Scaleable Co- the real-time-clock is invalid (Alpha BIOSes). One engi- herent Interface. We can use SSH connections to load the neer has reported to us that he had to write his own PCI kernel, or use InterMezzo or NFS. Using a real operating configuration code because no BIOS extant could handle system to boot another operating system provides much his machine’s bus structure. BIOSes also routinely zero greater flexibility than using a simple netboot program or all main memory when they start, which makes finding BIOS such as PXE. lockups in operating systems very hard, since on restart LinuxBIOS currently boots from power-off to multi- the message buffer and the OS image is wiped out. user login on two mainboards, the Intel L440GX+ and Another problem with BIOSes is their inability to acco- the Procomm PSBT1. We are currently working with in- modate non-standard hardware. For example, the Infor- dustrial partners (Dell, Compaq, SiS, and VIA) to port mation Sciences Institute (ISI) SLAAC1 reconfigurable 1 computing board will not work with some BIOSes be- tion, this BIOS will take almost as much NVRAM as a cause it does not configure its PCI interface quickly gzip-ed Linux kernel, while having much less capability. enough. In some cases, on some mainboards, by the time As an example, the Intel BIOS for the L440GX+ main- the BIOS is done probing the PCI bus the SLAAC1 card board takes 50% more memory (750 KB) than the Lin- is not even ready to be probed. As a result the BIOS never uxBIOS, and is far less capable. finds the SLAAC1. Changing the SLAAC1 hardware is Finally there is one recent effort which not an option, because the speed problem is an artifact of is aimed at providing a commercial replace- the FPGA that is used for the PCI interface. If we had ment for the BIOS, called SmartFirmware control over the BIOS, however, we could change it to ac- (http://www.codegen.com/SmartFirmware/index.html). comodate experimental boards which do not always work SmartFirmware is also IEEE Open Boot compliant. with standard BIOSes. While these efforts are interesting they do not address Finally, BIOS maintenance is a nightmare. All Pentium our needs for clustering. The problem with the BIOS is BIOSes are maintained via DOS programs, and configu- not only that it is closed-source; the problem is that it ration settings for most Pentium BIOSes and all Alpha is performing a function we no longer need (supporting BIOSes is via keyboard and display (and, in some truly DOS), rather than functions we do need. The limited na- deranged cases, a mouse). It is completely impractical to ture of the BIOS was originally imposed in the days of walk up to 1024 racked PC nodes and boot DOS on each 8 KBYTE EPROMS; now, given that the BIOS is using one in turn to change one BIOS setting. Yet this imprac- most of a 1 MBYTE NVRAM, we ought to be able to do tical method is the only one supported by vendors. better. The next generation of mainboard NVRAMs will This situation is completely unacceptable for clusters be 2 Mbytes: there is lots of room for a real operating sys- of 64, 128, or more nodes. The only practical way to tem on the mainboard. Finally, we need to have the option maintain a BIOS in a cluster is via the network, under of doing things the BIOS writers can not imagine, such as control of an operating system: BIOS configuration and booting over Scaleable Coherent Interface or managing upgrade should be possible from user programs running the BIOS via SSH connections. under an operating system. BIOS parameters should be available via /proc. 3 Structure of the Linux BIOS 2 Related Work The structure of the LinuxBIOS is driven by our desire to avoid writing new code. We want to build the absolute There are a number of efforts ongoing to replace the minimum amount of code needed to get Linux up, and BIOS. For the most part these efforts are aimed de- then let Linux do the rest of the work. Linux has shown veloping an open-source replacement for the BIOS that its ability to handle quirky hardware; it is doubtful we can supports all BIOS functions – in other words, plug- do a better job than it can. compatible BIOS software. An end goal of most of One initial concern for any PC BIOS is what mode to these efforts is to be able to boot DOS and have it run the processor in. All x86 family processors boot up work correctly. Example projects are the OpenBIOS into an 8086 emulation mode, running with 16-bit ad- (www.openbios.org) and the FreeBIOS (which can be dresses, operators, and operands. In other words, 1000 found at www.sourceforge.net). Mhz. Pentiums on startup are emulating a 16-bit, 6 Mhz, Other efforts aim to build a completely open source ver- near 25-year-old processor. BIOSes even now have to sion of the IEEE Open Boot standard. This work will re- take into account such unpleasant details as near and far quire the construction of a Forth interpreter; code to sup- pointers (see the PXE standard for some examples). port protocol stacks such as TCP/IP; other code to support Some of the new open source BIOS efforts have taken Disk I/O and file systems on those disks; in short, many the approach of doing a fair amount of startup in 16- pieces that are already available in real operating systems. bit assembly, and at some point making a transition to If the experience of other similar projects are any indica- protected mode. There are three serious problems with 2 this approach. First, the 8086 emulation is guaranteed it. On uninitialized hardware, however, the IDE controller to work for DOS, but we have seen cases where certain is not enabled because there is no BIOS to enable it in the instruction sequences don’t seem to work right. Emula- first place. Thus, ’IDE controller not enabled’ has a dia- tion has its limitations, and depending on it seems very metrically opposite meaning in BIOS and non-BIOS envi- risky. Second, some hardware initialization absolutely re- ronments. We have chosen to make the one-line change to quires having access to 32-bit addresses. For example, the Linux IDE driver to enable IDE controllers when they setting up PC-100 SDRAM requires at some point a write are found.

Proceedings of the 4Th Annual Linux Showcase & Conference, Atlanta

AFS - a Secure Distributed File System

Linux Kernal II 9.1 Architecture

Filesystems HOWTO Filesystems HOWTO Table of Contents Filesystems HOWTO

Comparative Analysis of Distributed and Parallel File Systems' Internal Techniques

A Parallel File System for Linux Clusters Authors: Philip H

A Distributed NFS Server for Clusters of Workstations

PVFS: a Parallel File System for Linux Clusters

Journaled File System (JFS) for Linux UT, Texas 4/08/2002

Servers Reintegration in Disconnection-Resilient File Systems for Mobile Clients

File Replication with the Intermezzo File System

Secure Filesystems

"Lustre - the Inter-Galactic Cluster File System?"