<<

RevIews 3

Minix 3 and the experience SMART KERNEL covado, Fotolia

Minix is often viewed as the spiritual predecessor of , but these two cousins could never agree on the kernel design. Now a new Minix with a BSD-style is poised to attract a new generation of users. BY RÜDIGER WEIS

inux has a long and stormy rela- documented system soon became popu- ponents incorporated into the kernel. In tionship [1] with another Unix- lar with OS enthusiasts. In a post to the a famous post to the Minix group, the Llike known as Minix newsgroup, upstart Finnish under- Minix creator referred to Linux as “… a Minix [2]. Noted author and graduate announced his giant step back to the 1970s,” and a con- scientist Andrew S. Tanenbaum released own experimental system in 1991, and fident reply from young Torvalds to this the first version of Minix in 1987 as a many early Linux contributors came leading expert in the field of operating tool for teaching students about operat- from the ranks of the Minix community. systems is early evidence of his now-leg- ing systems, and this small and well- But Tanenbaum and Torvalds clashed endary directness. Still, Linus has ac- early over issues of design. Tanenbaum knowledged the importance of Tanen- Rüdiger Weis is a professor of system has always favored the microkernel ar- baum’s work to the formation of his own programming at TFH Berlin. Between chitecture, a distinguishing feature of ideas. In his autobiography Just for Fun 2002 and 2005, he was involved in Minix to this day (see the box titled [3], Linus refers to Tanenbaum’s Operat- post-doctorate research on secure op- “Why Can’t Just Work All ing Systems: Design and Implementation erating systems under Professor An- the Time?”). Linus, on the other hand, as the book that changed his life. drew S. Tanenbaum at the Vrije uni- built Linux with a monolithic kernel, The debate about micro- versus mono-

THE AUTHOR THE versiteit, Amsterdam. with filesystems, drivers, and other com- lithic kernels goes on to this day, and

48 ISSUE 99 FEBRuARy 2009 NETWAYS OSDC

OPEN SOURCE DATA just as Linux didn’t Program/ Program/ fade away, neither CENTER CONFERENCE Server Server Filesystem did Minix. Version 3 of the Minix oper- 29 & 30 April 2009 | Nuremberg Reincarnation Server ating system is de-

User Space Network Stack signed with the ob- Driver Driver Driver jective of creating a system that is more secure and reliable Microkernel than comparable POSIX systems, I/O access Scheduler MMU and a BSD-style

Kernel Space open source license makes the latest Figure 1: The Minix microkernel encapsulates many subsys- Minix a strong can- tems in , including drivers, the filesystem, and the didate for produc- network stack. The kernel just runs critical functions, such tion as well as edu- as underlying I/​O, schedulers, and . cational uses. Minix is even at- tracting the attention of some major sponsors. The EU is now sponsoring the project with several million Euros of funding, and Google has a number of Minix projects in its “Summer of Code” program. runs on 32-bit CPUs, as well as on a number of virtual machines including Qemu, , and VMware. The operating system includes an X Win- dow System (X11), a number of editors ( and Vi), shells (including and Zsh), GCC, languages such as Python and Perl, and network tools such as SSH. A small footprint and crash-resistant design Minix a good http://www.netways.de/osdc candidate for embedded systems, and its superior stability has led to a promis- ing new role as a firewall system. EARLY BIRD SPECIALS - BOOK NOW Insecure by Design The security problems facing the current crop of operating systems, including Windows, but also including Linux, are the result of design errors. The errors were inherited for the most part from their predecessors of the 1960s. Most of Top IT professionals will meet, discuss, these problems can be attributed to the fact that developers aren’t perfect. Hu- learn and share their expertise on mans make mistakes. Of course, it would be nice to reduce the numbers and mitigate the effects; however, designers have frequently been far too willing to Open Source solutions for large IT compromise security and a clean design for speed. Tanenbaum refers to this as infrastructures. a “Faustian pact.” In addition to the issues related to sheer size, monolithic designs are also prone to inherent structural problems: Any error is capable of endangering the Break new ground on: whole system. A fundamental design error is that current operating systems do • High Availability The Question of Extension • Clustering Many developers and users disagree with Tanenbaum’s doctrine, he has • Load Balancing maintained for over a decade, of being very cautious about introducing exten- sions to the kernel. Tanenbaum’s measure of reasonable operating system com- • Security Management | Firewalling plexity is a system that can be taught in a single term. Modularity makes it possi- ble to complete the development of a practically deployable solution in the • Large Scale Databases of a thesis. Examples of this are ports for various processor architectures, modifi- • Configuration Management cations to Minix for Xen , and security applications. In his memoir [3], Linus Torvalds states his reason for rejecting the microkernel architecture for Linux. “The theory behind the microkernel was that you split the kernel into fifty independent parts, and each of the parts is a fiftieth of the com- OPEN SOURCE DATA plexity. But everybody ignores the fact that the communication among the parts CENTER CONFERENCE is actually more complicated than the original system was – never mind the fact 29. & 30. April | Nürnberg that the parts are still not trivial.” A messy monolithic system can thus offer some performance and scalability benefits, even if it lacks the stability of a micro- kernel. presented by supported by ® NETWAYS MAGAZIN

AnzeigeV1.1_Englisch.indd 1 20.11.2008 16:35:28 Reviews Minix 3

not follow the Principle Of Least Author- Continued operating system growth package from a stranger and bringing it ity (POLA). To put this simply, POLA comes with the integration of new driv- into the cockpit of a plane. states that developers should distribute ers. Monolithic systems build device systems over a number of modules so an drivers into the kernel, which means Transparent Architecture error in one module will not compromise that a driver error can compromise the Minix is probably the most fully docu- the security and stability of other mod- stability of the whole system. Closed mented operating system around. The ules. They should also make sure that source drivers in particular endanger Minix Book by Tanenbaum and Wood- each module only has the rights that it system security. According to Tanen- hull [4] is the primary reference. Numer- actually needs to complete its assigned baum, building a closed source driver ous publications on new features and tasks. into the kernel is like accepting a sealed ongoing research are found on the Minix

Why Can’t Computers Just Work All the Time? By Andrew S. Tanenbaum Computer users are changing. Ten years the more bugs there are. Various studies They have to use kernel services to read ago, most computer users were young have shown that the number of bugs per and write to the hardware. The layer of people or professionals with lots of techni- thousand lines of code (KLoC) varies from processes running in user-mode directly cal expertise. When things went wrong – 1 to 10 in large production systems. A re- above the kernel consists of device drivers, which they often did – they knew how to ally well-written piece of software might with the disk driver, the Ethernet driver, fix things. Nowadays, the average user is have 2 bugs per KLoC over time, but not and all the other drivers running as sepa- far less sophisticated, perhaps a 12-year- fewer. An operating system with, say, 4 rate processes protected by the MMU old girl or a grandfather. Most of them million lines of code is thus likely to have hardware so they cannot execute any privi- know about as much about fixing com- at least 8000 bugs. Not all are fatal, but leged instructions and cannot read or write puter problems as the average computer some will be. A study at Stanford Univer- any memory except their own. nerd knows about repairing his car. What sity showed that device drivers – which Above the driver layer comes the server they want more than anything else is a make up 70% of the code base of a typical layer, with a file server, a server, computer that works all the time, with no operating system – have bug rates 3x to 7x and other servers. The servers make use of glitches and no failures. higher than the rest of the system. Device the drivers as well as kernel services. For drivers have higher bug rates because (1) Many users automatically compare their example, to read from a file, a user process they are more complicated and (2) they are computer to their television . Both are sends a message to the file server, which inspected less. While many people study full of magical electronics and have big then sends a message to the disk driver to the scheduler, few look at printer drivers. screens. Most users have an implicit model fetch the blocks needed. When the file sys- of a television set: (1) you buy the set; (2) The Solution: Smaller Kernels tem has them in its buffer cache, it calls the you plug it in; (3) it works perfectly without The solution to this problem is to move kernel to move them to the user’s address any failures of any kind for the next 10 code out of the kernel, where it can do space. years. They that from the computer, maximal damage, and put it into user- In addition to these servers, there is an- and when they do not get it, they get frus- space processes, where bugs cannot cause other server called the reincarnation trated. When computer experts tell them: system crashes. This is how Minix 3 is de- server. The reincarnation server is the par- “If God had wanted computers to work all signed. The current Minix system is the ent of all the driver and server processes the time, He wouldn’t have invented (second) successor to the original Minix, and monitors their behavior. If it discovers RESET buttons” they are not impressed. which was originally launched in 1987 as one process that is not responding to its For lack of a better definition of dependabil- an educational operating system but has pings, it starts a fresh copy from disk (ex- ity, let us adopt this one: A device is said to since been radically revised into a highly cept for the disk driver, which is shadowed be dependable if 99% of the users never ex- dependable, self-healing system. What fol- in RAM). The system has been designed so perience any failures during the entire pe- lows is a brief description of the Minix ar- that many (but not all) of the critical drivers riod they own the device. By this definition, chitecture; you can more at http://​ and servers can be replaced automatically, virtually no computers are dependable, ­www.minix3.​­ org​­ . while the system is operating, without dis- whereas most TVs, iPods, digital cameras, Minix 3 is designed to run as little code as turbing running user processes or even camcorders, etc. are. Techies are willing to possible in kernel mode, where bugs can notifying the user. In this way, the system forgive a computer that crashes once or easily be fatal. Instead of 3-4 million lines is self healing. twice a year; ordinary users are not. of kernel code, Minix 3 has about 5000 To whether these ideas work in prac- Home users aren’t the only ones annoyed lines of kernel code. Sometimes kernels tice, we ran the following experiment. We by the poor dependability of computers. this small are called . They started a fault-injection process that over- Even in highly technical settings, the low handle low-level process management, wrote 100 machine instructions in the run- dependability of computers is a problem. scheduling, , and the clock, and ning binary of the Ethernet driver to see Companies like Google and Amazon, with they provide some low-level services to what would happen if one of them were hundreds of thousands of servers, experi- user-space components. executed. If nothing happened for a few ence many failures every day. They have The bulk of the operating system runs as a seconds, another 100 were injected, and so learned to live with this, but they would re- collection of device drivers and servers, on. In all, we injected 800,000 faults into ally prefer systems that just worked all the each running as an ordinary user-space each of three different Ethernet drivers and time. Unfortunately, current software fails process with restricted privileges. None of caused 18,000 driver crashes. In all cases, them. these drivers and servers run as superuser the driver was replaced automatically by The problem is that software con- or equivalent. They cannot even access I/​O the reincarnation server. Despite injecting tains bugs, and the more software there is, devices or the MMU hardware directly. 2.4 million faults into the system, not once

50 ISSUE 99 FEBRuary 2009 Minix 3 Reviews

3 homepage [2]. Minix is compliant with rors. The microkernel architecture imple- INFO the POSIX standard IEEE 1003.2-1996, ments drivers as separate user mode pro- and developers have already ported cesses that are not permitted to execute [1] Tanenbaum, Andrew S., “Some many Unix programs to Minix. privileged commands or I/​O operations Notes on the ‘Who wrote Linux’ Ker- or write directly to memory. Instead, fuffle, Release 1.5,” 2004, The Minix Difference these operations are performed by audit- http://www.​­ cs.​­ vu.​­ nl/​­ ~ast/​­ brown/​­ Minix 3 is on the syllabus of many uni- able kernel calls (see Figure 1). [2] Minix Project: versities, and many generations of stu- Minix 3 uses fixed-length messages http://www.​­ minix3.​­ org​­ dents have scrutinized the few thousand for process communications. This de- [3] Torvalds, Linus, and David Diamond. lines of Minix code and fixed most er- sign simplifies the code structure and Just for Fun. HarperBusiness, 2001 helps mitigate the danger of buffer over- [4] Tanenbaum, Andrew S., and Albert Why Can’t Computers Just Work All the Time? By Andrew S. Tanenbaum flows. The Minix filesystem runs as a S. Woodhull. The Minix Book: Oper‑ ating System Design and Implemen‑ did the operating system crash. Needless simple user process. Because it is made tation. Prentice-Hall, 2006 to say, if a fatal error occurs in a Linux or up of around 8,200 lines of userspace Windows driver running in the kernel, the code, but no kernel code, it is easy to [5] Weis, Rüdiger, “Linux is obsolete entire operating system will crash in- debug. 2.0,” presented at Chaos Communi- stantly. An innovative component, the reincar- cation Camp 2007, http://​­public.​ ­tfh‑berlin.​­de/​­~rweis/​­vorlesungen/​ Is there a downside to this approach? Yes. nation server, enhances the reliability of ­ComputerSicherheit/​ There is a performance hit. We have not the Minix system by serving as the par- ­WeisLinuxIsObsolete2.pdf​­ measured it extensively, but the research ent of all servers and drivers. It detects [6] : group at Karlsruhe [University of crashes very quickly and continually Karlsruhe, Germany], which has devel- http://www.​­ codeplex.​­ com/​­ ​ monitors the function of critical pro- oped its own microkernel, L4, and then ­singularity cesses, re-starting fallen processes as run Linux on top of it as a single user pro- [7] Minixwall: http://wiki.tfh-berlin. necessary to keep the system running. cess, has been able to get the perfor- de/~minixwall mance hit down to 5%. We believe that, if we put some effort into it, we could get Minix Firewall Project the overhead down to the 5–10% range, Packet filters are an endangered system Linux firewall to its knees. In Minix 3, a too. Performance has not been a priority component. Despite the excellent quality single user process could crash without for us, as most users reading their e-mail of the Linux implementation, a compromising system security. The rein- or looking at Facebook pages are not lim- number of security issues have surfaced carnation server would simply restart ited by CPU performance. What they do in the past. If a subsystem of this kind is the process. want, however, is a system that just works all the time. running on the , it will en- The differences become even more ap- danger system security. Building on parent if an attacker succeeds in execut- If Microkernels Are So Dependable, Why Doesn’t Anyone Use Them? work by the Tanenbaum group, the ing code. In Minix, a hijacked user pro- Technical University of Applied Science cess is still a problem, but the effect is far Actually, they do. Probably you run many Berlin ported the widespread Netfilter less serious thanks to isolation. of them. Your mobile phone, for example, is a small but otherwise normal computer, framework to Minix 3 [5]. Even Microsoft is exploring their own and there is a good chance it runs L4 or Here again, the stability of the micro- microkernel system, named Singularity Symbian, another microkernel. Cisco’s kernel architecture delivers additional [6]. Although Minix has played the mi- high-performance router uses one. In the benefits. In Linux, an attacker who suc- crokernel game for many years now, its military and aerospace markets, where ceeds in provoking a crash – for exam- biggest obstacle to becoming more wide- dependability is paramount, Green Hills ple, by exploiting a buffer overflow in spread has always been its non-free li- Integrity, another microkernel, is widely the do_replace() function – can bring a cense. Now that Minix 3 is released used. PikeOS and QNX are also microker- under the BSD open nels widely used in industrial and embed- source license and the ded systems. In other words, when it re- ally matters that the system “just works all firewall extensions are the time” people turn to microkernels. For available under the more on this topic, see www.cs.​­ vu.​­ nl/​­ ~ast/​­ ​ GPL [7]. Researchers ­reliable‑os/. at the TFH Berlin are In conclusion, it is our belief, based on also working on ex- many conversations with nontechnical ploring Minix’s poten- users, that what they want above all else tial as a virtualized is a system that works perfectly all the firewall. Stability, a time. They have a very low tolerance for small footprint, and a unreliable systems but currently have no new licensing model . We believe that microkernel-based give Minix 3 a strong systems can lead the way to more de- pendable systems. potential for growth, Figure 2: Minix is very spartan on launching. This said, version especially in embed- 3.1.2a does include an X11 interface and developer tools. ded systems. n

FEBRuary 2009 ISSUE 99 51