Inside the Mac OS X Kernel Debunking Mac OS Myths 24Th Chaos Communication Congress 24C3, Berlin 2007
Total Page:16
File Type:pdf, Size:1020Kb
Lucy <whoislucy(at)gmail.com> Inside the Mac OS X Kernel Debunking Mac OS Myths 24th Chaos Communication Congress 24C3, Berlin 2007 Many buzzwords are associated with Mac OS Mac OS Successor X: Mach kernel, microkernel, FreeBSD kernel, As Apple was in bitter need of a successor for C++, 64 bit, UNIX... and while all of these ap- Mac OS, they decided to buy an operating sys- ply in some way, “XNU”, the Mac OS X ker- tem and build Mac OS compatibility into it. nel is neither Mach, nor FreeBSD-based, it's Despite negotiations with the company behind not a microkernel, it's not written in C++ and BeOS, Apple finally decided to buy NEXT, the it's not 64 bit - but it is Open Source (with res- company Steve Jobs had founded just after ervations) and it's UNIX... but just since re- having left Apple in 1985, and to convert cently. NEXTSTEP/OpenStep into the next Mac OS: This paper intends to clear up the confusion Mac OS X. by presenting details of the Mac OS X kernel Mach architecture, its components Mach, BSD and I/ The NEXTSTEP operating system was heav- O-Kit, what's so different and special about this ily based on Mach. Mach was an operating sys- design, and what the special strengths of it are. tem project at the Carnegie Mellon University that was started in 1985 in response to the ever- History increasing complexity of the UNIX and BSD Unlike many other operating systems, the de- kernels. As one of the first microkernels, it only sign of Mac OS X has never been strictly included code for memory management (ad- planned and implemented from scratch, in- dress spaces, tasks), scheduling (threads; a stead, it is the result of code from very differ- concept unknown to UNIX at that time) and ent sources put together over the last decades. inter-process communication (IPC) - all other Mac OS functionality typically found in an operating Mac OS started its life in 1984 on the original system kernel, like filesystems, networking, 128KB Macintosh as a mouse-operated graphi- security and device drivers, had to be imple- cal operating system that, due to memory con- mented in so-called “servers” in user space. straints, did not support multitasking. It wasn't This could be a very big plus for reliability, until 1988 that Mac OS supported a very sim- since a crash in a driver didn't necessarily bring ple form of cooperative multitasking (“Multi- the system down, as well as maintainability, Finder”). In the mid-90s, Apple ended up hav- since it imposed strict rules on the interface ing a ten year old code base designed for a between the core kernel functionality and the single-tasking system on a Motorola 68000 that userland servers. Unlike in UNIX, operating now ran on PowerPC CPUs. Parts of the kernel system components couldn't just call each code ran in a 68K emulator, and it still did not other arbitrarily (“The big mess” - Tanen- support memory protection. There was no way baum). Another advantage of a microkernel to compete even with Windows 95, which is like Mach is the possibility to have several per- why Apple started the Copland project in 1994 sonalities, each of which is a set of userspace in order to design and implement a new and servers. This way, a Mach-based system could, modern operating system that would have the for example, run UNIX and Windows applica- Mac OS API and user interface - much like tions at the same time. Having a minimal piece Microsoft did with Windows NT. But although of code running in privileged mode that ab- Copland had been heavily advertised with de- stracts the hardware and allows different oper- velopers, programming books had been pub- ating systems to run on top of it is basically the lished and Betas had been given out, the pieces same approach implemented by virtualization of Copland never fit together, and the unbeara- today. But the typical configuration of a Mach bly unstable operating system was scrapped in operating system was to have a single BSD 1996. server in user mode, i.e. the majority of the BSD kernel with memory management and API renamed to Cocoa, the Mac OS 9 API scheduling stripped out, and process manage- “Toolbox” ported as a compatibility API (now ment built on top of Mach tasks. named “Carbon”), “carbonized” versions of the The problem with the Mach design was that OS 9 Finder and QuickTime technologies, plus the kernel was slower than a traditional mono- a VMware-like Virtual Machine called Blue- lithic kernel because of the extra kernel/user Box (“Classic”) that runs OS 9 and its applica- context switches when a server communicated tions unmodified. with the kernel or servers communicated with each other. On a monolithic kernel, these were Architecture just simple function calls. The simplest solu- The Mac OS X kernel, named “XNU” (“X is tion for this problem is “co-location”: The per- not UNIX”) consists of three main compo- sonality servers run in kernel mode, and com- nents: Mach, BSD and I/O-Kit. munication is fast again. While it somewhat Mach defeats the original idea of a microkernel, it Being the only operating system that still uses still has the advantage of well-partitioned ker- Mach code (not counting GNU/HURD), Mac nel components and a more modern core ker- OS X has evolved from the original code base nel: The Mach memory management code was quite a bit, but the architecture is basically un- later integrated into BSD. changed. Mach (“osfmk” in the kernel source NEXTSTEP tree, which stands for “OSF microkernel”) NEXTSTEP, which was released in a 1.0 ver- calls address spaces “tasks”, and one task can sion in 1989, chose to go with this design. contain zero or more threads. Being policy- NEXT had removed the core kernel parts from free, there is little information associated with the 4.3BSD kernel and layered it on top of a task, so, for example, there is no UNIX-style Mach, in kernel mode. This way, NEXT was current working directory or environment as- many years ahead of the competition with sociated with it. While there are few surprises NEXTSTEP being the first desktop/GUI oper- in the memory management code compared to ating system that supported preemptive multi- other modern operating systems, the key dis- tasking, memory protection and UNIX com- tinctive feature of Mach is Mach Messaging. A patibility. At first NEXTSTEP only ran on their task can have any number of “ports”, which are own Motorola 68K-based machines, but was interprocess communication (IPC) endpoints. later ported to SPARC, PA-RISC and i386, One task can subsequently send a message when NEXT started licensing it under the name from its originating port to its peer port, and “OpenStep” to other hardware manufacturers, Mach will take care of security, enqueueing, so it was highly portable. When Apple acquired dequeueing, network opacity (ports can be on NEXT in 1997, they added PowerPC support different machines) and, if necessary, byte and removed support for all architectures other swapping. For programming convenience, the than i386; the latter would serve as the fallback Mach Interface Generator (“MIG”) can gener- solution when Apple switched from PowerPC ate stub code from interface definitions, so that to i386 in 2005/2006. two processes can talk to each other using sim- Rhapsody and OS X ple function calls, but internally, this will be With Apple’s acquisition of OpenStep, many translated into Mach messages. more changes were made to the operating sys- BSD tem which now had the interim name “Rhap- The BSD part of the kernel implements sody”: They replaced the “DriverKit” driver UNIX processes on top of Mach tasks, and model with the new “I/O-Kit” system, updated UNIX signals on top of Mach exceptions and Mach 2.5 with the Mach 3.0 codebase, updated Mach IPC. UNIX filesystem semantics are im- the BSD part with 4.4BSD and FreeBSD code plemented here just like TCP/IP networking. and added support for the HFS filesystem and And while the VFS (virtual filesystem) compo- Apple networking protocols to the kernel. In nent allows plugging in BSD-style filesystems, userland, Mac OS X is pretty much the /dev infrastructure plugs right into I/O-Kit. NEXTSTEP/OpenStep, with the native “NS” BSD exports all the semantics that an applica- tion expects from a UNIX/BSD/POSIX com- XML description of dependencies and the parts patible operating system, like “open()” and of the kernel it links against. “fork()”, through the syscall interface. Since there are basically two kernels in XNU Other interesting details - Mach with its message passing API and BSD The following sections describe some other with the POSIX API - there are two kinds of interesting details of or around the Mac OS X syscalls. While both use a single int 0x80/ kernel. sysenter/sc entry point, negative syscall num- Booting bers will be routed to Mach, while positive While PowerPC-based Macs use OpenFirm- ones go to BSD. Note that, just like on Win- ware, Intel-based machines use EFI (“Extensi- dows NT, applications may not use int 0x80/ ble Firmware Interface”). Both kinds of firm- sysenter/sc directly, as this is a private inter- ware are a lot more powerful than the 16 bit face. Instead, applications must call through BIOS still shipping on PCs. While EFI can libSystem, which is the equivalent of libc on boot off USB and supports GPT partitioning OS X.