USOO83416O2B2
(12) United States Patent (10) Patent No.: US 8,341,602 B2 Hawblitzel et al. (45) Date of Patent: Dec. 25, 2012
(54) AUTOMATED VERIFICATION OF A 7.65 E: $39. St...... 727 TYPE-SAFE OPERATING SYSTEM 7,818,729- W - B1 * 10/2010 PlumOTley, et JT.al...... T17,140. . 8, 104,021 B2* 1/2012 Erli tal. T17,126 (75) Inventors: Chris Hawblitzel, Redmond, WA (US); 8,127,280 B2 * 2/2012 E. al T17.136 Jean Yang, Cambridge, MA (US) 8,132,002 B2 * 3/2012 Lo et al...... 713, 164 8, 181,021 B2 * 5/2012 Ginter et al...... T13, 164 (73) Assignee: Microsoft Corporation, Redmond, WA 2004/0172638 A1 9, 2004 Larus (US) 2005, OO65973 A1 3/2005 Steensgaard et al. 2005, 0071387 A1 3, 2005 Mitchell (*) Notice: Subject to any disclaimer, the term of this (Continued) patent is extended or adjusted under 35 U.S.C. 154(b) by 442 days. OTHER PUBLICATIONS Klein et al., seL4: formal verification of an OS kernel, Oct. 2009, 14 (21) Appl. No.: 12/714,468 pages,
KERNE (TAL) APPLICATION 24 (TAL)
GarbageCollect NUCLEUS GetStackState AllocObject (BoogiepL) Allacy actor e Rese:Stack hrow g g YieldTo s s A. 2 4 s s s adfield wgaTextwrite FaultHandlar riteField Tryreadkeyboard ErrorHandlar O readStack Startimer Interrupthandler writeStack SendEoi Fatalandler 22 5 v. v.2 s
x86 HARDWARE 220 "SafeoS"STRUCTURE SHOWINGXEMPLARY FUNCTIONSEXPORTED BY NUCLEUS US 8,341.602 B2 Page 2
U.S. PATENT DOCUMENTS 10th ACM SIGPLAN Int’l Conf. on Functional Programming, ICFP 2005/0114842 A1* 5/2005 Fleehart et al...... 717/126 2005, pp. 116-128, Sep. 26-28, 2005, Tallinn, Estonia. 2006/0085433 A1 4/2006 Bacon Klein, G. K. Elphinstone, G. Heiser, J. Andronick, D. Cock, P. 2006/0265438 Al 11/2006 Shankar Derrin, D. Elkaduwe, K. Engelhardt, R. Kolanski, M. Norrish, T. 2006/0277539 A1* 12/2006 Amarasinghe et al...... 717/168 Sewell. H. Tuch, S. Winwood, seA: Formal verification of an OS 2007/0174370 A1 7/2007 Detlefs kernel, Proc. of the 22nd ACM Symposium on Operating Systems 2007/0285271 A1 12/2007 Erlingsson Principles 2009, SOSP 2009, Oct. 11-14, 2009, pp. 207-220, Big Sky, 2008/0010327 A1 1/2008 Steensgaard Montana, USA. 2008/0148244 A1* 6/2008 Tannenbaum et al...... 717/136 2012/0084759 A1* 4/2012 Candlea et al...... 717/126 Lin, C., A. McCreight, Z. Shao, Y. Chen, Y. Guo, Foundational typed assembly language with certified collecton, First Joint IEEE/IFIP OTHER PUBLICATIONS Symposium on Theoretical Aspects of Software Engg. TASE 2007, Jun. 5-8, 2007, pp. 326-338, Shanghai, China. Heiser et al., Towards trustworthy computing systems: taking Maeda, T., Safe execution of user programs in Kernel mode using microkernels to the next level, Jul. 2007, 9 pages,
-----'sixt------ANNOTATED SEE 3. ANNOTATIONMODULE SOURCE CODE as a as a a FOR OS NUCLEUS EVALUATIONMODULE
VERIFED WERIFED SOURCE CODE OS NUCLEUS COMPLATIONMODULE SOURCE CODE
COMPLED OS NUCLEUS SUPPLYING WELL-FORMED STACKS AND OBJECTS
145
KERNEL SOURCE CODE KERNEL source cooEH SOURCEINTERACTION CODE OTHERAPPLICATION PERSEEME's VERIFICATIONMODULE SOURCE CODE 150
WERIFED PROGRAMELEMENT WERIFED SOURCE CODE PROGRAMELEMENT COMPLATIONMODULE SOURCE CODE
COMPLED PROGRAM ELEMENT(S)
OS CONSTRUCTION VERIFIABLY SAFE OS MODULE (e.g., "Safe08')
"AUTOMATED, STATIC SAFETY VERIFIER" FIG. 1 U.S. Patent Dec. 25, 2012 Sheet 2 of 5 US 8,341,602 B2
KERNEL (TAL) RAS 255 APPLICATION (TAL) 245...A.------M.------KERNEL ENTRY POINT NewThread, Yield, ...
GarbageCollect AllocObject NUCLEUS GetStackState (BoogiePL) AllocVector ResetStack Throw 240 YieldTO readfield VgaTextWrite FaultHandler WriteField TryReadkeyboard ErrorHandler readStack StartTimer Interrupthandler WriteStack SendEoi Fatalandler 225 Vo V
x86 HARDWARE
220
"SafeOS" STRUCTURE SHOWING EXEMPLARY FUNCTIONS EXPORTED BY NUCLEUS FIG 2 U.S. Patent Dec. 25, 2012 Sheet 3 of 5 US 8,341,602 B2
BUILDING "SafeOS” USING KERNEL TRUSTED AND UNTRUSTED SOURCE (C) SOURCE (C) COMPONENTS y-E.---
NUCLEUS SOURCE ) 305 310
BEAT COMPLER BARTOK COMPLER
315 365 NUCLEUS SOURCE (BoogiePL x86) KERNAL...OBJ (x86)
325 32O 370 SPECIFICATIONS Boogie/Z3 (BoogiePL) TAL CHECKER 330 335 375 gif380 BoogieaSM ASSEMBLER LINKER EXECUTABLE W a 340 NUCLEUS.OBJ (X86) SafeOS GENERATOR P P E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. E. f. E. E. E. E. E. E. E. E.
SafeC)S. ISO (BOOTABLE CD ROM IMAGE) ...... U.S. Patent Dec. 25, 2012 Sheet 4 of 5 US 8,341,602 B2
PROVIDE ANNOTATED SOURCE CODE COMPRISING ANUCLEUS OF AN OPERATING SYSTEM, WHEREIN THE ANNOTATIONS CORRESPOND TO ALLOWED BEHAVIORS (I.E., "SPECIFICATIONS") FOREACH INSTRUCTION OF THE NUCLEUS SOURCE CODE
AUTOMATICALLY EVALUATE ALLOWED BEHAVIORS OF THE ANNOTATED SOURCE CODE USINGAVERIFICATION TOOL TO ENSURE THAT ALLOWED BEHAVIORS ARE CONSISTENT WITH THE CORRESPONDING INSTRUCTIONS OF THE NUCLEUS SOURCE CODE
IF THE NUCLEUS SOURCE CODE HAS CONSISTENT ALLOWED BEHAVIORS, COMPLE THE NUCLEUS SOURCE CODE TO PRODUCE AN OPERATING SYSTEMNUCLEUS SUPPLYING WELL-FORMED STACKS AND OBJECTS
PROVIDE SOURCE CODE FOR ONE OR MORE HIGHER LEVEL PROGRAMELEMENTS (SUCH AS THE KERNEL OR OTHERAPPLICATIONS) THAT ARE INTENDED TO RUN ON TOP OF THE OPERATING SYSTEM NUCLEUS
AUTOMATICALLY VERIFY THAT THE SOURCE CODE OF THE HIGHER LEVEL PROGRAMELEMENTS SAFELY INTERACTS WITH THE WELL-FORMED STACKS AND OBJECTS OF THE COMPLED OPERATING SYSTEM NUCLEUS
IF THE SOURCE CODE OF THE HIGHER LEVELPROGRAMELEMENTS IS WERIFED TO SAFELY INTERACT, COMPLE THE VERIFIED SOURCE CODE
COMBINE (I.E., LINK) THE OPERATING SYSTEMNUCLEUS WITH THE COMPLED SOURCE CODE OF THE HIGHER LEVELPROGRAMELEMENTS TO CONSTRUCT AVERIFIABLY SAFE OPERATING SYSTEM (E.G., "SafeoS") THAT CORRECTLY SUPPLIES WELL-FORMED STACKS AND OBJECTS
CONSTRUCTION OF AVERIFIABLY SAFE OPERATING SYSTEM
FIG. 4 U.S. Patent Dec. 25, 2012 Sheet 5 of 5 US 8,341,602 B2
GENERAL PURPOSE COMPUTER
DISPLAY DEVICE
PROCESSING UNIT(S)
SYSTEM MEMORY (RAM/FLASH/ETC.)
OUTPUT COMMUNICATIONS DEVICE(S) INTERFACE US 8,341,602 B2 1. 2 AUTOMATED VERIFICATION OFA objects that are protected by type safety. Further, sel4 pro TYPE-SAFE OPERATING SYSTEM cesses communicate through shared memory or message passing. Therefore, unfortunately, sel. A does not have a com CROSS-REFERENCE TO RELATED munication mechanism that is statically checked for safety. In APPLICATIONS addition, sel4 verifies its C code; however, its assembly language (~600 lines of ARM assembler) is currently unveri This application is a Continuation-in-Part (CIP) of, and fied. These 4 microkernel contained ~8700 lines of C code, claims the benefit under Title 35, U.S. Code, Section 120 of, Substantially larger than earlier verified operating systems previously filed U.S. patent application Ser. No. 12/362.444, like Kit. On the other hand, the effort required was also large. filed on Jan. 29, 2009, by Chris Hawblitzel, and entitled 10 In fact, it was reported that the sel. A project required 20 “Automatic Region-Based Verification of Garbage Collec person-years of research devoted to developing their proofs, tors', the subject matter of which is incorporated herein by including 11 person-years specifically for these 4 code base. this reference. The proof required 200,000 lines of Isabelle scripts—a 20-to-1 script-to-code ratio. Thus, while sel. A demonstrates BACKGROUND 15 that microkernels are within the reach of interactive theorem proving, these 4 project fails to provide an acceptably effi 1. Technical Field cient solution to the problem of realistic systems software An Automated, Static Safety Verifier, as described verification. herein, uses typed assembly language (TAL) and Hoare logic The well-known “FLINT’ project sets an ambitious goal to to achieve highly automated, static verification of both type build foundational certified machine code, where certifica and memory safety for constructing a 'safe' operating sys tion produces aproof about an executable program, and a very tem. small proof checker can verify the proof. The FLINT project 2. Related Art uses powerful higher-order logic to verify properties of low Low-level software, such as operating systems and lan level code, but fails to achieve significant automation. For guage run-time systems, should be reliable and secure. With 25 example, the preemptive thread library of the FLINT project out reliability, users endure frustration and potential data loss requires -35,000 lines of script for about 300 lines of assem when the system software crashes. Without security, users are bly language (plus many tens of thousands of lines of script to Vulnerable to attacks from the network, which often exploit build up foundational lemmas about the program logic). low-level bugs such as buffer overflows to take over a user's Separately, the FLINT project also created a certified system computer. Unfortunately, today's low-level software still suf 30 combining TAL with certified garbage collection (GC). fers from a steady stream of bugs, often leaving computers Unfortunately, the TAL/GC system was rather limited in that vulnerable to attack until the bugs are patched. it supported only objects containing exactly two words. Fur Many projects have proposed using safe languages to ther, it performed only conservative collection, and did not increase the reliability and security of low-level systems. Safe Support multiple threads. languages ensure type safety and memory safety: accesses to 35 Finally, as is well-known to those skilled in the art, H data are guaranteed well-typed and guaranteed not to over monads have been used to export a set of operating system flow memory boundaries or dereference dangling pointers. primitives usable to develop an operating system in a high This safety rules out many common bugs. Such as buffer level language. For example, H was used to build the well overflow vulnerabilities. Unfortunately, even if a language is known “House' operating system. However, H itself is not safe, implementations of the language’s run-time system 40 formally verified, and relies on a large Haskell run-time sys might have bugs that undermine the safety. For example, Such tem for correctness. bugs have left many conventional web browsers open to attack. SUMMARY The desire to formally verify low-level operating systems (OS) and run-time system code is not new. Nevertheless, very 45 This Summary is provided to introduce a selection of con little mechanically verified low-level OS and run-time system cepts in a simplified form that are further described below in code exists, and, as is known to those skilled in the art, that the Detailed Description. This Summary is not intended to code still requires many person-years of effort to Verify. identify key features or essential features of the claimed sub For example, with respect to OS and run-time system veri ject matter, nor is it intended to be used as an aid in determin fication efforts, more than 20 years ago, the well-known 50 ing the scope of the claimed Subject matter. Boyer-Moore mechanical theorem prover verified a small OS In general, an “Automated, Static Safety Verifier” as and a Small high-level language implementation. These were described herein, uses typed assembly language (TAL) and not integrated into a single system, though, and each piece in Hoare logic to achieve highly automated, static verification of isolation was quite limited. The OS, referred to as “Kit', was both type and memory safety of an operating system. An quite Small, containing just a few hundred assembly language 55 example of a safe operating system constructed using the instructions. It supported a fixed number of preemptive techniques described herein is referred to hereinas “SafeOS'. threads, but did not implement dynamic memory allocation or The Automated, Static Safety Verifier” uses various tech thread allocation. Further, it ran on an artificial machine niques and tools to mechanically verify the safety of every rather than on standard hardware. The language used, “micro assembly language instruction in the operating system, run Gypsy', was Small enough that it did not require a significant 60 time system, drivers, and applications, with the exception of run-time system, as it had no heap allocation or threads. an operating system bootloader. More recently, the well-known “sel 4' project verified all SafeCS, or a similar operating system constructed using of the C code for an entire microkernel. se4 works on uni the Automated, Static Safety Verifier, includes a “Nucleus’ processors, providing preemption via interrupts, and provid that provides access to hardware and memory, a “kernel” that ing non-blocking system calls. SeL4 also provides memory 65 builds services on top of the Nucleus, and applications that protection by providing address spaces of pages protected by run on top of the kernel. The Nucleus, written in verified hardware page tables. Unfortunately, sel. A does not provide assembly language, implements allocation, garbage collec US 8,341,602 B2 3 4 tion, multiple stacks, interrupt handling, and device access. FIG. 3 provides an exemplary flow diagram illustrating The kernel, written in C# (or other language) and compiled to how a verifiably safe operating system such as “SafeGS is TAL, builds higher-level services, such as preemptive constructed from untrusted and trusted components, as threads, on top of the Nucleus. ATAL checker then verifies described herein the safety of the kernel and applications. Finally, a Hoare FIG. 4 illustrates a general system flow diagram that illus style verifier with an automated theorem prover verifies both trates exemplary methods for implementing various embodi the safety and correctness of the Nucleus. ments of the “Automated, Static Safety Verifier, as described As noted above, SafeGS is a simple example of a safe herein. operating system constructed using the techniques described FIG. 5 is a general system diagram depicting a simplified herein. It should be understood that the discussion of SafeGS 10 general-purpose computing device having simplified com is provided only for purposes of explanation with respect to puting and I/O capabilities for use in implementing various the construction of a safe OS using the “Automated Static embodiments of the Automated, Static Safety Verifier, as Safety Verifier. As such, it should be understood that while described herein. feature rich safe operating systems can be constructed using 15 DETAILED DESCRIPTION OF THE the Automated, Static Safety Verifier, in its current imple EMBODIMENTS mentation, SafeCS is a small safe operating system that has a number of programmatic limitations. For example, it lacks In the following description of the embodiments of the Support for many common C# features: exception handling, claimed Subject matter, reference is made to the accompany for example, is implemented by killing a thread entirely, ing drawings, which form a parthereof, and in which is shown rather than with try/catch. It lacks the standard .NET class by way of illustration specific embodiments in which the library, since the library's implementation currently contains claimed subject matter may be practiced. It should be under much unsafe code. It lacks dynamic loading of code. stood that other embodiments may be utilized and structural Although it protects applications from each other using type changes may be made without departing from the scope of the safety, it lacks a more comprehensive isolation mechanism 25 presently claimed Subject matter. between applications. Such as Java Isolates, Ch AppDomains, 1.0 Introduction: or Singularity SIPs. The verification does not guarantee ter In general, an “Automated, Static Safety Verifier” as mination. Finally, SafeCS uses verified garbage collectors (as described herein, uses typed assembly language (TAL) and described in the aforementioned copending U.S. patent appli Hoare logic to achieve highly automated, static verification of cation) that are stop-the-world rather than incremental or 30 both type and memory safety of an operating system. The real-time, and SafeCS keeps interrupts disabled throughout “Automated, Static Safety Verifier uses various techniques the collection. and tools to mechanically verify the safety of every assembly However, it should be understood that none of the limita language instruction in the operating system, run-time sys tions in SafeGS’s present implementation, as described above tem, drivers, and applications, with the exception of an oper are fundamental. In fact, it must be understood that the imple 35 ating system boot loader. An example of a safe operating mentation of SafeGS, as described herein, is simply a basic constructed using the techniques described herein is referred example of the more general capabilities of the “Automated, to herein as “SafeCS”. However, it should be understood that Static Safety Verifier for constructing verifiably safe operat SafeCS is a simple example of a safe operating system con ing systems. Specifically, the high degree of automation in structed using the techniques described herein, that is SafeGS’s verification, provided by the “Automated, Static 40 described only for purposes of explanation. As such, it should Safety Verifier, will allow SafeOS, or any other OS con be understood that safe OS's with expanded feature sets can structed using the “Automated, Static Safety Verifier” to scale be constructed using those same techniques. Consequently, to a more realistic feature set, such as a large library of safe any limitations in the feature set of the exemplary SafeOS code and a verified incremental garbage collector. should not be interpreted as limiting the capabilities of the In view of the above summary, it is clear that the “Auto 45 “Automated, Static Safety Verifier to create a safe OS having mated, Static Safety Verifier” described herein provides vari any desired features or capabilities. ous unique techniques for achieving highly automated, static More specifically, the “Automated Static Safety Verifier” Verification of both type and memory safety for constructing is used to generate an operating system, Such as the SafeCS a “safe' operating system. In addition to the just described example discussed herein, and run-time system that is veri benefits, other advantages of the “Automated Static Safety 50 fied to ensure both type and memory safety. Such safety is Verifier will become apparent from the detailed description ensured by mandating that every assembly language instruc that follows hereinafter when taken in conjunction with the tion in the running system is mechanically verified for safety. accompanying drawing figures. This includes every instruction of every piece of software except the bootloader. In other words, all applications, device DESCRIPTION OF THE DRAWINGS 55 drivers, thread schedulers, interrupt handlers, allocators, gar bage collectors, etc., are verified safe. The Automated, Static The specific features, aspects, and advantages of the Safety Verifier leverages two technologies to verify OS claimed subject matter will become better understood with safety: TAL (typed assembly language) and automated SMT regard to the following description, appended claims, and (satisfiability modulo theories) theorem provers. As accompanying drawings where: 60 described in further detail below, these technologies are FIG. 1 provides an exemplary architectural flow diagram adapted to automate most of the verification. that illustrates program modules for implementing various A safe operating system such as SafeCS consists of at least embodiments of an “Automated, Static Safety Verifier, as three layers, including: a “Nucleus’ that provides primitive described herein. access to hardware and memory, a “kernel’ that builds system FIG. 2 provides an exemplary a structure for verifiably safe 65 services on top of the Nucleus, and applications that run on operating system (i.e. “SafeCS) constructed using the top of the kernel. In a tested embodiment of the “Automated, “Automated, Static Safety Verifier, as described herein. Static Safety Verifier, the kernel and applications were writ US 8,341,602 B2 5 6 ten in safe C# (though it should be clear that other languages several of which are specifically defined below for purposes may also be used, if desired), which is automatically com of explanation and understanding. piled to TAL. An existing TAL checker was then used to Memory Safe Program: A program is said to be “memory verify this TAL (again, automatically). The Nucleus was writ safe' if the program can never access memory that is not ten directly in assembly language, and was annotated with currently allocated for use by the program. specifications (including preconditions, postconditions, and Type Safe Program: A program is said to be “type safe' if loop invariants). Note that these “specifications” are also types are associated with pointers to memory, and the pro referred to herein as “allowed behaviors' or simply as “anno gram can only access memory Subject to rules associated with tations'. these types 10 Safe Operating System: An operating system is said to be A Hoare-style program verifier called “Boogie' was then “safe' if the operating system is both “memory safe' and used verify the assembly language against a specification “type safe'. (i.e., allowed behaviors) of safety and correctness. This Verifiably Safe Operating System: An operating system is ensures the safety and correctness of the Nucleus’s imple said to be “verifiably safe' if a computer verification program mentation, including safe interaction with the TAL code and 15 can analyze the source code of the operating system to verify safe interaction with hardware (including memory, interrupts, that the operating system is “safe' (see above definition of a timer, keyboard, and screen). Boogie relies on Z3, an auto 'safe' operating system). mated SMT theorem prover, to check that the specifications Well-Formed Stack: A stack is said to be “well-formed” if (i.e., the allowed behaviors) are satisfied. Writing the speci it consists of Zero or more stack frames, where both of the fications requires human effort, but once they are written, following conditions hold: Boogie and Z3 verify them completely automatically. As a 1. Each frame's memory does not overlap with the memory result, the Nucleus of a safe operating system, such as the of any other stack frame or object; and “SafeOS Nucleus, requires only 2-3 lines of proof annota 2. Each frame contains Zero or more pointers, each point tion per executable statement, which, as is known to those ing to some object, where: skilled in the art, is an order of magnitude less than similar 25 a. If the pointer has type t, then the pointed-to object has projects based on interactive theorem provers. a type that is assignable to t. Note that some of the stack-based concepts described Well-Formed Object: An object is said to be “well-formed” herein build on techniques described in the aforementioned if both of the following conditions hold: co-pending U.S. patent application Ser. No. 12/362.444, 1. The objects memory does not overlap with the memory entitled “Automatic Region-Based Verification of Garbage 30 of any other stack frame or object; and Collectors', that guarantee the safety of higher-level lan 2. The object contains Zero or more pointers, each pointing guage code, such as C#. In particular, as described in further to some object, where: detail below, the stack-based solutions described herein a. If the pointer has type t, then the pointed-to object has enforce the guarantee that whenever the assembly-language a type that is assignable to t. code transfers control to higher-level code, the higher-level 35 1.2 System Overview: code's stack and heap are “well-formed (see Section 1.1 for As noted above, the “Automated, Static Safety Verifier.” a definition of “well-formed and other terms used through provides various techniques for using typed assembly lan out this document). This solution, however, extends the stack guage (TAL) and Hoare logic to achieve highly automated, based solutions of the aforementioned co-pending U.S. patent static verification of both type and memory safety of an oper application (which allowed only one stack) to handle multiple 40 ating system. The processes Summarized above are illustrated stacks. Because the assembly language code is free to modify by the general system diagram of FIG. 1. In particular, the the stack pointer, as described in further detail below, it can system diagram of FIG. 1 illustrates the interrelationships easily Switch to any higher-level language stack. This enables between program modules for implementing various embodi the assembly language code to implement multiple threads. ments of the “Automated, Static Safety Verifier, as described Although the assembly language code runs with interrupts 45 herein. Furthermore, while the system diagram of FIG. 1 disabled, the higher-level language code may run with inter illustrates a high-level view of various embodiments of the rupts enabled, so that the higher-level language code may be “Automated, Static Safety Verifier. FIG. 1 is not intended to preempted by an interrupt. This allows the assembly language provide an exhaustive or complete illustration of every pos code to implement preemptive threads. sible embodiment of the Automated, Static Safety Verifier In addition, various techniques described in the aforemen 50 as described throughout this document. tioned co-pending U.S. patent application express memory In addition, it should be noted that any boxes and intercon management (including garbage collection) as assembly lan nections between boxes that may be represented by broken or guage code, Verifiable by Boogie/Z3. These techniques use dashed lines in FIG. 1 represent alternate embodiments of the regions to ensure that the garbage collector scans the higher “Automated, Static Safety Verifier described herein. Fur level language Stack before completing a garbage collection: 55 thermore, any or all of these alternate embodiments, as the stack and the heap are associated with “regions, and the described below, may be used in combination with other stack's region must match the heap's region at the end of alternate embodiments that are described throughout this garbage collection. These concepts from the aforementioned document. co-pending U.S. patent application are extended to multiple In general, as illustrated by FIG. 1, the processes enabled stacks, as described herein, where all the stacks regions must 60 by the “Automated, Static Safety Verifier begin the process match the heap's region at the end of garbage collection. This of constructing a verifiably safe operating system 100 by ensures that the verified assembly language code correctly using a source code “specification” evaluation module 120 to finds and scans all stacks for threads that may run in the evaluate annotated source code 105 that defines an OS future. nucleus. This evaluation is intended to verify that allowed 1.1 General Definitions and Considerations: 65 behaviors (i.e., the “specifications”) associated with each The “Automated, Static Safety Verifier, as described instruction are consistent with those instructions (See Section herein will be discussed using various terms and definitions, 2.4). In general, the annotations for each instruction in the US 8,341,602 B2 7 8 source code 105 are manually added to the source code via a 2.1 Operational Overview and Advantages: manual annotation module 110. As discussed in further detail As noted above, the “Automated, Static Safety Verifier'- in Section 2.4, the purpose of the annotations on the Source based processes described herein provide various techniques code is to provide “specifications” that define “allowed for using typed assembly language (TAL) and Hoare logic to behaviors' for each instruction of the source code. achieve highly automated, static verification of both type and Assuming that the source code 'specification” evaluation memory safety of an operating system. In providing these module 120 determines that all instructions in the OS nucleus techniques, the “Automated, Static Safety Verifier described are verified with respect to their allowed behaviors, the source herein provides a number of advantages conventional verifi code “specification” evaluation module 120 passes the veri cation schemes. 10 First, the “Automated Static Safety Verifier is capable of fied OS nucleus source code 125 to a verified source code producing an operating system (i.e., SafeCS) that is believed compilation module 130. The verified source code compila to be the first OS that is mechanically verified to ensure type tion module 130 then compiles the OS nucleus source code to safety. Furthermore, every assembly language instruction produce an operating system nucleus 135 Supplying “well that runs after booting is statically verified without the need to formed stacks and objects (see prior definition of “well 15 trust a high-level-language compiler or any unverified library formed stacks and objects in Section 1.1). code. Further, an OS such as SafeCS is a real system tested Once the OS nucleus has been compiled, the next step is to embodiments of SafeCS has been observed to boot and run on construct one or more higher level program elements (includ real, off-the-shelf x86 hardware while supporting a variety of ing at least an OS kernel) that are to be run on top of the OS language features, including classes, virtual methods, arrays, Nucleus. The source code 140 for the higher level elements preemptive threads, etc. Note that while the exemplary includes kernel source code 145 and/or other application SafeCS is described herein as being operable on x86 hard source code 150. In either case, the source code 140 for the ware, the use of x86 hardware is described only for purposes higher level program elements is provided to a source code of explanation. In fact, it should be clear that, as with many interaction verification module 155. In general, the source other OS's or applications, the Automated, Static Safety code interaction verification module 155 operates by auto 25 Verifier can be used to construct safe a safe OS for any other matically verifying that the source code 140 of the higher desired CPU architecture by ensuring that the underlying level program elements safely interacts with the well-formed code includes the proper instructions for operating on what stacks and objects of the compiled operating system nucleus ever CPU architecture is desired. 135. Another advantage of the “Automated, Static Safety Veri Assuming that the Source code interaction verification 30 fier is that it is efficient. As illustrated by FIG.3, it supports efficient kernel TAL code 345 (written in Cit) or other appli module 155 successfully verifies each instruction in the cation source code 350 generated by an optimizing C#-to Source code 140 for a particular higher level program element TAL compiler, i.e., “Bartok'360 (or similar complier), using (such as the kernel), then that verified program element Bartok’s native layouts for objects, method tables, and arrays, Source code 160 is passed to a verified program element 35 following compilation to C# using a conventional C# com source code compilation module 165. In general, the verified piler 355. Further, it incorporates the code from the verified program element Source code compilation module 165 sim garbage collectors described in the aforementioned co-pend ply compiles the verified program element source code 160 to ing U.S. patent application Ser. No. 12/362.444, entitled construct corresponding compiled program elements 170. “Automatic Region-Based Verification of Garbage Collec An OS construction module 175 then uses a conventional 40 tors', which can run realistic macro-benchmarks at near the linker to combine the compiled OS Nucleus and the 135 and performance of Bartok’s native garbage collectors. the compiled program elements 170 to construct the verifi These and other examples provided throughout this docu ably safe OS 100. As described in specific detail throughout ment demonstrate that automated techniques (e.g., TAL and Section 2, and with respect to FIG. 2 and FIG.3, the verifiably automated theorem proving) are powerful enough to verify safe OS 100 correctly supplies well-formed stacks and 45 the safety of the complex, low-level code that makes up an objects when that OS is executed on compatible CPU type operating system and run-time system. Furthermore, it dem hardware. onstrates that a small amount of code verified with automated 2.0 Operational Details of the “Automated, Static Safety theorem proving can Support an arbitrarily large amount of Verifier’’: TAL code. The above-described program modules are employed for 50 2.2 Tools for Constructing a “Safe Operating System': implementing various embodiments of the “Automated, Two verification technologies, TAL and automated theo Static Safety Verifier. As summarized above, the “Auto rem proving, drive the design of a safe OS Such as the exem mated, Static Safety Verifier” provides various techniques for plary SafeCS described herein. These technologies are using typed assembly language (TAL) and Hoare logic to complementary. On one hand, TAL is relatively easy to gen achieve highly automated, static verification of both type and 55 erate, since the compiler automatically turns C# code (again, memory safety of an operating system. The following sec other coding languages can be used, if desired) into TAL tions provide a detailed discussion of the operation of various code, relying only on lightweight type annotations already embodiments of the “Automated, Static Safety Verifier, and present in the C# code. This enables TAL to scale easily to of exemplary methods for implementing the program mod large amounts of code. For the exemplary OS described ules described in Section 1 with respect to FIG.1. In particu 60 herein, SafeGS, the Bartok compiler was used to generate lar, the following sections provide examples and operational TAL code from C# code, as illustrated by FIG. 3, though details of various embodiments of the “Automated Static again, other compilers providing similar functionality may Safety Verifier, including: an operational overview of the also be used. “Automated, Static Safety Verifier: tools for constructing a On the other hand, automated theorem provers can verify “safe' OS, the Nucleus interface; the Nucleus source code 65 deeper logical properties about the code than a typical TAL instruction specifications; and Nucleus implementation and type system can express. Leveraging this power requires verification. more effort, though, in the form of heavyweight programmer US 8,341,602 B2 10 Supplied preconditions, postconditions, and loop invariants. FIG. 2) ensures that the Nucleus 200 sets the stack pointer to To exploit the tradeoff between TAL and automated theorem the top of the correct stack during a yield. The specification of proving, the operating system code (e.g., SafeCS) was split correctness also includes specifications for assembly lan code into two parts, as illustrated by FIG. 2: a Nucleus 200, guage instructions and for interaction with hardware devices verified with automated theorem proving, and a kernel 205, and memory. Note that all Boogie specifications are written as verified with TAL. The difficulty of theorem proving moti first-order logic formulas in the BoogiePL language. vated the balance between the two parts: only the functional By expressing and checking properties at a low level (as ity that TAL could not verify as safe went into the Nucleus; all sembly language), non-trivial properties can be ensured with other code went into the kernel, which is then separately high confidence. The remainder of the discussion of the over verified. Note that FIG. 2 is discussed in further detail below. 10 all “Automated, Static Safety Verifier” focuses on these prop In a tested embodiment of the “Automated, Static Safety erties, with an emphasis on the specification and Verification Verifier, the Nucleus’s source code is not expressed in TAL, of the Nucleus’s correctness properties. In particular Section but rather in Boogie's programming language, called Boo 2.3 discusses the Nucleus's design, while Section 2.4 and 2.5 giePL (or just “Boogie' as discussed in the aforementioned discuss specification and verification of the Nucleus. The copending U.S. patent application), so that the Boogie verifier 15 kernel is discussed in detail in Section 2.6. can check it. Since the Nucleus code consists of assembly 2.3 “Nucleus’ Interface: language instructions, these assembly language instructions The Nucleus consists of a set of functions necessary to must appearina form that the Boogie verifier can understand. support the TAL code that runs above it. Note that a minimal As described in detail below, in constructing a safe OS such as set of functions is discussed with respect to SafeOS since the SafeCS, assembly language instructions were encoded as construction of SafeGS is provided only for purposes of sequences of calls to BoogiePL procedures (e.g. an "Add' explanation. Again, a more feature rich set of functions can be procedure, a "Load” procedure, etc.), so that the Boogie veri coded into the Nucleus, if desired. However, it should be fier could check that each instruction’s preconditions were understood that even with an automated theorem prover, satisfied. After Boogie verification, a separate tool, described Hoare-style verification is still hard work; less code in the in the aforementioned copending U.S. patent application, 25 Nucleus means less code to verify. At the same time, the set extracts standard assembly language instructions from the has to guarantee safety in the presence of arbitrary TAL code: BoogiePL code. A standard assembler then turns these it canassume that the TAL code is well typed, but can make no instructions into an object file (i.e., the code is compiled via further assumptions about the well-formedness of the TAL the assembler) for later linking with the kernel and/or other code. Verified higher level program elements to construct the over 30 One optional design decision used in constructing SafeCS, all verifiably safe OS. which greatly simplified the Nucleus, is that, similar to Rather than hand-code the entire Nucleus in assembly lan designs used in recent micro-kernels, no Nucleus function guage, Some of the less performance-critical parts were writ ever blocks. In other words, every Nucleus function performs ten in a high-level extension of BoogiePL referred to as a finite (and usually small) amount of work and then returns. “Beat’. The (non-optimizing) Beat compiler transforms veri 35 However, the Nucleus may return to a different thread than the fiable Beat expressions and statements into verifiable Boo thread that invoked the function. This allows the kernel built giePL assembly language instructions. Note however, that if on top of the Nucleus to implement blocking thread opera desired, the entire Nucleus can be hand-coded in assembly tions, such as waiting on a semaphore. language, or generated using other tools. In other words, the The tested embodiment of the Automated, Static Safety basic idea is to provide assembly language for the entire 40 Verifier used to construct SafeOS resulted in a Nucleus API Nucleus, regardless of how that code is generated. However, consisting of just 20 functions, all shown in FIG. 2. These 20 for purposes of explanation, the remainder of this discussion functions (see elements 225, 230, 235 and 240 of FIG. 2), will assume the use of BoogiePL code. implemented with a total of about 1500x86 instructions, Note, however, that none of the compilers used are consid include 3 memory management functions (AllocObject, ered a part of the trusted computing base. In fact, it is not 45 AllocVector, GarbageCollect), one very limited exception necessary to trust the compilers to ensure the correctness of handling function (Throw), 3 stack management functions the Nucleus and safety of the operating system, such as (GetStackState, ResetStack, YieldTo), and 4 device access SafeCS, as a whole. As shown in FIG.3, the TAL checker 370 functions (VgaTextWrite, TryReadKeyboard, StartTimer, and Boogie/Z3 verifier 325 check that the output (e.g., SendEoi). Most of the functions are intended for use only by nucleus source 315 and compiled kernel 365 or other com 50 the kernel 205 that feeds any necessary data to the application piled application) of the compilers (360 and 310, respec 210 via a main function 255 or the like. Most of these func tively) conforms to the TAL type system and the Nucleus tions are not intended for use by the applications 210. How specification320, respectively, so it is only necessary to trust ever, various applications 210 may call functions such as, for these checkers. example, AllocObject and AllocVector directly, as illustrated Beyond the TAL checker 370 and Boogie/Z3 verifiers 325, 55 by FIG. 2, depending upon how those applications are coded. FIG. 3 shows additional components in SafeCS’s trusted Again, note that additional functions can be coded, if desired, computing base: the assembler 335, the linker 375, the ISO to construct a more feature rich function set. CD-ROM image generator 385, and the boot loader 380. In this exemplary SafeCS, the Nucleus 200 exports four Clearly, some of these components are optional (such as, for pseudo-functions, including: readField, writeField, read example, the ISO CD-ROM image generator 385) and are 60 Stack, and writeStack, to the kernel 205 and applications 210. provided only for ease of use. In addition, the trusted com These functions, described further in Section 2.4, contain no puting base includes the specification 320 of correctness for executable code, but their verification guarantees that the the Nucleus's BoogiePL code 315. This includes specifica kernel 205 and applications 210 will find the values that they tions 320 of the behavior (also referred to herein as “allowed expect when they try to access fields or objects or slots on the behaviors’ or “annotations”) of functions exported by the 65 current stack. The Nucleus 200 exports another four functions Nucleus 200, shown in FIG. 2. For example, referring back to to the x86 hardware 220 for handling faults and interrupts: FIG. 2, the specification of “YieldTo” (within element 230 of FatalHandler halts the system, while FaultHandler, US 8,341,602 B2 11 12 ErrorHandler, and Interrupthandler yield execution to kernel turn, the Nucleus exports functionality to the bootloader (the TAL code running on a designated interrupt-handling stack. Nucleus entry point), the interrupt handling (the Nucleus’s Finally, the Nucleus 200 exports an entry point, NucleusEn interrupt handlers), and the TAL code (AllocObject, YieldTo, tryPoint 225 to the boot loader 215. etc.). As more devices are added to the system, more Nucleus SafeGS expresses the specification for interacting with functions may be required. However, a minimal Nucleus only these components as first-order logical formulas in Boo needs to include the portion of the device interfaces critical giePL. These formulas follow C/Java/CH syntax and consist the rest of the system's safety. For example, if desired, of: SafeGS could use an I/O MMU to protect the system from devices, minimizing the device code that needs to reside in the 10 Nucleus. Similar to the conventional verified L4 micro-kernel, se4, 1. Arithmetic Operators: +,-, *, , SafeGS keeps interrupts disabled throughout the execution of 2. Boolean Operators: , &&., ||, == 3. Variables: foo, Bar, old (foo), ... any single Nucleus function. However, in various embodi 4. Boolean Constants: true, false ments, interrupts can be enabled during the TAL kernels 15 5. Integer Constants: 5, ... execution, with no loss of safety. Since Nucleus functions do 6. Bit Vector Contants: 5bv16, 5bv32,... 7. Function Application: Factorial.(5), Max(3,7), IsOdd(9), ... not block, SafeGS still guarantees that eventually, interrupts 8. Array Indexing: foo3, ... will always be re-enabled, and usually will be re-enabled very 9. Array Update: foo3 := Bar, ... quickly. In a tested embodiment, one implementation of 10.Quantified Formulas: (Wi:int::fooi==Factorial(i)), ... SafeGS sacrifices real-time interrupt handling because of one 11. Etc. particularly long function: “GarbageCollect', which per forms an entire stop-the-world garbage collection across the BoogiePL bit vectors correspond to integers in C/Java/Chi, entire heap. However, it should be understood that such limi which are limited to numbers that fit inside a fixed number of tations are not fundamental to an OS constructed using the bits. BoogiePL integers, on the other hand, are unbounded “Automated, Static Safety Verifier. For example, given a 25 verified incremental garbage collector, SafeCS, or a similar mathematical integers. BoogiePL arrays are unbounded operating system, could reduce the execution time of "Gar mathematical maps from some type (usually integers) to bageCollect' or other garbage collector to, say, just the time some other type. Unlike arrays in C/Java/Chi, BoogiePL required to scan the stacks (or even a single stack), rather than arrays are immutable values (there are no references to arrays, the time required to garbage collect the whole heap. Alterna 30 and arrays are not updated in place). An array update expres tively, SafeGS, or a similar operating system, could poll for sion Such as ax: y creates a new array, which is equal to the interrupts periodically, as in selA. However, it should be old arraya at all locations except X, where it contains a new understood that actually delivering these interrupts to the value, y. For example, (ax:=y)x = y and (ax:=y) kernel TAL code would still require that the garbage collector x+1=ax+1). reach a state that is safe for the kernel. 35 BoogiePL procedures have preconditions and postcondi The next two sections describe how the “Automated, Static tions, written as BoogiePL logical formulas Such as the exem Safety Verifier specifies, implements, and verifies the plary procedure illustrated below: Nucleus functions listed above for creation of a verifiably safe OS Such as SafeCS. 2.4"Nucleus' Source Code Instruction Specifications: 40 vara:int, b:int; To verify that the Nucleus behaves correctly, it is necessary procedure P(x:int, y:int) to specify what that correct behavior is (i.e., the “allowed requires a KbdAvailable==KbdDone; implements "stacks” and the kernel implements “threads'. ensures and(eax,1) =O ==> KbdAvailable > KbdDone; The specification uses two variables, S and StackState, to procedure KbdDataIn8(); 50 requires KbdAvailable > KbdDone: specify the current state of the SafeCS stacks. Stacks are modifies Eip, eax, KbdDone; numbered starting from 0, and S contains the current running ensures KbdDone == old (KbdDone) + 1: stack. The Nucleus entry point specification S=0 indicates ensures and(eax,255) == KbdEventsold.(KbdDOne): that the nucleus should run the TAL kernel entry point in stack 0. At any time, each stack S is in one of four states, specified 55 by StackStates: empty, running, yielded, or interrupted. Ini 2.4.3 Boot Loader and Nucleus Initialization: tially, stack 0 is set to running (StackRunning), and all other Interaction with the boot loader, interrupt handling, and stacks are empty. When TAL code is interrupted, its current TAL code is specified by preconditions and postconditions on stack's state changes to interrupted. When TAL code volun Nucleus-implemented procedures. The first such procedure tarily yields, its current stack's state changes to yielded. to execute is the Nucleus entry point, NucleusEntry Point. 60 NucleusEntry Points final postcondition sets up the Nucle When booting completes, the bootloader transfers control to us's private invariant, “NucleusInv'. This invariant is a NucleusEntry Point. Note that after this control transfer, no Nucleus-specified function that takes as arguments the bootloader code ever runs again during the current OS execu Nucleus’s private global variables, along with Some specifi tion. The Nucleus entry point implementation (i.e., “Nucle cation variables like Mem, S, and StackState. The Nucleus usEntryPoint') must obey the following specification (note 65 may define NucleusInv any way it wants. If it defines too that for brevity, Some of the requires, modifies, and ensures weak an invariant (e.g. NucleusInV(...)=true), though, then clauses are omitted): the Nucleus will be not have enough information to imple US 8,341,602 B2 17 18 ment other Nucleus functions, such as allocation. On the the target stack; the kernel uses this to start new threads 250. other hand, if the invariant is too strong (e.g. Nucleus If the target state is in the running state, then the stack is Inv(...)=false), the Nucleus entry point will not be able to switching to itself; in this case, YieldTo simply returns. For ensure it in the first place. However, for successful verifica brevity, most of the specification forYieldTo is omitted in the tion, the Nucleus must define an invariant strong enough to ensure the well-formedness of the stacks and heaps, as well as example below, however, it looks much like Inter the well-formedness of the Nucleus’s own internal data struc ruptHandler's specification from section 2.4.4, but with cases tures. for all four stack states, rather than just for the yielded state: 2.4.4 Interrupt Handling: After the TAL kernel begins execution in stack 0, it volun 10 tarily yields control to another stack and enables interrupts. procedure YieldTo(...); When a stack Voluntarily yields, its state changes from requires NucleusInv(S, ...); StackRunning to StackYielded.( ebp, esp. eip), requires ScanStackInv(S,..., esp. ebp); where ebp, esp. and eip indicate the values that the stack requires pointer, frame pointer, and instruction pointer, respectively, (StackStates==StackRunning &&s==S must be restored with in order to resume the stack's execu 15 && RET==ReturnToAddr(Memesp)) | (StackStates==StackYielded ( ebp, esp, eip) tion. && RET==ReturnToAddr( eip)) Upon receiving an interrupt or fault, the Nucleus’s inter | (StackStates==StackInterrupted.( eax,..., efl) rupt and fault handlers (InterruptHandler, FaultHandler, and && RET==ReturnToInterrupted ( eip, cs, efl)) ErrorHandler) transfer control back to the TAL code in stack | (StackStates==StackEmpty 0, which then services the interrupt or fault. To specify this && RET==ReturnToAddr(KernelEntry Point) behavior, the handler procedure's preconditions and postcon &&...); ditions force the interrupt and fault handlers to restore stack O's stack and frame pointers and return to stack O’s instruction pointer, as illustrated below: YieldTo differs from Interrupthandler in one crucial 25 respect. In particular, when TAL code Voluntarily calls a Nucleus function, the TAL leaves its stack contents in a state procedure Interrupthandler(...); that the verified garbage collector can Scan: the garbage col requires NucleusInv(S, ...); lector (GC) follows the chain of saved frame pointers until it requires (Tag(StackState|O)==STACK YIELDED ==> reaches a saved frame pointer equal to 0. For each frame, the RET == ReturnToAddr(eip) 30 collector uses the frame's return address as an index into && StackState O==StackYielded.( ebp, esp, eip)); tables of GC information. In contrast, an interrupted ensures (Tag (StackState O))==STACK YIELDED ==> stack is not necessarily scannable. The requirement Scan NucleusInv(0,...) StackInv(...) expresses the layout of the stack that allows && ebp == i ebp && esp == i esp); scanning by the garbage collector. 35 In a tested embodiment of SafeCS, the Nucleus exports The interrupted Stack changes from State StackRunning to several run-time system functions to both the TAL kernel and state, as illustrated below: TAL application code: AllocObject, AllocVector, GarbageC StackInterrupted( eax, ebX. ecX. edX, esi, edi, ollect, and Throw. Again, it must be understood that SafeOS ebp, esp. eip. CS, efl) was constructed using the “Automated, Static Safety Verifier The values eax ... efl indicate the x86 registers that must 40 for purposes of explanation and to demonstrate operation of be restored to resume the interrupted stack's execution. Note the Automated, Static Safety Verifier. As such, it should be that in the case that a safe operating system constructed using clear that the Nucleus an arbitrary safe OS constructed using the “Automated, Static Safety Verifier, such as SafeGS, is the techniques and procedures enabled by the “Automated, designed to handle floating point code, floating point states Static Safety Verifier can be designed to export additional would also need to be restored. 45 functions. SafeGS verifies the correctness of the Nucleus, but only Given this understanding, the exemplary SafeOS imple verifies the safety of the kernel. As a result, a buggy TAL ments only trivial exception handling: when TAL tries to kernel might leave stack 0 in some state besides yielded. For throw an exception, Throw simply terminates the current example, a buggy kernel might enable interrupts while run stack by setting its state to empty and transferring control to ning in Stack 0, which could cause Stack 0 to be in the running 50 stack O. Note that SafeGS takes its allocation and garbage state when an interrupt occurs. Therefore, to ensure safety collection implementations (both mark-Sweep and copying even in the presence of a buggy kernel, the Nucleus checks collection) from the verified garbage collection procedures stack O’s state at run-time; if it not in the yielded state, the described in the aforementioned copending U.S. patent appli Nucleus halts the system to ensure safety. cation. However, SafeGS makes minor changes to the alloca 2.4.5 TAL Kernel and Applications: 55 tion and GC implementation described in the aforementioned As illustrated by FIG. 2, the Nucleus 200 exports various copending U.S. patent application. First, the allocators return functions directly to TAL code. First, it exports three stack null when the heap is full, rather than invoking garbage col manipulation functions to the TAL kernel 205: GetStack lection directly. This allows the TAL kernel to schedule the State, ResetStack, and YieldTo. GetStackState simply returns remaining threads to prepare for garbage collection, as the state of any stack. ResetStack changes the State of a stack 60 described in section 2.6. Second, the original verified collec to empty; the kernel 205 may use this to terminate an unused tors described in the aforementioned copending U.S. patent thread. Finally, YieldTo transfers control to another stack. The application scanned only a single stack. In contrast, SafeCS exact behavior of YieldTo depends on the state of the target adds a loop to the original scanning technique so that SafeCS stack that is being yielded to. If the target stack is in the is capable of scanning all of the stacks. yielded or interrupted state. YieldTo restores the target stacks 65 The verified garbage collectors export functions readField state and resumes its execution. If the target state is in the and writeField that read and write heap object fields on behalf empty state. YieldTo runs the TAL kernel's entry point 245 in of the TAL code. More precisely, these functions grant the US 8,341,602 B2 19 20 TAL code permission to read and write fields of objects, by ensuring that the fields reside at valid memory locations and, implementation Try ReadKeyboard() { for reads, that the fields contain the correct value. The “correct call KeyboardStatusIn3(); value' is defined by an abstract heap that the specification call eax := And(eax, 1); maintains. The key invariant in the garbage collector verifi call Go (); if (eax = 0) {goto skip: call eax:=Mov(256): cation is that the concrete values stored in memory accurately call Ret(old (RET)); return; reflect the abstract heap whenever the TAL code is running. skip: call KeyboardDataIn8(); SafeGS extends the abstract heap with multiple abstract call eax := And(eax, 255): stacks, each consisting of Zero or more abstract stack frames. 10 call Ret(old(RET)); return; The Nucleus exports functions readStack and writeStack that guarantee that the concrete stack contents in Mem match the abstract contents. As with conventional systems, this match As illustrated by FIG. 3, to verify this, the Automated, ing is loose enough to allow the garbage collector to update Static Safety Verifier simply runs the Boogie tool 325 on the concrete pointers as objects move in memory (since the copy 15 source code, which queries the Z3 theorem prover to check ing collector moves objects). In the specification for read that the procedure satisfies its postconditions, and that all Stack, the InteriorValue predicate matches a concrete value calls inside the procedure satisfy the necessary preconditions “val at memory address “ptr. to the abstract value at offset (i.e., the specifications 320). Given the BoogiePL source code '' in the “abstract frame' frame of the current stack S, as 315, this process is entirely automatic, requiring no scripts or illustrated below: human interactive assistance to guide the theorem prover. Each statement in the verified BoogiePL code 315 corre sponds to 0, 1, or 2 assembly language instructions. “If statements require 2 instructions, a compare and branch. procedure readStack(ptrint, frame:int, :int) returns(val:int); Dynamically checked arithmetic statements also require 2 requires StackStateS == StackRunning: 25 requires NucleusInv(S,...); instructions, an arithmetic operation followed by a jump-on requires 0 <= frame < FrameCountsS); overflow. Boogie.Asm 330, a tool developed for use with the requires ... ideas described in the aforementioned copending U.S. patent (SCS ... application, then transforms this verified BoogiePL source ensures vall== Memptr; ensures InteriorValue(val.....FrameAbsSframe...); 30 code 315 into valid assembly code, as illustrated below:
The name “InteriorValue' reflects the fact that a stack value 2Try Read Keyboard proc might be an “interior pointer to the inside of an object, which inal, O64h the garbage collector must properly track, as discussed in the 35 and eax, 1 cmp eax, O aforementioned copending U.S. patent application. Note that jne TryRead KeyboardSskip readStack does not modify the instruction pointer, Eip, mov eax, 256 ret because it contains no instructions (the same is true for read Try ReadKeyboardSskip: Field, writeField, and writeStack). Because of this, the TAL inal, O60h code need not generate any code to call readStack at run-time. 40 and eax, 255 Instead, it simply reads the data at Memptr directly. ret Finally, the Nucleus exports device management functions to the TAL kernel: VgaTextWrite, TryReadKeyboard, Start Boogie.Asm 330 checks that the source code 315 contains Timer, and SendEoi (send end-of-interrupt). The specifica no circular definitions of constants or functions (which would tion for TryReadKeyboard, for example, requires that the 45 cause the verification to be unsound), and no recursive defi Nucleus return a keystroke (in the range 0-255) if one is nitions of procedures (which the translator currently cannot available, and otherwise return the value 256: generate code for as with the aforementioned copending U.S. patent application. Boogie.Asm 330 also checks that the veri fied source code 315 conforms to a restricted subset of the 50 BoogiePL syntax. For example, before performing an “if” or procedure Try Read Keyboard(); “return' statement, the code must perform call Go() or call ensures KbdAvailable==old (KbdDone) => eax==256; Ret(old(RET)) operations to update the global variables that ensures KbdAvailable> old.(KbdDone) ==> reflect the machine state, as illustrated below: eax==KbdEventsold (KbdDone): 55
procedure Go(); 2.5 Nucleus Implementation and Verification: modifies Eip: SafeGS expands on the ideas described in the aforemen procedure Ret(oldRET:ReturnTo); requires oldRET == ReturnToAddr(Memesp); tioned copending U.S. patent application by using BoogiePL 60 requires Aligned (esp); to express verified assembly language instructions, but modifies Eip, esp; improves on the earlier work by generating much of the ensures esp == old (esp) + 4: Verified assembly language automatically from higher-level ensures Aligned (esp); Source code. The verified assembly language code is illus trated below with a small, but complete, example represent 65 Writing Such detailed, annotated assembly language by ing the verified source code implementing TryReadKey hand is hard work, and the resulting code is often difficult to board: read. To relieve some of this burden and clarify the code, a US 8,341,602 B2 21 22 small extension of BoogiePL called Beat 310 was developed board events and signals a semaphore upon receiving an for use by the “Automated, Static Safety Verifier”. Beat 310 event. Each semaphore Supports two operations, Wait and provides some conveniences, such as defining named aliases Signal. If a running thread requests to wait on a semaphore, for x86 assembly language registers, and very simple high the kernel scheduler moves the thread into the semaphore's level-to-assembly-level compilation of Statements and private wait queue, so that the threadblocks waiting for some one to signal the thread. A subsequent request to signal the expressions. This aids readability, since named variables and semaphore may release the thread from the wait queue into structured control constructs (such as “if/else' and “while') the ready queue. are easier to read than unstructured assembly language The kernels scheduler also coordinates garbage collec branches, without straying too far from the low-level assem 10 tion. The well-known Bartok compiler was modified so that at bly language model. The close correspondence between the each C# allocation site, the compiler generates TAL code to Beat code and the generated assembly language ensures that check the allocator's return value. If the value is null, the TAL verifiable Beat compiles to verifiable BoogiePL assembly code calls the kernel to block awaiting garbage collection, language. For non-performance-critical code, Beat's 310 then jumps back to retry the allocation after the collection. high-level statements were used rather than writing assembly The kernel maintains a collection queue of threads waiting for language instructions directly. 15 garbage collection. Similar to conventional Bartok and Sin 2.6 Kernel: gularity designs, before performing a collection, SafeCS As illustrated by FIG.2, on top of the Nucleus 200, Safe0S waits for each thread in the system to block on a semaphore or provides a simple kernel 205, written in C# and compiled to at an allocation site. This ensures that the collector is able to TAL. This kernel follows closely in the footsteps of other scan the stack of every thread. This does raise the possibility operating systems developed in safe languages, so this sec that one thread in an infinite loop could starve the other tion focuses on the interaction of the kernel with the Nucleus. threads of garbage collections. However, if this is a concern, Again, it must be understood that SafeGS was constructed Bartok can be modified to poll for collection requests at using the Automated, Static Safety Verifier” for purposes of backwards branches, at a small run-time cost and with larger explanation and to demonstrate operation of the “Automated, GC tables. Alternatively, if Bartok were able to generate GC Static Safety Verifier”. As such, it should be clear that the information for all program counter values in the program, a kernel of an arbitrary safe OS constructed using the tech 25 precondition could be added to InterruptHandler saying that niques and procedures enabled by the “Automated Static an interrupted Stack is scannable, so that the kernel need not Safety Verifier can have significantly more functionality wait for threads to block before calling the garbage collector. than the exemplary kernel described below. 3.0 Summary of the “Automated, Static Safety Verifier: In general, the kernel constructed for the SafeOS imple The processes described above with respect to FIG. 1 ments round-robin preemptive threading on top of Nucleus 30 through FIG.3 and in further view of the detailed description stacks, allowing threads to block on semaphores. The kernel provided above in Sections 1 and 2 are illustrated by the scheduler maintains two queues: a ready queue of threads that general operational flow diagram of FIG. 4. In particular, FIG. are ready to run, and a collection queue of threads waiting for 4 provides an exemplary operational flow diagram that Sum the next garbage collection. A running thread may Voluntarily marizes the operation of an exemplary embodiment of the ask the kernel to yield. In this case, the thread goes to the back 35 “Automated, Static Safety Verifier”. Note that FIG. 4 is not of the ready queue. The scheduler then selects another thread intended to be an exhaustive representation of all of the vari from the front of the ready queue and calls the Nucleus ous embodiments of the Automated, Static Safety Verifier” YieldTo function to transfer control to the newly running described herein, and that the embodiments represented in thread. FIG. 4 are provided only for purposes of explanation. The kernel TAL code may execute the x86 disable-inter In general, as illustrated by FIG. 4, the “Automated Static rupt and enable-interrupt instructions whenever it wishes. 40 Safety Verifier begins operation by providing 400 annotated While performing scheduling operations, the kernel keeps Source code comprising a Nucleus of an operating system. In interrupts disabled. It enables interrupts before transferring general as discussed in detail above, these annotations corre control to application TAL code. Thus, a thread running appli spond to allowed behaviors (i.e., “specifications”) for each cation code may be interrupted by a timer interrupt. When this instruction of the Nucleus source code. happens, the Nucleus transfers control back to the kernel TAL 45 Next, the Automated, Static Safety Verifier automati code running on stack O. This code uses the Nucleus Get cally evaluates 410 allowed behaviors of the annotated source StackState function to discover that the previously running code using a verification tool (e.g., Boogie, as discussed thread was interrupted. It then moves the interrupted thread to above) to ensure that allowed behaviors are consistent with the back of the scheduler queue, and calls YieldTo to yield the corresponding instructions of the Nucleus source code. control to a thread from the front of the ready queue. Following this evaluation 410, if the nucleus source code is When an application asks the kernel to spawn a new thread, 50 determined to have consistent allowed behaviors, the “Auto the kernel allocates an empty Nucleus stack, if available, and mated Static Safety Verifier compiles 420 the Nucleus places the new thread in the ready queue. When the empty Source code to produce an operating system nucleus Supply thread runs (via a scheduler call to YieldTo), it enters the ing well-formed Stacks and objects. kernel's entry point, KernelEntry Point, which calls the appli Once the operating system Nucleus has been compiled cation code. When the application code is finished running, it 55 420, the Automated Static Safety Verifier” constructs the may return back to KernelEntry Point, which marks the thread kernel and/or other higher level program elements. In particu as “exited' and yields control back to the kernel's stack 0 lar, the next step is to provide 430 source code for one or more code. The stack O code then calls ResetThread to mark the higher level program elements (such as the kernel or other exited threads stack as empty (and thus available for future applications) that are intended to run on top of the operating threads to use). Note that KernelEntry Point is not allowed to system nucleus. The “Automated, Static Safety Verifier then return, since it sits in the bottom-most frame of its stack and 60 automatically verifies 440 (e.g., using a tool Such as the afore has no return address to return to. The TAL checker verifies mentioned TAL checker) that the source code of the higher that KernelEntry Point never tries to return, simply by assign level program elements safely interacts with the well-formed ing KernelEntry Point a TAL type that lacks a return address to stacks and objects of the compiled operating system nucleus. return to. If the source code of the higher level program elements is Applications may allocate and share semaphore objects to 65 verified to safely interact, then the “Automated, Static Safety coordinate execution among threads. The SafeGS keyboard Verifier compiles 450 the verified source code. Finally, the driver, for example, consists of a thread that polls for key “Automated, Static Safety Verifier uses a conventional linker US 8,341,602 B2 23 24 (trusted) to combine 460 (i.e., link) the compiled operating ware and memory access for the operating system, and system nucleus with the compiled Source code of the higher wherein the annotations correspond to allowed behav level program elements to construct a verifiably safe operat iors for each instruction of the source code of the ing system (e.g., “SafeGS) that correctly supplies well nucleus; formed Stacks and objects. evaluating the allowed behaviors of the annotated source 4.0 Exemplary Operating Environments: code using a verification tool to ensure that the allowed The “Automated, Static Safety Verifier described herein is operational within numerous types of general purpose or behaviors are consistent with the corresponding instruc special purpose computing system environments or configu tions; rations. FIG. 5 illustrates a simplified example of a general wherein source code having consistent allowed behaviors purpose computer system on which various embodiments and 10 is compiled to produce an operating system nucleus elements of the Automated, Static Safety Verifier, as Supplying “well-formed Stacks and objects; described herein, may be implemented. It should be noted providing Source code for one or more higher level pro that any boxes that are represented by broken or dashed lines gram elements intended to be run on top of the operating in FIG. 5 represent alternate embodiments of the simplified system nucleus; computing device, and that any or all of these alternate 15 verifying that the Source code of the higher level program embodiments, as described below, may be used in combina elements safely interacts with the “well-formed stacks tion with other alternate embodiments that are described and objects of the compiled operating system nucleus; throughout this document. compiling verified source code of the higher level program For example, FIG. 5 shows a general system diagram elements; and showing a simplified computing device. Such computing combining the operating system nucleus with the compiled devices can be typically be found in devices having at least Source code of the higher level program elements to Some minimum computational capability, including, but not construct a verifiably safe operating system. limited to, personal computers, server computers, hand-held 2. The method of claim 1 wherein the verification tool for computing devices, laptop or mobile computers, communi ensuring that the allowed behaviors are consistent with the cations devices such as cellphones and PDAs, multiproces corresponding instructions further comprises a Hoare-style sor Systems, microprocessor-based systems, network PCs, 25 minicomputers, mainframe computers, etc. program verifier that verifies assembly language instructions To allow a device to implement the “Automated, Static comprising the annotated Source code against the allowed Safety Verifier, the device should have a sufficient compu behaviors. tational capability to perform the various evaluation, verifi 3. The method of claim 1 wherein the source code of the cation, and compilation processes and techniques described higher level program elements is compiled to typed assembly herein. In particular, as illustrated by FIG. 5, the computa 30 language (TAL) prior to being verified, and wherein Verifying tional capability is generally illustrated by one or more pro that Source code comprises evaluating that source code using cessing unit(s) 510, and may include one or more GPUs 515. a TAL checker to verify safe interaction with the “well Note that that the processing unit(s) 510 of the general com formed Stacks and objects of the compiled operating system puting device of may be specialized microprocessors, such as nucleus. a DSP, a VLIW, or other micro-controller, or can be conven 35 4. The method of claim 3 wherein the TAL checker tional CPUs having one or more processing cores, including enforces a guarantee that whenever any assembly language specialized GPU-based cores in a multi-core CPU. code of the nucleus transfers control to higher-level code, the In addition, the simplified computing device of FIG.5 may higher-level code's stack and heap are both “well-formed’. also include other components, such as, for example, a com 5. The method of claim 1 wherein stacks are “well-formed' munications interface 530. The simplified computing device 40 when they consist of Zero or more stack frames, where: of FIG. 5 may also include one or more conventional com each frame's memory does not overlap with the memory of puter input devices 540. The simplified computing device of any other stack frame or object; and FIG.5 may also include other optional components, such as, each frame contains Zero or more pointers, each pointing to for example one or more conventional computer output Some object, where, if the pointer has a specific type, devices 550. Finally, the simplified computing device of FIG. then the pointed-to object has a type that is assignable to 5 may also include storage 560 that is either removable 570 45 and/or non-removable 580. Note that typical communications that specific type. interfaces 530, input devices 540, output devices 550, and 6. The method of claim 1 wherein objects are “well storage devices 560 for general-purpose computers are well formed where: known to those skilled in the art, and will not be described in the object's memory does not overlap with the memory of detail herein. any other stack frame or object; and The foregoing description of the Automated, Static Safety 50 the object contains Zero or more pointers, each pointing to Verifier has been presented for the purposes of illustration Some object, where, if the pointer has a specific type, and description. It is not intended to be exhaustive or to limit then the pointed-to object has a type that is assignable to the claimed subject matter to the precise form disclosed. that specific type. Many modifications and variations are possible in light of the 7. The method of claim 1 wherein the verifiably safe oper above teaching. Further, it should be noted that any or all of SS ating system is further combined with an executable boot the aforementioned alternate embodiments may be used in loader to construct a bootable verifiably safe operating sys any combination desired to form additional hybrid embodi tem. ments of the “Automated, Static Safety Verifier. It is 8. A system having a processor for generating a verifiably intended that the scope of the invention be limited not by this safe operating system, comprising: detailed description, but rather by the claims appended 60 an automated verification tool for evaluating annotated hereto. Source code comprising a nucleus of an operating sys What is claimed is: tem, wherein the nucleus provides hardware and 1. A method for verification of a safe operating system that memory access for the operating system, and wherein correctly supplies “well-formed stacks and objects, com the annotations correspond to allowed behaviors for prising steps for: 65 each instruction of the Source code, to ensure that the providing annotated source code comprising a nucleus of allowed behaviors are consistent with the corresponding an operating system, wherein the nucleus provides hard instructions; US 8,341,602 B2 25 26 a compiler for compiling source code having consistent receive a first set of source code, said first set of source code allowed behaviors to produce a compiled operating sys defining an operating system nucleus that provides hard tem nucleus supplying “well-formed stacks and ware and memory access for the operating system; objects; compile the first set of source code to a first set of assembly language instructions: a checker tool for verifying that source code of one or more annotate each instruction of the first set of assembly lan higher level program elements intended to be run on top guage instructions to define allowed behaviors for each of the operating system nucleus safely interacts with the of those assembly language instructions; “well-formed” stacks and objects of the compiled oper evaluate the allowed behaviors of the annotated source ating system nucleus; code using a Hoare-style program verification tool to compiling verified source code of the higher level program 10 ensure that the allowed behaviors are consistent with elements; and each corresponding assembly language instruction; linking the compiled operating system nucleus with the if each assembly language instruction in the first set of compiled source code of the higher level program ele assembly language instructions are verified to be con ments to construct a verifiably safe operating system that sistent with their corresponding allowed behaviors, use correctly supplies “well-formed” stacks and objects. an assembler to compile the first set of assembly lan 9. The system of claim 8 wherein the verification tool 15 guage instructions to produce an operating system further comprises a Hoare-style program verifier that verifies nucleus object supplying “well-formed” stacks and assembly language instructions comprising the annotated “well-formed' objects; source code against the allowed behaviors. receiving a second set of source code, said second set of 10. The system of claim 8 wherein the source code of the source code defining a kernel; higher level program elements is compiled to typed assembly compile the second set of source code to a set of typed language (TAL) prior to being verified, and wherein Verifying assembly language (TAL) instructions: that source code comprises evaluating that source code using use a TAL checker to verify safe interaction of the TAL a TAL checker to verify safe interaction with the “well instructions defining the kernel with the “well-formed formed' stacks and objects of the compiled operating system stacks and “well-formed objects of the compiled oper 25 ating system nucleus; nucleus. if the safe interaction of the TAL instructions is verified, 11. The system of claim 10 wherein the TAL checker compiling the verified TAL instructions to produce a enforces a guarantee that whenever control is transferred to kernel object; and any higher-level code, the higher-level code's stack and heap use a linker to link the operating system nucleus object and are both “well-formed'. the kernel object to construct a verifiably safe operating 12. The system of claim 8 wherein stacks are “well 30 system that correctly supplies “well-formed” stacks and formed” when they consist of Zero or more stack frames, “well-formed” objects. where: 17. The computer-readable memory of claim 16 wherein each frame's memory does not overlap with the memory of stacks are “well-formed' when they consist of Zero or more any other stack frame or object; and stack frames, where: each frame contains Zero or more pointers, each pointing to 35 each frame's memory does not overlap with the memory of some object, where, if the pointer has a specific type, any other stack frame or object; and then the pointed-to object has a type that is assignable to each frame contains Zero or more pointers, each pointing to that specific type. some object, where, if the pointer has a specific type. 13. The system of claim 8 wherein objects are “well then the pointed-to object has a type that is assignable to formed where: that specific type. the object's memory does not overlap with the memory of 40 18. The computer-readable memory of claim 16 wherein any other stack frame or object; and objects are “well-formed” where: the object contains zero or more pointers, each pointing to the object's memory does not overlap with the memory of some object, where, if the pointer has a specific type, any other stack frame or object; and then the pointed-to object has a type that is assignable to the object contains zero or more pointers, each pointing to that specific type. 45 14. The system of claim 8 wherein the operating system some object, where, if the pointer has a specific type, nucleus includes a plurality of functions, and wherein none of then the pointed-to object has a type that is assignable to those functions ever blocks, such that each such function that specific type. performs a finite amount of work then returns to a thread that 19. The computer-readable memory of claim 16 wherein is not required to be the same thread that invoked that func the verifiably safe operating system is further combined with tion. 50 an executable boot loader to construct a bootable verifiably 15. The system of claim 8 wherein the verifiably safe safe operating system. operating system is further combined with an executable boot 20. The computer-readable memory of claim 16 wherein loader to construct a bootable verifiably safe operating sys the operating system nucleus includes a plurality of functions, tem. and wherein none of those functions ever blocks, such that 16. A computer-readable memory having computer 55 each such function performs a finite amount of work then executable instructions stored therein for generating a verifi returns to a thread that is not required to be the same thread ably safe operating system, said instructions comprising that invoked that function. causing a computing device to: