Building a High-Performance, Programmable Secure Coprocessor

Computer Networks 31Ž. 1999 831±860

Building a high-performance, programmable secure coprocessor

Sean W. Smith ), Steve Weingart 1 Secure Systems and Smart Cards, IBM T.J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA

Abstract

Secure coprocessors enable secure distributed applications by providing safe havens where an application program can executeŽ. and accumulate state , free of observation and interference by an adversary with direct physical access to the device. However, for these coprocessors to be effective, participants in such applications must be able to verify that they are interacting with an authentic program on an authentic, untampered device. Furthermore, secure coprocessors that support general-purpose computation and will be manufactured and distributed as commercial products must provide these core sanctuary and authentication properties while also meeting many additional challenges, including: Ø the applications, operating system, and underlying security management may all come from different, mutually suspicious authorities; Ø configuration and maintenance must occur in a hostile environment, while minimizing disruption of operations; Ø the device must be able to recover from the vulnerabilities that inevitably emerge in complex software; Ø physical security dictates that the device itself can never be opened and examined; and Ø ever-evolving cryptographic requirements dictate that hardware accelerators be supported by reloadable on-card software. This paper summarizes the hardware, software, and cryptographic architecture we developed to address these problems. Furthermore, with our colleagues, we have implemented this solution, into a commercially available product. q 1999 Elsevier Science B.V. All rights reserved.

Keywords: Secure coprocessor; Computer security; Privacy; FIPS 140-1; Smart card; Token; Cryptography; Tamper-detection; Code-signing

1. Introduction by adversaries who would benefit from subvert- ing this computation. 1.1. Secure coprocessors If an adversary can attack a device by altering or copying its algorithms or stored data, he or she often Many current and proposed distributed applica- can subvert an entire application. The mere potential tions face a fundamental security contradiction: of such attack may suffice to make a new application Ø computation must occur in remote devices, too risky to consider. Ø but these devices are vulnerable to physical attack Secure coprocessors ± computational devices that can be trusted to execute their software correctly, despite physical attack ± address these threats. Dis- ) Corresponding author. Tel.: q1-914-7847131; Fax: q1-914- tributing such trusted sanctuaries throughout a hos- 7846225; E-mail: [email protected]. tile environment enables secure distributed applica- 1 E-mail: [email protected]. tions.

Higher-end examples of secure coprocessing tech- ject to design, develop, and distribute such a device nology usually incorporate support for high-perfor- ± both as a research tool and as a commercial mance cryptographyŽ and, indeed, the need to physi- product, which reached market August 1997. cally protect the secrets used in a cryptographic This project challenged us with several constraints module motivated FIPS 140-1, the US Government wx20 : standardwx 11,14 used for secure coprocessors. . For Ø the device must offer high-performance computa- over 15 years, our team has explored building high- tional and cryptographic resources; end devices: robust, general-purpose computational Ø the device must be easily programmable by IBM environments inside secure tamper-responsive physi- and non-IBM developers, even in small quanti- cal packageswx 15,22±24 . This work led to the Abyss, ties; mAbyss, and Citadel prototypes, and contributed to Ø the device must exist within the manufacturing, the physical security design for some of earlier IBM distribution, maintenance, and trust confines of a cryptographic accelerators. commercial product Žas opposed to an academic However, although our efforts have focused on prototype. from a private vendor. high-end coprocessors, devices that accept much However, the projected lifecycle of a high-end more limited computational power and physical se- secure coprocessor challenged us with security is- curity in exchange for a vast decrease in cost ± such sues: as IC chip cards, PCMCIA tokens, and `smart but- Ø How does a generic commercial device end up in tons' ± might also be considered part of the secure a hostile environment, with the proper software coprocessing family.Ž As of this writing, no device and secrets? has been certified to the tamper-response criteria of Ø How do participants in distributed applications Level 4 of the FIPS 140-1 standard; even the touted distinguish between a properly configured, un- Fortezza achieved only Level 2, tamper evident. 2 . tampered device, and a maliciously modified one Even though this technology is closely associated or a clone? with cryptographic accelerators, much of the exciting potential of the secure coprocessing model arises 1.3. OÕerÕiew of our technology from the notion of putting computation as well as cryptographic secrets inside the secure box. Yee's 1.3.1. Hardware design seminal examination of this modelwx 26 built on our For our productwx 12 , we answered these questions Citadel prototype. Follow-up research by Tygar and by building on the design philosophy that evolved in Yeewx 21,27 and othersŽ e.g., Refs.wx 9,13,17. further our previous prototypes: explores the potential applications and limits of the Ø maximize computational powerŽ e.g., use as big a secure coprocessing model. CPU as is reasonable, and use good cryptographic accelerators3. ; 1.2. Challenge Ø support it with ample dynamic RAMŽ. DRAM ; Ø use a smaller amount of battery-backed RAM This research introduced the challenge: how do Ž.BBRAM as the non-volatile, secure memory; we make this vision real? Widespread development and and practical deployment of secure coprocessing ap- Ø assemble this on a circuit board with technology plications requires an infrastructure of secure de- to actively sense tamper and near-instantly ze- vices, not just a handful of laboratory prototypes. roize the BBRAM. Recognizing this need, our team has recently com- Fig. 1 sketches this design. pleted a several-year research and development pro-

3 Our current hardware features a 66 MHz 486 CPU, and 2 In November 1998, as this paper went to press, our device accelerators for DES, modular mathŽ. hence RSA and DSA , and earned the world's first FIPS 140-1 Level 4 certificate. noise-based random number generation. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 833

Ø allow this critical portion to be fairly complex; Ø then structure this critical software to exploit the tamper protections: tamper destroys only contents of volatile DRAM and the smaller BBRAM ± but not, for example, the contents of FLASH or ROM.

1.3.4. Software design Making a commercial product support this application design requires giving the device a robust programming environment, and making it easy for developers to use this environment ± even if they do not necessarily trust IBM or each other. These goals led to a multi-layer software architecture: Ø a foundational Miniboot layer manages security Fig. 1. Hardware architecture of our high-end secure coprocessor. and configuration; Ø an operating system or control program layer 1.3.2. Security design manages computational, storage, and crypto- Active tamper response gives a device a lifecycle graphic resources; and Ø shown in Fig. 2: tamper destroys the contents of an unprivileged application layer uses these re- secure memory ± in our case, the BBRAM and sources to provide services DRAM. However, one can logically extend the se- Fig. 3 sketches this architecture. cure storage area beyond the BBRAM devices them- Currently, Miniboot consists of two components: selves by storing keys in BBRAM and ciphertext in Miniboot 0, residing in ROM, and Miniboot 1, FLASH,4 or even cryptopaging wx26 it onto the host which resides, like the OS and the application, in file system. rewritable non-volatile FLASH memory. However, we are also considering adding support for various 1.3.3. Application design multi-application scenarios, including the simultane- This design leads to a notion of a high-end secure ous existence of two or more potentially malicious coprocessor that is substantially more powerful and applications in the same device, as well as one secure ± albeit larger 5 and more expensive ± than `master' application that dynamically loads addi- the family's weaker members, such as chip cards. tional applications at run-time. ŽThe larger physical package and higher cost permit more extensive protections.. This approach shapes 1.4. This paper the design for application software: Ø protect the critical portion of the application soft- Building a high-performance, programmable se- ware by having it execute inside the secure copro- cure coprocessor as a mass-produced product ± and cessor; not just as a laboratory prototype ± requires identifying, articulating, and addressing a host of research issues regarding security and trust. This paper dis-

4 FLASH is a non-volatile memory technology similar to EE- PROM. FLASH differs significantly from the more familiar RAM model in several ways: FLASH can only be reprogrammed by erasing and then writing an entire sector first; the erase and rewrite cycles take significantly longer than RAM; and FLASH imposes a finite lifetimeŽ currently, usually 1045 or 10. on the maximum number of eraserrewrite cycles for any one sector. 5 For example, our product is a PCI card, although we see no substantial engineering barriers to repackaging this technology as Fig. 2. Sample lifecycle of a high-end secure coprocessor with a PCMCIA card. active tamper response. 834 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Fig. 3. Software architecture for our high-end secure coprocessor. Our current software only supports one application, not dynamically loaded.

cusses the security architecture we designed and quires that computation, not just cryptography, re- Ž.with our colleagues implemented. side inside the secure box. This notion ± and previ- Ø Section 2 presents the security goals and commer- ous experience with commercial security hardware cial constraints we faced. Že.g., Ref.wx 1. ± gives rise to many constraints. Ø Section 3 introduces our approach to solving them. 2.1.1. DeÕelopment Ø Sections 4±8 presents the different interlocking To begin with, the goal of supporting the pieces of our solution. widespread development and deployment of applica- Ø Sections 9 and 10 summarize how these pieces tions implies: work together to satisfy the security goals. Ø The device must be easily programmable. Section 11 presents some thoughts for future di- Ø The device must have a general-purpose operat- rections. ing system, with debugging supportŽ when appropriate. . Ø The device must support a large population of 2. Requirements authorities developing and releasing application and OS code, deployed in various combinations In order to be effective, our solution must simul- on different instantiations of the same basic de- taneously fulfil two different sets of requirements. vice. The device must provide the core security properties Ø The device must support vertical partitioning: an necessary for secure coprocessing applications. But application from one vendor, an OS from another, the device must also be a practical, commercial bootstrap code from a third. product; this goal gives rise to many additional Ø These vendors may not necessarily trust each constraints, which can interact with the security other ± hence, the architecture should permit no properties in subtle ways. `backdoors.' 2.1. Commercial requirements 2.1.2. Manufacturing The process of manufacturing and distribution Our device must exist as a programmable, gen- must be as simple as possible: eral-purpose computer ± because the fundamental Ø We need to minimize the number of variations of secure coprocessing modelŽ e.g., Refs.wx 26,27. re- the device, as manufactured or shippedŽ since S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 835

each new variation dramatically increases admin- Ø It must be possible to remotely distinguish be- istrative overhead. . tween a message from a genuine application on Ø It must be possible to configure the software on an untampered device, and a message from a the device after shipment, in what we must regard clever adversary. as a hostile environment. We consider these requirements in turn. Ø We must reduce or eliminate the need to store a large database of recordsŽ. secret or otherwise 2.2.1. Safe execution pertaining to individual devices. It must be possible for the card, placed in a Ø As an international corporation based in the United hostile environment, to distinguish between genuine States, we must abide by US export regulations. software updates from the appropriate trusted sources, and attacks from a clever adversary. 2.1.3. Maintenance The foundation of secure coprocessing applica- The complexity of the proposed software ± and tions is that the coprocessor really provides safe the cost of a high-end device ± mean that it must be haven. For example, suppose that we are implement- possible to update the software already installed in a ing decentralized electronic cash by having two se- device. cure devices shake hands and then transactionally Ø These updates should be safe, easy, and minimize exchange moneyŽ e.g., Ref.wx 27. . Such a cash pro- disruption of device operation. gram may store two critical parameters in BBRAM: - When possible, the updates should be per- the private key of this wallet, and the current balance formed remotely, in the `hostile' field, without of this wallet. Minimally, it must be the case that requiring the presence of a trusted security physical attack really destroys the private key. How- officer. ever, it must also be the case that the stored balance - When reasonable, internal application state never change except through appropriate action of should persist across updates. the cash program.Ž For example, the balance should Ø Particular versions of software may be so defec- not change due to defective memory management or tive as to be non-functional or downright mali- lack of fault-tolerance in updates.. cious. Safe, easy updates must be possible even Formalizing this requirement brings out many then. subtleties, especially in light of the flexible ship- Ø Due to its complexity and ever-evolving nature, ment, loading, and update scenarios required by the code supporting high-end cryptographyŽ in- Section 2.1 above. For example: cluding public-key,6 hashing, and randomness. Ø What if an adversary physically modifies the must itself be updatable. But repair should be device before the cash program was installed? possible even if this software is non-functional. Ø What if an adversary `updates' the cash program with a malicious version? 2.2. Security requirements Ø What if an adversary updates the operating system underneath the cash program with a mali- The primary value of a secure coprocessor is its cious version? ability to provide a trusted sanctuary in a hostile Ø What if the adversary already updated the operat- environment. This goal leads to two core security ing system with a malicious version before the requirements: cash program was installed? Ø The device must really provide a safe haven for Ø What if the adversary replaced the public-key application software to execute and accumulate cryptography code with one that provides back- secrets. doors? Ø What if a sibling application finds and exploits a flaw in the protections provided by the underlying 6 Our hardware accelerator for RSA and DSA merely does operating system? modular arithmetic; hence, additional software support is neces- After much consideration, we developed safe exe- sary. cution criteria that address the authority in charge of 836 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 a particular software layer, and the execution enÕi- technology against the dedicated adversaries that ronment ± the code and hardware ± that has accesses might target a high-end device.Ž Tamper-evidence to the secrets belonging to that layer. technology only attempts to ensure that tampering Ø Control of software. If Authority N has owner- leaves clear visual signs.. We are afraid that a device ship of a particular software layer in a particular that is opened, modified and reassembled may ap- device, then only Authority N, or a designated pear perfect enough to fool even trained analysts. superior, can load code into that layer in that This potential for perfect reassembly raises the device. serious possibility of attack during distribution and Ø Access to secrets. The secrets belonging to this configuration. In many deployment scenarios, no one layer are accessible only by code that Authority N will have both the skills and the motivation to detect trusts, executing on hardware that Authority N physical tamper. The user who takes the device out trusts, in the appropriate context. of its shipping carton will probably not have the ability to carry out the forensic physical analysis necessary to detect a sophisticated attack with high 2.2.2. Authenticated execution assurance. Furthermore, the user may be the adver- Providing a safe haven for code to run does not sary ± who probably should not be trusted to report do much good, if it is not possible to distinguish this whether or not his or her device shows signs of the safe haven from an impostor. It must thus be possi- physical attack he or she just attempted. Those par- ble to: tiesŽ. such as, perhaps, the manufacturer with both Ø authenticate an untampered deÕice; the skills and the motivation to detect tamper may be Ø authenticate its software configuration; and reluctant to accept the potential liability of a `false Ø do this remotely, via computational means. negative' tamper evaluation. The first requirement is the most natural. Con- For all these reasons, our tamper-protection ap- sider again example of decentralized cash. An adver- proach does not rely on tamper-evidence aloneŽ see sary who runs this application on an exposed com- Section 4. . puter but convinces the world it is really running on a secure device has compromised the entire cash system ± since he or she can freely counterfeit 3. Overview of our architecture money by incrementing the stored balance. The second requirement ± authenticating the soft- In order to meet the requirements of Section 2, ware configuration ± is often overlooked but equally our architecture must ensure secure loading and exe- important. In the cash example, running a mali- cution of code, while also accommodating the flexi- ciously modified wallet application on a secure de- bility and trust scenarios dictated by commercial Õice also gives an adversary the ability to counterfeit constraints. money. For another example, running a Certificate Authority on a physically secure machine without knowing for certain what key generation software is 3.1. Secrets really installed leaves one open to attackwx 28 . The third requirement ± remote verification ± is Discussions of secure coprocessor technology driven by two main concerns. First, in the most usually begin with `physical attack zeroizes secrets'. general distributed application scenarios, participants Our security architecture must begin by ensuring that may be separated by great physical distance, and tamper actually destroys secrets that actually meant have no trusted witnesses at each other's site. Physi- something. We do this with three main techniques: cal inspection is not possible, and even the strongest Ø The secrets go away with physical attack. Sec- tamper-evidence technology is not effective without tion 4 presents our tamper-detection circuitry and a good audit procedure. protocol techniques. These ensure that physical Furthermore, we are reluctant to trust the effec- attack results in the actual zeroization of sensitive tiveness of commercially feasible tamper-eÕidence memory. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 837

Ø The secrets started out secret. Section 5 pre- 3.3. AchieÕing the security requirements sents our factory initialization and regene- rationrrecertification protocols. These ensure that Our full architecture carefully combines the build- the secrets, when first established, were neither ing blocks described in Sections 4±8 to achieve the known nor predictable outside the card, and do required security properties: not require assumptions of indefinite security of Ø Code executes in a secure environment. Section any given keypair. 9 presents how our secrecy management and code Ø The secrets stayed secret despite software at- integrity techniques interact to achieve the re- tack. Section 6 presents our hardware ratchet lock quirements of Section 2.2.1: software loaded onto techniques. These techniques ensure that, despite the card can execute and accumulate state in a arbitrarily bad compromise of rewritable soft- continuously trusted environment, despite the risks ware, sufficiently many secrets remain to enable introduced by dependency on underlying software recovery of the device. controlled by a potentially hostile authority. Ø Participants can remotely authenticate real 3.2. Code code on a real device. Section 10 presents how our secrecy management and code integrity techniques interact to achieve the requirement of Sec- Second, we must ensure that code is loaded and tion 2.2.2: any third party can distinguish between updated in a safe way. Discussions of code-down- a message from a particular program in a particu- loading usually begin with `just sign the code'. lar configuration of an untampered device, and a However, focusing on code-signing alone neglects message from a clever adversary. several additional subtleties that this security architecture must address. Further complications arise from the commercial requirement that this architecture accommodate a pool of mutually suspicious developers, who produce code that is loaded and 4. Defending against physical threats updated in the hostile field, with no trusted couriers. Ø Code loads and updates. We must have tech- The main goal of physical security is to ensure niques that address questions such as: that the hardware can know if it remains in an ± What about updates to the code that checks unmolested state ± and if so, that it continues to the signatures for updates? work in the way it was intended to work. To achieve Ø Against whose public key should we check the physical security, we start with our basic computa- signature? tionalrcrypto device and add additional circuitry and ± Does code end up installed in the correct components to detect tampering by direct physical place? penetration or by unusual operating conditions.If ± What happens when another authority up- the circuit detects a condition that would compro- dates a layer on which one's code depends? mise correct operation, the circuit responds in a Ø Code integrity. For the code loading techniques manner to prevent theft of secrets or misuse of the to be effective, we must also address issues such secure coprocessor. as: ± What about the integrity of the code that checks the signature? 4.1. OÕerÕiew ± Can adversarial code rewrite other layers? Section 7 presents our techniques for code integrity, and Section 8 presents our protocols for code Traditionally, physical security design has taken loading. Together, these ensure that the code in a several approaches: layer is changed and executed only in an environ- Ø tamper eÕidence, where packaging forces tamper ment trusted by the appropriate code authority. to leave indelible physical changes; 838 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Ø tamper resistance, where the device packaging The economic challenge of producing a physically makes tamper difficult; secure but usable system at a reasonable cost is Ø tamper detection, where the device actually is difficult. aware of tamper; Ø tamper response, where the device actively takes 4.2. Detecting penetration countermeasures upon tamper. We have taken the approach of making incremen- We feel that commercially feasible tamper-evi- tal improvements on well-known technology, and dence technology and tamper-resistance technology layering these techniques. This way, the attacker has cannot withstand the dedicated attacks that a high- to repeat, at each layer, work that has a low probabil- performance, multi-chip coprocessor might face. ity of success; furthermore, the attacker must work Consequently, our design incorporates an interleav- through the layers that have already been passed ing of resistance and detectionrresponse techniques, Ž.and may still be active . The basic element is a grid so that penetrations are sufficiently difficult to trig- of conductors which is monitored by circuitry that ger device response. can detect changes in the propertiesŽ open, shorts, Ø Section 4.2 will discuss our techniques to detect changes in conductivity. of the conductors. The con- penetration attacks. ductors themselves are non-metallic and closely re- Ø Section 4.3 will discuss how our device responds semble the material in which they are embedded- once tamper is detected. making discovery, isolation, and manipulation more Ø Section 4.4 will discuss the additional steps we difficult. These grids are arranged in several layers take to ensure that tamper response is effective and the sensing circuitry can detect accidental con- and meaningful ± particularly against physical nection between layers as well as changes in an attacks other than penetration. indiÕidual layer. Historically, work in this area placed the largest The sensing grids are made of flexible material effort on physical penetrationwx 8,22,23 . Preventing and are wrapped around and attached to the secure an adversary from penetrating the secure coprocessor coprocessor package as if it were being gift-wrapped. and probing the circuit to discover the contained Connections to and from the secure coprocessor are secrets is still the first step in a physical security made via a thin flexible cable which is brought out design. Although some standards have emerged as between the folds in the sensing grids so that no groundwork and guidelineswx 14,24,25 , exact tech- openings are left in the package.Ž Using a standard niques are still evolving. connector would leave such openings.. A significant amount of recent work examines After we wrap the package, we embed it in a efforts to cause incorrect device operation, and thus potting material. As mentioned above, this material allow bypass of security functionswx 2,3 . Other recent closely resembles the material of the conductors in work capitalizes on small induced failures in crypto- the sensing grids. Besides making it harder to find graphic algorithms to make discovery of keys easier the conductors, this physical and chemical resem- wx6,7 . Consequently, as feasible tampering attacks blance makes it nearly impossible for an attacker to have become more sophisticated through time and penetrate the potting without also affecting the con- practice, it has become necessary to improve all ductors. Then we enclose the entire package in a aspects of a physical security system. Over the years grounded shield to reduce susceptibility to electro- many techniques have been developed, but they all magnetic interference, and to reduce detectable elec- face the same problem: no proÕably tamper-proof tromagnetic emanations. system exists. Designs get better and better, but so do the adversary's skill and tools. As a result, physical 4.3. Responding to tamper security is, and will remain, a race between the defender and the attacker.Ž To date, we have not The most natural tamper response in a secure been able to compromise our own security, which is coprocessor is to erase secrets that are contained in also under evaluation by an independent laboratory the unit, usually by erasing Ž.zeroizing a Static against the FIPS 140-1 Level 4 criteria.wx 11,14 .. Random Access MemoryŽ. SRAM that contains the S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 839 secrets, then erasing the operating memory and ceas- Our software protocols take this threat into account, ing operation. An SRAM can be made persistent by periodically inverting this data. with a small battery, and can, under many conditions, be easily erased. 4.4.2. Other attacks This is what we do in our device: battery-backed An adversary might also compromise security by SRAM Ž.BBRAM exists as storage for secrets. Upon causing incorrect operation through careful manipu- detection of tamper, we zeroize the BBRAM and lation of various environmental parameters. As a disable the rest of the device by holding it in reset. consequence, a device needs to detect and defend The tamper detectionrresponse circuitry is actiÕeat against such attacks. all times whether the coprocessor is powered or not One such environmental parameter is supply volt- ± the detectionrresponse circuitry runs on the same age, which has to be monitored for several thresh- battery that maintains the BBRAM when the unit is olds. For example, at each power-down, the voltage unpowered. will go from an acceptable level to a low voltage, Tamper can happen quickly. In order to erase then to no supply voltage. But the detection and quickly, we crowbar the SRAM by switching its response circuitry needs to be always active ± so at power connection to ground. At the same time, we some point, it has to switch over to battery operation. force all data, address and control lines to a high A symmetric transition occurs at power-up. impedance state, in order to prevent back-powering Whenever the voltage goes below the acceptable of the SRAM via those lines. This technique is operating level of the CPU and its associated cir- employed because it is simple, effective, and it does cuitry, these components are all held in a reset state not depend on the CPU being sufficiently operational until the voltage reaches the operating point. When for sufficiently long to overwrite the contents of the the voltage reaches the operating point, the circuitry SRAM on tamper. is allowed to run. If the voltage exceeds the specified upper limit for guaranteed correct operation, it is 4.4. Detecting other physical attacks considered a tamper, and the tamper circuitry is activated. To prevent attacks based on manipulating the Another method by which correct operation can operating conditions, including those that would be compromised is by manipulating the clock signals make it difficult to respond to tamper and erase the that go to the coprocessor. To defend against these secrets in SRAM, several additional sensors have sorts of problems, we use phase locked loops and been added to the security circuitry to detect and independently generated internal clocks to prevent respond to changes in operating conditions. clock signals with missing or extra pulses, or ones that are either too fast or slow. 4.4.1. Attacks on zeroization High temperatures can cause improper operation For zeroization to be effective, certain environ- of the device CPU, and even damage it. So, high mental conditions must be met. For example, low temperatures cause the device to be held in reset temperatures will allow an SRAM to retain its data from the operational limit to the storage limit. Detec- even with the power connection shorted to ground. tion of temperature above the storage limit is treated To prevent this, a temperature sensor in our device as a tamper event. will cause the protection circuit to erase the SRAM if the temperature goes below a present level. Ionizing radiation will also cause an SRAM to 5. Device initialization retain its data, and may disrupt circuit operation. For this reason, our device also detects significant Section 4 discussed how we erase device secrets amounts of ionizing radiation and triggers the tamper upon tamper. One might deduce that a natural conse- response if detected. quence would be that `knowledge of secrets' implies Storing the same value in a bit in SRAM over `device is real and untampered'. But for this conclu- long periods can also cause that value to imprint. sion to hold, we need more premises. 840 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Ø the secrets were secret when they were first estab- tween ease of distribution and security: we have lished; chosen security. Ø the device was real and untampered when its secrets were established; 5.2. Field operations Ø weakening of cryptography does not compromise the secrets; 5.2.1. Regeneration Ø operation of the device has not caused the secrets An initialized device has the ability to regenerate to be exposed. its keypair. Regeneration frees a device from depend- This section discusses how we provide the first ing forever on one keypair, or key length, or even three properties. Section 6 will discuss how we cryptosystem. Performing regeneration atomically 7 provide the fourth. with other actions, such as reloading the crypto code Ž.Section 8 , also proves useful Ž as Section 10 will discuss. . For stronger forward integrity, implementa- 5.1. Factory initialization tions could combine this technique with expiration dates. As one might naturally suspect, an untampered To regenerate its keypair, a device does the fol- device authenticates itself as such using crypto- lowing: graphic secrets stored in secure memory. The pri- Ø create a new keypair from internal randomness, mary secret is the private half of an RSA or DSA Ø use the old private key to sign a transition keypair.Ž Section 10 elaborates on the use of this certificate for the new public key, including data private key.. Some symmetric-key secrets are also such as the reason for the change, and necessary for some special casesŽ as Sections 5.2.3 Ø atomically complete the change, by deleting the and 8.3 will discuss. . old private key and making the new pair and The device keypair is generated at deÕice initial- certificate `official'. ization. To minimize risk of exposure, a device The current list of transition certificates, com- generates its own keypair internally, within the tam- bined with the initial device certificate, certifies the per-protected environment and using seeds produced current device private key. Fig. 4 illustrates this from the internal hardware random number genera- process. tor. The device holds its private key in secure BBRAM, but exports its public key. An external 5.2.2. Recertification Certification Authority() CA adds identifying infor- The CA for devices can also recertify 8 the de- mation about the device and its software configura- vice, by atomically replacing the old certificate and tion, signs a certificate for this device, and returns Ž.possibly empty chain of transition certificates with the certificate to the device. a single new certificate. Fig. 5 illustrates this pro- ŽThe device-specific symmetric keys are also gen- cess.Ž Clearly, it would be a good idea for the CA to erated internally at initializationŽ.. see Section 8.3 . verify that the claimed public key really is the Clearly, the CA must have some reason to believe current public key of an untampered device in the that the device in question really is an authentic, appropriate family.. untampered device. To address this question ± and This technique can also frees the CA from de- avoid the risks of undetectable physical modification pending forever on a single keypair, key length, or Ž.Section 4.1 ± we initialize the cards in the factory, even cryptosystem. Fig. 6 illustrates this variation. as the last step of manufacturing. Although factory initialization removes the risks associated with insecure shipping and storage, it does 7 An atomic change is one that happens entirely,ornot at all introduce one substantial drawback: the device must Ð despite failures and interruptions. Atomicity for complex configuration-changing operations is in an important property. remain within the safe storage temperature range 8 To avoid denial-of-service attacks, Miniboot treats recertifica- Ž.Section 4.4 . But when considering the point of tion as a privileged command requiring authentication, like code initialization, a manufacturer faces a tradeoff be- loading. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 841

5.2.3. ReÕiÕal Scenarios arise where the tamper detection circuitry in a device has zeroized its secrets, but the device is otherwise untampered. As Section 4 discusses, certain environmental changes ± such as cold storage or bungled battery removal ± trigger tamper response in our design, since otherwise these changes would provide an avenue for undetected tamper. Such scenarios are arguably inevitable in many tamper-response designs ± since a device cannot easily wait to see if a tamper attempt is successful before responding. Satisfying an initial commercial constraint of `save hardware whenever possible' requires a way of re- ÕiÕing such a zeroized but otherwise untampered device. However, such a revival procedure introduces a significant vulnerability: how do we distinguish between zeroized but untampered device, and a Fig. 4. The device may regenerate its internal keypair, and atomi- tampered device? Fig. 7 illustrates this problem. cally create a transition certificate for the new public key signed How do we perform this authentication? with the old private key. As discussed earlier, we cannot rely on physical evidence to determine whether a given card is un- Again, for stronger forward integrity, implementa- tampered ± since we fear that a dedicated, well- tions could combine this technique with expiration funded adversary could modify a deviceŽ e.g., by dates. changing the internal FLASH components. and then

Fig. 5. The CA can recertify a device, by replacing its current device certificate and transition certificate sequence with a new device certificate, certifying the latest public key. 842 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Fig. 6. The CA can use device recertification in order to avoid depending forever on the same keypair. re-assemble it sufficiently well that it passes direct every aspect of the untampered card is visible to a physical inspection. Indeed, the need for factory-ini- dedicated adversary? tialization was driven by this concern: To accommodate both the commercial and security constraints, our architecture compromises: We can only rely on secrets in tamper-pro- Ø Revival is possible. We provide a way for a tected secure memory to distinguish a real trusted authority to revive an allegedly untam- device from a tampered device. pered but zeroized card, based on authentication via non-volatile, non-zeroizable `secrets' stored Indeed, the problem is basically unsolvable ± how inside a particular device component. Clearly, this can we distinguish an untampered but zeroized card technique is risky, since a dedicated adversary from a tampered reconstruction, when, by definition, can obtain a device's revival secrets via destruc- tive analysis of the device, and then build a fake device that can spoof the revival authority. Ø Revival is safe. To accommodate this risk, we force revival to atomically destroy all secrets within a device, and to leave it without a certified private key. A trusted CA must then re-initialize the device, before the device can `prove' itself genuine. This initialization requires the creation of a new device certificate, which provides the CA with an avenue to explicitly indicate the card has been revivedŽ e.g., ``if it produces signatures that verify against Device Public Key N, then it is allegedly a real, untampered device that has un- Fig. 7. Tamper response zeroizes the secrets in an initialized dergone revival ± so beware''. . Thus, we prevent device, and leaves either an untampered but zeroized device, or a tampered device. A procedure to revive a zeroized device must be a device that has undergone this risky procedure able distinguish between the two, or else risk introducing tam- from impersonating an untampered device that pered devices back into the pool of allegedly untampered ones. has never been zeroized and revived. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 843

Furthermore, given the difficulty of effectively tion 5 discussed how we ensure that they were secret authenticating an untampered but zeroized card, and to begin with. However, these techniques still leave the potential risks of a mistake, the support team for an exposure: did the device secrets remain secret the commercial product has decided not to support throughout operation? this option in practice. For example, suppose a few months after release, some penetration specialists discover a hole in the 5.3. Trusting the manufacturer OS that allows untrusted user code to execute with full supervisor privilege. Our code loading protocol A discussion of untamperedness leads to the ques- Ž.Section 8 allows us to ship out a patch, and a tion: why should the user trust the manufacturer of device installing this patch can sign a receipt with its the device? Considering this question gives rise to private key. three sets of issues. One might suspect verifying this signature would Ø Contents. Does the black box really contain the imply the hole has been patched in that device. advertised circuits and firmware? The paranoid Unfortunately, this conclusion would be wrong: a user can verify this probabilistically by physically hole that allows untrusted code full privileges would Ž opening and examining a number of devices. The also grant it access to the private key ± that is, necessary design criteria and object code listings without additional hardware countermeasures. This could be made available to customers under spe- section discusses the countermeasures we use. cial contract.. Ø CA Private Key. Does the factory CA ever cer- 6.2. Software threat model tify bogus devices? Such abuse is a risk with any public-key hierarchy. But, the paranoid user can This risk is particularly dire in light of the com- always establish their own key hierarchy, and mercial constraints of multiple layers of complex then design applications that accept as genuine software, from multiple authorities, remotely in- only those devices with a secondary certificate stalled and updated in hostile environments. History from this alternate authority. shows that complex systems are, quite often, perme- Ø Initialization. Was the device actually initialized able. Consequently, we address this risk by assuming in the advertised manner? Given the control a that all rewritable software in the device may be- manufacturer might have, it is hard to see how we have arbitrarily badly. can conclusively establish that the initialization Drawing our defense boundary here frees us from secrets in a card are indeed relics of the execution the quagmire 9 of having low-level miniboot code of the correct code. However, the cut-and-ex- evaluate incoming code for safety. It also accommo- amine approach above can convince a paranoid dates the wishes of system software designers who user that the key creation and management soft- want full access to `Ring 0' in the underlying Intel ware in an already initialized device is genuine. x86 CPU architecture. This assurance, coupled with the regeneration Declaring this assumption often raises objections technique of Section 5.2.1 above, provides a solu- from systems programmers. We pro-actively raise tion for the paranoid user: causing their device to some counterarguments. First, although all code regenerate after shipment gives it a new private loaded into the device is somehow `controlled,' we key that must have been produced in the advertised safe fashion.

9 6. Defending against software threats With the advent of Java, preventing hostile downloaded code from damaging a system hasŽ. again become a popular topic. Our architecture responds to this challenge by allowing `applets' to do 6.1. MotiÕation whatever they want ± except they can neither access critical authentication secrets, nor alter critical codeŽ which includes the Section 4 discussed how we ensure that the core code that can access these secrets. . Furthermore, these restrictions secrets are zeroized upon physical attack, and Sec- are enforced by hardware, independent of the OS and CPU. 844 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 need to accommodate the pessimistic view that `con- volved in detecting not just the address of an trolled software' means, at best, good intentions. instruction, but the context in which it is exe- Second, although an OS might provide two levels of cuted. For example, it is not sufficient merely to privilege, history 10 is full of examples where low- recognize that a sequence of instructions came level programs usurp higher-level privileges. Finally, from the address range for privileged code; the as implementers ourselves, we need to acknowledge locks would have to further distinguish between the very real possibility of error by accommodating ± these instructions, executing as privileged code; mistakes as well as malice. ± these instructions, executing as a subroutine, called by unprivileged code; 6.3. Hardware access locks ± these instructions, executing as privileged code, but with a sabotaged interrupt table. In order to limit the abilities of rogue but privileged software, we use hardware locks: independent circuitry that restricts the activities of code executing 6.3.1. Solution: time-based ratchet on the main CPU. We chose to use a simple hard- We finally developed a lock approach based on ware approach for several reasons, including: the observation that reset Ža hardware signal that Ø We cannot rely on the device operating system, causes all device circuitry return to a known state. since we do not know what it will be ± and a forces the device CPU to begin execution from a corrupt or faulty OS might be what we need to fixed address in ROM: known, trusted, permanent defend against. code. As execution proceeds, it passes through a Ø We cannot rely on the protection rings of the non-repeating sequence of code blocks with different device CPU, because the rewritable OS and Mini- levels of trust, permanence, and privilege require- boot layers require maximal CPU privilege. ments. Fig. 8 illustrates this sequence: Fig. 1 shows how the hardware locks fit into the Ø Reset starts Miniboot 0, from ROM; overall design: the locks are independent devices that Ø Miniboot 0 passes control to Miniboot 1, and can interact with the main CPU, but control access to never executes again. the FLASH and to BBRAM. Ø Miniboot 1 passes control to the OS, and never However, this approach raises a problem. Critical executes again. memory needs protection from bad code. How can Ø The OS may perform some start-up code. our simple hardware distinguish between good code Ø While retaining supervisor control, the OS may and bad code? then execute application code. We considered and discarded two options: Ø The applicationŽ executing under control of the Ø False Start: Good code could write a password OS. may itself do some start-up work, then to the lock. Although this approach simplifies the Ž.potentially incur dependence on less trusted code necessary circuitry, we had doubts about effec- or input. tively hiding the passwords from rogue software. Our lock design models this sequence with a trust Ø False Start: The lock determines when good code ratchet, currently represented as a nonnegative inte- is executing by monitoring the address bus during ger. A small microcontroller stores the the ratchet instruction fetches. value in a register. Upon hardware reset, the micro- This approach greatly complicates the circuitry. controller resets the ratchet to 0; through interaction We felt that correct implementation would be with the device CPU, the microcontroller can ad- difficult, given the complexities of instruction vance the ratchet ± but can neÕer turn it back.As fetching in modern CPUs, and the subtleties in- each block finishes its execution, it advances the ratchet to the next appropriate value.Ž Our implementation also enforces a maximum ratchet value, and ensures that ratchet cannot be advanced beyond this 10 For examples, consult the on-line archives of the Computer value.. Fig. 8 also illustrates how this trust ratchet Emergency Response Team at Carnegie Mellon University. models the execution sequence. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 845

Fig. 8. Hardware reset forces the CPU to begin executing Miniboot 0 out of ROM; execution then proceeds through a non-repeating sequence of phases, determined by code and context. Hardware reset also forces the trust ratchet to zero; each block of code advance the ratchet before passing control to the next block in the sequence. However, no code block can decrement the ratchet.

The microcontroller then grants or refuses mem- or earlier.Ž However, as Section 7.2 will show, we ory accesses, depending on the current ratchet value. use the ratchet to prevent these attacks as well..

6.3.2. Decreasing trust 6.3.3. Generalizations The effectiveness of this trust ratchet critically Although this discussion used a simple total order depends on two facts: on ratchet values, nothing prevents using a partial Ø The code blocks can be organized into a hierar- order. Indeed, as Section 7.2 discusses, our initial chy of decreasing privilege levelsŽ e.g., like clas- implementation of the microcontroller firmware does sical work in protection ringswx 16 or lattice mod- just that, in order to allow for some avenues for els of information flowwx 5,10. . future expansion. Ø In our software architecture, these privilege levels strictly decrease in real time! 6.4. PriÕacy and integrity of secrets This time sequencing, coupled with the indepen- dence of the lock hardware from the CPU and the The hardware locks enable us to address the fact that the hardware designŽ and its physical encap- challenge of Section 6.1: how do we keep rogue sulation. forces any reset of the locks to also reset software from stealing or modifying critical authenti- the CPU, give the ratchet its power: cation secrets? We do this by establishing protected 11 Ø The only way to get the most-privileged level pages: regions of battery-backed RAM which are Ž.`Ratchet 0' is to force a hardware reset of the locked once the ratchet advances beyond a certain entire system, and begin executing Miniboot 0 level. The hardware locks can then permit or deny from a hardwired address in ROM, in a known write access to each of these pages ± rogue code state. might still issue a read or write to that address, but Ø The only way to get a non-maximal privilege the memory device itself will never see it. levelŽ. `Ratchet N,' for N)0 is to be passed Table 1 illustrates the access policy we chose: F F control by code executing at a an earlier, higher- each Ratchet level R Ž.for 0 R 3 has its own privileged ratchet level. protected page, with the property that Page P can F Ø Neither rogue softwareŽ. nor any other software only be read or written in ratchet level R P. can turn the ratchet back to an earlier, higher- We use lockable BBRAM() LBBRAM to refer to privileged level ± short of resetting the entire the portion of BBRAM consisting of the protected system.

The only avenue for rogue software at Ratchet N 11 - The term `page' here refers solely to a particular region of to steal the privileges of ratchet K N would be to BBRAM ± and not to special components of any particular CPU somehow alter the software that executes at rachet K memory architecture. 846 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Table 1 Hardware locks protect the privacy and integrity of critical secrets Ratchet 0 Ratchet 1 Ratchet 2 Ratchet 3 Ratchet 4 Ž.Ž.Ž.ŽMiniboot 0 Miniboot 1 OS start-up Application start-up .Ž. Application Protected page 0 READ, WRITE NO ACCESS NO ACCESS NO ACCESS NO ACCESS ALLOWED Protected page 1 READ, WRITE READ, WRITE NO ACCESS NO ACCESS NO ACCESS ALLOWED ALLOWED Protected page 2 READ, WRITE READ, WRITE READ, WRITE NO ACCESS NO ACCESS ALLOWED ALLOWED ALLOWED Protected page 3 READ, WRITE READ, WRITE READ, WRITE READ, WRITE NO ACCESS ALLOWED ALLOWED ALLOWED ALLOWED

pages.Ž As with all BBRAM in the device, these ously resident sibling applications and dynamically- regions preserve their contents across periods of no loaded applications, this section confines itself to our power, but zeroize their contents upon tamper.. Cur- current implementation, of one application, resident rently, these pages are used for outgoing authentica- in FLASH. tionŽ. Section 10 ; Page 0 also holds some secrets used for ROM-based loadingŽ. Section 8 . 7.1. Loading and cryptography We partition the remainder of BBRAM into two regions: one belonging to the OS exclusively, and We confine to Miniboot the tasks of deciding and one belonging to the application. Within this non- carrying out alteration of code layers. Although pre- lockable BBRAM, we expect the OS to protect its vious work considered a hierarchical approach to own data from the application's. loading, our commercial requirementsŽ multiple-layer software, controlled by mutually suspicious authorities, updated in the hostile field, while sometimes 7. Code integrity preserving state. led to trust scenarios that were simplified by centralizing trust management. The previous sections presented how our architec- Miniboot 1Ž. in rewritable FLASH contains code ture ensures that secrets remain accessible only to to support public-key cryptography and hashing, and allegedly trusted code, executing on an untampered carries out the primary code installation and update device. To be effective, our architecture must inte- tasks ± which include updating itself. grate these defenses with techniques to ensure that Miniboot 0Ž. in boot-block ROM contains primi- this executing code really is trusted. tive code to perform DES using the DES-support This section presents how we address the problem hardware, and uses secret-key authenticationwx 19 to of code integrity: perform the emergency operations necessary to re- Ø Sections 7.1 and 7.2 describe how we defend pair a device whose Miniboot 1 does not function. against code from being formally modified, ex- ŽSection 8 will discuss the protocols Miniboot cept through the official code loading procedure. uses.. Ø Sections 7.3 and 7.4 describes how we defend against modifications due to other types of failures. 7.2. Protection against malice Ø Section 7.5 summarizes how we knit these techniques together to ensure the device securely As experience in vulnerability analysis often re- boots. veals, practice often deviates from policy. Without Note that although our long-term vision of the additional countermeasures, the policy of `Miniboot software architectureŽ. Fig. 3 includes simultane- is in charge of installing and updating all code S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 847

Table 2 The hardware locks protect the integrity of critical FLASH segments Ratchet 0 Ratchet 1 Ratchet 2 Ratchet 3 Ratchet 4 Ž.Ž.Ž.Miniboot 0 Miniboot 1 OS start-up Ž Application start-up .Ž. Application Protected segment 1 READ, WRITE READ, WRITE READ ALLOWED, READ ALLOWED, READ ALLOWED, Ž.Miniboot 1 ALLOWED ALLOWED WRITE PROHIBITED WRITE PROHIBITED WRITE PROHIBITED Protected segment 2 READ, WRITE READ, WRITE READ ALLOWED, READ ALLOWED, READ ALLOWED, ŽOperating Systemr ALLOWED ALLOWED WRITE PROHIBITED WRITE PROHIBITED WRITE PROHIBITED Control Program. Protected segment 3 READ, WRITE READ, WRITE READ ALLOWED, READ ALLOWED, READ ALLOWED, Ž.Application ALLOWED ALLOWED WRITE PROHIBITED WRITE PROHIBITED WRITE PROHIBITED

layers' does not necessarily imply that `the contents unless it modifies Miniboot 1. But this is not possi- of code layers are always changed in accordance ble ± in order to modify Miniboot 1, an adversary with the design of Miniboot, as installed'. For exam- has to either alter ROM, or already have altered ple: Miniboot 1.Ž Note these safeguards apply only in the Ø Without sufficient countermeasures, malicious realm of attacks that do not result in zeroizing the code might itself rewrite code layers. device. An attacker could bypass all these defenses Ø Without sufficient countermeasures, malicious by opening the device and replacing the FLASH code might rewrite the Miniboot 1 code layer, and components ± but we assume that the defenses of cause Miniboot to incorrectly `maintain' other Section 4 would ensure that such an attack would layers. trigger tamper detection and response.. To ensure that practice meets policy, we use the In order to permit changing to a hierarchical trust ratchetŽ. Section 6 to guard rewriting of the approach without changing the hardware design, the code layers in rewritable FLASH. We group sets of currently implemented lock firmware permits Ratchet X FLASH sectors into protected segments, one 12 for 1 to advance instead to a Ratchet 2 , that acts like each rewritable layer of code. The hardware locks Ratchet 2, but permits rewriting of Segment 3. Es- can then permit or deny write access to each of these sentially, our trust ratchet, as implemented, is al- segments ± rogue code might still issue a write to ready ranging over a non-total partial order. that address, but the memory device itself will never see it. 7.3. Protection against reburn failure Table 2 illustrates the write policy we chose for protected FLASH. We could have limited Ratchet 0 In our current hardware implementation, multiple Ž write-access to Segment 1 alone since in practice, FLASH sectors make up one protected segment. . Miniboot 0 only writes Miniboot 1 . However, it Nevertheless, we erase and rewrite each segment as a makes little security sense to withhold privileges whole, in order to simplify data structures and to from earlier, higher-trust ratchet levels ± since the accommodate future hardware with larger sectors. earlier-level code could always usurp these privi- This decision leaves us open to a significant risk: leges by advancing the ratchet without passing con- a failure or power-down might occur during the trol. non-zero time interval between the time Miniboot As a consequence of applying hardware locks to starts erasing a code layer to be rewritten, and the FLASH, malicious code cannot rewrite code layers time that the rewrite successfully completes. This risk gets even more interesting, in light of the fact that rewrite of a code layer may also involve changes 12 Again, the term `segment' is used here solely to denote to to other state variables and LBBRAM fields. these sets of FLASH sectors ± and not to special components of a When crafting the design and implementation, we CPU memory architecture. followed the rule that the system must remain in a 848 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 safe state no matter what interruptions occur during executables, the adversary must get write-access to operations. This principle is especially relevant to the that FLASH segment ± in which case, the adversary process of erasing and reburning software resident in also has write-access to the checksum, so his exhaus- FLASH. tive search was unnecessary. Ø Since Miniboot 1 carries out loading and contains the public-key crypto support, we allocate two 7.5. Secure bootstrapping regions for it in FLASH Segment 1, so that the old copy exists and is usable up until the new To ensure secure bootstrapping, we use several copy has been successfully installed. This ap- techniques together: proach permits using Miniboot 1 for public-key- Ø The hardware locks on FLASH keep rogue code based recovery from failures during Miniboot 1 from altering Miniboot or other code layers. updates. Ø The loading protocolsŽ. Section 8 keep Miniboot Ø When reburning the OS or an application, we from burning adversary code into FLASH. temporarily demote its state, so that on the next Ø The checksums keep the device from executing reset after a failed reburn, Miniboot recognizes code that has randomly changed. that the FLASH layer is now unreliable, and If an adversary can causeŽ. e.g., through radiation cleans up appropriately. extensive, deliberate changes to a FLASH layer so For more complex transitions, we extend this that it still satisfies the checksum it stores, then he approach: all changes atomically succeed together, or can defeat these countermeasures. However, we be- fail either back to the original state, or to a safe lieve that the physical defenses of Section 4 would intermediate failure state. keep such an attack from being successful: Ø The physical shielding in the device would make 7.4. Protection against storage errors it nearly impossible to produce such carefully focused radiation. Hardware locks on FLASH protect the code lay- Ø Radiation sufficiently strong to alter bits should ers from being rewritten maliciously. However, bits also trigger tamper response. in FLASH devicesŽ. even in boot block ROM can Consequently, securely bootstrapping a custom- change without being formally rewritten ± due to the designed, tamper-protected device is easier than the effects of random hardware errors in these bits them- general problem of securely bootstrapping a selves. general-purpose, exposed machineŽ e.g., Refs. To protect against spurious errors, we include a wx4,9,26. . 64-bit DES-based MAC with each code layer. Mini- boot 0 checks itself before proceeding; Miniboot 0 7.5.1. Execution sequence checks Miniboot 1 before passing control; Miniboot Our boot sequence follows from a common-sense 1 checks the remaining segments. The use of a 64-bit assembly of our basic techniques. Hardware reset MAC from CBC-DES was chosen purely for engi- forces execution to begin in Miniboot 0 in ROM. neering reasons: it gave a better chance at detecting Miniboot 0 begins with Power-on Self Test 0 errors over datasets the size of the protected seg- ()POST0 , which evaluates the hardware required for ments than a single 32-bit CRC, and was easier to the rest of Miniboot 0 to execute. Miniboot 0 verifies implementŽ even in ROM, given the presence of the MACs for itself and Miniboot 1. If an external DES hardware. than more complex CRC schemes. party presents an alleged command for Miniboot 0 We reiterate that we do not rely solely on single- ŽŽ..e.g., to repair Miniboot 1 Section 8 , Miniboot 0 DES to protect code integrity. Rather, our use of will evaluate and respond to the request, then halt. DES as a checksum is solely to protect against Otherwise Miniboot 0 advances the trust ratchet to 1, random storage errors in a write-protected FLASH andŽ. if Layer 1 is reliable jumps to Miniboot 1. segment. An adversary might exhaustively find other Except for some minor, non-secret device-driver executables that also match the DES MAC of the parameters, no DRAM state is saved across the correct code; but in order to do anything with these Miniboot 0 to Miniboot 1 transition.Ž In either Mini- S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 849 boot, any error or stateful change causes it to halt, in does not need to customize the load for each order to simplify analysis. Interrupts are disabled.. device. , and have the ability to be carried out on a Miniboot 1 begins with POST1, which evaluates public network. the remainder of the hardware. Miniboot 1 also Ø Updatable. As discussed in Section 2.1, we need verifies MACs for Layers 2 and 3. If an external to be able to update code already installed in party presents an alleged command for Miniboot 1 devices. Ž.e.g., to reload Layer 2 , Miniboot 1 will evaluate Ø Minimal disruption. An emphatic customer re- and respond to the request, then halt. Otherwise quirement was that, whenever reasonable and de- Miniboot 1 advances the trust ratchet to 2, andŽ if sired, application state be preserved across up- Layer 2 is reliable. jumps to Layer 2, the OS. dates. The OS then proceeds with its bootstrap. If the Ø Recoverable. We need to be able to recover an OS needs to protect data from an application that untampered device from failures in its rewritable may find holes in the OS, the OS can advance the software ± which may include malicious or acci- trust ratchet to 3 before invoking Layer 3 code. dental bugs in the code, as well as failures in the Similarly, the application can advance the ratchet FLASH storage of the code, or interruption of an further, if it needs to protect its private data. update. Ø Loss of cryptography. The complexity of public- key cryptography and hashing code forced it to 8. Code loading reside in a rewritable FLASH layer ± so the recoverability constraint also implies secure re- 8.1. OÕerÕiew coverability without these abilities. Ø Mutually suspicious, independent authorities. One of the last remaining pieces of our architec- In any particular device, the software layers may ture is the secure installation and update of trusted be controlled by different authorities who may code. not trust each other, and may have different opin- In order to accommodate our overall goal of ions and strategies for software update. enabling widespread development and deployment of Ø Hostile environments. We can make no assump- secure coprocessor applications, we need to consider tions about the user machine itself, or the exis- the practical aspects of this process. We review the tence of trusted couriers or trusted security offi- principal constraints: cers. Ø Shipped empty. In order to minimize variations To address these constraints, we developed and of the hardware and to accommodate US export followed some guidelines: regulations, it was decided that all devices would Ø We make sure that Miniboot keeps its integrity, leave the factory with only the minimal software and that only Miniboot can change the other configuration13 Ž. Miniboot only . The manufac- layers. turer does not know at ship timeŽ and may per- Ø We ensure that the appropriate authorities can haps never know later. where a particular device obtain and retain control over their layers ± de- is going, and what OS and application software spite changes to underlying, higher-trust layers. will be installed on it. Ø We use public-key cryptography whenever possi- Ø Impersonal broadcast. To simplify the process ble. of distributing code, the code-loading protocol Section 8.2 below outlines who can be in charge should permit the process to be one-roundŽ from of installing and changing code. Section 8.3 dis- authority to device.Ž , be impersonal the authority cusses how a device can authenticate them. Section 8.4 discusses how an `empty' card in the hostile field can learn who is in charge of its code layers. Sec- tions 8.5 and 8.6 discuss how the appropriate author- 13 Our design and implementation actually accommodates any ities can authorize code installations and updates. level of pre-shipment configuration, should this decision change. Section 8.7 summarizes software configuration man- 850 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Fig. 9. Authorities over software segments are organized into a tree.

agement for devices. Section 8.8. illustrates the de- The public key against which this message is veri- velopment process with a simple example. fied is stored in the FLASH segment for that code layer, along with the code and other parametersŽ see 8.2. Authorities Fig. 10. . Using public-key signatures makes it possible to As Fig. 9 illustrates, we organize software au- accommodate the `impersonal broadcast' constraint. thorities ± parties who might authorize the loading Storing an authority's public key along with the of new software ± into a tree. The root is the sole code, in the FLASH layer owned by that authority, owner of Miniboot; the next generation are the au- enables the authority to change its keypair over time, thorities of different operating systems; the next are at its own discretion.Ž Adding expiration dates and the authorities over the various applications that run revocation lists would provide greater forward in- on top of these operating systems. We stress that tegrity.. these parties are external entities, and apply to the However, effectively verifying such a signature entire family of devices, not just one. requires two things: Hierarchy in software architecture implies depen- Ø the code layer is already loaded and still has dence of software. The correctness and security of integrityŽ so the device actually knows the public the application layer depends on the correctness and key to use. ; and security of the operating system, which in turn de- Ø Miniboot 1 still functionsŽ so the device knows pends on Miniboot 1, which in turn depends on what to do with this pubic key. . Miniboot 0.Ž This relation was implied by the de- These facts create the need for two styles of creasing privileges of the trust ratchet.. loading: Similarly, hierarchy in the authority tree implies Ø ordinary loading, when these conditions both dominance: the authority over Miniboot dominates hold; and all operating system authorities; the authority over a Ø emergency loading, when at least one fails. particular operating system dominates the authorities over all applications for that operating system.

8.3. Authenticating the authorities

Public-key authentication. Wherever possible, a device uses a public-key signature to authenticate a message allegedly from one of its code authorities. Fig. 10. Sketch of the contents of code layer. S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 851

Secret-key authentication. The lack of public-key cryptography forces the device to use a secret-key handshake to authenticate communications from the Miniboot 0 authority. The shared secrets are stored in Protected Page 0, in LBBRAM. Such a scheme requires that the authority share these secrets. Our schemewx 19 reconciles this need with the no-data- bases requirement by having the device itself store a signed, encrypted message from the authority to itself. During factory initialization, the device itself Fig. 12. An ordinary load command for layer N consists of the new code, new public key, and trust parameters, signed by the generates the secrets and encrypts this message; the authority over that layer, this signature is evaluated against the authority signs the message and returns it to the public key currently stored in that layer. device for safekeeping. During authentication, the device returns the message to the authority.

8.4. Ownership 2FNF3, the authority over Layer N can issue a command surrendering ownership ± but the device Clearly, our architecture has to accommodate the can evaluate this command only if Layer N is cur- fact that each rewritable code layer may have con- rently reliable.Ž Otherwise, the device does not know tents that are either reliable or unreliable. However, the necessary public key.. in order to provided the necessary configuration flexibility, the OS and application layers each have 8.5. Ordinary loading additional parameters, reflecting which external authority is in charge of them. 8.5.1. General scheme Our architecture addresses this need by giving Code Layer N, for 1FNF3, is rewritable. Un- each of these layers the state space sketched in Fig. der ordinary circumstances, the authority over layer 11: N can update the code in that layer by issuing an Ø The code layer may be owned or unowned. update command signed by that authority's private Ø The contents of an owned code layer may be key. This command includes the new code, a new reliable. However, some owned layers ± and all public key for that authorityŽ which could be the unowned ones ± are unreliable. same as the old one, per that authority's key policy. , Ø A reliable code layer may actually be runnable. and target information to identify the devices for However, some reliable layers ± and all unreli- which this command is valid. The deviceŽ using able ones ± may be unrunnable. Miniboot 1. then verifies this signature directly This code is stored in EEPROM fields in the against the public key currently stored in that layer. hardware lock, write-protected beyond Ratchet 1. Fig. 12 sketches this structure. For 0-N-3, the authority over Layer N in a device can issue a Miniboot command giving an 8.5.2. Target unowned Layer Nq1 to a particular authority. For The target data included with all command signatures allows an authority to ensure that their command applies only in an appropriate trusted environment. An untampered device will accept the signature as valid only if the device is a member of this set.Ž The authority can verify that the load `took' via a signed receipt from MinibootŽ.. see Section 10 . For example, suppose an application developer determines that version 2 of a particular OS has a Fig. 11. State space of the OS and application code layers. serious security vulnerability. Target data permits 852 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 this developer to ensure that their application is loadable only on devices with version 3 or greater of that operating system.

8.5.3. Underlying updates The OS has complete control over the application, and complete access to its secrets; Miniboot has complete control over both the OS and the applica- Fig. 14. Ordinary loading of code into a layer is directly authenticated by the authority over that layerŽ. dashed arrows ; emergency tion. This control creates the potential for serious loading is directly authenticated by the authority underlying that backdoors. For example, can the OS authority trust layerŽ. solid arrows . that the Miniboot authority will always ship updates that are both secure and compatible? Can the application authority trust that the OS authority uses An ordinary reload of Layer N, if successful, appropriate safeguards and policy to protect the pri- preserves the current secrets of Layer N, and leaves vate key he or she uses to authorize software up- Layer N runnable. grades? For N-MF3, an ordinary reload of Layer N,if To address these risks, we permit Authority N to successful, preserves the current secrets of Layer M include, when loading its code, trust parameters if and only if Layer M had been reliable, and either: expressing how it feels about future changes to each Ø its trust parameter for N was always,or rewritable layer K-N. For now, these parameters Ø its trust parameter for N was countersigned, and have three values: always trust, neÕer trust, or trust a valid countersignature from M was included. only if the update command for K is countersigned Otherwise, the secrets of M are atomically de- by N. stroyed with the update. As a consequence, an ordinary load of Layer N An ordinary load of a layer always preserves that can be accompanied by, for N-MF3, a coun- layer's secrets, because presumably an authority can tersignature from Authority M, expressing compati- trust their own private key. bility. Fig. 13 sketches this structure. 8.6. Emergency loading 8.5.4. Update policy Trust parameters and countersignatures help us As Section 8.4 observes, evaluating Authority balance the requirements to support hot updates, N's signature on a command to update Layer N against the risks of dominant authorities replacing requires that Layer N have reliable contents. Many underlying code. scenarios arise where Layer N will not be reliable ± including the initial load of the OS and application in newly shipped cards, and repair of these layers after an interruption during reburn. Consequently, we require an emergency method to load code into a layer without using the contents of that layer. As Fig. 14 shows, an emergency load command for Layer N must be authenticated by Layer Ny1.Ž As discussed below, our architecture includes countermeasures to eliminate the potential backdoors this indirection introduces..

8.6.1. OS, application layers Fig. 13. An ordinary load command for layer N can include an optional countersignature by the authority over a dependent layer To emergency load the OS or Application layers, M. This countersignature is evaluated against the public key the authority signs a command similar to the ordi- currently stored in layer M. nary load, but the authority underneath them signs a S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 853 statement attesting to the public key. Fig. 15 illus- outside the code layer. The public-key certificate trates this. The device evaluates the signature on this later used in the emergency load of N specifies the emergency certificate against the public key in the particular ownerID, which the device checks. underlying segment, then evaluates the main signature against the public key in the certificate. 8.6.3. Emergency reloading of Miniboot This two-step process facilitates software distribu- Even though we mirror Miniboot 1, recoverability tion: the emergency authority can sign such a certifi- still required that we have a way of burning it cate once, when the next-level authority first joins without using it, in order to recover from emergen- the tree. This process also isolates the code and cies when the Miniboot 1 code layer does not func- activities of the next-level authority from the under- tion. Since we must use ROM onlyŽ and not Mini- lying authority. boot 1. , we cannot use public-key cryptography, but instead use mutual authentication between the device 8.6.2. Risks of siblings and the Miniboot 0 authority, based on device- Burning a segment without using the contents of specific secret keys-see Section 8.3. that segment introduces a problem: keeping an emergency load of one authority's software from over- 8.6.4. Closing the backdoors writing installed software from a sibling authority. Emergency loading introduces the potential for We address this risk by giving each authority an backdoors, since reloading Layer N does not require ownerID, assigned by the Ny1 authority when the participation of the authority over that segment. establishing ownership for N Ž.Section 8.4 , and stored For example, an OS authority could, by malice or error, put anyone's public key in the emergency certificate for a particular application authority. Since the device cannot really be sure that an emergency load for Layer N really came from the genuine Authority N, Miniboot enforces two precau- tions: Ø It erases the current Layer N secrets but leaves the segment runnable from this clean startŽ since the alleged owner trusts it. . Ø It erases all secrets belonging to later layers, and leaves them unrunnableŽ since their owners cannot directly express trust of this new loadŽ see Section 9.. . These actions take place atomically, as part of a successful emergency load.

8.7. Summary

This architecture establishes individual commands for Authority N to: Ø establish owner of Layer Nq1, Ø attest to the public key of Layer Nq1, Fig. 15. An emergency load commandŽ. for Ns2, 3 consists of Ø install and update code in Layer N, the new code, new public key, and trust parameters, signed by the Ø express opinions about the trustworthiness of fu- authority over that layer; and an emergency certificate signed by ture changes to Layer K-N. the authority over the underlying layer. The main signature is evaluated against the public key in the certificate; the certificate Except for emergency repairs to Miniboot 1, all signature is evaluated against the public key stored in the underly- these commands are authenticated via public-key ing layer. signatures, can occur over a public network, and can 854 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 be restricted to particular devices in particular con- Layer 2 developer, she shares no secrets with Bob, figurations. and updates for Bonnie's software will not be ac- Depending on how an authority chooses to control cepted by cards owned by Bob's. its keypairs and target its commands, these com- The architecture also support other variations in mands can be assembled into sequences that meet the installationrdevelopment process; for example, the criteria of Section 2.1. A separate reportwx 20 maybe Bob buys the cards himself, configures them, explores some of the scenarios this flexibility en- then ships them to Carol. ables. Ž.The case for Layer 3 developers is similar.

8.8. Example code deÕelopment scenario 9. Securing the execution We illustrate how this architecture supports flexible code development with a simple example. This section summarizes how our architecture Suppose Alice is in charge of Miniboot 1, and builds on the above techniques to satisfy the security Bob wants to become a Layer 2 owner, in order to requirements of Section 2.2.1.Ž Although a formal develop and release Layer 2 software on some cards. proof is beyond the scope of this paper, we have Bob generates his keypair, and gives a copy of his completed a side-project to formally specify these public key to Alice. Alice then does three things for properties, formally model the behavior of the sys- Bob: tem, and mechanically verify that the system pre- wx Ø She assigns Bob a 2-byte ownerID value that serves these properties 18 .. distinguishes him among all the other children of 9.1. Control of software Alice.Ž. Recall Fig. 9. Ø She signs an Èstablish Owner 2' command for Loading software in code Layer N in a particular Bob. device requires the cooperation of at least one cur- Ø She signs an Èmergency Signature' for an rent authority, over some 0FKFN. Èmergency Burn 2' saying that Owner 2 Bob has Ø From the code integrity protections of Section 8, Ž. that public key. Recall Fig. 15. the only way to change the software is through Bob then goes away, writes his code, prepares the Miniboot. remainder of his Èmergency Burn 2' command, and Ø From the authentication requirements for software attaches the signature from Alice. loading and installationŽ which Table 3 summa- Now, suppose customer Carol wants to load Bob's rizes. , any path to changing Layer N in the future program into Layer 2 on her card. She first buys a requires an authenticated command from some Ž virgin device which has an unowned Layer 2, but KFN now. . has Miniboot 1 and Alice's public key in Layer 1 . Ø From the hardware locks protecting Page 0Ž and Carol gets from Bob his Èstablish Owner 2' and the intractability assumptions underlying cryptog- Èmergency Burn 2' command, and plays them into raphy. , the only way to produce this command is her virgin card via Miniboot 1. It verifies Alice's to access the private key store of that authority. signatures and accepts them. Layer 2 in Carol's card is now owned by Bob, and contains Bob's Program 9.2. Access to secrets and Bob's Public key. If Bob wants to update his code andror keypair, 9.2.1. Policy he simply prepares an Òrdinary Burn 2' command, The multiple levels of software in the device are and transmits it to Carol's card. Carol's card checks hierarchically dependent: the correct execution of the his signature on the update against the public key it application depends on the correct execution of the has already stored for him. operating system, which in turn depends on the Note that Bob never exposes to Alice his private correct execution of Miniboot. However, when con- key, his code, his pattern of updates, or the identity sidered along the fact that these levels of software of his customers. Furthermore, if Bonnie is another might be independently configured and updated by S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 855

Table 3 Summary of authentication requirements for Miniboot commands affecting layer N Miniboot command Authentication required Establish owner of layer N Authority Ny1 Surrender owner of layer N Authority N Emergency load of layer N Authority Ny1 of layer K-N Authority Ky1 Ordinary load of layer N Authority N of layer K-N Authority K Ž.trust from authority N

authorities who may not necessarily trust each other, This load can only succeed if the execution envi- this dependence gives rise to many risks. ronment is deemed trustworthy, as expressed by the We addressed these risks by formulating and en- target information in Authority N's signature. forcing a policy for secure execution: 9.2.4. Run-time During ordinary execution, secure bootstrapping A program can run and accumulate state only Ž.Section 7 and the hardware locks on LBBRAM while the device can continuously maintain Ž.Section 6 ensure that only code currently in the a trusted execution environment for that pro- execution environment can directly access Layer N's gram. secrets ± and by inductive assumption, Authority N trusts this software not to compromise these secrets. The execution enÕironment includes both underlying untampered device, as well as the code in this 9.2.5. Changes and underlying layers. The secrets of a code layer The execution environment for Layer N can are the contents of its portion of BBRAM. change due to reloads, to tamper, and to other failure The authority responsible for a layer must do the scenarios. Our architecture preserves the Layer N trusting of that layer's environment ± but the device secrets if and only if the change preserves the trust itself has to verify that trust. To simplify implemen- invariant. Table 4 summarizes how these changes tation, we decided that changes to a layer's environ- affect the state of Layer N; Table 5 summarize how ment must be verified as trusted before the change the new state of Layer N affects the secrets of Layer takes effect, and that the device must be able to N. verify the expression of trust directly against that A runnable Layer N stops being runnable if the authority's public key. change in execution environment causes the inductive assumption to fail ± unless this change was an 9.2.2. Correctness emergency load of Layer N, in which case the Layer Induction establishes that our architecture meets N secrets are cleared back to an initial state. the policy. Let us consider Layer N; the inductive Ø Layer N becomes unowned if the environment assumption is the device can directly verify that changes in way that makes it impossible for Authority N trusts the execution environment for Authority N to express trust again: the device is Layer N. tampered, or if Layer 1Ž. the public key code becomes untrusted, or if Layer Ny1 becomes 9.2.3. Initial state unownedŽ so the ownerID is no longer uniquely A successful emergency load of layer N leaves N defined. . in a runnable state, with cleared secrets. This load Ø Layer N also becomes unowned if Authority N establishes a relationship between the device and a has explicitly surrendered ownership. particular Authority N. The device can subsequently Ø Layer N becomes unreliable if its integrity fails. directly authenticate commands from this authority, ŽAuthority N can still express trust, but only since it now knows the public key. indirectly, with the assistance of Authority Ny1.. 856 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Table 4 Summary of how the state of Layer N changes with changes to its execution environment Action Transformation of layer N state RELIABLE layer N fails checksum NOT RELIABLE Layer 2-N is OWNED but NOT RUNNABLE NOT RUNNABLE Layer 2-N is UNOWNED UNOWNED Layer 1 is NOT RELIABLE Device is ZEROIZED

Establish owner of layer N OWNED Surrender owner of layer N UNOWNED of layer N RUNNABLE Emergency load of layer K-NKs2 NOT RUNNABLE Ks1 UNOWNED Ordinary load of layer N RUNNABLE of layer K-N Trusted by auth N no change Untrusted by auth NKs2 NOT RUNNABLE Ks1 UNOWNED

Ø Otherwise, Layer N stops being runnable if an of the clean-up required by an authority that does not untrusted change occurred. trust a new Miniboot 1 ± despite failures during the Layer N stays runnable only for three changes: load process. Ø An emergency load of Layer N. Ø An ordinary reload of Layer N. Ø An ordinary reload of Layer K-N, for which 10. Authenticating the execution Authority N directly expressed trust by either signing an `always trust K ' trust parameter at last 10.1. The problem load of Layer N, or by signing an `trust K if countersigned' at last load of N, and signing a The final piece of our security strategy involves countersignature now. the requirement of Section 2.2.2: how to authenticate Only the latter two changes preserve the trust computation allegedly occurring on an untampered invariant ± and, as Table 5 shows, only these pre- device with a particular software configuration.Ž Sec- serve the Layer N secrets. tion 8.3 explained how the device can authenticate the external world; this section explains how the 9.2.6. Implementation external world can authenticate the device.. Code that is already part of the trusted enÕiron- It must be possible for a remote participant to ment carries out the erasure of secrets and other state distinguish between a message from the real thing, changes. In particular, the combined efforts of Mini- and a message from a clever adversary. This authen- boot 0Ž. permanently in ROM and the Miniboot 1 tication is clearly required for distributed applica- currently in Layer 1Ž. hence already trusted take care tions using coprocessors. As noted earlier, the e-wal-

Table 5 Summary of how changes to the state of Layer N changes its secrets Action Transformation of layer N secrets Layer N is NOT RUNNABLE ZEROIZED Layer N is RUNNABLE Emergency load of layer N Cleared to initial state Otherwise PRESERVED S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 857 let example of Yeewx 26 only works if it's the real supporting a particular segment changes over time, wallet on a real device. But this authentication is also its trustworthiness in the eyes of a remote participant required even for more pedestrian coprocessor appli- may change. If one does not consider the old version cations, such as physically secure high-end crypto- of the OS or the new version of an application to be graphic modules. For example, a sloppy definition of trustworthy, then one must be able to verify that one `secure' software update on crypto modules may is not talking to them. The authentication scheme require only that the appropriate authority be able to must accommodate these scenarios. update the code in an untampered device. If a security officer has two devices, one genuine and one 10.3. Our solution evilly modified, but can never distinguish between them, then it does not matter if the genuine one can These risks suggest the need for decoupling be- be genuinely updated. This problem gets even worse tween software levels, and between software ver- if updates all occur remotely, on devices deployed in sions. Our architecture carries out this strategyŽ al- hostile environments. though currently, we have only implemented the bottom level, for Layer 1. . 10.2. Risks As Section 5 explained, we build an internal key hierarchy, starting with the keypair certified for Perhaps the most natural solution to authentica- Miniboot 1 in a device at device initialization. This tion is to sign messages with the device private key private key is stored in Page 1 in LBBRAM ± so it that is established in initializationŽ. Section 5 and is visible only to Miniboot 1. Our architecture has erased upon tamper. However, this approach, on its Miniboot 1 regenerate its keypair as an atomic part own, does not address the threats introduced by the of each ordinary reload of Miniboot 1. The transition multi-level, updatable, software structure. For exam- certificate includes identification of the versions of ple: Miniboot involved.Ž As Section 8 discusses, each Ø Application threats. What prevents one applica- emergency reload of Miniboot 1 erases its private tion from signing messages claiming to be from a key ± the authority who just carried out mutual different application, or from the operating sys- secret-key authentication must then re-initialize the tem or Miniboot? What prevents an application device.. from requesting sufficiently many `legitimate' Similarly, as an atomic part of loading any higher signatures to enable cryptanalysis? What if an Layer N Ž.for N)1 , our architecture has the under- Internet-connected application has been compro- lying Layer Ny1 generate a new keypair for Layer mised by a remote adversary? N, and then certify the new public key and deletes Ø OS threats. If use of the device private key is to the old private key. This certification includes identi- be available to applications in real-time, then fication of the version of the code. Although Mini- Žgiven the infeasibility of address-based hardware boot could handle the keys for everyone, our current access control. protection of the key depends plan is for Miniboot to certify the outgoing keypair entirely on the operating system. What if the for the operating system, and for our operating sys- operating system has holes? We are back to the tem to certify the keypair for the application ± scenario of Section 6.1. because this scheme more easily accommodates cus- Ø Miniboot threats. An often-overlooked aspect of tomer requirements for application options. The OS security in real distributed systems is the integrity private key will be stored in Page 2 in LBBRAM. of the cryptographic code itself. How can one Our approach thus uses two factors: distinguish between a good and corrupted version Ø Certification binds a keypair to the layers and of Miniboot 1? Not only could a corrupt version versions of code that could have had access to the misuse the device private key ± it can also lie private key. about who it is. Ø The loading protocol along with the hardware- This last item is instance of the more general protected memory structure confines the private Õersioning problem. As the software configuration key to exactly those versions. 858 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Fig. 16. Our outgoing authentication strategy requires that, in order to authenticate message M, Program F trust only what's inside the dotted line Ð which it would have to trust anyway.

This approach provides recoverability from com- well as additional trust parameters for policy en- promise. Code deemed untrustworthy cannot spoof forcement. Hardware work also remains. In the short without the assistance of code deemed trustworthy. run, we plan to finish addressing the engineering An untampered device with a trusted Miniboot 1 can challenges in moving this technology into PCMCIA always authenticate and repair itself with public-key format. techniques; an untampered device with trusted ROM However, the main avenue for future work is to can always authenticate itself and repair Miniboot 1 develop applications for this technology, and to en- with secret-key techniques. able others to develop applications for it. We view This approach also arguably minimizes necessary this project not as an end-result, but rather as a tool, trust. For example, in Fig. 16, if Program F is going to finally make possible widespread development to believe in the authenticity of the mystery message, and deployment of secure coprocessor solutions. then it arguably must trust everything inside the dotted line ± because if any of those items leaked secrets, then the message could not be authenticated Acknowledgements anyway. But our scheme does not force Program F to trust anything outside the dotted lineŽ except the The authors gratefully acknowledge the contribu- integrity of the original CA. . tions of entire Watson development team, including Vernon Austel, Dave Baukus, Suresh Chari, Joan Dyer, Gideon Eisenstadter, Bob Gezelter, Juan Gon- 11. Conclusions and future work zalez, Jeff Kravitz, Mark Lindemann, Joe McArthur, Dennis Nagel, Elaine Palmer, Ron Perez, Pankaj We plan immediate work into extending the de- Rohatgi, David Toll, and Bennet Yee; the IBM vice. The reloadability of Miniboot 1 and the operat- Global Security Analysis Lab at Watson, and the ing system allows exploration of upgrading the cryp- IBM development teams in Vimercate, Charlotte, tographic algorithmsŽ e.g., perhaps to elliptic curves, and Poughkeepsie. We also wish to thank Ran as well as certificate blacklists and expiration. as Canetti, Michel Hack, and Mike Matyas for their S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860 859 helpful advice, and Bill Arnold, Liam Comerford, wx18 S.W. Smith, V. Austel, Trusting trusted hardware: towards a Doug Tygar, Steve White, and Bennet Yee for their formal model for programmable secure coprocessors, 3rd inspirational pioneering work, and the anonymous USENIX Workshop on Electronic Commerce, September 1998. referees for their helpful comments. wx19 S.W. Smith, S.M. Matyas, Authentication for secure devices with limited cryptography, IBM T.J. Watson Research Cen- ter, Design notes, August 1997. wx20 S.W. Smith, E.R. Palmer, S.H. Weingart, Using a high-per- References formance, programmable secure coprocessor, Proc. 2nd Int. Conf. on Financial Cryptography, Springer, Berlin, 1998. wx21 J.D. Tygar, B.S. Yee, Dyad: A system for using physically wx1 D.G. Abraham, G.M. Dolan, G.P. Double, J.V. Stevens, secure coprocessors, Proc. Joint Harvard±MIT Workshop on Transaction security systems, IBM Systems Journal 30Ž. 1991 Technological Strategies for the Protection of Intellectual 206±229. Property in the Network Multimedia Environment, April wx2 R. Anderson, M. Kuhn, Tamper resistance ± a cautionary 1993. note, 2nd USENIX Workshop on Electronic Commerce, wx22 S.H. Weingart, Physical security for the mABYSS system, November 1996. IEEE Computer Society Conf. on Security and Privacy, wx3 R. Anderson, M. Kuhn, Low cost attacks on tamper resistant 1987. devices, Preprint, 1997. wx23 S.R. White, L.D. Comerford, ABYSS: a trusted architecture wx4 W.A. Arbaugh, D.J. Farber, J.M. Smith, A secure and reli- for software protection, IEEE Computer Society Conf. on able bootstrap architecture, IEEE Computer Society Conf. on Security and Privacy, 1987. Security and Privacy, 1997. wx24 S.R. White, S.H. Weingart, W.C. Arnold, E.R. Palmer, Intro- wx5 D.E. Bell, L.J. LaPadula, Secure computer systems: mathe- duction to the Citadel architecture: security in physically matical foundations and model, Technical Report M74-244, exposed environments, Technical Report RC 16672, Dis- MITRE, May 1973. tributed Systems Security Group, IBM T.J. Watson Research wx6 E. Biham, A. Shamir, Differential fault analysis: a new Center, March 1991. cryptanalytic attack on secret key cryptosystems, Preprint, wx25 S.H. Weingart, S.R. White, W.C. Arnold, G.P. Double, An 1997. evaluation system for the physical security of computing wx7 D. Boneh, R.A. DeMillo, R.J. Lipton, On the importance of systems, 6th Annual Computer Security Applications Conf., checking computations, Preprint, 1996. 1990. wx8 D. Chaum, Design concepts for tamper responding systems, wx26 B.S. Yee, Using secure coprocessors, Ph.D. Thesis, Com- CRYPTO 83. puter Science Technical Report CMU-CS-94-149, Carnegie wx9 P.C. Clark, L.J. Hoffmann, BITS: a smartcard protected Mellon University, May 1994. operating system, Communications of the ACM 37Ž. 1994 wx27 B.S. Yee, J.D. Tygar, Secure coprocessors in electronic 66±70. commerce applications, 1st USENIX Workshop on Elec- wx10 D.E. Denning, A lattice model of secure information flow, tronic Commerce, July 1995. Communications of the ACM 19Ž. 1976 236±243. wx28 A. Young, M. Yung, The dark side of black-box cryptogra- wx11 W. Havener, R. Medlock, R. Mitchell, R. Walcott, Derived phy ± or ± should we trust Capstone? CRYPTO 1996, LNCS test requirements for FIPS PUB 140-1, National Institute of 1109. Standards and Technology, March 1995. wx12 IBM PCI Cryptographic Coprocessor, Product Brochure G325-1118, August 1997. Sean W. Smith received a B.A. in wx13 M.F. Jones, B. Schneier, Securing the World Wide Web: Mathematics from Princeton University smart tokens and their implementation, 4th Int. World Wide in 1987, and an M.S.Ž. 1988 and Ph.D. Web Conf., December 1995. Ž.1994 in Computer Science from wx14 National Institute of Standards and Technology, Security Carnegie Mellon UniversityŽ Pittsburgh, Requirements for Cryptographic Modules, Federal Informa- Pennsylvania, USA. . He worked as a tion Processing Standards Publication 140-1, 1994. post-doctoral fellow and then as a mem- wx15 E.R. Palmer, An introduction to Citadel ± a secure crypto ber of the technical staff in the Com- coprocessor for workstations, Computer Science Research puter Research and Applications Group Report RC 18373, IBM T.J. Watson Research Center, at Los Alamos National Laboratory, per- September 1992. forming vulnerability analyses and other wx16 M.D. Schroeder, J.H. Saltzer, A hardware architecture for security research for the United States implementing protection rings, Communications of the ACM government. He joined IBM Thomas J. Watson Research Center 15Ž. 1972 157±170. in 1996, where he has served as security co-architect and as part wx17 S.W. Smith, Secure coprocessing applications and research of the development team for the IBM 4758 secure coprocessor. issues, Los Alamos Unclassified Release LA-UR-96-2805, His research interests focus on practical and theoretical aspects of Los Alamos National Laboratory, August 1996. security and reliability in distributed computation. 860 S.W. Smith, S. WeingartrComputer Networks 31() 1999 831±860

Steve H. Weingart received a BSEE from the University of MiamiŽ Coral Gables, FL, USA. in 1978. From 1975 until 1980 he worked in biomedical research and instrument development. In 1980 he joined Philips Laboratories as an electrical engineer, and in 1982 he joined IBM at the Thomas J. Watson Research Center. Most of his work since 1983 has been centered on secure coprocessors, cryptographic hardware architecture and design, and the design and implementation of tamper detecting and responding packaging for secure coprocessors. He participated in the NIST panel convened to assist in the development of the FIPS 140-1 standard.