Building the IBM 4758 Secure Coprocessor

COVER FEATURE Building the IBM 4758 Secure Coprocessor Meeting the challenge of building a user-configurable secure coprocessor provided several lessons in hardware and software development and continues to spur further research. Joan G. Dyer ork leading toward the IBM 4758 long as it could be transformed into a CCA follow-on started, arguably, in the 1980s when the and meet the appropriate deadlines. Mark Abyss project began exploring tech- Seeking to provide an environment where applica- Lindemann niques to build tamper-responsive hard- tions could run securely forced us to focus not only on Ronald Perez ware and use that technology to protect security mechanisms and their implementation and W 1,2 against software piracy. Influenced by this earlier management, but also on various flavors of security Reiner Sailer work, we sought to build a secure coprocessor, policies they must support. Clearly, the hardware on Leendert defined as a tamper-responding device derived from which applications run must be secure, as must the van Doorn the Abyss, Citadel, and 4755 work. We wanted to operating system and runtime environment in IBM T.J. Watson provide a single multipurpose platform that third par- between, while offering a reasonable API for applica- Research Center ties could use to develop and deploy secure coproces- tions developers. To fix problems in the field and sor applications, with minimal IBM participation. To enable fast and inexpensive reaction to changing cus- Sean W. Smith accomplish this, we sought to achieve several goals: tomer needs, we implemented part of the code as Dartmouth firmware, rather than read-only memory. Figure 1 College • Ensure that the device can be identified externally shows the 4758’s three major components and their as to contents, using some form of outbound interrelationships. authentication and public-key interface (PKI). Subdividing the software into different layers raises Steve • Design the device and its software to be securely issues of trust because upper components rely on the Weingart configurable and updatable in the field. security that lower layers offer. Applications cannot Cryptographic • Construct the software architecture to accom- be more secure than the kernel functions they call, and Appliances modate layers of code from different parties, who the operating system cannot be more secure than the may or may not trust each other. hardware that executes its commands. • Avoid letting the compromise of one device Thus, if the lower layers are robust, higher layers breach the security of another. can choose whether to relinquish some security. We • Validate all these assertions through an external designed the lower layers to be relatively permanent party. firmware, or static hardware, thereby offering fewer points of attack throughout operation. Decreasing IBM’s Common Cryptographic Architecture prod- complexity, careful evaluation, and more effective uct group realized that its next-generation product security mechanisms justify the increasing trust along required properties possessed by the secure coproces- the vertical arrow. Management of the different lay- sor that IBM Research advocated. This knowledge ers must reflect the demand for increasing trust in gave the research team a unique and perhaps nonre- lower layers, and be flexible enough to support mul- peatable opportunity: funding and authority to design tiple security policies in various organizations. and produce the product we thought should exist, as As the “Design Decisions” sidebar notes, we 0018-9162/01/$10.00 © 2001 IEEE October 2001 57 Figure 1. Overview of Manufacturing considerations a secure coproces- Because we sought to provide a third-party pro- sor’s three major Software: grammable device, the manufacturing process could • Applications design components: Officers not incorporate anything related to the eventual soft- • Environment/OS ware. Programming, loading of code, or both are com- hardware and firm- • Kernel ware, which only IBM • Loaders pleted outside the factory at the customer’s discretion. could alter, and soft- Consequently, when a 4758 leaves the factory, it must be fully armed against tampering. ware, which custo- Firmware: mers could configure IBM • Post Trust in the field to meet • Miniboot Initialization and the trusted root their specific require- Each 4758 undergoes the same initialization proce- ments. Hardware: dure, eliminating the need for individual personaliza- • Processor tion—which is expensive, error prone, generally IBM • Flash, RAM, ROM requires a large quantity, and doesn’t fit the general- • Locks purpose secure-coprocessor model. An authentication • Tamper-responding unit protocol is performed in the factory to establish an ini- • Crypto functions tial secret using a valid source of randomness and minimal software within the 4758’s ROM. When shipped, each 4758 can authenticate requests to download code achieved our major goals, but learned through hind- and respond to tampering. Having a meaningful out- sight several techniques for improving the secure- bound authentication requires a securely generated root coprocessor design process. certificate, which is established in the factory environment. SECURITY AND TRUST We based the 4758’s security on an architecture and Layers and codeload boundaries implementation validated from manufacture to Layer refers to the division of software into inde- deployment. pendent pieces. The initial layer is inserted into ROM Design Decisions We found the following to be the crucial ingredients needed Hence, for the Federal Information Processing Standards (FIPS) to build a programmable, secure coprocessor. validation procedure, we could only use the hardware to seed a pseudorandom-number generator (PRNG), per the FIPS 186-1 Hardware tamper response standard. Second, the mandatory statistical tests to ensure that Relying upon software to react to tampering is too slow. How- the RNG works correctly—performed on both the hardware ever, this observation implies a well-sealed package and careful source and the PRNG output—slow the boot process by more design of the communication paths between the 4758 and the than 20 seconds. outside world. Dissipating heat becomes a problem, which affects the speed at which components internal to the 4758 can be run. Layered design The ratchet hardware provides the mechanism that assures A programmable device must have a layer responsible for prior firmware or software layers that subsequent layers cannot implementing code load security policy. The layer should be val- alter their secrets. Hardware enforces this assurance best, ren- idated, as its correctness forms the basis upon which subsequent dering areas of storage invisible, protecting areas from tamper, or layers depend for their own security. both. The number of ratchet lock settings, and what they protect, Subsequent layers personalize the device. We initially selected requires careful iterative design—something that deadlines and two layers: one for an operating system and device drivers, the other constraints do not always allow for. Ideally, the number of second for application code. This division made it possible for locks and associated regions would be configured by firmware. third parties to either write applications to an operating system we supplied, or write everything themselves. Randomness Finally, minimal software in ROM provides for repairs or Generating initial secrets requires a validated source of ran- extensions of initial facilities using the secure field update archi- domness, which ensures that possible destruction of a few cards tecture. cannot compromise any others. We encountered two problems with this requirement, neither of them fatal. First, we could find Self-initialization no standard for hardware random-number generators (RNGs). A programmable device must leave the factory ready to 58 Computer during the 4758’s manufacturing phase, and the Secure boot remaining layers are downloaded into flash memory A 4758 provides a secure boot of itself. The hardware using a public-key authentication process associated and firmware guarantee that the coprocessor stores only with a layer’s owner. A codeload boundary denotes a authorized code in flash memory. Authorization is defined point at which the system accepts a signed file accomplished at codeload time. Checks of flash for authenticating a load-this-layer command. integrity are made at each boot before passing control to the next layer. The checks run against hardware- Officers and signing keys enforced flash contents that are locked read-only by the At each codeload boundary, an officer and key pair time the operating system and application run. authenticate loading. The previous layer’s officer The validity of these statements follows, in part, from establishes ownership and provides a certificate attest- the secure packaging. An attempt to replace the ROM ing to the new owner’s initial public key. The officer or flash contents destroys the 4758’s secrets. The dam- owning the layer has complete control over subse- aged coprocessor might still run the same application quent loads, including changing the key pair and software, but it could not authenticate itself when establishing various trust targeting options. booted, and the application software’s battery-backed RAM and encrypted flash information would be un- Retaining state available. A secure coprocessor provides the ideal place to keep secrets, provided they disappear upon tamper- HARDWARE ing and that any compromise of a single device does

Load more