GBDE - GEOM Based Disk Encryption

Poul-Henning Kamp The FreeBSD Project [email protected]

Abstract The everincreasing mobility of computers has made protection of data on digital storage media an important requirement in a number of applications and situations. GBDE is a strong cryptographic facility for denying unau- thorised access to data stored on a ``cold''disk for decades and longer.GBDE operates on the disk(-partition) level allowing anytype of or database to be protected. Asignificant focus has been put on the practical aspects in order to makeitpossible to deployGBDE in the real world. 1

1. Losing data left and right mistaken for the plot from a classic Buster Keaton In the last couple of years, gentlemen of the press movie: have repeatedly been able to expose howlaptop com- First a laptop was forgotten and lost in a taxi-cab. puters containing highly sensitive orvery valuable Newpolicy: always drive your own car if you information have been lost to carelessness, theft and in bring your laptop. Then a car was stolen, includ- some cases espionage. [THEREG] ing the laptop in the trunk. Newpolicy: always bring your laptop with you. The next laptop was The scope of the problem is very hard to gauge, since it stolen from a pub while the owner was bowing to is not a subject which the involved persons and, in par- the pressures of nature. Newpolicy: employees ticular,institutions are at all keen on having exposed. are not to carry their own laptops outside the office However, a few data points have been uncovered, at anytime. Laptops will be transported from and revealing that the U.S. Federal Bureau of Investigation to the employees home address by the agencyse- curity force and will be chained and locked to a loses, on average, one laptop every three days. ring in the wall installed by the companyjanitors. [DOJ0227] All requests must be filed 3 days in advance on When a computer is lost, stolen or misplaced, it is very form ##-#. [PRIV] often the case that the computer hardware represents a value which is insignificant compared to the value of the disk contents. More often than not, the only reason 2. Protecting disk contents the press heard about it was that the material on the Protecting the contents of a computer'sdisk can disk was ``hot''enough to makethe loss of control rat- in practice be done in twoways: by physically securing tle people at government level. the disk or by encrypting its contents. While it is easy to blame these incidents on ``user Physical protection is increasingly impossible to imple- error'', as is generally done, doing so makes it a very ment. It used to be that disk drivescould only be hard problem to fix. Human nature being what it is, movedbyforklift, but these days a gigabyte disk is the seems to remain just that. size, but not quite yet the thickness, of a postage stamp. In the absence of technical counter measures, adminis- While computers can be tied down with wires and bars trative measures have been applied, generally with can be put in front of windows, such measures are gen- abysmal results. In one case, a bureaucracyhas han- erally not acceptable, or at least not judged economi- dled the problem according to what could easily be cally justified in anybut the most sensitive operations. That leavesencryption of the disk contents as the only 1 This software was developed for the FreeBSD Project by practical and viable mode of protection, and both the Poul-Henning Kamp and NAI Labs, the Security Research Division practicality and the viability has been somewhat in of Network Associates, Inc. under DARPA/SPAWAR contract N66001-01-C-8035 (``CBOSS''), as part of the DARPACHATS re- doubt. search program. Until recently,nearly all aspects of cryptographywere a highly political issue, this has eased a lot in the last couple of years and there now``only''remain a number of rather fundamental questions in the area of law has been subject to both version skew and compatibility enforcement and human rights, which are still unset- problems. tled. With the political issues mostly out of the way,the next 2.3. Disk levelencryption roadblock is practical: While use of cryptographycan Encryption at the disk levelcan protect all data, neverbeentirely transparent, the overhead and work- no matter howtheyare stored, file system, database or load it brings must be reasonable. otherwise. To a user,encryption at the disk levelwould require 2.1. Application levelencryption authentication before the computer can be used, every- Encryption at the application levelhas been thing functioning transparently thereafter,with all disk available for a number of years, primarily in the form content automatically protected. of the PGP [PGP] program. This is about as intrusive Giventhat the programming interface for a disk device and demanding as things can get: the user is explicitly is very simple and practically identical between operat- responsible for doing both encryption and decryption ing systems, there are no technical reasons whythe 2 and must enter the pass-phrase for every operation. same implementation could not be used across several Apart from the inconvenience of this extra workload, operating systems. manyorg anisations would trust their users neither to All in all this is a close to ideal solution from an opera- get this right nor eventowant to get it right. From an tional point of view. institutional point of viewitisimportant that crypto- There are significant implementation issues however. graphic data protection can be made mandatory. In difference from the higher levels, encryption at the disk levelhas no way of knowing a priori which sectors 2.2. Filesystem levelencryption contain data and which sectors do not; neither is knowl- Encryption at the file system levelisatried and edge available about access patterns or relationships acknowledged method of providing protection, but it between individual sectors. suffers from a number of drawbacks, mainly because Where application levelorfile system based encryption no mainstream file systems offer encryption. schemes can key each file individually,adisk based Encrypting file systems are speciality items, which encryption must key each and every sector individually, means increased cost and system administration prob- ev enifitisnot currently used to hold data. lems of all sorts. It has been argued that the encryption ideally should And since practically all operating systems use their happen in the disk-drive,and while there are steps in ownfile system format, cross platform fully functional this direction, theydounfortunately seem to have been file systems are very rare. This means that a typical made for the wrong reasons by the wrong people organisation will have tooperate with a handful of dif- [CPRM], and have consequently not gained acceptance. ferent methods of encryption, which translates to sys- Provided the owner of the computer remains in control tem administration overhead, user confusion and extra of the encryption, I see no reason whyencryption in the effort to pass security and ISO9000 audits. disk drivesshould not gain acceptance in the future. Asecondary,but increasingly important issue is that data which are stored in databases on rawdisk, operat- 3. Whythis is not quite simple ing system paging areas and other such data are not Several implementations have been produced protected by a cryptographic file system. To protect which implement a disk encryption feature by running these would mean adding yet another set of encryption the user provided passphrase through a good quality methods, which leads to a situation which is very hard one-way hash function and used the output as a key to to handle practically and administratively. encrypt all the sectors using a standard block cipher in Finally,file systems have a complexprogramming CBC mode. Aper sector IV for the encryption is typi- interface to the , which traditionally cally derivedfrom the passphrase and sector address using a one-way hash function. Tw o typical examples 2 Interestingly,this is so impractical in real world use that vari- are [CGD] and [LOOPAES]. ous applications with PGP support resort to caching the pass-phrase at the application level, thereby weakening the protection a fair bit. Unfortunately this approach suffers from a number of significant drawbacks, both in terms of cryptographic strength and deployability. available hardware, and is consequently out of the Fordata to stay protected for decades or evenlifetimes, question. sufficient margin must exist not only for technological The third design criterion came from the fact that peo- advances in brute force technology,but also for theoret- ple forget passphrases, and while loosing the entire ical advances in cryptoanalytical attacks on the algo- content of the disk as punishment could be seen as a rithms used. large-calibered educational device, it is not acceptable Protecting a modern disk, typically having a fewhun- from a real world perspective:there must be some kind dred millions of sectors, with the same single 128 or of multiple access paths. 256 bits of key material offers an incredibly large Giventhat GBDE is open source software, there is little amount of data for statistical, differential or probabilis- more than symbolic value in a hierarchical set of tic attacks in the future. passphrases: changing the source code to bypass the Worse, because the sectors contain file system or data- hierarchycannot trivially be prevented. It is also not base data and meta data which are optimised for speed, clear what the hierarchywould control in the first the plaintext sector data typically have both a high place. Since all sectors are treated the same, it cannot degree of structure and a high predictability,offering be used to implement hierarchical access levels, and ample opportunities for statistical and known plaintext implementing a hierarchywhich only affects the key attacks. hierarchywould be silly.Itwas therefore decided that GBDE would support a number of parallel key paths, This author would certainly not trust data so protected and four was chosen as the default number which can to be kept secret for more than maybe fiv e or ten years be changed at compile time. against a determined attacker. The fifth usability criterion was that GBDE should be But far more damning to this method is that there can able to use anybyte string as passphrase, and not be only be one single passphrase for the disk. This effec- limited to NUL terminated text strings. In manyset- tively rules out the ability for an organisation to imple- tings, the necessary key strength cannot be derived ment anykind of per-user or multilevelkey manage- from a keyboard entered text string, and therefore it ment scheme: the only possible scheme is ‘‘one key per should be possible to use storage devices likeUSB- disk’’. keys or smart cards to contain or contribute to the Add to this that to change the passphrase the entire disk passphrase. would have tobedecrypted and re-encrypted, and we The final usability criterion was that GBDE should have a model which may work in theory,and can be offer the best possible protection also for the user.This made to work in practice for a determined individual, might require some explanation: butwhich would fast become an operational liability for anyorg anisation. 4.1. Protecting the user 4. Designing GBDE The weakest link in anycryptographic deploy- ment is almost always the users and their handling of The initial design phase of GBDE focused on keymaterial. determining a set of features which would makeitboth possible and practical for people and organisations to As an example of this, most banks use two-man proce- deployGBDE in routine use. dures to protect their vaults. Not because this prevents their employees from embezzling money—there are The first decision has already been described and plenty of ways theycan do that — but because it pro- argued for: Encryption must happen at the sector or tects the employee from being a weak link which could ‘‘rawdisk’’lev el, in order to be comprehensive,univer- be broken by means of physical violence, blackmail or sal and portable. hostage taking. The second decision was dictated by the fact that all These kinds of situations mandate that analysis treat the security policies we have everseen, contain a rule user as a component, something which have become which says ‘‘passwords must be changed every N routine by now. {days,weeks,months}’’. This is sound thinking, and GBDE should support it. While changing the Unfortunately it is still a relatively common mistaketo passphrase does not necessarily have tobeinstanta- either leave the attacker out of the analysis, or to neous, decrypting and re-encrypting an entire disk assume a very weak or evenstupid attacker. would likely takemore than a day with currently An example of this, is the ``The Steganographic File will describe a couple of major issues which GBDE System''[STEGFS], which provides a facility where a leavestothe user to solve. hierarchyofpassphrases protect data at different levels. The argumentation more or less goes ``protect a couple 5.1. Key management of unimportant files at the lowest level, and your impor- It has been said that there is only one really hard tant files at higher levels. When captured, give them problem in cryptography: the problem of key manage- the lowest levelkey and denythat anymore keys ment. GBDE does not try to address that. exist.'' In all of the following we will assume that the If we include the attacker in the analysis, she will soon passphrases and other key material have been handled knowthat the facility used is STEGFS, and conse- in a sensible manner,fitting the importance of the data quently that multiple levels of keysare not only a possi- theyare used to protect. If you put a yellowstickynote bility but to be expected: Whyelse would people use with the passphrase on a disk, no amount of crypto- STEGFS in the first place ? graphic code will offer anyprotection. Auser caught with a STEGFS encrypted set of data, will therefore likely be subject to pressure to release keys until the attacker is satisfied that there are no more 5.2. No protection for ``hot''disks keys. If the attacker is the police, this can nowland the GBDE is only designed to protect data on a user in jail for up to fiv e years in certain countries. ``cold disk''. A cold disk is defined as a disk disassoci- 3 But evenworse: if the attacker has her own ideas of ated from the keyswhich provide access to the disk. what is to be found in the protected data, for instance If the GBDE protected device has been ``attached''ona information on weapons of mass destruction, but no running system, parts of the key material will be such information is on the protected set, there is no way present in memory and therefore GBDE'scrypto- the user can ``get offthe hook''and prove his inno- graphic protection is no stronger than the protection the cence: STEGFS provides no way of proving the nonex- operating system givestothose data. istence of anyfurther keys. If the protected device is not ``attached'', there is no If one studies the evidence of a real-world scenario: the key-material present in RAM and the disk is fully pro- plight of the junior CIA operative resident and later tected by GBDE. hostage at the Tehran embassy [IRAN], a couple of Adisk on a shelf or in a courier bag is obviously also a interesting insights emerge. cold disk. Very often, the user will have a tinywindowintime It is important to note that disks in laptop computers during which it may be possible to manually activate which are only ``suspended''can not be be considered data destruction mechanisms, or alternatively,beable cold if theyare attached in the kernel, because the key to out-wait a specific timeout after which automatic material is present in RAM or possibly in the ``suspend mechanisms will have destroyed the data if theyare not 4 rearmed correctly. to disk''partition on the disk. Forsuch a feature to offer the user effective protection While most computers offer facilities for password pro- it must provide a tangible feedback to the attacker that tection when leaving suspend mode, it should be noted the user has destroyed the data, and can not bring it that modern hardware facilities may contain wide open back. Effective feedback has been smoking piles of back doors around that protection. ash, but not, as related in the narrative from Tehran, Forinstance, FireWire/IEEE1394 capable devices are piles of shredded documents. mandated to have hav e built in ``OHCI''facilities The final GBDE design criterion was therefore that which allowdirect access to the entire memory space GBDE should be able to rapidly destroykey material in without the CPU being involved or aware of this. We such a way that it can be proventhat there is no hope to 3 recoverthe data short of very expensive brute force Under carefully controlled circumstances GBDE can also be used as a component to build protection for ``hot disks''; that is dis- methods. cusse later. 4 Interestingly,ifGBDE was implemented in the disk drive and 5. What GBDE does not do the BIOS handled the pass-phrase entry,the suspend-to-disk partition would also be protected. With anycryptographic facility,itisasimportant to knowwhat it does as what it does not, this section estimate that using this method it would be possible to mapping the locations on the disk, these sectors offer undetectably grab a snapshot of all RAM on a typical no leverage for attacks, until their encrypted location notebook computer in a fewminutes. A great tool for has been found by some other means. debugging, a nightmare tool for security. The final cryptographic criterion was to use only stan- dard cryptographic algorithms in good standing: Rijn- 6. Cryptographic design dael/AES as block cipher,RSA2/512 as strong one way hash and MD5 as ``bit blender''. The usability requirements produces the overall layout of the GBDE cryptographic design which the cryptographic design must respect: A passphrase gives 7. The GBDE algorithm access to a copyofthe master-key material, which As can be seen from the above,the apriori again givesaccess to the protected sector contents. requirements have closely circumscribed the solution From a cryptographic point of view, the requirements space for GBDE, and the algorithms four steps follows are very simple: makeitasstrong as possible using as more or less directly from the design. fewresources as possible. While we currently have great faith in algorithms like 7.1. From passphrase to master key the Rijndael/AES, and it is generally accepted that It is generally accepted that a passphrase consist- 128bit symmetric keysoffer practically impenetrable ing of one or more sentences in a human language only security,the history of the cryptographyhas shown that contain about one bit of effective entropyper character, significant weaknesses takelong time to uncover. and obviously theyare variable length. Therefore the Since the aim is that GBDE be able to protect a cold passphrase is passed through the SHA2/512 one-way disk for a decade or more, it would be unwise to rely on hash algorithm. This transforms the variable length our current understanding of the subject and the algo- passphrase to 512 bits which contain as much as possi- rithms to remain unchallenged for the lifetime of the ble of the entropyinthe passphrase. data. These 512 bits, called ``the key material'', are used to Anumber of defensive measures have therefore been locate and decrypt the so called ``lock sector''which been included in the design criteria, in order to put suf- contains the master-key and various parameters. ficient margin into the cryptographic strength of Acompile time constant, sets the number of lock sector GBDE. copies supported, the default number is four.These The first criterion is that plaintext sector data should be copies are located in four sectors chosen randomly at encrypted with one-time-use (pseudo-)random keys. the time when the device is initialised for GBDE usage. This measure, while relatively expensive,effectively The first 128 bits of the key material are used to stops differential plain-text attacks. decrypt the sector offset of the encrypted and encoded This forces yet a design decision since the PRN sector- lock sector.The encrypted version of this offset can keys obviously must be encrypted and stored on the either be stored in the first sector of the device or out- disk with the encrypted plain-text. The second criterion side the device in a file. is that these sector-keysare encrypted with a per sector Once read from disk, the next 256 bits of the key mate- ``key-key'' derivedfrom only a fraction of a very large rial is used to decrypt the encoded lock sector using master key. AES/CBC/256. By only allowing a fraction of the master-key into the The final 128 bits of the key material define the order calculation, a penetration which manages to find the in which the fields of the lock sector are encoded. key-key for a givensector will not expose the master Numeric fields are encoded in little-endian byte order key, and since the sector key which the key-key so that the on-disk format is portable between different encrypts is (pseudo-)random, no differential leverage architectures. Once decoded, one of the fields offer a exists at this leveleither. MD5 hash checksum which can be used to prove that The third cryptographic criterion was that a remapping this is actually a valid lock sector. of sectors should happen, so that the location of a par- ticular plain-text sector on the encrypted volume is not predictable. Manyfile systems have well defined and 7.2. Sector mapping easily predictable ``super blocks''and allocation bit- It follows from the above that a number of trans- maps which have well defined sector addresses. By formations are necessary on the sector location, in addition to the cryptographic transformation of the sec- 7.3. Plausible denial tor contents. This could be used to implement ``plausible The first transformation, splits the plain-text sectors denial''byembedding the GBDE partition inside some into a number of zones. The size of a zone is chosen so data which credibly can be claimed to be something that one sector exactly contains the encrypted sector else. Since high entropydata are rare in the wild, spe- keys for all the payload sectors in the zone. With a sec- cial care is needed to build a convincing coverstory to tor size of 512 bytes and a sector key size of 128 bits, explain the existence of otherwise unexplained random the results in a zone size of looking bits. 512bytes/sector *8bits/byte Most computers today come pre taxed with a Microsoft = 32sectors 128bits operating system, and this could be used for the cover The extra sector with the encrypted sector keysis story: Flush all free space in the Windows partition appended after the last of the 32 data sectors, making with random bits, locate a long stretch of free space in the zone 33 sectors long in the encrypted image. the partition and use that to contain the GBDE encrypted data. If no other evidence betrays the exis- It should be noted that the sector size for the encrypted tence of the GBDE partition, it should be possible to device and consequently the zone size is a parameter denyits existence. set at device initialisation time. Atruly paranoid setup would leave the computer con- The second transformation is a rotation by an integral figured to boot the Windows system by default, and number of sectors. The number is randomly chosen at locate the GBDE data in such a way that it would be device initialisation time and recorded in a field in the destroyed by the act of doing so. lock sector.This step addresses the design requirement that the location of the encrypted sector should not be trivial to determine. 7.4. ``Key-key''derivation The third transformation inserts the four lock sectors Foreach sector to be processed, we need to into their locations. The locations of these are chosen derive the key-key which the sector key isencrypted at random at device initialisation time, and stored in with. For this purpose the lock sector contains two fields in the lock sector.Itfollows quite obviously fields, the ``salt''which is a 128 bit (pseudo-)random from this, that each lock sector must contain informa- number,and the master key which is a 2048 bit tion about the location of all lock sectors, in order to (pseudo-)random number.Both of these are generated map around them. when the device is initialised for GBDE usage and not Finally an offset is added which determines where the subsequently changed. encrypted image starts in the underlying device. In order to makethe format architecture invariant, the the sector address is encoded as a little-endian 64 bit integer Zoning sect = LE64(sectoraddress)

and then, surrounded by the salt, run through the MD5 Rotation algorithm which acts as a ``bit-blender''. The salt is necessary to prevent pre-computation of a dictionary of Insert all possibly sector addresses as an attack vector. Masterkeys index[16] = MD5(salt[0. . . 7] + sect + salt[8. . . 15])

The resulting 16 bytes of indexare used to ``cherry- Offset pick''16bytes from the master key,which are run through MD5 with the sector address inserted in the fig1:mapping operations. middle: 5 The offset mapping serves twopurposes, it allows the first sector of the underlying device to be reserved to 5 In both cases the encoded sector address were put in the mid- contain the encrypted locations of the lock sectors, but dle of the MD5 input data only for reasons of symmetry,cryptograph- it also allows the administrator to use only a part of the ically it makes no difference. underlying device for GBDE encrypted data. keykey = MD5( To read and decrypt a sector,the key sector and the masterkey[index[0]] + ...+masterkey[index[7]] + encrypted sector are read into memory,the key-key sect + generated and the sector key decrypted with it. The masterkey[index[8]] + ...+masterkey[index[15]]) sector key isused to decrypt the encrypted sector data, the temporary buffers erased and the transaction com- Some research have cast doubt about the cryptographic pleted. qualities of the MD5 algorithm [RSAMD5]. The ques- Since the sector data is encrypted with a (pseudo-)ran- tionable property,how hard it is to generate collisions, dom key,there is no need to use anyparticular initiali- is of no concern here: MD5 is merely used as a ``bit sation vector for the encryption process and therefore a blender''and unlikewhen MD5 is used for authentica- constant IV of zero bits is used 6. tion purposes, there is no value to the attacker in pro- ducing a collision with a different input. 7.5. Sector data encryption 7.6. Performance optimisations If implemented strictly according to the above To encrypt and write a sector of plain-text data, description, GBDE would takea drastic performance the key-key isderivedasabove,and the sector contain- hit because all I/O requests, no matter their size would ing the encrypted sector keysisread into memory. explode into tworequests per sector on the underlying device for each sector in the request. Even with the Salt MD5 Sector Address caches built into modern disk drives, this would lead to Sector Data adrastic performance hit. Instead the request is split into a number of ``work Master Cherry packets''which are characterised by the fact that all Key picker data sectors in the work packet are consecutively laid out on the device, this also implies that theyare in the same zone and therefore need the same key sector. MD5 Furthermore to avoid re-reading a key sector which has just been written to disk, a small cache is used to keep PRNG copies of the encrypted key sectors which have recently key−key been used. The size of the cache grows linearly with sector−key usage of key sectors, and decays with 10% every sec- ond 7 Write to AES Furthermore, the logical sector size of the encrypted key−sector 128 CBC device can be set to a power of twomultiple of the underlying devices sector size. Using larger sectors means fewer keys, larger zones and consequently larger work packets. Forpaging spaces, the VM page size should be used as Write to AES logical sector size, for file systems the smallest alloca- data sector 128 tion unit should be used. ForUFS/FFS this is the so CBC called fragment size, and it is typically 2048 bytes. fig2:write operation Anew sector key isgenerated with the standard In retrospect zero bits were a bad choice since anysector on the encrypted device which contains all zero bits showupasadis- (P)RNG facilities in the kernel and the sector data tinct signature on the encrypted side. This does not introduce a vul- encrypted using AES/CBC/128. nerability inside the design envelope, but it will nonetheless be ad- dressed in the next revision of GBDE since a fix will not affect the in- The sector key isencrypted with AES/CBC/128 using stalled user base. the key-key and inserted at the correct place in the key 7 It is not possible to use the regular kernel buffer cache be- sector. cause GBDE operates belowthe SPECFS layer.Furthermore the Both the key sector and the encrypted data sector are buffer cache offers no means to clear the buffers contents before it is recycled. written to disk after which the temporary storage is cleared and the transaction is complete. 8. Administrative operations possible to tell them apart from random bits, so the In order to use a device with GBDE it must be only way to locate the lock sectors is by doing a brute initialised first. The initialisation will generate the con- force search. tents of one lock sector and encode, encrypt and write it If the encrypted location of the lock sectors is available, to the device. it may be faster to brute-force the 128 bits which While this is technically enough to get started, it is encrypt the location than to search all possible byte off- highly recommended that the entire device be filled sets on the disk. with (pseudo-)random bits before writing the initial Attacking the lock sectors means finding the 256 bit lock sector,since this will prevent an attacker from encryption key for the encoded data, and the permuta- using the previous possibly less random sector contents tion of the 12 fields in the lock sector. to eliminate a large fraction of the sectors from an Giventhat one of the fields is a MD5 hash which will attack. can be used to test for a hit, the worst case work to Unfortunately,ittakes considerable amount of time to brute force the lock sector is therefore somewhere in generate (pseudo-)random bits for modern sized disk the neighbourhood of: devices, and this process can takehours or evendays to W ⋅ 2128 + (W + W ) ⋅ 2256 complete. AES/CBC/128 AES/CBC/256 MD5 or To use a GBDE device, it needs to be ``attached''. This ⋅ + ⋅ 256 involves presenting the passphrase, and using that Nbytes_on_device (W AES/CBC/256 WMD5) 2 locate and decrypt, decode and validate the lock-sector. or,ifthe attacker by some other means have located the The opposite process, detaching the GBDE device is lock sector: + ⋅ 256 mainly a question of erasing all traces of the lock sector (W AES/CBC/256 WMD5) 2 and key sector copies from memory. All of which are practically infinity by todays stan- Since the location of all lock sectors is recorded in each dards. lock sector,itispossible for anyuser with a valid passphrase to overwrite anylock sector. 9.2. Bottom up attack One operation is writing a newlock sector protected by anew passphrase. If done for the lock sector which the Attacking from the other end, so to speak, we user has a passphrase for,this amounts to changing the will assume that the attacker either knows the plain text users passphrase. of a number of sectors or is able to recognise it by some algorithm, and let us give him the benefit of knowing Another option is to overwrite the lock sector(s) with all the parameters which goes into the mapping of sec- zero bits. This disables the passphrase(s), a condition tors. which GBDE will report specifically on, in order to provide the tangible feedback of destroyed data dis- Doing a brute force on one of the sectors yields the sec- cussed earlier. tor key which can be used to brute-force the encrypted copyofthe sector key,which again yields the key-key for this particular sector. 9. Attacking GBDE devices Having obtained the key-key for one specific sector,the Givenamedia containing data protected with next step is to find the 16 unknown bytes of the 24 GBDE, there are twoavenues for attacking the algo- bytes input sequence to MD5 which generate this key- rithm: From the top, trying to get hold of the master- keyasoutput. (The middle 8 bytes is the little-endian keyand salt from one of the lock sectors, from the bot- encoded sector number,which we assume the attacker tom, trying to derive them from decrypted sectors, or already knows). by trying to guess the passphrase. In the worst case, the workload so far is: 2 ⋅ W ⋅ 2128 + W ⋅ 2128 9.1. Top down attack AES/CBC/128 MD5 The attacker nowhas the value of up to 16 bytes (there The first challenge is to identify which sectors may be duplicates) of the 256 bytes in the master key, contain what. buthedoes not knowwhere those 16 values are located There is currently no known algorithms or statistical in it. methods which are able to tell bits produced with If the attacker brute-forces multiple sectors, he can try AES/CBC/256 and AES/CBC/128 apart, nor is it to brute-force the 128 bit salt using the known master keybytes as truth detector.The exact number of sec- The LOOP-AES [LOOPAES] facility in Linux offers tors he needs to brute force is very hard to predict but an ``iteration count''which adds N thousand iterations let us assume the worst case value of two. The worst of AES/256 to the pass-phrase preprocessing path in case work necessary to do this is: order to makesuch an attack more expensive. ⋅ 128 WMD5 2 With the increasing speeds of hardware and the avail- Knowing the salt the attacker can nowpredict which ability of cryptographic co-processors, protecting a bytes in the master-key are involved in the generation weak pass-phrase for a decade or more in this fashion is of the key-key for each sector on the disk, and may be neither realistic nor prudent, and GBDE therefore does able to decrypt a number of sectors based on the subset not implement a similar option. of the master key heknows so far. Abetter way to frustrate a dictionary attack is to com- 8 To extend his knowledge of the master key hewill have bine the pass-phrase with a high-entropytoken, for to attack more sectors using brute force, but the search instance 1024 bits generated in a suitable random way. space is reduced by the already known bytes of the This token could be stored on a authentication device or master-key soeach newsector may be obtainable along removable storage device such as an USB-key. with a byte of master key byaworst case effort as low as: 10. Known weaknesses ⋅ ⋅ 8 + ⋅ 2 W AES/CBC/128 2 2 WMD5 No armour is impenetrable, and no dragon with- at which point the door is wide open. out a soft spot. Three issues have been identified where GBDE has less than perfect performance. Summing up, the total lowest worst case effort, given known plaintext is in the neighbourhood of 129 ⋅ + 128 ⋅ 10.1. The static master key 2 W AES/CBC/128 2 WMD5 which is also practically infinity by todays standards. The fact that the lock-sector contents does not change with a change of passphrase means that once a person has once had access to a device, it is not possi- 9.3. Attacking the pass-phrase ble to reliably revoke that access by changing the pass- As already mentioned, using human language phrase: He would be able to restore a copyofthe lock- text for pass-phrases yields only about one bit of sector for which he knows the pass-phrase, and thus entropyper word, so a well written 64 bit entropypass- gain access to the device. All things being equal, this is phrase could be: only marginally more problematic than the fact that the user may also have kept copies of the plain-text data. If Blow, winds, and crack your cheeks! rage! blow! Youcataracts and hurricanoes, spout this is a real concern, the proper mitigation is to copy Till you have drench'dour steeples, drown'dthe cocks! the data offthe device, reinitialise the GBDE lock sec- Yousulphurous and thought-executing fires, tors, and copythe data back. In the future an ``fast re- Vaunt-couriers to oak-cleaving thunderbolts, encrypt''operation would help makethis a less painful Singe my white head! And thou, all-shaking thunder, procedure. Smite flat the thick rotundity o' the world! Crack nature'smoulds, and germens spill at once, That makeingrateful man! 10.2. Huge appetite for random bits [LEAR] Because sector keysare single use (pseudo-)ran- Giventhat fewactors can deliverthat correctly overthe dom bits, GBDE can consume up to sixteen PRNG lime-lights, we can expect that the average pass-phrases bytes for every 512 byte sector written to the disk, or will offer less entropy, and dictionary attacks are conse- 32 kilobytes per megabyte of plaintext. It is obviously quently not only feasible, but to be expected. important that these PRNG bits are high quality and The worst case work required to test one passphrase that the source cannot be manipulated or tampered candidate amounts to: with, in particular when the device is first initialised + + where the master-key and salt bit-strings are chosen. WSHA2/512 W AES/128 /CBC + + Wdisk−read W AES/256 /CBC WMD5 while the best case work to reject it (because the 8 This principle is often referred to as ``something you have + decrypted location of the lock-sector would be found to something you know'' be outside the physical media) is only: + WSHA2/512 W AES/128 /CBC If the attacker can predict or manipulate the bits To further improve security,the security department assumed by GBDE to have random properties, attack- could copythe first twoencrypted lock sectors to a ing the protected data could become trivial. floppydisk and replace them with random bits. It follows from this that a PRNG which is not regularly Should the user loose his passphrase or if he decides to perturbed by outside entropysources would be a very activate the lock sector destruction function in some sit- bad source for GBDE, because it would be possible to uation, the floppydisk can be used to restore the lock predict all future bits once the state of the generator sectors and the contents of the device can be recovered wasdetermined once. from there. In FreeBSD the kernel random facility is based on the Needless to say,destroying the lock sectors adds little industry standard Yarrowalgorithm which uses crypto- security if the adversary has access to a backup copy, so graphic functions to churn out high entropybits based while the floppydisk does not need by definition need on entropycollected from hardware sources to be stored in a safe, it should obviously not be stored [YARROW]. next to the encrypted device either.

10.3. The cleaning lady copy attack 11.2. Serverdeployment An attacker in position to makeabit-wise copya Forserver use the issue is slightly more complex. GBDE protected media on a regular basis would be It is usually desirable to be able to keep the passphrase able to gain a head-start on attacking the device by stored on the server so that it will be able to boot auto- being able to monitor which sectors change and which matically,but at the same time makeitimpossible for do not from one snapshot to the next. an attacker to get hold of the passphrase and the pro- Givenjust a fewcopies spaced a day apart, an attacker tected disk at the same time. could likely pinpoint the location of the UFS/FFS super One way to implement this, is by using a ``weak blocks because of their strict geometric distances, and link/strong link''method similar to that use in ``permis- the lock sectors may stand out after a longer time sive action links''inatomic weapons [PAL]: period. In the analysis of attacks on cold media, we The server is installed in an small enclosure with a very more or less assumed that the attacker had this head- sensitive and very fast intrusion detection. If an intru- start, but the full impact of this kind of attack can only sion is detected (breach of the weak link), the computer be judged for a specific file system or content type. will destroythe passphrase (disabling the strong link), Alog structured file system would probably be harder thereby protecting the encrypted data. to unravelthis way because data and metadata are all Needless to say,there needs to be an internal power intermingled in sequential write operations. source which can provide sufficient power to reliably destroythe passphrase, in case the attacker starts out by 11. GBDE as a component cutting the power supply to the enclosure. Afacility likeGBDE is obviously only a compo- nent in anyreal security implementation. The main 12. Performance impact interface to GBDE as a component is the passphrase With an emphasis on strong crypto, performance and the administrative operations. will always suffer,and GBDE is no exception.

11.1. Typical laptop deployment 12.1. Throughput In an organisation, one possible way to deploy The most expensive part of the GBDE design is GBDE on laptop computers could be the following: the stored encrypted sector key,which worst case The laptop is installed with operating system and disk forces twodisk operations per sector operation. space is set aside for the GBDE partition(s). The secu- The first set of measurements try to gauge howmuch rity department initialises the GBDE partition with a the logical sector size affects the performance. To top-secret companypassphrase. The second lock sector eliminate as much of the physical aspect, a test of 100 is initialised using a passphrase which the branch-office megabytes sequential read and write using 1 megabyte manager has access to, and the third lock sector is ini- operations were run with varying GBDE logical sector tialise with a newpassphrase which the user is given sizes and without GBDE. access to. The hardware used was an AMD Athlon 700MHz CPU 12.3. Other indications using an IBM DTLA-307015 disk. Forworkstation or No attempt to quantify CPU usage more pre- server use, this is somewhat behind the state of the art, cisely than ‘‘will eat a lot of CPU cycles’’hav e been butits performance is pretty close to that of a modern attempted, the expectation is that this will improve laptop computer. when hardware assisted cryptographic functions Operations seq. read seq. write become available. Mode kB/s stddevkB/s stddev The author’spersonal home directory has been GBDE unencrypted 36141 28 27915 738 encrypted for more than 6 months at this point in time, 512 bytes 7326 12 3447 30 and he has neither found the performance annoyingly 1024 bytes 8088 41 6023 50 slownor lost anyfiles in this period. 2048 bytes 7880 32 5082 25 4096 bytes 8140 176 6061 51 13. Futureimprovements 8192 bytes 8849 37 6597 8 The current implementation of GBDE does not As can be see from the numbers, GBDE runs at approx takeadvantage of hardware assisted cryptographic pro- 20-25% of the unencrypted speed if the logical sector cessing. Once AES capable hardware accelerators size is one kilobyte or above. become available the GBDE workflowengine should be changed to takeadvantage of them. Whyusing 2048 byte logical sectors is slower than both 1024 and 4096 is somewhat of a mystery.Wespeculate Anumber of additional administrative operations can that it may be related to memory management arena be imagined, for instance a mode where a device is ini- fragmentation, or possibly a cache interaction with the tialised and attached in one operation which does not UMA memory allocator. write the lock sectors to the media thereby preventing later re-attachment. Such a mode would be useful for devices used for paging or temporary file systems. A 12.2. Latency command line option to save and restore lock sectors to Giventhat the encryption operations are very fast afile would be convenient too. compared to the seek times of contemporary disk hard- Fast re-encryption operation: it is possible to write ware, the expected behaviour with respect to I/O code to change the master key and salt which only latencyisthat operations with a high degree of locality decrypt and re-encrypt the sector-keys. This would be will taketwice as long, and for operations where the twotothree orders of magnitude less work than re- disk mechanism incurs arm settling time, the increase encrypting the entire volume. in I/O latencywill be lost in the noise. An attempt was made to measure GBDE’simpact on 14. Availability I/O latencyusing a modified version of the diskinfo -t tests. GBDE is distributed under the ‘‘two-clause BSD license’’and will be maintained in the publicly avail- The numbers collected confirmed that highly local able FreeBSD CVS repository,from where we encour- operations roughly double their I/O latency, typically age other operating systems to adopt it. from 150µs to 300µs. But for less local operations the measurements were so noisy that statistically they prove nothing useful, one way or the other.A‘‘hairy- 15. Conclusion eyeball’’judgement of the numbers found nothing GBDE was designed and implemented as a cryp- which would disprove our hypothesis and found them tographic facility which operates on the rawdisk level generally compatible with it. with real-world compatible semantics. We believe that the noise was due to the complexity of The operational experience so far is good, and initial the situation, where GBDE’ssector cache, the disk user feedback has been very positive. drivers use of the disksort() function and the disk drives internal optimisations result in very significant changes The reviews have sofar agreed that the cryptographic in the order of things for evenvery minor changes to design is sound and very strong approching overkill for the driving program. most ‘‘normal’’applications. It is impossible to set a guaranteed protection time on anycryptographic algorithm, but we belive that if AES/128 retains at least 80 bits of effective strength, GBDE will protect its payload against anyterrestial Over Weapons and Laptop Computers’’ attacker for at least 10 years and probably also 25 years Report No. 02-27, August 2002, Office of the ... provided the attacker can not guess that the pass- Inspector General reports 317 laptops lost over phrase was: ‘‘Det er svært at spå, især om fremtiden.’’ 9 28 months. This is about 2% of the FBI’sinv en- tory,and a rate of one every three days. 16. Acknowledgements [IRAN] ‘‘Held Hostage in Iran - A First Tour LikeNo The author would liketothank: Other’’William J. Daugherty,1996 Robert Watson for organising funding and subsequently http://www.cia.gov/csi/studies/spring98/iran.html translating danglish to the proper red-tape protocol for [LEAR] the involved paper tigers. Shakespeare: King Lear,Act III, scene 2. Lucky Green for,inaddition to being incredibly patient [LOOPAES] and helpful, trusting his data to GBDE from a very The ‘‘loop AES’’facility in Linux: early stage. http://loop-aes.sourceforge.net/loop- David Wagner for his insight, reviewand in particular AES.README for reiterating the obvious until it finally registered. [PAL] ‘‘Permissive Action Links’’StevenM.Bellovin Bruce D. Evans for his enthusiastic resistance and com- http://www.research.att.com/˜smb/nsam-160/pal.html petent technical hindrance. It would have been much to [PGP] easy to cut corners if it were not for people likeBruce, ‘‘Pretty Good Privacy’’,originally written by Phil much appreciated! Zimmerman, subsequently giventhe dot-com Also thanks to all the people who have listened atten- runaround. A good place to start: tively while I have bored them with incoherent explana- http://www.pgpi.com tions and fuzzy drawings, and people who have helped [PRIV] stamp out ‘‘Danglish’’from this paper, Flemming Private communication. This policyisconfiden- Jacobsen and Gregory Sutter in particular. tial, so the source cannot be revealed. And finally a big thanks to the Fr eeBSD crew for [RSAMD5] putting up with me and my crazy ideas. RSA Laboratories’ bulletin Number 4 - Novem- ber 12, 1996 ‘‘Recent Results for MD2, MD4, 17. References and MD5’’ ftp://ftp.rsasecurity.com/pub/pdfs/bulletn4.pdf [CGD] The ‘‘cgd’’facility in NetBSD: [STEGFS] http://netbsd.gw.com/cgi-bin/man- ‘‘The Steganographic File System’’Paper by cgi?cgd+4+NetBSD-current Ross Anderson, Roger Needham and Adi Shamir. http://www.mcdonald.org.uk/StegFS/ [CPRM] CPRM was an attempt to mandate Digital [THEREG] Restriction Management features be imple- See for instance, these articles on ‘‘The Register’’ mented in all read-write storage devices, such as 24-03-2000: ‘‘Sneak thief steals state secrets in AT A hard disks. MI5 laptop’’06-04-2000: ‘‘Third secret-packed official notebook nicked’’18-04-2000: ‘‘FBI Forafull time line of its emergence and defeat admits loss of ’top secret’ laptop’’and on a see this summary article on ‘‘The Register’’ slightly different theme: 23-08-2001: ‘‘$15k 18-02-2001: ‘‘CPRM on ATA - Full Coverage’’ reward offered for lost laptop’’ By AndrewOrlowski (http://www.theregis- (http://www.theregister.co.uk): ter.co.uk): [YARROW] [DOJ0227] Yarrowisasecure pseudorandom number gener- ‘‘The Federal Bureau of Investigation’sControl ator designed by Bruce Schneier and John Kelsey. 9 ‘‘It is difficult to predict, in particular the future’’ —Robert Storm Petersen (1882 - 1949) http://www.counterpane.com/yarrow.html