A Study of Disk Sanitization Practices
Total Page:16
File Type:pdf, Size:1020Kb
Data Forensics Remembrance of Data Passed: A Study of Disk Sanitization Practices Many discarded hard drives contain information that is both confidential and recoverable, as the authors’ own experiment shows. The availability of this information is little publicized, but awareness of it will surely spread. Indianapolis retired 139 computers. fundamental goal of information security is to Some of these systems were donated to design computer systems that prevent the schools, while others were sold on the unauthorized disclosure of confidential infor- open market, and at least three ended up in mation. There are many ways to assure this in- a thrift shop where a journalist purchased them. Unfortu- Aformation privacy. One of the oldest and most common nately, the VA neglected to sanitize the computer’s hard SIMSON L. techniques is physical isolation: keeping confidential data drives—that is, it failed to remove the drives’ confidential GARFINKEL on computers that only authorized individuals can access. information. Many of the computers were later found to AND ABHI Most single-user personal computers, for example, contain contain sensitive medical information, including the SHELAT information that is confidential to that user. names of veterans with AIDS and mental health prob- Massachusetts Computer systems used by people with varying autho- lems. The new owners also found 44 credit card numbers Institute of rization levels typically employ authentication, access con- that the Indianapolis facility used.4 Technology trol lists, and a privileged operating system to maintain in- The VA fiasco is just one of many celebrated cases in formation privacy. Much of information security research which an organization entrusted with confidential infor- over the past 30 years has centered on improving authentica- mation neglected to properly sanitize hard disks before tion techniques and developing methods to assure that com- disposing of computers. Other cases include: puter systems properly implement these access control rules. Cryptography is another tool that can assure infor- • In the spring of 2002, the Pennsylvania Department of mation privacy. Users can encrypt data as it is sent and Labor and Industry sold a collection of computers to decrypt it at the intended destination, using, for exam- local resellers. The computers contained “thousands of ple, the secure sockets layer (SSL) encryption protocol. files of information about state employees” that the de- They can also encrypt information stored on a com- partment had failed to remove.5 puter’s disk so that the information is accessible only to • In August 2001, Dovebid auctioned off more than 100 those with the appropriate decryption key. Crypto- computers from the San Francisco office of the Viant graphic file systems1–3 ask for a password or key on consulting firm. The hard drives contained confidential startup, after which they automatically encrypt data as client information that Viant had failed to remove.6 it’s written to a disk and decrypt the data as it’s read; if the • A Purdue University student purchased a used Macin- disk is stolen, the data will be inaccessible to the thief. tosh computer at the school’s surplus equipment ex- Yet despite the availability of cryptographic file systems, change facility, only to discover that the computer’s hard the general public rarely seems to use them. drive contained a FileMaker database containing the Absent a cryptographic file system, confidential infor- names and demographic information for more than 100 mation is readily accessible when owners improperly re- applicants to the school’s Entomology Department. tire their disk drives. In August 2002, for example, the • In August 1998, one of the authors purchased 10 used United States Veterans Administration Medical Center in computer systems from a local computer store. The PUBLISHED BY THE IEEE COMPUTER SOCIETY I 1540-7993/03/$17.00 © 2003 IEEE I IEEE SECURITY & PRIVACY 17 Data Forensics • Used equipment is awash with confidential informa- Table 1. Tbytes shipped per year on tion, but nobody is looking for it—or else there are the global hard-disk market. people looking, but they are not publicizing that fact (Courtesy of IDC research) YEAR TBYTES SHIPPED To further investigate the problem, we purchased more than 150 hard drives on the secondary market. Our goal 1992 7,900 1993 16,900 was to determine what information they contained and 1994 33,000 what means, if any, the former owners had used to clean 1995 77,800 the drives before they discarded them. Here, we present 1996 155,900 1997 344,700 our findings, along with our taxonomy for describing in- 1998 698,600 formation recovered or recoverable from salvaged drives. 1999 1,500,000 2000 3,200,000 2001 5,200,000 The hard drive market 2002 8,500,000 Everyone knows that there has been a dramatic increase in disk-drive capacity and a corresponding decrease in mass - computers, most of which were three to five years old, storage costs in recent years. Still, few people realize how contained all of their former owners’ data. One com- truly staggering the numbers actually are. According to the puter had been a law firm’s file server and contained market research firm Dataquest, nearly 150 million disk privileged client–attorney information. Another com- drives will be retired in 2002—up from 130 million in puter had a database used by a community organization 2001. Although many such drives are destroyed, a signifi- that provided mental health services. Other disks con- cant number are repurposed to the secondary market. (This tained numerous personal files. market is rapidly growing as a supply source for even main- • In April 1997, a woman in Pahrump, Nevada, purchased stream businesses, as evidenced by the 15 October cover a used IBM computer for $159 and discovered that it story in CIO Magazine, “Good Stuff Cheap: How to Use contained the prescription records of 2,000 patients who the Secondary Market to Your Enterprise’s Advantage.”8) filled their prescriptions at Smitty’s Supermarket phar- According to the market research firm IDC, the world- macy in Tempe, Arizona. Included were the patient’s wide disk-drive industry will ship between 210 and 215 names, addresses and Social Security numbers and a list million disk drives in 2002; the total storage of those disk of all the medicines they’d purchased. The records in- drives will be 8.5 million terabytes (8,500 petabytes, or 8.5 cluded people with AIDS, alcoholism, and depression.7 × 1018 bytes). While Moore’s Law dictates a doubling of integrated circuit transistors every 18 months, hard-disk These anecdotal reports are interesting because of storage capacity and the total number of bytes shipped are their similarity and their relative scarcity. Clearly, confi- doubling at an even faster rate. Table 1 shows the terabytes dential information has been disclosed through comput- shipped in the global hard-disk market over the past decade. ers sold on the secondary market more than a few times. It’s impossible to know how long any disk drive will Why, then, have there been so few reports of unintended remain in service; IDC estimates the typical drive’s life- disclosure? We propose three hypotheses: span at five years. As Table 2 shows, Dataquest estimates that people will retire seven disk drives for every 10 that • Disclosures of this type are exceedingly rare ship in the year 2002; this is up from a retirement rate of • Confidential information is disclosed so often on retired three for 10 in 1997 (see Figure 1). As the VA Hospital’s systems that such events are simply not newsworthy experience demonstrates, many disk drives that are “re- Table 2. Global hard-disk market. (Courtesy of Dataquest) YEAR UNITS SHIPPED COST PER MEGABYTE RETIREMENTS RETIREMENT RATE* (IN THOUSANDS) TO END USER (IN THOUSANDS) (IN PERCENT) 1997 128,331 0.1060 40,151 31.2 1998 143,927 0.0483 59,131 41.0 1999 174,455 0.0236 75,412 43.2 2000 199,590 0.0111 109,852 55.0 2001 195,601 0.0052 130,013 66.4 2002 212,507 0.0025 149,313 70.2 * ratio of drives retired to those shipped each year 18 JANUARY/FEBRUARY 2003 I http://computer.org/security/ Data Forensics tired” by one organization can appear elsewhere. Unless retired drives are physically destroyed, poor information security practices can jeopardize information privacy. 200,000 The ubiquity of hard disks 160,000 Compared with other mass-storage media, hard disks pose special and significant problems in assuring long- term data confidentiality. One reason is that physical and 120,000 electronic standards for other mass-storage devices have evolved rapidly and incompatibly over the years, while the Integrated Drive Electronics/Advanced Technology In thousands of units 80,000 Attachment (IDE/ATA) and Small Computer System In- Shipped terface (SCSI) interfaces have maintained both forward Retired and backward compatibility. People use hard drives that 40,000 are 10 years old with modern consumer computers by 1997 1998 1999 2000 2001 2002 Year simply plugging them in: the physical, electrical, and log- ical standards have been remarkably stable. This unprecedented level of compatibility has sustained The sanitization problem both formal and informal secondary markets for used hard Most techniques that people use to assure information Figure 1. drives. This is not true of magnetic tapes, optical disks, flash privacy fail when data storage equipment is sold on the Worldwide memory, and other forms of mass storage, where there is secondary market. For example, any protection that the hard-disk considerably more diversity. With current devices, people computer’s operating system offers is lost when someone market in units typically cannot use older media due to format changes (a removes the hard drive from the computer and installs it in shipped versus digital audio tape IV drive, for example, cannot read a DAT a second system that can read the on-disk formats, but retired each I tape, nor can a 3.5-inch disk drive read an 8-inch floppy.) doesn’t honor the access control lists.