DRAFT DRAFT FOR COMMENTS DRAFT

1 January 19, 2005 2 3 Deleted File Recovery Tool Specification 4 5 6 Draft for SC Review of Version 1.0 7 8 9 10 11 12 13 14 15 16

DFR-Specification-SC.doc Page 1 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

16 17 CONTENTS 18 19 20 1 Preface...... 3 21 2 Introduction...... 3 22 3 Purpose...... 5 23 4 Scope...... 6 24 5 Background...... 7 25 5.1 Abstract Model of a ...... 7 26 5.2 File System Properties ...... 7 27 5.3 References (Informative) ...... 9 28 6 Definitions...... 10 29 7 Requirements ...... 12 30 7.1 Requirements for Mandatory Features...... 12 31 7.2 Requirements for Optional Features ...... 13 32 33 34 35 36

DFR-Specification-SC.doc Page 2 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

36

37 1 Preface 38 39 There is a critical need in the law enforcement community to ensure the reliability of 40 computer forensic tools. A capability is required to ensure that forensic software tools 41 consistently produce accurate and objective results. The goal of the Computer Forensic 42 Tool Testing (CFTT) project at the National Institute of Standards and Technology 43 (NIST) is to establish a methodology for testing computer forensic software tools by 44 development of general tool specifications, test procedures, test criteria, test sets, and test 45 hardware. The results provide the information necessary for toolmakers to improve tools, 46 for users to make informed choices about acquiring and using computer forensics tools, 47 and for interested parties to understand the tools capabilities. Our approach for testing 48 computer forensic tools is based on well-recognized international methodologies for 49 conformance testing and quality testing. This project is further described at 50 http://www.cftt.nist.gov/. 51 52 The CFTT is a joint project of the National Institute of Justice, the research and 53 development organization of the U.S. Department of Justice; the National Institute of 54 Standards and Technology Office of Law Enforcement Standards and Information 55 Technology Laboratory, and other agencies, such as the Technical Support Working 56 Group and the IRS. The entire computer forensics community participates in the 57 development of the specifications and test methods by commenting on drafts as they are 58 published on the CFTT website. 59

60 2 Introduction 61 62 Frequently during a forensic examination, data is discovered on the target media that is 63 not part of any active or visible file. Although this data can still be examined (e.g. string 64 searching), as would be done for unallocated space, if the data associated with a 65 particular file could be identified and recovered in its original form, this could provide 66 additional useful information. An example of this would be where a graphics file, if 67 undeleted and recovered, could be viewed—potentially providing more information than 68 a simple string search. Many of the forensic tools used by investigators identify files that 69 have been deleted, and allow the operator to undelete them. This may allow the 70 investigator to examine the file in the original format (e.g. a graphics file viewer), or 71 identify when a particular file was deleted and its original location. 72 73 To reconstruct files that have been deleted within a forensic setting, three fundamental 74 problems have to be addressed by a deleted file recovery (DFR) tool. First, the files that 75 have been deleted have to be identified and located. Although this could be as simple as 76 scanning directory entries for a particular key (e.g. ‘0xE5’ in Fat 32) it may be a more 77 complex process. This process is paramount for any recovery tool to work correctly, for 78 if files are not correctly identified and located, they will not be part of the recovery 79 process. DFR-Specification-SC.doc Page 3 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

80 81 The second problem, from a file system perspective, is that the data to be recovered is 82 latent, and needs the assistance of a tool to the data in a form that is usable by the 83 investigator. As with most other latent , since the results depend on the 84 output of a particular tool, it must be shown to operate correctly (i.e. undelete files 85 correctly). 86 87 The third and final fundamental problem that needs to be addressed by deleted file 88 recovery tools is the potential uncertainty in any recovery effort. A common problem 89 with all residual data recovery is that since residual data is no longer maintained by the 90 file system, there is a reduced level of confidence in the information recovered. 91 Specifically with deleted file recovery, the data recovered may be commingled with data 92 from other deleted files, allocated files, or even from non allocated space. 93

DFR-Specification-SC.doc Page 4 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

93 3 Purpose 94 This document defines the functional requirements and specifications for deleted file 95 recovery tools used within forensics investigations. 96 97 These requirements were developed through a combination of processes including but not 98 limited to deleted file recovery research, personal interviews with forensic investigators, 99 and working with a focus group of individuals who are experts in the field of forensic 100 investigation and depend on the results of deleted file recovery tools. Additionally, as 101 this document evolves, feedback will be incorporated from a variety of sources, and will 102 be posted to our web site at http://www.cftt.nist.gov for comments. 103 104 It is important to note that this document is limited to the defining of functional 105 specifications and requirements for deleted file recovery tools and processes. Additional 106 documents used in testing such as the assertions, test cases, and actual testing results will 107 be developed. 108 109

DFR-Specification-SC.doc Page 5 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

109 4 Scope 110 The scope of this specification and requirements document is limited to software that 111 identifies and recovers deleted files. The tools examined will be limited to those that are 112 currently used within the general computer forensic community, as indicated from 113 research and feedback from various focus groups. The proper or improper use of a tool is 114 not within the scope of this specification. 115 116 The specifications and requirements for deleted file recovery are high-level, and are 117 based on the following assumptions. 118 119 General: 120 • The deleted file recovery tools are used in a forensically sound environment. 121 • The individuals using these tools adhere to forensic principles, and have control 122 over the environment in which the tools are used. 123 124 Tool Functions: 125 • Only deleted file recovery tools and functions are examined. 126 • Other types of latent data recovery are not part of this specification. 127 128 Tool Environment: 129 • Only the file systems supported by a given tool are tested. 130 • Each file system tested is correctly configured, and is accessible if mounted on the 131 appropriate and receptive . 132 • Only commonly used file systems will be part of the testing parameters. 133 Encrypted, compressed, and distributed file systems are outside the scope of this 134 document. 135 136 Deleted File State: 137 • It is assumed that the files used to test the deleted file recovery process were 138 created and deleted in a process similar to how an end-user would create and 139 delete files. 140 • Files that were specifically corrupted, modified, or otherwise manipulated to 141 appear deleted are outside of the scope of this document. 142

DFR-Specification-SC.doc Page 6 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

142 5 Background 143 This section provides the technical background needed to discuss deleted file recovery 144 tools and functions. The first section outlines a brief high-level model of a file system. 145 Section two covers the two most common properties of file systems, which are the basis 146 for most deleted file recovery efforts. Section three outlines some of the reference 147 material used to understand file systems, and how various implementations may affect 148 deleted file recovery. 149

150 5.1 Abstract Model of a File System 151 A file system is used to store data for access by a computer. The data is normally stored 152 within a tree-like structured hierarchy of directories and files. File system metadata 153 contains information to describe and locate every file within a given file system. Some 154 metadata resides in directory entries, but additional metadata may reside in special files 155 (e.g. NTFS $MFT) or other locations (e.g. i-nodes). 156 157 When a file or directory is deleted, normally the associated metadata entry is flagged as 158 being no longer active. However, in most file systems, neither the metadata associated 159 with the file nor the actual content is completely removed. This creates a situation where 160 there is residual metadata (metadata remaining after a delete has occurred) that may still 161 be accessible. However, depending on the original format and structure of the metadata, 162 not all of it may be reachable. This would be the case for a fragmented directory, where 163 the first data block of directory entries would be reachable even after deletion, but the 164 remaining data blocks of directory entries are not. 165

166 5.2 File System Properties 167 File systems are designed to allow an operating system to have access to secondary 168 storage in a manner that is both efficient and timely, as in the past, storage devices have 169 been expensive, and slow (when compared to Random Access Memory). Accessing the 170 hard drive efficiently, although implemented differently in each file system, tends to have 171 some side effects that can be exploited to recover deleted files. Two of the most key 172 properties are contiguous writes, and the conservative nature of file system activity. 173 174 File systems use contiguous writes if possible: Most operating systems write data to the 175 drive in a contiguous set of data blocks or sectors if available. A given data file, provided 176 it is not modified after being written to the disk, tends to have all the data in sequentially 177 accessible sectors. This speeds up both the write and read processes, since the heads on 178 the drive do not need to move to different areas on the disk to write or read data. This 179 plays a role in data recovery, in that data from a given file, even deleted, has a high 180 likelihood of being grouped together on the disk in contiguous data blocks. 181 182 File systems are conservative: This characteristic implies that, in order to be as fast and 183 efficient as possible, file systems perform many activities with a minimum of changes or 184 overhead. In the case of , in most situations, only a logical deletion is DFR-Specification-SC.doc Page 7 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

185 performed—meaning that the actual data is not erased, but the metadata that indexes the 186 information is changed, flagged or removed. By using this technique, a file’s content, no 187 matter how large, can be “deleted” by simply modifying or removing entries in a disk 188 index. The simplest example of this is how a windows FAT 32 file system deletes files. 189 It locates the directory entry of the file to be deleted, and changes the beginning character 190 in the file name to a ‘0xE5’ hex value, and then zeros the . This 191 indicates to the file system that a file has been deleted, and is no longer accessible (or 192 maintained) by the file system—yet most of the metadata and the entire file content 193 remain. 194 195 For the most part, these common attributes assist in the recovery of data on the drive, 196 regardless of the type of file system the data resides on. Many tools leverage the residual 197 metadata in locating the potential file system objects, and then recover the largest amount 198 of contiguous data. 199 200 201

DFR-Specification-SC.doc Page 8 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

201 5.3 References (Informative) 202 Documents and research that were of particular relevance to the deleted file recovery, 203 background information, and the specifications and requirements document. It is 204 important to note that these references were primarily informative. 205 206 Carrier, (2003). “File System Analysis Techniques: Sleuth Kit Reference Document.” 207 Available at http://www.sleuthkit.org/sleuthkit/docs/ref_fs.html. 208 209 Crane, (1999). “Linux Ext2fs mini-HOWTO.” Available at 210 http://www.tldp.org/HOWTO/Ext2fs-Undeletion.html. 211 212 Erdelsky, (1993). “A Description of the DOS File System.” Available at 213 http://www.alumni.caltech.edu/~pje/dosfiles.html. 214 215 Himmer, (2000). “File Systems HOWTO.” Available at 216 http://www.faqs.org/docs/Linux-HOWTO/Filesystems-HOWTO.html. 217 218 Microsoft, (2004). “Description of the FAT32 File System.” Available at 219 http://support.microsoft.com/default.aspx?scid=http://support.microsoft.com:80/support/k 220 b/articles/q154/9/97.asp&NoWebContent=1. 221 222 NIST, (2004). “General Test Methodology for Computer Forensic Tools,” Available at 223 http://www.cftt.nist.gov/ 224

DFR-Specification-SC.doc Page 9 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

224 6 Definitions 225 Included here are definitions of terms used in this specification document. Although 226 there may be commonly accepted definitions for some of the terms, the context in which 227 they are applied may change their meaning. 228 229 Data Block: File system specific data allocation unit (block), usually 512 bytes or a 230 multiple of it. Some file systems may use other terms to describe a data block 231 such as Sector (in FAT file systems). 232 233 Deleted Block Pool (DBP): A conceptual collection of data blocks that were originally 234 part of an FS-Object, subsequently deleted, and have not been reallocated or 235 reused. 236 237 Documentation: The collection of materials available to the operator of a given 238 undeletion tool or function that describes its usage, purpose, operation, or system 239 requirements. 240 241 Estimated Content: A tool Estimates Content if it attempts to recover the content of a 242 deleted file, beyond what is explicitly identified in the residual metadata. 243 244 File System Object (FS-Object): The fundament objects to store and organize 245 information within a file system. The most common examples of FS-Objects 246 would be Files and Directories. 247 248 Logical Order: The content of a FS-Object as it would be sequentially accessed. 249 250 Logical Deletion: When an FS-Object is deleted through metadata manipulation, 251 without the actual object data being erased. For example, in FAT32, when an 252 object is deleted, the directory entry is flagged, and the file allocation entries are 253 cleared—the actual file data is not removed or erased. 254 255 Metadata: The associated periphery information or attributes that describe a FS-Object 256 such as name, time-based metadata (creation, modification, and last accessed 257 times), access rights, ownership, and location. 258 259 Recovered Object (RO): The object constructed by a Deleted File Recovery Tool 260 through examining residual metadata. Due to the potential for corruption inherent 261 with data that is no longer maintained by a file system, the RO and associated 262 attributes may not completely match the original FS-Object. However, the RO is 263 a sequence of data blocks with the following properties: 264 1. Each RO shall contain all data blocks identified from the residual metadata. 265 2. Each RO shall consist of only data blocks from the Deleted Block Pool. 266 267 Residual Metadata: The metadata that remains after a FS-Object has been deleted. In 268 some cases there may exist more residual metadata than can be accessed. For example, if

DFR-Specification-SC.doc Page 10 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

269 a directory is fragmented, when it is deleted, usually only the first data block of metadata 270 is accessible, while the remaining fragmented directory information is not. 271 272

DFR-Specification-SC.doc Page 11 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

272 7 Requirements 273 The requirements section is divided into two parts. The first, Requirements for 274 Mandatory Features, are those features that are critical to the operation of the given tool, 275 and must be present. The second part is the Requirements for Optional Features. These 276 features, on the condition they are present, will be used to report on the tool capabilities. 277 If a feature is not present, then requirements for those features will not be tested.

278 7.1 Requirements for Mandatory Features 279 All deleted file recovery tools must support the following requirements. 280 281 DFR--01 The tool shall support recovery efforts on file systems identified by the 282 Documentation. 283 284 DFR-RM-02 The tool shall identify all deleted File System-Object entries in residual 285 metadata. 286 287 DFR-RM-03 The tool shall report errors in constructing a Recovered Object. 288 289 DFR-RM-04 The tool shall construct a Recovered Object for each deleted File System- 290 Object entry in residual metadata. 291 292 DFR-RM-05 Each Recovered Object shall include all non-allocated data blocks 293 identified in a residual metadata entry. 294 295 DFR-RM-06 Each Recovered Object shall consist only of data blocks from the Deleted 296 Block Pool. 297 298 299 300

DFR-Specification-SC.doc Page 12 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT DRAFT DRAFT FOR COMMENTS DRAFT

301 302

303 7.2 Requirements for Optional Features 304 The following define conditional requirements for optional features. The requirements 305 below are used to report on the tool capabilities. If the tool does not provide the 306 capability defined, then the requirement will not apply. 307 308 If the residual metadata for deleted files in a given file system does not identify all file 309 allocation units in the deleted file, the DRF tool may optionally create a recovered object 310 that estimates the likely content of an original file identified in the residual metadata by 311 extrapolation from drive content. This is referred to as Estimates Content. 312 313 DFR-RO-01: The tool shall report Recovered Object attributes that are recoverable from 314 residual metadata. 315 316 DFR-RO-02: If the tool Estimates Content then each recovered data block shall be 317 assigned to a Recovered Object no more than once. 318 319 DFR-RO-03: If the tool Estimates Content then the Recovered Object shall consist only 320 of data blocks from the original File System-Object identified in the residual 321 metadata. 322 323 DFR-RO-04: If the tool Estimates Content then any data blocks in the Recovered Object 324 shall be in the same logical order as in the original File System-Object identified in 325 the residual metadata. 326 327 DFR-RO-05: If the tool Estimates Content then the Recovered Object shall consist of 328 the some number of blocks as the original File System-Object. 329 330 331

DFR-Specification-SC.doc Page 13 of 131/19/2005 1:29 PM DRAFT DRAFT FOR COMMENTS DRAFT