Forensic Timeline Analysis of the Zettabyte File System

Forensic Timeline Analysis of the Zettabyte File System

1 Forensic Timeline Analysis of the Zettabyte File System Dylan Leigh <[email protected]> Supervisor: A.Prof. Hao Shi <[email protected]> Centre for Applied Informatics, College of Engineering and Science, Victoria University, Melbourne FS FS FS Abstract—During forensic analysis of computer FS FS FS systems, it is often necessary to construct a FS FS Stripe or Mirror chronological account of events, including when Volume Storage Pool RAID files were created, modified, accessed and deleted. Volume Timeline analysis is the process of collating and analysing this data, using timestamps from the filesystem and other sources such as log files and internal file metadata. (a) Traditional FS/Volume (b) ZFS Pool The Zettabyte File System (ZFS) uses a novel and complex structure to store file data and Figure 1. Structure of Traditional Filesystems and Volumes metadata across multiple devices. Due to the compared to ZFS Pools unusual structure and operation of ZFS, many existing forensic tools and techniques cannot be used to analyse ZFS filesystems. scalability, high availability and guaranteed data In this project, it has been demonstrated that integrity [2]. four of the internal structures of ZFS can be used as effective sources of timeline information. Methods to extract these structures and use them A. Pooled Storage for timeline analysis are provided, including al- gorithms to detect falsified file timestamps and Traditional volume management involves to determine when individual blocks of file data mapping individual filesystems to volumes and were last modified. disks;. filesystems are fixed in location on the disk or volume. I. OBJECTIVES ZFS uses a different approach: all disks are allocated to a “pool”; and filesystems are cre- The objectives of this project are to identify ated and mounted as required administratively. internal ZFS metadata which can be used for Data is stored within different devices by ZFS to timeline analysis, and provide methods to col- optimise performance and maintain redundant lect this metadata. It must also be determined copies to prevent loss of data from device how this metadata is related to the creation and failure. modification time of files, and how long any Creating a new ZFS filesystem on a pool is a useful metadata will be retained. Finally, this lightweight process similar to creating a direc- project should provide at least one effective tory on other filesystems. Figure1 shows this method which uses ZFS metadata to detect pooled structure of the Zettabyte File System falsified file timestamps. compared to “traditional” file systems. The knowledge obtained from the study should be used to provide practical guidelines and procedures for forensic investigators, and B. Guaranteed Data Integrity shared amongst the community to provide the Data integrity and prevention of corruption basis for future forensic research and develop- was a key design consideration of ZFS. ZFS ment of ZFS forensic software. It is intended checksums all data and metadata and transpar- that at least one publication is produced from ently detects and recovers from corruption using this research in the area of timeline forensics, the redundant copies. Even with a “pool” of a plus one possible publication related to the single device, the checksums detect corruption study of the ZFS on-disk format. which would occur without warning on other filesystems [2]. II. BACKGROUND: ZFS ZFS uses a copy-on-write transaction opera- ZFS was designed[1] to meet the needs of tion for all writes, which ensures that no data is enterprise server storage and surpass many lim- ever overwritten in place and the disk is never itations of existing filesystems. ZFS is a popu- left in an inconsistent state. All disk writes by lar filesystem for many applications due to its ZFS will either complete atomically or not at 2 Table I ZFS STRUCTURES AND COMPONENTS USED FOR TIMELINE ANALYSIS Structure Component Effective Uberblocks TXG and Timestamp Sometimes Object Number Sometimes File Objects Generation TXG Always File Data Birth TXG Always Block Pointers Vdev Number Never Figure 2. Simplified ZFS Object Tree Spacemaps Spacemap TXG Never all. This prevents corruption in the event of power failure or a hardware fault. ZFS can also PUBLICATIONS recover faster from such faults as no file system This research has directly produced one publi- check is required when the system is restarted. cation [9] discussing timeline analysis of ZFS; it has also been used in a related project to C. Internal Structure extend existing super-timeline software to to process ZFS events which has produced another ZFS stores all data and metadata within a publication by the author [10]. tree of objects, stored within a tree of blocks. Additional publications are planned for release Each “Block Pointer” reference to another block in 2015 using thesis material which has not can point to up to three identical copies of the yet been published, as well as additional case target block, which ZFS spreads across multiple studies based on more detailed simulations. devices in the pool in case of disk failure. Each physical device contains a “device la- bel” with an array of 128 “Uberblocks”, the root IV. EXPERIMENT METHODOLOGY of the block and object trees. Each transaction A. Primary Experiments is committed to a new Uberblock (overwriting the oldest one in the array). ZFS metadata was collected from test sys- Each Uberblock contains a Block Pointer tems which were configured to use ZFS pools pointing to the root of the object tree, the “Meta with 1,2,3,4,5,7 and 9 devices, in a variety of Object Set”. Apart from the metadata contained pool configurations. Disk activity was simulated within the label, all data and metadata is stored using data obtained from surveys of activity on within the object tree, including internal meta- corporate network file storage [11], [12], [13]. data such the free space maps and delete queues. Three experiments were conducted for each pool configuration: a control with no tampering, III. ZFS FORENSICS LITERATURE one where timestamps were altered by changing the system clock backwards briefly by one hour, Digital Forensic Implications Beebe et al’s and one where the Unix “touch” command was of ZFS [3] is an overview of the forensic differ- used to alter the modification and access times ences between ZFS and more “traditional” file of one file backwards by one hour. Tampering systems; in particular it outlines many forensic was conducted 5 hours into each simulation. advantages and challenges of ZFS. However, Each simulation was conducted for 24 hours. it is based on theoretical examination of the Metadata was collected every 30 minutes, using documentation and source code, noting that “we the ZDB (ZFS debugger) tool. After each simu- have yet to verify all statements through our lation had concluded, a final set of metadata was own direct observation and reverse engineering collected, which was examined to determine if of on-disk behavior”. any timestamp falsification could be detected. Data recovery on ZFS has been examined TableI shows the structures and the components extensively by other researchers. Max Bruning’s of those structures which were used for timeline On Disk Data Walk [4] demonstrates how to analysis after each simulation. access ZFS data on disk from first principles; Experiments were repeated as time permitted; Bruning has also demonstrated techniques for in total 85 simulations were completed (not recovering deleted files[5]. including the “Large File” experiments) and Li [6], [7] presents an enhancement to the over 110GB of ZFS metadata was collected. ZDB filesystem “debugger” which allows it to trace target data within ZFS media without mounting the file system. Finally, some ZFS B. Large File Experiments anti-forensics techniques are examined by Ci- A second set of 22 experiments were con- fuentes and Cano [8]. ducted to verify behaviour of files larger than 3 128KB observed during the primary experi- copy-on-write mechanism of ZFS and the ments. These were conducted in a similar man- high rate at which Spacemaps are con- ner to the primary experiments, with the addi- densed. tion of a “large file” which was created at the start of the simulation and appended to every A. Other Contributions second. The potential for false positives (from times- No tampering was used; instead the analysis tamps which are physically false but logically after the simulation compared internal metadata true, such as from copying old files) are dis- from the large file with other files. By matching cussed, as well as the effects of file attribute the “birth” Transaction Group ID (TXG) from changes and tampering with internal ZFS meta- individual blocks of the large file with birth data. TXG from highest level block pointers in other This project also contributes to forensic prac- files, it is possible to determine the time at tice by providing methods and commands for which the individual blocks of the large file are analysis of ZFS devices, including general com- written, which cannot generally be determined puter forensic procedures for ZFS as well as via other means. those specific to timeline analysis. In particular, techniques to safely import a ZFS pool from V. RESULTS AND KEY FINDINGS forensic disk images without corrupting the host TableII shows the results of performing system or the system under examination are timeline analysis on all experiments involving provided. tampering. In experiments without tampering; VI. FUTURE WORK all methods were negative and there were no false positives. Although this is the first time internal ZFS The most significant findings for ZFS time- metadata has been examined for timeline foren- line analysis are listed below: sics, the scope of this study is extremely limited and much more work needs to be done to 1) Comparing the birth TXG ID from the validate the techniques presented here under highest level file data Block Pointers is an varied conditions.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    4 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us