Forensic Analysis of the Resilient File System (Refs) Version 3.4

Forensic Analysis of the Resilient File System (ReFS) Version 3.4 Paul Pradea, Tobias Großa,∗, Andreas Dewaldb,a,∗ aFriedrich-Alexander University Erlangen-Nürnberg, Germany bERNW Research GmbH, Heidelberg, Germany Abstract ReFS is a modern file system that is developed by Microsoft and its internal structures and behavior is not officially documented. Even so there exist some analysis efforts in deciphering its data structures, some of these findings have yet become deprecated and cannot be applied to current ReFS versions anymore. In this work, general concepts and internal structures found in ReFS are examined and documented. Based on the structures and the processes by which they are modified, approaches to recover (deleted) files from ReFS formatted file systems are shown. We also evaluated our implementation and the allocation strategy of ReFS with respect to accuracy, runtime and the ability to recover older file states. Keywords: Digital forensics, Data recovery, File systems, ReFS 1. Introduction Sinofsky (2012) explains key goals as well as features like checksums for meta- and content-data as well as the Storage media analysis is a common task in the field of Copy-On-Write (COW) mechanism that are provided by digital forensic, when PCs or mobile devices get analyzed. ReFS. He also mentions a generic key-value interface which Investigators have to rely on the proper functioning of their is used by ReFS to manage on-disk structures. tools to provide them with correct interpretation of traces. In a working report, Green (2013) analyzed the internal File systems have to be interpreted and presented when structures of ReFS and created an overview of identified analyzing storage media, doing this manually is unfeasible. structures. This report skipped how the internal key-value From this situation emerges the need for digital forensic structure of ReFS works. He also looked into the recovery of tools to ideally support all of the file systems that are cur- deleted items, but only describes the recycle bin mechanism rently in use and may be encountered in a forensic analysis. of the windows file explorer which is not specific to ReFS. Limitations of classical file systems such as low perfor- Another unofficial draft of structures found under ReFS mance, limited capacity or unsuitability for SSD drives led is provided by Metz (2013). This work strongly focuses on to the development of new filesystems like APFS or ReFS. the low-level presentation of how data structures in ReFS These new filesystems have to be supported in open source are composed and was the first to vaguely describe the forensic tools and documentation. Transparency in forensic key-value store. The author implemented a library which processes and tools is important when digital evidence is is based on his findings. used in severe cases. That is even more important when Georges (2018) aimed to develop a tool which outputs filesystems are proprietary as the above-mentioned ones. results comparable to EnCase. He also described how the al- With this work, we want to provide the forensic commu- location status of clusters is managed. With the developed nity with tools and information to properly analyze ReFS tool, file extraction is possible for ReFS v1.2. partitions. Nordvik et al. (2019) also examined data structures of ReFS. They mainly focused on v1.2 and v3.2 of ReFS. 1.1. Related Work At the end they tested their findings on v3.4, too. For There exists related work that we show in the following v3.2 they came to the conclusion that the versions did not which analyzed ReFS before, but they looked into versions differ much from v1.2 and that \the structures are almost 1.1 and 1.2. Some core concepts still exist in the latest identical". In our work we come to the conclusion that version 3.4 but also major changes were applied between between v3.4 and v1.2 many data structures were added, these versions, which draws older work incompatible to deprecated or changed. We found also new functionality newer versions of ReFS. like virtual addresses which have to be taken into account when extracting data from the disk. In contrast to their work, we discovered more details and also investigate how ∗Corresponding author Email addresses: [email protected] (Tobias Groß), the Copy-On-Write mechanism of ReFS is implemented [email protected] (Andreas Dewald) and propose strategies to recover old versions of files. Preprint submitted to DFRWS EU December 13, 2019 1.2. Our Contribution Table Identifier Table Name In this work, we analyzed the proprietary Microsoft 0x2 Object ID Table Windows Resilient File System (ReFS) version 3.4 and 0x21 Medium Allocator Table contribute the following: 0x20 Container Allocator Table • We analyzed the internal structures and mechanics 0x1 Schema Table of ReFS v3.4 and documented them in a technical 0x3 Parent Child Table report. (Prade et al. (2019)) 0x4 Object ID Table, duplicate • We extended The Sleuth Kit from Carrier to support 0x5 Block Reference Count Table ReFS. 0xb Container Table • We propose strategies for recovering deleted files. 0xc Container Table, duplicate • We implemented a page carver which allows the re- 0x6 Schema Table, duplicate construction of deleted files and older file states. 0xe Container Index Table • We evaluated the correctness of our implementation 0xf Integrity State Table with different ReFS partitions. 0x22 Small Allocator Table • We analyzed the allocation strategy of ReFS v3.4 which affects the recoverability of older file states and Table 1: Tables referenced by the checkpoint structure compared the findings with our carver. 2. Background 3.1. Checkpoint ReFS partitions have 2 checkpoint structures which are The Sleuth Kit. The Sleuth Kit (TSK) is an open-source written alternately, to have at least one valid checkpoint filesystem forensic tool. It was developed with high porta- in case of a crash. The checkpoint of a file system is the bility and extensibility in mind (Altheide and Carvey, 2011, first structure that holds the current clock values, which p. 42). The highly extensible design of TSK allows devel- indicates the latest checkpoint structure. Additionally, the opers to extend its functionality as well as the types of file checkpoint also contains the current log sequence number systems it supports. TSK is structured into multiple layers (LSN) which identifies the last entry that was written into of abstraction that map how data is stored on storage me- the redo log. Since ReFS stores changes to the file system dia. The File System Layer is of most importance within in the memory and writes them at a later point as a batch, this work, as this is the abstraction where file systems such the system may crash in this time window. A system crash as FAT, NTFS and also ReFS are settled. reverts the system to its last valid state and would discard all changes executed in the meantime. However, every Copy-On-Write. Copy-On-Write (COW) is an update pol- transaction that was successfully performed and is part icy that may be used to alter data on a storage medium. A of the batch is also written to a sequential log file that is COW update policy makes sure that data is never updated discussed in our report. To know from where to start with in place. Whenever the content of a block should be altered, the redo operations, it is essential to save an identifier for the block is read into memory, modified and its content the last transaction that was part of the current checkpoint. is written to an alternate location on the storage medium. The most important structures found in the checkpoint When the system crashes while a block is updated through are the table references. For the sake of thoroughness, a COW policy, the old state of the data remains untouched table 1 presents an overview of all tables referenced by the and still persists. COW offers a simple strategy to enforce checkpoint along with their table identifiers. atomicity in write operations and to ensure the integrity of data structures (Rodeh et al., 2013, p. 15). 3.2. Container Table Internally a ReFS volume is separated into multiple 3. ReFS Internal Structure equally sized bands or containers. The size of a single band is relatively large at around 64 MiB or 256 MiB. We analyzed the structure of ReFS and classified ident- The Container Table provides detailed information on that fied structures into the categories File System, Content, managed memory regions. It tracks how many clusters File Name, Metadata and Application introduced by Car- within a band are used and additionally collects statistics rier (2005). Detailed structure description of ReFS which about the usage of data and access times within that band. are important for forensic analyses can be found in our tech- Tipton (2015) argues that the introduction of Container nical report (Prade et al. (2019)). In this work we describe Tables was important to treat multi-tiered storage systems only the structures important for enumerating and recov- more efficiently. In a multi-tiered storage system, it is ering files and folders. The ordering corresponds to the common to have different types of storage media that offer occurrence when starting interpreting an ReFS partition different reading and writing characteristics. Some storage at the boot sector. media such as flash memory allow performing random 2 access faster than traditional hard disks, which are more The Object ID Table additionally stores meta-information suitable to perform sequential operations. about the tables that it references. It stores the addresses In a multi-tiered storage system, data needs to be reor- and the checksums of the tables it references, as well as ganized according to its characteristics.

Load more