Quick viewing(Text Mode)

Forensically Sound Data Acquisition in the Age of Anti-Forensic Innocence

Forensically Sound Data Acquisition in the Age of Anti-Forensic Innocence

Forensically Sound Acquisition in the Age of Anti-Forensic Innocence

Forensisch korrekte Datensicherung im Zeitalter anti-forensischer Arglosigkeit

Der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr.-Ing. vorgelegt von

Michael Gruhn

aus Bad Windsheim Als Dissertation genehmigt von der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 2016-11-24

Vorsitzender des Promotionsorgans: Prof. Dr.-Ing. Reinhard Lerch

Gutachter: Prof. Dr.-Ing. Felix Freiling Prof. Dr. Zeno Geradts Abstract

In this thesis, we tackle anti-forensic and problems in . An anti-forensic technique is any measure that prevents a forensic analysis or reduces its quality. First, we investigate the anti-forensic threat of hard drive firmware , which can prevent a forensic analyst from acquiring data from the hard drive, thus jeopardizing the forensic analysis. To this end, we first outline the threat of hard drive firmware rootkits. We then provide a procedure to detect and subvert already published firmware bootkits. We further outline potential avenues to detect hard drive firmware rootkits nested deeper within the hard disk drive’s so-called Service Area, a special storage on the magnetic platter reserved for use by the firmware. After addressing the acquisition of persistent in form of hard disk drives, we shift towards acquisition and later analysis of volatile storage, in the form of RAM. To this end, we first evaluate the atomicity and integrity as well as anti-forensic resistance of different memory acquisition techniques with our novel black-box analysis technique. This black-box analysis technique in which memory contents are constantly changed via our payload application with a traceable access pattern, allows us to measure to which extent current memory acquisition methods satisfy atomicity and integrity when dumping the memory of processes. We also discuss their resistance against anti-forensics. As a result, we show that cold boot attacks belong to the most favorable memory acquisition techniques. We then investigate cold boot attacks in more detail. First, we experimentally confirm that cooling the RAM modules prolongs the remanence effect considerably. We then prove also experimentally that transplanting RAM modules from one system to another is possible. We further address the issue scrambling in modern DDR3 technology as well as other proposed countermeasures, such as BIOS passwords and temperature detection. We also show that once a system is cold-booted, malicious anti-forensic code running on the system stops running immediately and can thus no longer interfere with the memory acquisition. Therefore, we show the practical feasibility of cold boot attacks as anti-forensic resistant memory acquisition method. After outlining the anti-forensic resistant acquisition of evidence we address the analysis. To this end, we first revisited the theory of data analysis, especially the concept of essential data in forensic analysis as coined by Carrier in his seminal work “ Forensic Analysis”. We first extend Carrier’s concept by differentiating different levels of essentiality. We introduce the notion of strictly essential data, which refers to data that is always required to be correct and non-manipulated by all systems to provide a specific functionality, and partially essential, which is only required to be correct and non-manipulated for some systems. We then practically verify both the original theories and our extensions in experiments. Eventually, we argue that essential data can help to build a trust hierarchy of data encountered during forensic analysis, from which we conclude that anti-forensic resistant analysis methods must only rely on what we call strictly essential, i.e., trusted, data, otherwise, the analysis is potentially impaired by anti-forensic measures because non-essential data can be freely manipulated without impacting the correct working of a system. Last but not least, we tackle a long unsolved problem in forensic memory analysis: Currently, all -of-the-art digital forensic analysis tools ignore unmapped memory pages, i.e., pages swapped out onto persistent storage. This can result in blind spots in which data and thus potential evidence is not analyzed. We fix this by analyzing the Windows NT virtual via a novel gray-box analysis method. To this end, we place traceable data into virtual memory and force it into both the physical RAM as well as the pagefile stored on persistent storage. We are thus able to reverse engineer the complete virtual address mapping, including the non-mapped pagefile. We evaluate our analysis results against real world data from Windows 7, 8, and 10 systems in both the 32 and 64- variants. By shedding light on this last blind spot of virtual memory analysis we increase its anti-forensic resistance, because we can now for the first time analyze the virtual address space in its entirety. Zusammenfassung

Diese Arbeit befasst sich mit der Anti-Forensik und Rootkit Problematik in der digitalen Forensik. Eine anti-forensische Technik ist jede Maßnahme die eine forensische Analyse verhindert oder Ihre Qualität mindert. Zuerst wird die anti-forensische Bedrohung durch Festplatten Rootkits untersucht, welche einen forensischen Analysten davon abhalten können Daten zu sichern, und dadurch die Analyse gefährden. Hierzu skizzieren wir zunächst die Bedrohung durch Festplatten Rootkits. Dann stellen wir eine Prozedur vor mit der bereits veröffentlichte Rootkits erkannt und unterbunden werden können. Wir zeigen potenzielle Wege zur Auffindung von Rootkits, die tiefer in der Festplatten Firmware, in der sogenannten Service Area, einem speziellen Speicherbereich reserviert für die Benutzung durch die Firmware, verankert sind. Nach dem Problem der Datensicherung von Festplatten widmen wir uns der Sicherung und späteren Analyse von flüchtigen Daten in Form von RAM. Hierzu evaluieren wir zuerst die Atomarität und Integrität, aber auch die anti-forensische Resistenz verschiedener Techniken zur Hauptspeichersi- cherung mit unserer neuartigen Black-Box Analyse Methode. Diese Analyse Methode, in der die Speicherinhalte durch unsere Anwendung permanent durch ein verfolgbares Zugriffsmuster verändert werden, erlaubt es uns zu messen zu welchem Grad aktuelle Methoden zur Hauptspeichersicherung atomar und integer sind beim Auslesen des Speichers von Prozessen. Wir erörtern des weiteren deren Resistenz gegen Anti-Forensik. Das Resultat zeigt, dass der Cold Boot Angriff eine bevorzugte Technik zur Hauptspeichersicherung ist. Wir haben deshalb den Cold Boot Angriff im Detail betrachtet. Zuerst haben wir experimentell bestätigt, dass das kühlen des RAMs den Remanenz Effekt beachtlich verlängert. Wir zeigen, ebenfalls experimentell, dass das Transplantieren von RAM Modulen von einem System zum anderen möglich ist. Des weiteren befassen wir uns mit dem Problem des Scramblings der modernen DDR3 Technologie und weiteren Gegenmaßnahmen wie BIOS Passwörtern und Temperatur-Überwachung. Wir zeigen ebenso, dass durch das kalte Neustarten Anti-Forensik Code der auf dem System ausgeführt wird sofort in seiner Ausführung unterbrochen wird und daher nicht weiter bei der Hauptspeichersicherung interferieren kann. Dadurch zeigen wir sowohl die praktische Anwendbarkeit also auch die Anti-Forensische Resistenz des Cold Boot Angriffes. Nach dem Aufzeigen anti-forensisch resistenter Sicherung von Beweisen wenden wir uns der Analyse zu. Hierzu erweitern wir zunächst das Konzept von essentiellen Daten wie es von Carrier in seiner wegweisend Arbeit “File System Forensic Analysis” vorgestellt wurde. Wir erweitern Carriers Konzept durch die Unterscheidung zwischen strikt essentiellen Daten, die immer korrekt und nicht manipuliert seien müssen damit alle Systeme eine spezifische Funktion zur Verfügung stellen können und partiell essentiellen Daten, die nur für manche System korrekt und nicht manipuliert seien müssen. Danach verifizieren wir die originale Theorie und unsere Erweiterung in praktischen Experimenten. Zum Ende argumentieren wir, dass essentielle Daten helfen eine Vertrauens Hierarchie der Daten die in einer forensischen Analyse vorgefunden werden zu erstellen, aus welcher wir schließen können, dass anti-forensisch Resistente Analyse Methoden nur solche Daten verwenden dürfen die strikt essentiell sind, da sonst die Analyse potenziell durch anti-forensische Maßnahmen beeinträchtigt wird, weil nicht essentielle Daten frei manipulierbar sind ohne die Funktionalität eines Systems einzuschränken. Zuletzt widmen wir uns einem lange ungelösten Problem in der forensischen Hauptspeicheranalyse. Zur Zeit ignorieren alle gängigen forensischen Methoden zur Hauptspeicheranalyse ausgelagerte Speicherseiten, d. h. virtuelle Speicherseiten die auf persistenten Speicher ausgelagert wurden. Dies resultiert in durch die Analyse nicht einsehbare Speicherbereiche in denen sich potenziell Beweise befinden können. Wir treten dem entgegen indem wir die virtuelle Speicherverwaltung von Windows NT mit einem Gray-Box Analyse Ansatz untersuchen. Hierzu plazieren wir rück verfolgbare Daten sowohl in den physischen RAM als auch die Auslagerungsdatei. Dadurch können wir den kompletten virtuellen Adressraum mitsamt der Auslagerungsdatei rekonstruieren. Wir evaluieren unseren Ansatz gegen Windows 7, 8 und 10, in den 32 und 64-bit Varianten. Indem wir diese zuvor nicht einsehbaren Speicherbereiche der Hauptspeicheranalyse zuführen stärken wir deren anti-forensische Resistenz, da man nun den kompletten Adressraum analysieren kann. Acknowledgments

First and foremost, I would like to thank my doctoral advisor Felix Freiling for giving me the opportunity to work with him at his Security Research Group at the Department of Science at the Friedrich-Alexander University Erlangen-Nürnberg, his continuous advice, and support. Without him, this thesis would not exist. I would also like to thank Zeno Geradts, for agreeing to be the second reviewer to this thesis, and for the stimulating discussions we had while meeting at conferences — I’m looking forward to the . I also thank my colleagues at the Security Research Group, for a cheerful and friendly working atmosphere.

In addition, I want to thank (in alphabetical order): Johannes Bauer for proof- parts of this thesis, recommending the Grammarly grammar checker and, last but not least, collaborating on the DDR3 descrambling attack; Andreas Dewald for helpful input to the formalization of the essential data theory, proof-reading publications this thesis is based on, and being my group-leader for my first 1.5 years of research; Christian Moch for interesting discussions, sharing ideas, the deathmoch exploit and proof-reading this thesis; Tilo Müller for the LATEX template this dissertation started out to be based on, proof-reading, advising and collaborating on publications this thesis is based on, and being my group-leader for the last year of research for this thesis (even though he is not a forensic guy himself); Andreas for proof-reading parts of this thesis; Johannes Stüttgen for the hint to the Academic--Check project; the anonymous reviewers of the publications this thesis is based on.

Corporate acknowledgments: This work was supported by the online study program Master Digitale Forensik, the German Federal Ministry of Education and Research (BMBF) via the project Open Competence Center for Cyber Security (Open C3S), the German Research Foundation (DFG) as part of the Transregional Collaborative Research Centre “Invasive Computing” (SFB/TR 89) and the German Federal Ministry of Education and Research (BMBF) via the project Parallelized Application Detection in Overlays for Firewalls (Padiofire).

Contents

1 Introduction 1 1.1 Contributions...... 2 1.2 Related work ...... 3 1.3 Publications...... 5 1.4 Overview ...... 7

2 Background 8 2.1 Forensics...... 8 2.1.1 Acquisition ...... 9 2.1.2 Analysis...... 9 2.2 Anti-forensics...... 9 2.2.1 Acquisition ...... 10 2.2.2 Analysis...... 11 2.3 Rootkits...... 13

3 Persistent data acquisition (from hard drives) 15 3.1 Introduction...... 15 3.1.1 HDD anatomy ...... 15 3.1.2 HDD firmware analysis ...... 18 3.1.3 Related work...... 18 3.1.4 Outline ...... 18 3.2 Hard disk firmware bootkit ...... 18 3.2.1 Compromising ...... 19 3.2.2 Interfering with data acquisition ...... 20 3.2.3 Detection (in EEPROM) ...... 21 3.2.4 Subverting ...... 27 3.2.5 Investigating ...... 27 3.3 Overlay/module-based hard disk firmware rootkit...... 28 3.3.1 Verifying overlays/modules ...... 28 3.3.2 Memory analysis...... 29 3.4 Discussion...... 31 3.4.1 Compromise in EEPROM...... 31 3.4.2 Compromise in Service Area ...... 31 3.4.3 Compromised controller hardware ...... 31 3.4.4 SSDs...... 32 3.5 Conclusion and future work...... 32

i Contents

4 Memory acquisition evaluation 35 4.1 Introduction...... 35 4.1.1 Related work...... 36 4.1.2 Contribution ...... 36 4.1.3 Outline ...... 36 4.2 Background: Criteria for forensically sound memory snapshots ...... 37 4.2.1 Atomicity of a snapshot...... 37 4.2.2 Integrity of a snapshot...... 37 4.3 Black-box measurement methodology...... 38 4.3.1 Implementation...... 38 4.3.2 Estimating atomicity and integrity...... 38 4.3.3 Intuitive examples ...... 40 4.3.4 Practical constraints...... 40 4.4 Experiments...... 40 4.4.1 Setup ...... 40 4.4.2 Sequence ...... 41 4.4.3 Issues ...... 42 4.4.4 Analyzed methods and tools ...... 42 4.5 Results...... 46 4.5.1 Measurement accuracy...... 46 4.5.2 Individual results...... 46 4.5.3 Atomicity and integrity comparison ...... 48 4.5.4 Anti-forensic resilience...... 49 4.6 Conclusions and future work ...... 50

5 Memory acquisition (via cold boot attacks) 51 5.1 Introduction...... 51 5.1.1 Related work...... 52 5.1.2 Outline ...... 52 5.2 Setup ...... 52 5.2.1 Hardware...... 52 5.2.2 Test data placement...... 53 5.2.3 Software...... 54 5.2.4 Experiment...... 54 5.3 Observations ...... 56 5.3.1 Ground state patterns...... 56 5.3.2 Cached data ...... 58 5.4 Results...... 59 5.4.1 Remanence effect...... 59 5.4.2 Temperature and RAM remanence...... 61 5.4.3 RAM transplantation attacks...... 63 5.5 Bypassing countermeasures ...... 64 5.5.1 Descrambling DDR3 memory...... 65 5.5.2 RAM reset on boot...... 67 5.5.3 Locking the boot ...... 67 5.5.4 Temperature detection...... 68 5.5.5 0x7c00 defense ...... 69 5.6 Limitations ...... 69 5.6.1 CPU-bound ...... 69 5.6.2 RAM encryption...... 70 5.7 Conclusion and future work...... 70

ii Contents

6 Essential data and anti-forensics 71 6.1 Introduction...... 71 6.1.1 Problem statement...... 72 6.1.2 Related work...... 72 6.1.3 Outline ...... 73 6.2 Definition of essential data by Carrier and its problems...... 73 6.2.1 Problem 1: Definition depends on assumed functionality ...... 74 6.2.2 Problem 2: Definition depends on application...... 75 6.2.3 Problem 3: Definition cannot deal with redundant information ...... 75 6.3 What is essential data? ...... 76 6.4 Evaluation...... 77 6.4.1 DOS/MBR...... 78 6.4.2 GPT header...... 81 6.5 Discussion...... 81 6.5.1 Usefulness of new definitions ...... 81 6.5.2 Trust hierarchy...... 82 6.5.3 Evidence hierarchy...... 82 6.6 Conclusions and future work ...... 83

7 Virtual memory analysis (on Windows NT) 85 7.1 Introduction...... 85 7.1.1 Motivation ...... 85 7.1.2 Related work...... 86 7.1.3 Outline ...... 86 7.2 Grey-box virtual address translation analysis ...... 86 7.2.1 Scheme ...... 86 7.2.2 Test data generation...... 87 7.2.3 Inferring the virtual address translation...... 88 7.3 Windows NT and x64 virtual memory overview ...... 89 7.3.1 Pagefile ...... 89 7.3.2 table entries ...... 90 7.3.3 Virtual address translation ...... 93 7.4 Acquisition ...... 95 7.4.1 Memory...... 95 7.4.2 Pagefile ...... 95 7.5 Analysis...... 96 7.5.1 Finding DirectoryTableBase ...... 96 7.5.2 Reconstructing the virtual address space...... 97 7.5.3 Analyzing the virtual address space ...... 97 7.6 Evaluation...... 98 7.6.1 Problem cases of virtual memory analysis...... 98 7.6.2 Synthetic data ...... 100 7.6.3 Real life data...... 103 7.7 Conclusion and future work...... 103

8 Conclusion and future work 105

Bibliography 109

iii List of Figures

3.1 HDD anatomy ...... 16 3.2 Identifying the EEPROM of a WD3200AAKX HDD...... 21 3.3 Reading an EEPROM with an in-circuit programming clamp without desoldering . 22 3.4 Reading the EEPROM after it has been desoldered from the HDD’s PCB . . . . . 22 3.5 Holding a board in reset by pulling the ’s reset pin...... 23 3.6 Reset and ground pins to be connected for in-circuit reading from an ST31000340NS 23 3.7 Cross-comparisons of the EEPROM contents of different WD3200AAKXs . . . . . 24 3.8 Cross-comparisons of the EEPROM contents of a bootkit infected WD3200AAKXs 26 3.9 JTAG pin layout of the WD3200AAKX ...... 29 3.10 Wires soldered to the JTAG pins of a WD32000AAKX...... 29 3.11 A MICTOR 38 pin connector soldered to the JTAG test pads of a WD32000AAKX 29 3.12 UART pin layout of Seagate HDDs ...... 29

4.1 Space-time diagram of an imaging procedure creating a non-atomic snapshot . . . 37 4.2 Integrity of a snapshot with respect to a specific point in time t ...... 38 4.3 Atomicity and integrity ...... 39 4.4 Acquisition plot of pmdump ...... 46 4.5 Memory acquisition technique comparison (acquisition plot)...... 47 4.6 Memory acquisition technique comparison (acquisition density plot) ...... 48 4.7 Each acquisition position inside an atomicity/integrity matrix...... 48

5.1 Abstract setup of our experiments ...... 55 5.2 RAM module covered in cooling agent...... 55 5.3 Illustration of different ground state patterns ...... 57 5.4 Observed effects due to missing cache write back...... 58 5.5 Scrambling patterns in DDR3 systems after a cold reboot...... 60 5.6 Mona Lisa picture series as recovered after a ...... 60 5.7 RAM remanence of systems A to G and system J...... 61 5.8 RAM remanence of systems A, B, F, G and J over time and at different temperatures 62 5.9 Beginning of “Alice’s Adventures in Wonderland” recovered from a cold boot attack 64 5.10 Scrambled storage of data and image acquisition ...... 65

6.1 Typical partition entry in partition table...... 71 6.2 Example of essential and non-essential data...... 73

7.1 Grey-box virtual address translation analysis scheme...... 87 7.2 32-bit Paging PTE structures...... 90 7.3 Windows NT 32-bit Paging structures...... 90 7.4 x86 PAE Paging structures ...... 91 7.5 x64 IA32e Paging structures...... 92 7.6 Windows NT x64 paging structures...... 93 7.7 Windows NT Demand Zero PTE...... 93 7.8 Windows NT Pagefile PTE ...... 94 7.9 Windows NT Transition PTE...... 94 7.10 Windows NT Prototype PTE...... 94

iv List of Tables

3.1 Analyzed HDDs...... 33

4.1 Comparison of worst case atomicity and integrity deltas...... 49

5.1 List of tested computer systems and their corresponding RAM type and model . . 53 5.2 Ground state and bit decay relationship...... 57 5.3 List of observable RAM remanence in our test systems...... 59 5.4 Temperatures and bit errors for cold boot attacks with RAM transplantation . . . 63 5.5 BIOS password circumvention methods for the various systems tested ...... 67 5.6 Temperatures and bit errors for several transplantations without cooling ...... 68

6.1 MBR partition table entry data fields and their type...... 78 6.2 MBR data fields and their type...... 79 6.3 GPT header...... 80

7.1 Size value for signature...... 97 7.2 Problem cases of virtual memory analysis ...... 99 7.3 Results of reconstructing synthetic data ...... 101 7.4 Results of reconstructing real life data...... 102

v Listings

2.1 Partition table of extended partition loop exploit as displayed by ...... 11 2.2 Comment in foremost 1.5.7 file engine. outlining an unfixed problem 12

3.1 OpenOCD command to dump memory sections of a WD3200AAKX HDD..... 30 3.2 Boot message on UART from Seagate ST2000DM001...... 30 3.3 Displaying the available commands over the ST2000DM001’s UART...... 31

4.1 Command to dump the lowest 2 GiB of memory from QEMU into the file qemu.mem 43 4.2 Command to dump the memory from a VirtualBox virtual machine...... 43 4.3 inception indicating initialization of the IEEE 1394 ...... 44 4.4 Commandline used to invoke ProcDump ...... 45 4.5 Invoking ProcDump leveraging process cloning for acquisition...... 45

5.1 Program to replace any character outside the ASCII printable range with a star.. 64

7.1 Pagefile registry entry...... 89 7.2 Pagefile registry entry ...... 89 7.3 WinDbg displaying memory layout of _MMPAGING_FILE structure...... 89 7.4 Extracting the pagefile. with the Sleuthkit...... 96 7.5 _EPROCESS layout as per WinDbg...... 96 7.6 _EPROCESS signature constraints...... 97 7.7 Obtaining offset information for siganture via WinDbg...... 97

vi 1 Introduction

To new researchers in the field, digital forensics often appears to be a relatively new area of research with many unsolved problems. While the latter may be true, the former is not: As early as 1984, the FBI established its Computer Analysis and Response Team [39, p. 3], and already eight years later Collier and Spaul [24] introduced the then unknown field of to academia. Like many areas of , the area of digital forensics has developed quickly which has caused many problems with standardization and training. Today, however, the area has a solid scientific foundation as laid out by standard works such as Casey’s book “Digital Evidence and Computer Crime” [20] from 2004 or Carrier’s book “File System Forensic Analysis” [18] from 2005. A worldwide active community gathers at annual conferences such as the Digital Forensics Research Conference (DFRWS) and the problem of training has been addressed by various academic degree programs [4], one being the German Master of Science Digitale Forensik [16], for which the author of this thesis has been providing tutoring during the time of preparing this work. Hence today, digital forensics can be considered a well established scientific research field.

Looking at the present groundwork in the area, the standard pattern for performing forensic research on some hardware or software is the following: The first goal is to understand the system by performing experiments or “tactical reverse engineering” [53]. The next step then is to characterize the traces left within the system to allow investigators to draw conclusions upon discovery of such traces. The standard assumption mostly is that the system operates in “normal” circumstances, meaning that the technology is off-the-shelf and the opponent is a standard computer user. Both assumptions are totally sound in the majority of scenarios of forensic practice. However, the establishment of digital and the growing awareness of vendors, programmers, and criminals of the field is slowly changing the rules of the game. We have entered a time where opponents and technologies are increasingly clever, meaning that they are aware of the strengths and especially the shortcomings of digital forensic techniques. Interestingly, the field of classical forensic science has undergone a similar evolution. For example, 100 years ago it was possible, say for an axe murderer, to cover his tracks by wiping away the victim’s blood at the . If this was done well, it was very easy for the police to overlook traces of blood, and thus the case would not be investigated further due to lack of evidence. However, in 1937, Walter Specht, a German forensic scientist, established the use of luminol to detect blood [136]. The procedure uses the fact that luminol reacting with hydrogen peroxide becomes luminescent if a catalyst, in this case the hemoglobin in blood, is present [8]. Therefore, blood stains could now be detected even after they had been cleaned [27]. Investigative procedures, therefore, became much more robust against “clever” opponents. The example clearly shows the effects of ignoring the clever opponent. Ignoring the abilities of an adversary can be characterized by a state of naïve innocence. Today, classical forensic sciences are well-prepared against adversarial manipulative procedures, i.e., they have grown up from a state of innocence to a state of self-reflected maturity. Failing to leave the literary “age of innocence” [166] can have catastrophic consequences. We, therefore, believe that it is now necessary for digital forensic science to establish the “clever” opponent as the normal investigative case.

In this thesis, we address several anti-forensics methods against digital forensic procedures. Anti- forensics is becoming the established term for the techniques used by “clever” adversaries today. As with manipulations in classical forensics, it is very simple to manipulate a digital system in order to hide evidence from an investigator. And it is again rather difficult to detect even some of the simplest manipulations. For example, on the Windows to hide a process via direct kernel object manipulation (DKOM) [17] from the Task Manager requires only a kernel driver consisting of a mere 100 lines of C code. However, its detection is vastly more complicated and has even led to a multi-million dollar business branch of anti-virus software. Like in classical

1 1 Introduction forensics, the deeper the investigator’s analysis goes the harder it becomes to hide evidence. In a sense, this thesis attempts to find the digital equivalent of luminol. For example, for a rootkit to run, its code must be present on the system. This leaves “residue” on the system. Now given an untampered view of the system an investigator can when provided with an analysis technique that is complete, detect the rootkit’s code “residue”. This detection procedure can be seen as the digital equivalent of applying luminol to detect blood residue. In this thesis, we present theories, techniques, and methods that extend the field of digital forensics in this regard with the focus on providing untamperable acquisition and complete analysis methods, so all evidence “residue” can be detected even though a “clever” opponent tried to hide it from the forensic analyst.

1.1 Contributions

The contributions of this doctoral thesis to the field of rootkit and anti-forensic resistant computer forensic procedures concern different areas, such as persistent data acquisition, memory acquisition, and virtual memory analysis.

In the field of hard drive firmware rootkit resistant persistent data acquisition (Chapter3) our contributions are as follows:

• We provide, to our knowledge, the first forensic discussion regarding the acquisition of data from hard disks with manipulated firmware.

• We provide an analysis of an already published hard disk bootkit. To this end, we outline methods for detection and subversion of the bootkit located in the hard disks EEPROM. As a practical evaluation, we provide a procedure to verify the legitimacy of firmware contained in the EEPROM of the Western Digital HDDs model WD3200AAKX. To this end, we cross-compared the firmware of 16 WD3200AAKX HDDs.

• We provide a theoretical discussion on how hard disk rootkits residing in the firmware overlays and/or modules stored in the so-called Service Area can be detected.

• To facilitate the transferability of our contributions we investigate 20 different hard drive models from different vendors. Of these 20 hard drive models, 16 are HDDs while 4 are SSDs.

In the field of rootkit and anti-forensic memory acquisition procedures (Chapter4 and5) our contributions are as follows:

• We develop a novel black-box analysis method to practically evaluate forensic memory acquisition tools with regard to atomicity and integrity as defined by Vömel and Freiling [163]. Thus, we extend the insights of Vömel and Stüttgen [164].

• We evaluate 12 memory forensic acquisition tools and methods according to our method with regard to their atomicity and integrity.

• We further provide a discussion concerning the anti-forensic risks involved with each memory forensic acquisition technique and method.

• Based on the results of our evaluation, we analyzed the most favorable memory acquisition method, the cold boot attack in more detail. To this end, we provide an independent study based on 12 computer systems with different hardware configurations and thus verify the empirical practicability of cold boot attacks against DDR1 and DDR2.

• We provide empirical measurements showing the correlation between temperature and RAM remanence. The results of these measurements demonstrate that already cooling the surface temperature of a DDR1 or DDR2 module by just 10 ℃ can prolong the remanence effect notably.

• We further enable the cold boot attacks against scrambled DDR3 memory.

2 1.2 Related work

• Last, we argue that all software-based countermeasures to the cold boot problem published since 2008 can be circumvented by transplanting the RAM modules from the victim’s running computer to another computer controlled by the attacker. We further provide an overview of circumvention methods against BIOS-based cold boot countermeasures. Besides the above contributions to the field of anti-forensic evidence acquisition we also contribute to the field of forensic data analysis theory (Chapter6) with the following contributions: • First, we revisit Carrier’s definition of essential data [18] and show that there are two types of essential data: strictly and partially. While strictly essential corresponds to Carrier’s definition, partially essential refers to application specific interpretations. • We use our new extended definitions to build a trust hierarchy with regard to anti-forensic resistance. • We empirically show the amount of strictly and partially essential data in DOS/MBR and GPT partition systems, thereby complementing, extending and verifying Carrier’s findings. We, therefore, also empirically show which data within the DOS/MBR and GPT partition tables is more resistant against anti-forensic manipulations given a specific system. Last but not least, we make contributions to the field of virtual memory analysis (Chapter7). These contributions can be summarized as follows: • We eradicate a forensic blind spot from virtual memory analysis of the Windows NT operating system. Namely, we integrate the swapped out pages stored on persistent storage in the pagefile.sys into the virtual memory analysis. • To this end, we provide a gray-box analysis method to verify virtual address translation implementations. This method can be used to verify the correctness of virtual memory analysis software as well as analyze virtual address translation mappings of unknown systems. We evaluate the practicability of our method by analyzing the partially known Windows NT paging behavior, with a distinct focus on the pagefile. • Our prototype tools work for virtual address space reconstruction of Windows NT versions 7, 8.1 and 10 for both x86 systems, with 32-bit or PAE paging, as well as x64 systems, with IA32e paging. • With this work, we, last but not least, provide a reference for Windows NT virtual address translation, by summarizing our verification of previous research updated with our own findings. Combining all our contributions we hope to further the field of rootkit and anti-forensic resistant digital forensic procedures to a point where it becomes at least unfeasible for an opponent to hide digital evidence from forensic analysis and, therefore, take the first steps to leave the age of anti-forensic innocence.

1.2 Related work

Probably the first mention of anti-forensics was in 2002 by “the grugq” with his work “Defeating Forensic Analysis on Unix” in the hacker magazine Phrack [152]. However, the first mention of a problem with such attacks on digital forensic procedures from within the forensic community was probably not until Geiger’s “Evaluating Commercial Counter-Forensic Tools” [55] in 2005. Back then the term anti-forensics has not been established yet. Other works by Geiger et al. referring to counter-forensics followed in 2006 [56, 57]. Already in the same year, “Arriving at an anti-forensics consensus: Examining how to define and control the anti-forensics problem” by Harris [74] attempted to standardize methods of addressing the anti-forensic problem by defining the term, categorizing the anti-forensic techniques and outlining general guidelines to protect forensic integrity. Also, in 2006, Sartin [126] published “ANTI-Forensics–distorting the evidence”. It, like the other works at that time, outlines the anti-forensic problem. “Anti-forensics and the digital investigator” by

3 1 Introduction

Kessler [86] extends previous definitions of anti-forensics with the notion of time-based anti-forensics, the methods of which simply try to the forensic process to the point where the information gathered from it are too late, thus lose their evidentiary value or stop being usable at all, e.g., the statute of limitations running out on a crime. In 2007 “Anti-forensics: Techniques, detection and countermeasures” by Garfinkel [52] was published. It outlined anti-forensic techniques, most notably the 42.zip bomb, a decompression exploit. The 42.zip is a 42 KiB sized ZIP archive, which extracts to almost 4 PiB of data. This is achieved by recursively compressing identical ZIP archives into another ZIP archive. The resulting large amount of data expansion will cause the investigator to run out of storage space, especially when the analysis is automated. Most of the current systems are aware of such ZIP bombs and limit their level of recursive unpacking. More such exploits against forensic tools are outlined in “Anti-forensics with a small army of exploits” by Hilley [77]. It focuses on the Metasploit Anti-Forensic Investigation Arsenal (MAFIA). In the same year also “Can we trust digital image forensics?” by Gloe et al. [59] was published. It raised the problem of anti-forensics with regard to digital image forensics, which today is known as the field of multimedia forensics. Many works about the anti-forensics problem followed [12, 49, 119, 142, 117, 28, 48, 47]. Several even exist that formalize on the problem [120, 142]. Other works include various exploits used to impact forensic investigations in one way or the other [153, 168]. Works that specifically focus on anti-forensic measures on the Android smartphones [33, 1, 84] or other smartphones [6, 38] also exist. A field that really thrived with anti-forensics research was multimedia forensics. Beginning in 2010 many works, first developing methods to hide data within multimedia files, then detecting the developed methods, have been released [60, 140, 137, 138, 139, 141, 158, 157, 149, 91]. This research continued throughout the years 2012 to 2013 [9, 143, 15, 159, 42, 167, 41, 116]. The hype seems to have ceased after 2014 with only a few publications in this area per year [174, 43, 44]. Closely related to multimedia forensics is steganography, which is used as an anti-forensic measure by steganographically hiding evidence [147, 23].

In 2013 Stüttgen and Cohen presented “Anti-forensic resilient memory acquisition” [145] in which they detailed a software driven memory acquisition method that can still be reliable even against a compromised system. However, such highly publications specifically focusing on the development of anti-forensic resistant acquisition methods are unfortunately still rare.

While not directly related to the anti-forensic problem, we would like to argue that Kerckhoffs principle [85] should also be applicable to forensics, i.e., the fact that the enemy knows the system should not impact its security. Applying this to digital forensics, it means that forensic methods should be anti-forensic resistant to the point where evading them becomes impracticable — even if the adversary knows the method. This may not always be possible, however, it should still be a goal to strive for. As can be seen from the classic forensic example, it is very simple for an adversary that knows the forensic system to sabotage it, e.g., wiping blood away. While it is often difficult to remedy the problem, e.g., inventing use of luminol for detecting blood residue, in the end, the effort is well worth it as the forensic procedure, therefore, becomes much more robust against an adversary, e.g., cleaning blood stains so they are not detectable anymore is a lengthy procedure [27]. To this end, forensic research should not be satisfied by being able to analyze the web browser history and then hoping an adversary will not figure out how to delete or disable the web browser’s history. But forensic research must rather always investigate new avenues for forensic evidence, e.g., analyzing the web browser’s cache, cookies, and temporary files.

This year, in 2016, there have been two anti-forensic presented at the Digital Forensics Research Conference (DFRWS). The first [94] outlines how memory analysis, i.e., identification and analysis of process structures, can be bootstrapped in an anti-forensic resistant way. In the second [25] the authors survey various anti-forensic tools and provide an extended taxonomy. These two very current works at the leading conference in the field of digital forensics clearly show the actuality and importance of this thesis’ topic.

In this section, we listed related works that deal with the digital anti-forensic problem in various fields. More specific related work, including a delimitation to work presented in this thesis, is cited and discussed separately in each chapter.

4 1.3 Publications

1.3 Publications

During the preparation of this work partial results and other related works have been published at peer-reviewed conferences and workshops as well as peer-reviewed journals. The previous publications by the author, who is marked via underlining, are:

[67] Michael Gruhn and Tilo Müller. On the practicability of cold boot attacks. 2013 International Conference on Availability, Reliability and Security, ARES 2013, Regensburg, Germany, September 2-6, 2013, pages 390–397, 2013. doi: 10.1109/ARES.2013.52. URL http://dx. doi.org/10.1109/ARES.2013.52.

[65] Michael Gruhn. Windows NT pagefile.sys virtual memory analysis. In Ninth International Conference on IT Security Incident Management & IT Forensics, IMF 2015, Magdeburg, Germany, May 18-20, 2015, pages 3–18, 2015. doi: 10.1109/IMF.2015.10. URL http://dx. doi.org/10.1109/IMF.2015.10.

[50] Felix Freiling and Michael Gruhn. What is essential data in digital forensic analysis? In Ninth International Conference on IT Security Incident Management & IT Forensics, IMF 2015, Magdeburg, Germany, May 18-20, 2015, pages 40–48, 2015. doi: 10.1109/IMF.2015.20. URL http://dx.doi.org/10.1109/IMF.2015.20.

[131] Maximilian Seitzer, Michael Gruhn, and Tilo Müller. A bytecode interpreter for secure program execution in untrusted main memory. In Computer Security - ESORICS 2015 - 20th European Symposium on Research in Computer Security, Vienna, Austria, September 21-25, 2015, Proceedings, Part II, pages 376–395, 2015. doi: 10.1007/978-3-319-24177-7_19. URL http://dx.doi.org/10.1007/978-3-319-24177-7_19.

[165] Philipp Wachter and Michael Gruhn. Practicability study of android forensic research. In 2015 IEEE International Workshop on Information Forensics and Security, WIFS 2015, Roma, Italy, November 16-19, 2015, pages 1–6, 2015. doi: 10.1109/WIFS.2015.7368601. URL http://dx.doi.org/10.1109/WIFS.2015.7368601.

[51] Felix Freiling, Jan Schuhr, and Michael Gruhn. What is essential data in digital forensic analysis? it - Information Technology, 57(6):376–383, 2015. URL http://www.degruyter. com/view/j/itit.2015.57.issue-6/itit-2015-0016/itit-2015-0016.xml.

[32] Andreas Dewald, Felix Freiling, Michael Gruhn, and Christian Riess. Forensische Informatik. Books on Demand, Norderstedt, 2nd edition, 2015. ISBN 978-3-8423-7947-3.

[66] Michael Gruhn and Felix Freiling. Evaluating atomicity, and integrity of correct memory acquisition methods. Digital Investigation, 16, Supplement:S1 – S10, 2016. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2016.01.003. URL http://www.sciencedirect. com/science/article/pii/S1742287616000049. Proceedings of the Third Annual DFRWS .

[10] Johannes Bauer, Michael Gruhn, and Felix Freiling. Lest we forget: Cold-boot attacks on scrambled DDR3 memory. Digital Investigation, 16, Supplement:S65–S74, 2016. ISSN 1742- 2876. doi: 10.1016/j.diin.2016.01.009. URL http://dx.doi.org/10.1016/j.diin.2016.01. 009. Proceedings of the Third Annual DFRWS Europe.

5 1 Introduction

These publications are incorporated in this thesis as follows:

Chapter3 on page 15 is based on yet unpublished work authored solely by the author of this thesis.

Chapter4 on page 35 is based on “Evaluating Atomicity, and Integrity of Correct Memory Ac- quisition Methods” [66], which is a joint work with Felix Freiling. The idea, implementation, and evaluation, however, was an individual work by the author of this thesis. This previous publication was amended with a discussion of anti-forensic resilience in addition to atomicity and integrity. Chapter4 further uses parts with general explanations about memory acquisition techniques from “Windows NT pagefile.sys Virtual Memory Analysis” [65], which have been merged into this chapter to avoid needless repetition in this thesis.

Chapter5 on page 51 is based around “On the Practicability of Cold Boot Attacks” [67] with Section 5.5.1 on page 65 being based on “Lest We Forget: Cold-Boot Attacks on Scrambled DDR3 Memory” [10]. The 2013 work “On the Practicability of Cold Boot Attacks” [67] revisited cold boot attacks and how all the published software based cold boot attack mitigations can be circumvented. It further identified a problem with the cold boot attack memory acquisition procedure against modern DDR3 memory. Research on the attack against DDR3 continued until 2015 when eventually in joint work we were able to formulate an attack against DDR3 memory. The results were published as a joint work [10] in 2016. Because the final solution was obtained by a joint work effort, Chapter5 focuses mainly on the results obtained in “On the Practicability of Cold Boot Attacks” [67] and only contains a condensed write-up of the joint work “Lest We Forget: Cold-Boot Attacks on Scrambled DDR3 Memory” [10] in Section 5.5.1 on page 65. Section 5.6.2 on page 70 further contains a reference to “A Bytecode Interpreter for Secure Program Execution in Untrusted Main Memory” [131]. Even though the implementation has been done by Maximilian Seitzer as part of his master’s thesis, the idea was solely by the author of this thesis. But because the resulting publication was also based on Maximilian Seitzer’s master’s thesis, it is only referenced as an example of RAM encryption systems preventing cold boot attacks and not used in its entirety.

Chapter6 on page 71 is based on the two publications both named “What is essential data in digital forensic analysis?” [50, 51] one by Freiling and Gruhn and the other one by Freiling et al. While the main idea of the work, i.e., refining Carrier’s definition of essential data in digital forensic analysis [18], was stipulated by Felix Freiling, the technical evaluation of the work can be attributed to the author of this thesis. The work was further extend with a discussion of the relationship of essential data and anti-forensics. Because the work with Schuhr [51] is itself based on the earlier work by Freiling and Gruhn, but focusing on the legal aspects of essential data in digital forensic analysis without any technical evaluation, it was not used in this thesis. The author’s contribution to the second edition of the book “Forensische Informatik” [32] is also based on “What is essential data in digital forensic analysis?” Additional material contributed to “Forensische Informatik” have not been used in this thesis.

Chapter7 on page 85 is based on “Windows NT pagefile.sys Virtual Memory Analysis” [65], which has been an individual work by the author.

6 1.4 Overview

1.4 Overview

In Chapter2 on the following page, the background chapter, we first give an overview of forensics and anti-forensics, both in the classical physical realm as well as the digital realm. We define the main goal of anti-forensics. We further, motivate and illustrate the main objective of this thesis with practical examples. After this introductory background chapter, this thesis is devoted to state of the art anti-forensic and rootkit problems.

In Chapter3 on page 15, we investigate how data can best be acquired from hard drives that are potentially compromised by a firmware rootkit. To this end, we first outline the threat of hard drive firmware rootkits to forensic analysis. We then provide a procedure to detect and subvert hard disk drive firmware bootkits, as published. We further outline potential avenues to detect hard drive firmware rootkits nested deeper within the hard disk drive’s so-called Service Area, a special storage on the magnetic platter reserved for use by the firmware.

In Chapter4 on page 35, we shift towards the acquisition of volatile storage, i.e., RAM. To this end, we evaluate the quality, both with regard to atomicity and integrity, as well as anti-forensic resistance, of different memory acquisition techniques with our newly developed black-box analysis technique. This resulted in the cold boot attack proving to be the most favorable anti-forensic resistant memory acquisition techniques.

In Chapter5 on page 51, we outline the cold boot attack in detail. First, we experimentally confirm that cooling the RAM modules prolongs the remanence effect considerably. Then we prove, experimentally, that transplanting RAM modules from one system to another is possible. We further address the issue of modern DDR3 technology being scrambled as well as other proposed countermeasures, such as BIOS passwords and temperature detection.

In Chapter6 on page 71, we start addressing the analysis of evidence. To this end, we first revisit the theory of data analysis, namely the concept of essential data in forensic analysis as coined by [18]. After extending Carrier’s theories, we practically verifying both the original theories and our extensions in experiments. We further argue that the essential data concept can be used to build a trust hierarchy, from which we concluded that anti-forensic resistant analysis methods must only use what we call strictly essential, i.e., trusted, data.

In Chapter7 on page 85, we tackle a long unsolved problem in forensic memory analysis. To this end, we analyze the Windows NT virtual memory paging via a newly conceived gray-box analysis method, in which we place traceable data into virtual memory and force the data into both the physical RAM as well as being swapped out to the pagefile.sys stored on persistent storage. We are thus able to reverse engineer the complete virtual address mapping, including the non-mapped pagefile. We present our results of this reverse engineering as well as practical evaluations to prove the correctness of our findings.

In Chapter8 on page 105, we finally conclude this thesis.

7 2 Background

In this chapter, we give a brief introduction and motivation into forensics and anti-forensics. We illustrate them with examples that motivate our work. We also give a brief outline of rootkits and an introduction to firmware rootkits, which we address later in this thesis.

2.1 Forensics

In general, forensic science is the application of scientific methods in the persuade of answering questions of law [20]. In forensic investigations, evidence is discovered, collected, preserved, analyzed and eventually presented in a court of law. Forensic scientists should only provide objective evidence. They should not be biased. They do not decide whether a suspect is guilty or innocent. They only provide facts on which a judge or other entity can base their verdict on.

There are different forensic science fields. They usually are based on well established scientific research fields.

uses anthropology, e.g., to analyze skeletonized human remains in criminal cases.

• Forensic botany uses botany to answer questions such as: “From where are leafs and/or pollens found at a crime scene from?” This could reveal information regarding the whereabouts of either the victim or the suspect before a crime.

uses chemistry to answer questions such as: “Is gunshot residue at the hand of a suspect?” This can determine whether the suspect has fired a weapon. “Is there residues of accelerants in the ashes of a fire?” If there is it would indicate arson instead of an accidental fire.

• Forensic dactyloscopy uses dactyloscopy, which is the field of fingerprint identification, to match up two sets of fingerprints, by comparing their minutiae, and determining whether they are equal or not.

uses engineering to answer questions such as: “Did the car’s break line rupture before the crash?”, “Why did a building collapse?”

• Digital forensics, also known as digital forensics or forensic computing, uses research methods from computer science to answer questions based on digital evidence, such as: “Is a specific file stored on a specific computer system?”, “How did the file get onto the computer system?”, “Which user account is responsible for the download of the file?”

As can be seen from the above list all forensic field names are a combination of the word “forensic” and the name of the field. The term forensic computing is a term proposed in 2014 by Dewald and Freiling [31]. It was not only introduced to bring the name into line with the naming of other forensic fields, but also to distinguish between so-called digital forensic tasks usually performed by “ordinary” criminal investigators, such as copying hard disks, from the computer science tasks behind complex digital forensic tasks. This can be compared to a police officer taking a DNA swab from a suspect and the forensic scientist actually performing the DNA analysis in a laboratory. In this case, the police officer is usually not considered to do , while the latter definitively is. In this work the terms forensic computing and digital forensics are, however, used interchangeably.

8 2.2 Anti-forensics

In every forensic field, certain steps such as discovery, acquisition (also known as collection), preservation, analysis, and presentation exist. In this work, we only deal with acquisition and analysis, because these are the most relevant targets for anti-forensic measures. In the next two sections, we outline the important aspects of acquisition and analysis.

2.1.1 Acquisition

Before evidence can be analyzed it must first be acquired. In a classical physical forensic discipline, this means collecting fingerprints, hair, fiber particles, etc. from the crime scene. Here special care must be taken to not destroy the evidence during acquisition, i.e., if a fingerprint is not acquired correctly, e.g., it is smeared during the acquisition process, it could be lost forever. In physical forensics the acquisition process is closely related to preserving the evidence, i.e., making sure it retains its authenticity during the investigation. This is a task that is trivial during digital investigations, because 1:1 copies of are possible. This means in digital forensics acquiring the evidence means making such a 1:1 copy of the data to be acquired. While this may seem trivial for hard drives, which can simply be cloned, either via software or special hardware, we show throughout this thesis that this is not always the case. For example, to read a hard drive, the forensic investigator, according to the current state of the art, uses firmware running on the hard drive to do so. However, as we show, such firmware can be manipulated and interfere with the acquisition process. Such interference is called anti-forensics and we further discuss it in Section 2.2.

2.1.2 Analysis

Next, after the evidence has been acquired it needs to be analyzed. One classic example would be fingerprint matching. A fingerprint acquired at a crime scene is compared to the fingerprints of suspects. If there is a match the suspect can be connected to the crime scene. Sometimes the actual evidence, or at least a part of it, gets destroyed during analysis, e.g., during chemical analysis, in which particles acquired at the crime scene must be dissolved in order to analyze them further. In digital forensics, such destruction of evidence is generally not needed, again, because of the possibility of a 1:1 copy. The analysis process, in digital forensics, consists of interpreting the acquired data, e.g., a 512- , could either be a , a network packet, or any other digital data. Is the data not interpreted correctly, the analysis is faulty. As before, such analysis errors can be provoked via anti-forensic methods, which we discuss in the following section.

2.2 Anti-forensics

Definition 1. Anti-forensics is any measure that prevents a forensic analysis or reduces its quality.

Any measure that prevents a forensic analysis can, according to Definition1, be considered an anti-forensic measure. This means that any measures taken to prevent the forensic acquisition must also be considered an anti-forensic measure, because, as outlined earlier, if there is no evidence acquired, it cannot be analyzed. A classic example of anti-forensic measures are gloves. A person wearing gloves generally does not leave fingerprints. Hence, a criminal wearing gloves at the crime scene prevents the analysis of his fingerprints by simply not leaving any in the first place. However, Definition1 also specifies a reduction in quality to be sufficient for a measure to qualify as an anti-forensic measure. One example would be to documents before they are acquired by a forensic accountant. While ultimately the forensic accountant will be able to puzzle the documents back together, the time it takes to do so impacts the quality of the forensic analysis severely. Because time spent during a forensic investigation always costs money, the quality of the forensic analysis is further impacted. Depending on the severity of the case, such time investments may often not be justifiable. Also, if too much time passes during analysis gathered evidence may lose its evidentiary value or stop being usable at all, e.g., statute of limitation running out on a crime.

9 2 Background

It is important to note, that not only measures destroying and/or preventing evidence should be considered anti-forensic measures, but really all measures that as per Definition1 prevent or reduce the quality of a forensic analysis. An example would be to manipulate a crime scene after a crime, i.e., leave and/or even plant more evidence than was originally present after the crime. To this end, a criminal could scatter the contents of a public ash-tray at a crime scene, rendering analysis rather difficult. Another example would be to outright frame somebody else for the crime by stealing personal items from the to be framed person and leaving them at the crime scene. In the next two sections, we will outline anti-forensics during acquisition as well as analysis in the context of digital forensics.

2.2.1 Acquisition

The most efficient and safest way to impact a forensic analysis is to interfere already with the acquisition process. If there is no acquisition there can be no analysis. The acquisition is an especially crucial step in the digital forensic process, because unlike the analysis we show later, it can often not be repeated. Once digital evidence is gone, it is gone. This is especially true for volatile data, such as RAM contents, which we, therefore, mainly focus in this thesis.

2.2.1.1 Technical/practical impossibility

One way to prevent the acquisition of evidence is by making it technically impossible to acquire the evidence in the first place. To this end, hardware can be used to prevent a forensic analyst from gaining access to the data. One such instance that comes to mind are Smartcards, more specifically SIM (subscriber identification module) cards. These cards can be protected by a PIN. The data on them, such as stored phone numbers, can only be accessed when the correct PIN was entered. The card has also the possibility to lock itself, once the PIN was entered wrongly too many times, effectively preventing brute force attacks.

Another example is storing data in processor registers, which, unlike RAM can, at least according to current consensus [108], not be acquired via physical attacks. The most prominent example of this anti-forensic technique is TRESOR [108], which stores keys in CPU registers so they can not be acquired via physical attacks.

Here it could be argued whether or not encryption can be considered a measure that prevents acquisition. We would like to argue that it does not, but rather prevents analysis. Because the data itself can be acquired, and at least in theory the key can be brute forced eventually. However, we will not go into this semantic debate.

2.2.1.2 Manipulations

Another way to facilitate anti-forensics is via manipulations to the system that should be acquired. In 2013, Stüttgen and Cohen presented their work “Anti-forensic resilient memory acquisition” [145], in which they showed a simple one-byte manipulation that breaks many memory acquisition software for . To this end, the memory layout information presented to the Linux was manipulation via direct kernel object manipulation (DKOM). More specifically, under Linux /proc/iomem exposes the memory resource tree to the user space. Only memory regions titled “SYSTEM RAM” are guaranteed to be backed by RAM and hence accessible. Changing this to “SYSTEM ROM” does not cause any problems to the system. However, many memory acquisition tools will not acquire those memory sections anymore, assuming those sections are actually ROM and not RAM. By preventing the forensic analyst from acquiring the memory logically also prevents memory analysis.

10 2.2 Anti-forensics

2.2.1.3 Tool errors

Another possibility for anti-forensics during acquisition are tool errors. While it is debatable that virtually all anti-forensic problems can be attributed to tool problems, i.e., most tools are, at the current time, unfortunately not anti-forensic resistant, we will not enter a debate nor discussion what should and what should not be considered a tool error. We will rather give one clear example of anti-forensic attacks, to illustrate how errors in the tools used by the forensic analyst can be attacked with anti-forensic measures. To this end, we outline an error that allowed us to manipulate the partition table of a storage medium in such a way that it could not be acquired by Linux. The error was first described by Wundram et al. [168] as an anti-forensic attack against the tool mmls from Brian Carrier’s Sleuthkit. However, the described extended partition loop in the DOS/MBR partition table also works to attack a Linux system, as is documented by CVE-2016-5011. Basically, the attack works as follows: A DOS/MBR partition table is created with one regular partition. Then an extended partition is created. This extended partition then points to the DOS/MBR itself, thus, creating a loop. The partition table as parsed by Linux’s fdisk tool is shown in Listing 2.1. Note that fdisk does not recursively parse extended partitions. However, Linux’s libblkdev would recursively parse the loop until it ran out of memory, causing a denial of service (DoS) against the system.

$ f d i s k −l deathmoch_active. Bad offset in primary extended partition

Disk deathmoch_active.dd: 0 MB, 1536 , 3 sectors Units = sectors of 1 ∗ 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk type: Disk identifier : 0xc5952e09

DeviceBoot Start End Blocks Id System deathmoch_active.dd1 1 1 0+ 83 Linux deathmoch_active.dd2 0 0 0+ 5 Extended

Partition table entries are not in disk order

Listing 2.1: Partition table of extended partition loop exploit as displayed by fdisk

We were also able to find a similar anti-forensic attack against Windows 7 by setting the number of partition table entries to zero within the GPT header of a GPT partitioned storage medium. Once such a prepared device was connected to a system running Windows 7, it caused the system to crash with the so-called Blue Screen of Death, caused by a division by zero within the kernel.

While both outlined attacks can be trivially prevented by updating the tools, when first confronted with them while using a vulnerable tool, the forensic investigator must waste time analyzing the problems caused by such manipulated partition tables instead of actually analyzing the storage medium’s contents.

2.2.2 Analysis

In case the acquisition cannot be prevented the analysis must be attacked instead. However, because, at least, in theory, the analysis can be repeated an infinite amount of times it can often not be prevented completely. This is due to the ability to construct 1:1 copies of digital evidence, hence, granting the analyst theoretically infinite attempts. This means should an attack be able to compromise an analysis, i.e., should the analysis fail, the analyst can restore the evidence from the 1:1 copy and try again and again until an analysis approach succeeds. Because this makes anti-forensic measures during acquisition more powerful, this thesis focuses primarily on the acquisition and not analysis. Nonetheless, we now outline several anti-forensic schemes preventing analysis.

11 2 Background

2.2.2.1 Technical/practical impossibilities

As already mentioned before, encryption could be considered rendering analysis technically impossi- ble, because, given a sufficient key space size brute force attack can be physically impracticable. Beside encryption, there is technically nothing that can make analysis impossible. However, it may become impracticable. Because even though an analyst can potentially try an infinite amount of analysis attempts, this is not practical. At one point either time or money will run out.

2.2.2.2 Manipulations

Even though manipulation, either of the acquired data or the system used by the forensic an- alyst, should not be possible, it has been shown that JavaScript code could be injected into HTML reports produced by forensic tools [168]. Once those manipulated reports are viewed in a browser supporting JavaScript the injected code is executed — on the analyst’s computer. Again, it can be argued that such manipulation possibilities are errors in the tool producing the HTML report.

Another possibility for disturbing the forensic analysis are manipulations which, even though they have already been performed to the evidence before the acquisition, manifest during the analysis process. One such example are timestamps in recordings. Obviously, those can be faked. However, a more anti-forensic resistant method can be used to determine when a recording was actually made rather than when its manipulatable time stamp says it was. Usually, a power network provides with its typical frequency, i.e., the Electric Network Frequency (ENF). However, due to fluctuations in the production and consumption of electricity this frequency fluctuates over time. These fluctuations can be extracted from an audio recording and correlated to the ENF fluctuations [78]. This technique is known as the ENF criterion. Another example is the identification of the source camera of a picture. The , such as EXIF data, embedded inside the picture can, again, be easily forged. However, the unique camera fingerprint embedded inside the picture, caused by photo response non-uniformity (PRNU) noise, the shape of the lens, and other uncontrollable production inaccuracies, can be used to identify the source of a picture [58] even if heavily compressed [3]. This method is also applicable to videos [161].

2.2.2.3 Tool errors

Another possibility for anti-forensics during analysis are, again, tool errors. As before many anti- forensic problems can be attributed to tool problems. One example for this is an error in the file carving tool foremost. It does not detect files that span over a 100 MiB chunk boundary, i.e., if a file starts at 199 MiB and ends at 201 MiB it will not be detected by foremost. This is because foremost processes its input in chunks of 100 MiB size with each of those chunks being analyzed separately. The authors of foremost are well aware of this fact as the comment seen in Listing 2.2 in the source code proves.

598 / ∗ FIX ME∗ ∗ ∗ 599 ∗ We should jump back and make sure we didn’t miss any headers that are 600 ∗ b i d g e d between chunks. What is the best way to do this?\ 601 ∗ /

Listing 2.2: Comment in foremost 1.5.7 source code file engine.c outlining an unfixed problem

An attacker could hide evidence from the foremost tool by placing it at said 100 MiB boundaries. While this example is trivial, more complex examples exist [168]. But the foremost example shows, without having to explain too many details, what can be considered an anti-forensic measure that is enabled by a tool error.

12 2.3 Rootkits

2.3 Rootkits

A rootkit is a special kind of malware. Its main objective is to provide sustained access to a compromised system and/or computer resource and at the same time conceal the presence of such access, its related activities and the rootkit itself. One simple example of a rootkit, would be the installation of an SSH , i.e., a remote access, by an attacker after gaining access to a system. This way the attacker can login to the system via SSH at a later point in time, hence, achieving sustained access. However, the installation of an SSH server is very simple to detect, because it is, among many other places, listed in the list of running processes. To combat this, rootkits generally use anti-forensic techniques to hide their presence. One such example of such hiding mechanism is direct kernel object manipulation (DKOM) [17], a technique that directly manipulates kernel memory structures. In this specific instance, a rootkit would modify the kernel’s process list. This list is typically a doubly linked list containing all processes on the system. Because a process is executed in smaller subunits, so-called threads, which are organized in separate lists, the process list can be manipulated without affecting the execution of the malware. The rootkit can, thus, make itself invisible to the operating system by unlinking itself from the process list. Another flavor of rootkits are so-called bootkits. They are called bootkits because, unlike regular rootkits, they infect the boot process of a system. Due to the very early compromise in the boot process, i.e., long before the actual system is actually loaded, very deep manipulations to the system being booted can be performed. This results in bootkits being harder to detect than regular rootkits. However, their ultimate goal is identical. Because these regular kind of rootkits and bootkits are already well known, also in the forensic community, and tools for their detection exist, we, in this thesis, tackle firmware rootkits. Firmware rootkits are rootkits that manipulate device firmware, i.e., the software providing functionality to hardware devices. Contrary to popular belief modern hardware does not consist of discrete logic burnt into a circuit but hardware usually contains embedded processor units, i.e., , itself. Already in 2009 researchers presented firmware rootkits. One of those early works by Anibal L. Sacco and Alfredo A. Ortega leverages a computer’s BIOS to achieve a persistent infection [122, 123]. In this work, we will address the special case of firmware rootkits on hard drives.

13

3 Persistent data acquisition (from hard drives)

The main objective of a rootkit is to provide sustained access to a compromised system and at the same time conceal the presence of such access, its related activities and the rootkit itself. To this end, rootkits use anti-forensic techniques to hide their presence. In this chapter, we investigate an anti-forensics technique which prevents forensic investigators from acquiring case-relevant data in the first place. Currently, the most efficient way to remain persistent but hidden are hard disk firmware rootkits, for the detection and mitigation of which currently no forensic best practice exist. We, therefore, analyzed 20 different hard drive models, of which 4 are SSDs, from different vendors. We also analyzed one model, the Western Digital WD3200AAKX, more in depth to outline methods for detection and subversion of bootkits located in the hard disks EEPROM. To this end, we analyzed 16 different WD3200AAKX HDDs. In this chapter, we furthermore provide a theoretical discussion on how hard disk rootkits residing in the firmware overlays and/or modules stored in the special storage area on a HDD called the Service Area can be detected. To our knowledge, we provide the first forensic discussion regarding the acquisition of data from a hard disk compromised with a firmware boot- and/or rootkit.

3.1 Introduction

In digital forensics data is analyzed. In order to analyze data, it must, however, first be acquired. Because many times forensic investigations, of, e.g., industrial espionage, involve rootkit compromises, this chapter addresses persistent data acquisition from potentially rootkit subverted hard drives. This is a task that has not been addressed by the current state of the art in forensic hard drive forensics. Current literature on hard drive forensics always recommends making a copy of the original source drive, while using a write blocker to prevent changes to the original source drive. The current state of the art also recommends the acquisition of the usually hidden HPA and DCO sections. For a non-compromised hard drive, those measures work fine. However, they are not enough to ensure that evidence is not destroyed, lost or never found when the analyzed hard drive has been compromised by a firmware rootkit. Hence, in this chapter, we first give a short overview of what effect a firmware rootkit can have on an investigation. We then demonstrate how firmware bootkits can be detected and we outline several possibilities how even deep firmware rootkits can be detected. We will use a Western Digital WD3200AAKX as running example throughout this chapter.

3.1.1 HDD anatomy

A modern HDD is not just a block device but rather a whole computer system in itself. The symbolic picture in Figure 3.1 on the following page shows the anatomy of a hard disk drive (HDD). It is loosely based on the Wester Digital WD3200AAKX drive. The picture shows the HDD and selected components in the middle. The components are: the disk, also known as the platter; the read and write head; and the PCB containing the processor, RAM, EEPROM, etc. The figure further points out three topics of interest: persistent storage areas, interfacing possibilities, and anti-forensic threats. These are detailed in the next two sections. The first section outlines the persistent storage areas and their associated anti-forensic threats. The second section outlines the interfacing and verification possibilities.

15 3 Persistent data acquisition (from hard drives)

Persistent Storage Anti-Forensic Areas: Threats: Service Area Compromised Firmware Overlays/Modules Boot Sector Compromised Partition Table Data Sectors General Anti-Forensics Interfacing and Code Stored on Device Verification HPA/DCO Hidden Data in HPA/DCO, Service Area, EEPROM or sectors marked as "bad" Possibilities: Disk/Platter

JTAG SAMSUNG 237 K4H511638J-LCCC H5116 E1H012GSS Malware Code in RAM X-Ray

Mask ROM TW Compromised Controller 0741 B1P YPMA682AE 88i6745-TFJ1 C125 Hardware SPI 1227 25X20BLNIG EEPROM winbond Compromised Firmware Boot Loader SATA UART

Figure 3.1: HDD anatomy: storage areas, interfacing possibilities and anti-forensic threats

3.1.1.1 Persistent storage areas and anti-forensic threats While the main purpose of a HDD is to provide persistent storage, the HDD itself has also storage areas it can use, which may not be accessible by a normal user of the HDD. The following areas are pointed out in Figure 3.1:

Mask ROM The mask ROM is integrated into the processor on the HDD’s PCB. Because it is programmed by the integrated circuit manufacturer during the manufacturing process directly via the photomask in the photolithography process, it is read-only and can not be modified. It contains code that allows the processor to boot and load the firmware boot loader code from the EEPROM into RAM.

EEPROM The EEPROM contains the boot loader of the firmware. The boot loader bootstraps the system to the point where it can read from the platter. At this point, it will start loading more firmware from the platter. The EEPROM can also be read and written by software via the HDDs SATA connection. This way firmware code can be modified, e.g., to perform legitimate firmware upgrades. This results in the anti-forensic threats associated with a compromised firmware boot loader, also known as a bootkit. It is important to note that this firmware bootkit does not interfere with the boot process of the operating system installed on the HDD, but rather the boot process of the HDD itself. Besides compromising the firmware, the EEPROM could be used as hidden data storage.

Service Area The Service Area is a hidden area on the platter that, unlike the HPA or DCO, can not be used by a normal user. This area is used to store further firmware components called overlays or modules, which are loaded into RAM by the boot loader. On some HDD’s some overlays are only loaded on demand and swap other overlays out of RAM. While this area is reserved for usage by the firmware, there are vendor specific commands (VSCs) with which this area can be accessed via the SATA . This allows an attacker to modify firmware as well as hide data. After the Service Area begins the regular storage area of the HDD platter — often known as the User Area. However, in Figure 3.1 we additionally point out different aspects of the User Area, because these different sections of the User Area can be used in different ways to facilitate anti-forensics. But to the hard drive, the User Area is treated as one single storage area.

16 3.1 Introduction

Boot Sector The Boot Sector is the first thing loaded by a BIOS when from the HDD. Because a professional forensic investigator will not boot from the device, any malware infections, such as bootkits in the boot sector, will not immediately impact his work, unless, of course, the malware is the subject of his investigation. The Boot Sector usually also contains the partition table, which can be malicious. In CVE-2016- 5011, we have shown that it is possible to cause a denial of service (DoS) against various Linux systems with a specially crafted DOS/MBR partition table, that uses an extend partition loop [168], i.e., an extended partition within the DOS/MBR partition table will point back to the DOS/MBR itself causing infinite recursion in the library libblkid when parsing the partition table. Another such example uses GPT partitioning. It can cause a DoS against Windows 7 by setting the number of partition table entries to zero in the GPT header, which will result in the so-called Blue Screen of Death due to an implementation error which causes a division by zero within the kernel. So even by merely connecting a HDD with such a compromised partition table to a vulnerable system can impact the forensic investigator’s work negatively.

Data Sectors The Data Sectors are all sectors that a regular user of the HDD has access to. This is where, e.g., the file system resides, the files within it, etc. Regular anti-forensic measures, typically, reside here, e.g., loop attacks [168] or XSS code injection into forensic reports that are generated as HTML files [168].

HPA/DCO Last but not least the HDD can contain the so-called (HPA) and/or a Device Configuration Overlay (DCO). Both are hidden from the user. The HPA can be used to store installation files used for system recovery, while the DCO can be used to control over- provisioning of the HDD. Access to both the HPA and DCO can be acquired via ATA commands [69]. Both areas are well known to forensic investigators, hence, they lost their value as data hiding spot for criminals.

3.1.1.2 Interfacing and Verification Possibilities Figure 3.1 on the preceding page also shows the various interfacing and verification possibilities we found provided by the analyzed hard drives. We will outline most of them in more detail throughout this chapter. Hence, here we only give a brief outline of each.

Magnetization The first idea to verify the data on the HDD platter seems to be to read the magnetization directly from the platter. While this may have worked on older hard disk drives, newer disk drives typically have a very high integration density rendering this method nearly impossible.

SATA The usual way today to interface with a HDD is via its Serial AT Attachment (SATA) interface. This method uses the firmware stored on the device to read and write data from and to the platter. In the case of a firmware compromise this interface can not be trusted.

SPI To interface with the EEPROM the Serial Peripheral Interface (SPI) can be used. Because this interface it very low level a software compromise is not possible.

X-Ray While x-raying the device is not actually interfacing with it, X-raying can be used to verify the integrity of the PCB and attached microchips.

JTAG Some hard disks have a Joint Test Action Group (JTAG) interface. A JTAG interface can be used to test and debug the processor on the PCB.

UART Some hard disks have a Universal Asynchronous Receiver Transmitter (UART) interface, which provides serial communication with the device.

17 3 Persistent data acquisition (from hard drives)

3.1.2 HDD firmware analysis software

PC-3000 is a commercially available hard and software combination,which can be used to repair corrupted firmware. Unfortunately, we do not have access to a PC-3000, hence, we can only make assumptions about it. Presumably, it could possibly be used to restore a proper firmware on a HDD. But it is a proprietary product, hence, little is known about its inner workings, making it a less trustworthy solution than what we are about to present. Also, because the PC-3000’s inner workings are unknown, it can not be further developed and adapted to the needs of forensic examiners, e.g., new hard drive support.

3.1.3 Related work

In 2009 Sutherland et al. [148] outlined an anti-forensic technique in which the ability of the hard disk to mark specific tracks on the hard drive platter as bad was used to hide data from a forensic investigation by marking its corresponding physical sectors as bad, i.e., unusable. At OHM in 2013 Jeroen “Sprite_tm” Domburg’s demonstrated possibly the first publicly demonstrated hard disk firmware rootkit [37]. Many other works followed [173, 61, 172]. In 2012 Knauer and Baier [88] demonstrated how the diagnostic UART interface of a Samsung HDD could be used to bypass ATA passwords. They further extended their work in 2014 [7] to anti-forensic data hiding. In 2013 Read et al. [118] used firmware modifications to hide partitions from a forensic investigator. To this end, they used the HDDHackr tool, which can manipulate firmware of ordinary Western Digital HDDs to be used with the Microsoft 360, instead of more expensive official OEM drives. In 2015 Laan and Dijkhuizen [90] proposed a firewall for specific ATA commands. With their work in place, the proposed hard disk firmware rootkits would not be able to infect the device, because this is done via specific ATA commands, which their so-called firmwall would block. Further the NSA’s IRATEMONK, UNITEDRAKE, STRAITBIZARRE, SLICKERVICAR, SAD- DLEBACK, ALTEREDCARBON, FAKEDOUBT, PLUCKHAGEN and EASYKRAKEN projects are assumed to be in one way or the other involved with compromising the firmware of HDDs. But, even though the topic of firmware rootkits and malware has been demonstrated again and again, there is no publication outlining the impacts on digital forensics and/or providing best practice recommendations on how to deal with such firmware anti-forensic malware infections.

3.1.4 Outline

The outline of this chapter is as follows: First, in Section 3.2, we examine hard disk firmware bootkits. Next, in Section 3.3 on page 28 we outline overlay-based hard disk firmware rootkits. Followed by a discussion in Section 3.4 on page 31 and last but not least a conclusion in Section 3.5 on page 32.

3.2 Hard disk firmware bootkit

In this section, we present a survey of hard disk firmware bootkit infections using the Western Digital WD3200AAKX 320 GB HDD as our primary example. In the first Section 3.2.1 on the facing page, we show two ways how this HDD’s firmware can be compromised — only via running software with root privileges on the computer hosting the HDD. Next, we show how we can use only anti-forensic resistant hardware techniques to detect some of the possible firmware manipulations. We further detail how deeper firmware manipulations could be detected as well as briefly outline how those techniques can be applied to other hard drives besides the explanatory WD3200AAKX, namely 20 hard drives by different manufacturers. In Section 3.2.4 on page 27, we show how manipulated firmware and also manipulated controller hardware can be subverted by a forensic analyst to still access the data stored on the HDD’s platter in a forensically sound way. Eventually to conclude this section we give future prospects in Section 3.2.5 on page 27 into how firmware manipulations can be investigated further.

18 3.2 Hard disk firmware bootkit

3.2.1 Compromising

The firmware of the Western Digital WD3200AAKX is not digitally signed. Neither is the firmware of any other hard drive, to our knowledge. It can be uploaded, i.e., replaced with a new firmware image, via vendor specific ATA commands. These commands are often simply called vendor specific commands (VSCs). The Linux tool hdparm provides the argument “–fwdownload ” which implements a convenient interface to these vendor specific commands. It can be used to write the new firmware image to the EEPROM of the WD3200AAKX. Malware, such as a rootkit, can also issue those vendor specific commands, given sufficient privileges to do so, i.e., it needs administrative privileges. Given this ability the possibilities of firmware manipulation are endless.

3.2.1.1 Providing persistent root access

In 2013 Jeroen “sprite_tm” Domburg demonstrated a firmware rootkit against a 500 GB Western Digital HDD that allowed for covert root access on UNIX systems persisting even system reinstalls and complete HDD erasure [37]. To this end, he hooked the table of the Marvel Feroceon ARM processor responsible for the SATA DMA transfers. The hooked interrupt is triggered once the requested data was loaded from platter into the cache of the HDD by the second Feroceon ARM processor on the main controller chip. Instead of instantly issuing the SATA DMA transfer, the hook code would first search the newly loaded data in cache for an ASCII string that indicates the /etc/ file of a UNIX system was read, i.e., a string having the form “root:*:*:*:*:*:*:*:*\n”, with * representing a wildcard character for any ASCII character not containing : or no character, and \n representing a newline character. Here root represents the username, in this case, the root user, and : is a separator separating the individual fields. The first field is the username and the second field the password hash. The other fields are additional fields, that are irrelevant for this scenario. The rootkit then exchanges the password hash for a known password hash. This way the attacker injects a known root password into any UNIX system, or any other system employing the /etc/shadow password file for that matter, installed on such a compromised HDD — even if the system is reinstalled.

3.2.1.2 Activating and deactivating the rootkit (from remote)

Further hooks can be added to the firmware to enable and disable this root password overwrite, e.g., a second hook in the interrupt table for the interrupt triggered once data in the cache is supposed to be written to platter can be used to search for commands. These commands can then be triggered simply by writing them to disk. This way the regular root user can use the system with his or her password, but once the attacker wants to access the system all he or she needs to do is to write a secret command string to disk to activate the root password overwrite. sprite_tm used the strings “HD, live” to activate and “HD, dead” to deactivate his rootkit. An attacker can then inject such a command string, e.g., within the User-Agent string of a HTTP request to a server which is then subsequently written into the HTTP access logs of that server. Another example would be a SSH login attempt with a specially crafted username which would then be written to the security logs and thus trigger the command. An attacker may need to trigger multiple data writes to ensure that the commands are actually committed to the HDD and not reside in the kernel’s file system cache, though. To this end, an attacker can simply flood the logs by issuing bogus requests. Any such actions needed to get the command to be committed to disk can then be deleted afterwards once the attacker has gained root access. In fact, the firmware rootkit could replace the commands found in the cache with innocent looking data, e.g., if the command to start the rootkit was “3e91923511ce3ad01dc6ae56d3874dc6a7bef7aec532d365a4d83037eca4f20a13374211” it could be replaced with the valid user agent string “Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1” before actually being written to disk.

19 3 Persistent data acquisition (from hard drives)

3.2.1.3 Other HDD firmware rootkit examples

Zaddach et al. demonstrated a similar exploit to sprite_tm’s work in their work “Implementation and implications of a stealth hard-drive backdoor” [173]. They further detailed optimizations to control the rootkit in a way which makes the attack more feasible than sprite_tm’s prototype. Also, according to talks by the co-authors Goodspeed at 0x07 Sec-T Conference in 2014 [61], Kaminsky1 at DEF CON in 2014 [83], and Zaddach himself at REcon in 2014 [172], they used a 500 GB HDD, showing that such firmware rootkits are not just affecting Western Digital drives.

In the following years more and more researchers recreated these HDD firmware bootkits and reported their results online. These examples of security researchers recreating sprite_tm’s work further show not only the feasibility, especially with regard to implementing their own firmware rootkits, but also the relevance in the field of anti-forensics. Given that multiple independent researchers have been able to recreate the rootkit and even openly posting about it online proves that criminals have access to this kind of information. Forensic scientists should, therefore, expect such firmware compromises when investigating high profile cases of industrial espionage, etc.

Last but not least, the crucial part about this type of compromise is that there will be no evidence on the regular data storage areas of the HDD. And any attempt to clean the drive will be futile because the changed root password hash is modified on the fly and never actually written to the platter.

3.2.2 Interfering with data acquisition

Besides the above-mentioned scenario of persistent root access, trivial but yet efficient anti-forensic measures can be employed. E.g. the partitions of the HDD can be formatted in a way that after the partition table — irrelevant whether DOS/MBR or GPT — one sector is left unallocated, so it is never accessed by the system during regular operation. This sector can then be used by manipulated firmware as a digital trap [61, https://youtu.be/8Zpb34Qf0NY?t=1095] and trigger any combination of — but not limited to — the following anti-forensic measures:

• only return zeros when reading specific sectors of the HDD (i.e., sectors containing evidence)

• only return zeros on all reads from the device

• return zeros on all read request and overwriting the requested data with zero on the HDD

All the above measures impact the quality of the forensic results obtained from the device. While the first two are very subtle, they should, in general, be enough to stop forensic expert working according to the current state of the art from discovering evidence. But they still leave the evidence on the device. While measure three is not subtle, it will destroy the evidence. Given that imaging a drive is often automated and takes time, it is unlikely a forensic analyst realizes that the data has been overwritten after the drive was imaged. In any case, the current state of the art does not consider such attacks and hence does not protect from them. Special equipment such as a write blocker will not prevent the drive itself, but only the investigator, from (accidentally) writing to the disk. Other practices, such as hashing the image, will also not help, because the data was already manipulated on the device. Imaging the device twice (or even three times), first without HPA, then with HPA (and then DCO) enabled, could, depending on the type of anti-forensic trap used, yield discrepancies between each imaging result. For example, the first time the partition table is read the data is read correctly. Then, once the sector after the partition table functioning as the trap is read the partition table will suddenly be zeroed out. However, in case the data was erased during the first imaging process, noticing discrepancies at the second imaging process is too late.

1Even though not named on the publication, he admitted to being an “unindicted co-conspirator” [83, https:// youtu.be/xneBjc8z0DE?t=550].

20 3.2 Hard disk firmware bootkit

3.2.3 Detection (in EEPROM)

Detecting manipulated firmware must be done via hardware, i.e., physical access. This is because compromised firmware could simply acknowledge requests that new — non-compromised — firmware has been written successfully when the investigator tries to upload such non-compromised firmware to the device. The compromised firmware could use the Service Area to store a shadow firmware, which is returned on read requests and overwritten on write requests. All, of course, without actually touching the real firmware on the controller. One possibility to verify firmware with software only access would be to upload firmware with a characteristic behavior, e.g., returning a magic value when reading a specific sector. This way one could make sure that at least this functionality of the firmware is executed and then assume the rest of the uploaded firmware is also executed. However, this gives no 100 % assurance, because compromised firmware could only temporarily load the newly uploaded firmware and at a later point switch back to the compromised firmware, e.g., via a command string written to disk as detailed in Section 3.2.1.2 on page 19. This is possible because the Service Area provides enough storage for multiple versions of the firmware. Here it is critical to acknowledge the fact that the Service Area can only reasonably be read through the controller itself and hence the firmware. This means without physical acquisition of the firmware binary no 100 % assurance can be given. However, in this section we only look at the currently easily available HDD firmware compromise methods, i.e., a bootkit residing within the EEPROM. A deeper HDD compromise down into the system and overlay modules within the Service Area is outlined in Section 3.3 on page 28. To detect a bootkit as that follows the scheme outlined by sprite_tm [37], the firmware loader residing in the EEPROM must be verified. For this three steps are needed: identifying the EEPROM chip, reading the EEPROM chip and last but not least verifying its contents.

3.2.3.1 Identifying the EEPROM

Figure 3.2: Identifying the EEPROM of a Western Digital WD3200AAKX HDD

The EEPROM can be identified by reading the part number on the chip. Because the marking is often very small it is advised to read the part number from a picture. Figure 3.2 shows a picture of the Western Digital WD3200AAKX EEPROM, taken with an inexpensive of the shelf consumer smartphone (LG L70). The direction of light is important to maximize the readability of the markings. As can be seen from the picture the EEPROM of this Western Digital WD3200AAKX is a Programmable Microelectronics Corporation Pm25LD020 chip. A list of chips used within different HDDs can be seen later in Table 3.1 on page 33.

21 3 Persistent data acquisition (from hard drives)

3.2.3.2 Reading the EEPROM

The process of reading the firmware on our Western Digital WD3200AAKX example HDD is straightforward and can be done without desoldering the EEPROM chip. We used a commercially available SOIC-8 programming clamp and the Autoelectric MiniPRO TL866CS programmer. Once the controller PCB has been removed from the HDD the SOIC-8 programming clamp can simply be clipped to the EEPROM chip as can be seen in Figure 3.3. This process is referred to as in-circuit programming. Figure 3.4 depicts the reading of a desoldered EEPROM chip.

Figure 3.3: Reading an EEPROM with an in-circuit Figure 3.4: Reading the EEPROM after it has been programming clamp without desoldering desoldered from the HDD’s PCB

To increase transparency, we used the open source minipro software by Valentin Dudouyt instead of the official but by Autoelectric. The MiniPRO programmer supports over 10 k different EEPROM chips. It only supports 3.3 V SPI EEPROM chips, however, some EEPROM chips are 1.8 V. To read those 1.8 V chips — or other chips unreadable by the MiniPRO — we used a Bus Pirate v4.0 by Dangerous Prototypes in combination with the open source flashrom software. Table 3.1 on page 33 shows all HDDs we tested for compatibility with this EEPROM acquisition method. The table lists the HDD Vendor, HDD Model, the EEPROM chip used and whether or not we were able to read the EEPROM in-circuit. We were able to acquire the firmware portion stored in the EEPROM of all the devices containing an EEPROM. However, only on about half, we could acquire the EEPROM in-circuit. Nevertheless, we demonstrate the practicability of our proposal. Reading out the EEPROM in-circuit does not pose an unreasonable overhead in HDD data acquisition. We, therefore, propose that this procedure should be best practice when imaging HDDs of systems under suspicion of malware and/or rootkit compromises. Because the potential benefits, as we will show in the next sections, outweighs the effort to read the EEPROM. In case the EEPROM can not be read in-circuit, we, however, rather recommend imaging the drive without desoldering it first — unless there are strong indications that the drive is actually compromised.

Even though in-circuit reading on the WD3200AAKX was straightforward, for other hard drives it was sometimes challenging to read the EEPROM in-circuit, i.e., without desoldering. In certain board layouts applying power to the EEPROM via the SPI interface would power-up the whole board, including the processor, causing the programmer to, either abort due to over-current protection, or return all ones on reading the chip ID or data. To remedy the problem, it sometimes is possible to hold the board in reset, i.e., pull down the system reset pin of the board, as can be seen in Figure 3.5 on the facing page. Further Figure 3.6 on the next page shows what we believe to be the reset pins and grounds of the HDDs listed in Table 3.1 on page 33 requiring this procedure for in-circuit reading. We are reasonably certain these are the respective correct reset pins. However, once connecting the marked contacts, we were able to read the EEPROM without any problems.

22 3.2 Hard disk firmware bootkit

Figure 3.5: Holding a board in reset by pulling the Figure 3.6: Reset and ground pins to be connected processor’s reset pin for in-circuit reading from an ST31000340NS

Whether a board needs to be held in reset for in-circuit programming to work is indicated in Table 3.1 on page 33 with the rst footnote. On other occasions, when the from the minipro software is “IO error: expected 7 bytes but 5 bytes transferred”, we found multiple quick successive read attempts to cause the reading to work. In this case presumably, the applied voltage supplied to the EEPROM charged the decoupling capacitors which are spread over the circuit board. Consequently, after these had been charged up, the EEPROM reached its required minimum supply voltage and was able to operate correctly. In the big table, we indicate this via the rep footnote.

To verify a successful read we always performed three readouts and checked for identical bit patterns. We found this to be very important, because even slightly defective or loose contacts in either the SPI cables or the probe cables which hold the board in reset can cause read errors, which can go by unnoticed unless checked for.

Not all could be read in-circuit. Interestingly we were only unable to read the EEPROM of HDDs by and HGST. These HDD’s board layouts were basically identical. This is presumably because as of 2012 the HGST’s 3.5” HDD division is owned by Toshiba. However, all the EEPROMs of all the HDDs listed in Table 3.1 on page 33 can be read by desoldering them. Further in case a HDD is not listed as in-circuit-programmable in Table 3.1 on page 33 does not mean it is in general impossible to do so, but rather that we did not manage to do so with our equipment.

3.2.3.3 Verifying EEPROM contents

In order to determine a procedure on how to verify different firmware obtained from the EEPROM, we used our example drive the Western Digital WD3200AAKX. More specifically we used 16 different ones. We extracted the EEPROM of each and then compared the retrieved contents. From this sam- ple set we concluded that the firmware retrieved from the EEPROM can not be compared directly, i.e., via a hash or bitwise comparison. This is because the EEPROM also stores information that is different for each drive. The 16 analyzed HDDs had 4 different firmware revisions with each revision being found on 8 and 5 drives respectively and 3 firmware revisions being found only on 1 drive each.

Figure 3.7 on the following page shows cross-comparisons, via hexcompare, of different firmware binaries from different WD3200AAKX HDDs. The blue and/or green parts are identical between the two binaries. The red areas mark differences.

23 3 Persistent data acquisition (from hard drives)

(a) Same firmware revisions (b) Different firmware revisions (c) More drastically different firmware revisions

(d) Detailed view of Figure 3.7a showing the differences in more detail

Figure 3.7: Cross-comparisons via hexcompare of the EEPROM contents of different WD3200AAKXs (Blue and/or green output is identical in both, while red marks differences)

Figure 3.7a shows a comparison of the EEPROM contents of drives using the same firmware revision. Here we can observe that overall the contents is almost identical. Only some byte sequences at the end of the EEPROM are different. These byte sequences are within so-called ROYL ROM modules. ROYL refers to the drive and firmware architecture used by Western Digital for the WD3200AAKX hard disk. The ROM refers to the type of module, i.e., in this case, a module residing in the EEPROM. And last the term module refers to a part of the firmware which is organized modular in so-called modules. Here the difference is located in the modules with IDs 0x0a, 0x0d, 0x30, and 0x47. Module 0x0d contains identity information. Module 0x30 is the Service Area translator, i.e., where on the platter is the Service Area. Module 0x47 is responsible for surface adaptives, i.e., how far is the head away from the platter, etc. All these modules contain drive specific information, and hence, are different for each drive. We have also found module 0x0b (Module Directory) in the EEPROM as well. However, it was identical within each firmware revision.

24 3.2 Hard disk firmware bootkit

As is already clear at this point a bitwise comparison is, even between the same firmware revisions, not possible, as the EEPROM already contains data individual to that specific hard disk, namely the surface adaptives and SA Translator modules. However, because the position of these modules is the same for the same firmware revision, they can simply be excluded from the comparison. Using these modules to store a rootkit or other compromising code is not possible, because the variable data is first too small to hold meaning full exploitation code and second not executed. Hence, if all other portions of the EEPROM contents are identical to known good EEPROM contents, it is safe to assess that the EEPROM can also be considered good and not compromised.

Figure 3.7b on the facing page depicts a comparison of two different firmware revisions. Here the same modules as before contain differences, but also the actual firmware at the start of the EEPROM is different. Starting with the section header. It is different because some sections changed in size. Next, comes the bootstrap code, which is responsible for decompressing the other firmware sections. As can clearly be seen from the large blue section after the small initial red section this bootstrap code did not change between these firmware revisions. The following sections, however, are all completely red. Here it is important to note that this does not mean that the entire firmware changed between these revisions. Because these sections are compressed even small changes lead to different compression tables and a bitwise completely different data stream. The empty space after the sections containing the firmware is blue again, because they, for both revisions are filled with 0xff indicating empty flash blocks.

Figure 3.7c on the preceding page depicts the comparison of two different firmware revisions. Unlike the previous example, the changes between these firmware revisions can be characterized as major. First, the position of some ROYL modules changed. The modules are now located more closely after the last firmware section. The yellow frames and the arrow indicate from where to where the modules moved within the EEPROM.

Another noteworthy change is within the bootstrap code. Unlike before, the bootstrap code actually changed. Because the bootstrap code is not compressed, the unchanged sections remain identical. However, because adding and removing code changes the offset of all code following, the bootstrap code eventually also differs in our non-offset compensated bitwise comparison and is marked as red. For further discussion, it should be remembered that there are still unchanged parts of the bootstrap code showing up as blue, though.

Last but not least, it is important to acknowledge that even though the header changed, as before, its length remained the same. This means no additional sections have been added, nor have sections been removed. This indicates that the number of sections seems to be a rather fixed entity of the Western Digital ROYL firmware.

Figure 3.8 on the following page shows comparisons of clean EEPROM firmware contents with infected EEPROM firmware contents. As before, a blue output indicates that there is no difference between the clean and infected EEPROM contents, while red indicates a difference. First, in Figure 3.8a on the next page we compare the same EEPROM contents but before and after being infected with the rootkit as per sprite_tm’s proof of concept rootkit code [36]. Unlike before almost the entire of the EPROM contents changed, save the empty part and the ROYL modules. This is because sprite_tm’s bootkit adds itself as an additional section to the firmware. This causes the header at the beginning of the EEPROM to be extended by 32 bytes. This causes the sections following the header to be shifted by that 32-byte offset, thus, changing the EEPROM content from the non-infected EEPROM. Further, because the sections are shifted, their beginning addresses within the EEPROM change as well. This, in turn, causes the header fields containing these addresses to also change, which causes the for the header sections to also change. Hence, the EEPROM contents almost completely changed, compared to the contents before the infection.

25 3 Persistent data acquisition (from hard drives)

(a) Firmware before and after bootkit infection as per (b) Same firmware revision but one with bootkit in- sprite_tm’s proof of concept rootkit code [36] fection as per sprite_tm [36]

Figure 3.8: Cross-comparisons via hexcompare of the EEPROM contents of a firmware bootkit infected WD3200AAKXs (Blue output is identical in both, while red marks differences)

However, because a direct comparison of the EEPROM contents before and after infection is impracticable, we compared the infected EEPROM contents with clean EEPROM contents with the same firmware revision. The results can be seen in Figure 3.8b. We see the same changes as we saw in the case of comparing the same EEPROM contents with itself before and after infection. However, as demonstrated earlier in Figure 3.7a on page 24, the ROYL modules located within the EEPROM after the firmware differ for different HDDs, even when containing the same firmware revision. Hence, we also see some changed byte sequences towards the end of the EEPROM.

We can now, with the above knowledge, formulate a procedure to verify EEPROM contents x against a set of known-good EEPROM contents G = {g1, g2, . . . , gn}: • If x ∈ G the EEPROM x is good. In fact, it is one of the samples. • For each g in G check – whether x ≈ g with only differences in the ROYL modules after the firmware sections. If this is the case then the EEPROM x is good. – whether the section header in x contains more than one section with block code number 0x5a, i.e., main loader section. If this is the case the EEPROM x is definitively infected with a bootkit. • Otherwise, the EEPROM x can not be verified as good. Here it is important to understand that it is not enough to verify that the section header at the start of the EEPROM does not contain a second main loader section, i.e., a second section with block code number 0x5a. While this is enough to detect the prototype bootkit by sprite_tm, it is not enough to detect a more elaborate bootkit. In theory, a bootkit could hide in one of the compressed sections. Hence, even an EEPROM comparison looking like the top part of Figure 3.7c on page 24 with the bottom part looking like Figure 3.7b on page 24, does not indicate a clean EEPROM. The clean state can only be assessed via the above procedure.

Because the firmware is not digitally signed it is hard to be 100 % certain a firmware is not compromised, making it hard to build a set of known good EEPROM firmware samples. There are several ways to verify an EEPROM as good: • Extract the EEPROM from vendor updates. – This seems like the most convenient way. It can also be secure in case the vendor update is digital signed. However, HDD firmware is usually not regularly updated via vendor updates.

26 3.2 Hard disk firmware bootkit

• Reverse engineering the EEPROM contents and verify it does not contain any unintentional functionality. – This is the safest. This ensures even against compromise via vendor-supplied firmware. However, it is impracticable. Goodspeed acknowledged that it took the authors behind Implementation and implications of a stealth hard-drive backdoor [173] “ten man months” [61, https://youtu.be/8Zpb34Qf0NY?t=1396] to reverse engineer the firmware of a Seagate Barracuda HDD. He further stated that they “killed 15 hard disks” [61, https:// youtu.be/8Zpb34Qf0NY?t=1442] during their research. And this was to develop a hook- based bootkit for the HDD. Verifying any and every functionality within the firmware could potentially take much longer. This makes it impracticable for a productive forensic context. It may be applicable within an audit for highest security environments.

• Determining good EEPROMs via clustering. – While this approach can not provide 100 % assurance, it provides a reasonable trade-off between practicability and certainty. The basic idea to build a set of good EEPROMs for the WD3200AAKX is as follows: 1. Collect EEPROM contents of as many WD3200AAKXs as possible into your test set T . 2. Add each EEPROM sample to its own cluster. 3. Compare the EEPROM contents pairwise via hexcompare. 4. If the comparison indicates the same firmware revision, i.e., the result looks similar to Figure 3.7a on page 24, with only differences in the ROYL modules 0x0a, 0x0d, 0x30, and 0x47 join the clusters of the two compared EEPROM samples. 5. Define a trust-threshold t. 6. Add one EEPROM sample of each remaining cluster to your good WD3200AAKX EEPROM sample set G if the number of EEPROM samples within the cluster is above your trust-threshold t. This scheme only works if no more than trust-threshold t many EEPROMs within the test set T are compromised. Hence, it is advised to not included HDDs that are suspected to be compromised in test set T . Ideally, test set T should only contain samples from a good source, e.g., bought directly from the vendor. Obviously, this does not protect against a bad vendor, unlike the reverse engineering method.

3.2.4 Subverting A bootkit compromise as outlined previously can be subverted by replacing the compromised EEPROM contents with legitimate EEPROM contents. This can be done by writing the EEPROM via SPI. This is the inverse process of the EEPROM reading as outlined in Section 3.2.3.2 on page 22. The same rules regarding in-circuit programming as discussed for reading also apply to writing. Another possibility is to replace the EEPROM by desoldering it, or replacing the entire PCB of the HDD. This process is often done by specialist in case of a bad EEPROM or bad PCB. Here it is important to leave the drive specific data from the ROYL modules 0x0a, 0x0d, 0x30, and 0x47 from the compromised EEPROM intact, because otherwise the correct functionality of the drive can be impacted. In case the EEPROM or PCB is swapped, this data must be copied to the replacement EEPROM.

3.2.5 Investigating To investigate the actual firmware rootkit is a complex task. As noted earlier, Goodspeed acknowl- edged that it took ten man months to reverse engineer the firmware of a Seagate Barracuda HDD. This firmware is not obfuscated in any way. So if the task is to investigate a sophisticated firmware rootkit this time could potentially be increased dramatically.

27 3 Persistent data acquisition (from hard drives)

3.3 Overlay/module-based hard disk firmware rootkit

In the previous section, we outlined how the boot loader code stored in the EEPROM can be verified to detect and combat firmware bootkits. However, as already suggested in the previous section, the EEPROM is not the only place where the HDD stores its firmware and hence malware can hide. After the boot loader code loaded from the EEPROM initialized the HDD enough to read from the spinning platter, further firmware code is loaded from the Service Area. Western Digital calls these additional firmware blocks modules, while Seagate calls them overlays. A firmware rootkit hiding in the Service Area is harder to detect than the previously discussed bootkit inside the EEPROM.

3.3.1 Verifying overlays/modules In order to detect an infected overlay and/or module, the overlays and modules must be verified. The general ideas of Section 3.2.3.3 on page 24 can be reused to verify the overlays and modules. The challenging part is acquiring the overlays or modules without the risk of a rootkit interfering in the acquisition process. As outlined before a compromised firmware could simply return the original firmware when it is read, and keep it in a shadow copy for write requests to it — all while continuing to run the compromised firmware. We now outline the possibilities to read the Service Areas on our WD3200AAKX example HDD. We also provide an alternative memory analysis procedure that can be used to analyze a hard disk.

3.3.1.1 Reading Service Area To read the Service Area of the WD3200AAKX there exist two possibilities. While the first relies on the potentially compromised firmware, the second can deliver 100 % trust, but requires a large amount of engineering.

Vendor Specific Commands (VSC) The Service Area can be read via SATA by issuing vendor specific commands (VSCs). Hiding Data in Hard-Drive’s Service Areas by Berkman [13] outlines how the Service Area can be read from a Western Digital WD2500KS HDD. They also published their source code, which can be adapted to be used with the WD3200AAKX. The Service Area modules can further be read with the commercial PC-3000 tool suit.

The problem with this method of reading the Service Area is that the vendor specific commands send via SATA to the HDD, are interpreted by the very firmware running the HDD. If that firmware is compromised, it can simply deny the read requests or return a clean copy of the original firmware modules, etc. Hence, this method is not anti-forensic resistant enough to withstand very deep and elaborated rootkit compromises. The problem is complicated by the fact that the parts of the firmware responsible for providing the SATA interface are located in modules stored in the Service Area. Meaning that this firmware code can not be verified within the EEPROM. Hence a way to access the Service Area without using the firmware located within it must be devised.

Custom Boot Loader The only way to read the Service Area without relying on the firmware located within it is to use a custom boot loader. To this end, the EEPROM can be programmed via physical access, again programming via software will invoke potentially compromised firmware on the HDD. This is not a trivial task. However, because the HDD runs any code without any signature checks whatsoever, it boils down to reverse engineering how the firmware accesses the Service Area and reimplementing this functionality into a minimal boot loader, which dumps the contents of the Service Area. Many HDDs have a UART interface which can be used to load additional code in case the EEPROM provides insufficient space, and to possibly dump the Service Area contents. Table 3.1 on page 33 provides an overview of which HDDs contain such a UART interface. Many HDDs also contain the testing and debugging interface JTAG, which can aid in developing such a custom boot loader. Whether a HDD has JTAG is also documented in Table 3.1 on page 33. The documentation lists the IDCODE that should be expected to be found when

28 3.3 Overlay/module-based hard disk firmware rootkit performing an IDCODE scan on the JTAG interface. Again, if Table 3.1 on page 33 does not list UART or JTAG on a specific HDD does not mean it does not have JTAG or UART. It simply means that during our investigations we have not found such interface or could not utilize it. We outline JTAG in more detail in Section 3.3.2.1.

3.3.2 Memory analysis

Because developing a custom boot loader to dump the Service Area without involving potentially compromised firmware, we propose a memory analysis as an alternative. This can be compared with doing a live analysis on a computer system — in this case the HDDs processor.

3.3.2.1 Dumping memory via JTAG

Figure 3.9: JTAG pin layout of the WD3200AAKX Figure 3.10: Wires soldered to the JTAG pins of a WD32000AAKX

Figure 3.11: A MICTOR 38 pin connector soldered Figure 3.12: UART pin layout of Seagate HDDs to the JTAG test pads of a WD32000AAKX

As mentioned before, many hard disks contain the testing and debugging interface JTAG. We can use the JTAGulator hardware to determine the JTAG pinout of suspected JTAG pins. Figure 3.9 shows the JTAG pin layout of our example WD3200AAKX HDD. Figure 3.10 shows wires soldered to the JTAG pins, which are then connected to the BusBlaster v4 JTAG interface. In Figure 3.11, a MICTOR connector was soldered to the WD3200AAKX. This not only provides a more convenient access, but also is easier to solder to the pads than individual wires. This means that the soldering skills only need to be moderate to achieve a JTAG connection.

29 3 Persistent data acquisition (from hard drives)

With OpenOCD we can then dump the relevant firmware files. The used OpenOCD commands can be seen in Listing 3.1. The mask ROM boot loader is mapped into memory at offset 0xffff0000. It is dumped via the first command. Then the firmware sections of the EEPROM are dumped. dump_image bootrom.bin 0xffff0000 0x10000 dump_image bootsection0.bin 0x1b000 0x1aa0 [...] Listing 3.1: OpenOCD command to dump memory sections of a WD3200AAKX HDD The addresses of the firmware sections in memory can be extracted from the header in the EEPROM. This can, e.g., be done with sprite_tm’s fwtool [36, http://spritesmods.com/hddhack/hddhack. tgz] via ./fwtool -i .bin. The “Block load vaddress” of each section in the output is the memory address to which that section is loaded by the boot loader. Next, the same procedure can be used to dump the entire address space and verify code integrity by reverse engineering. Because the firmware is capable of loading additional modules from the Service Area at any time, the memory should be dumped periodically and analyzed for changes.

Even though this can also not provide 100 % assurance, it is more feasible, to begin with, than developing a boot loader. But ultimately this form of memory analysis should be used to reverse engineer the hard disk under investigation to the point where developing custom boot loader code is possible. This is especially true, since all HDD controller processors we tested via JTAG were all MMU-less, i.e., not using a memory managing unit (MMU) to restrict memory accesses. This means any process running on the controller can modify any and all code and data in RAM. This means the only way to truly assess non-compromise is when each and every instruction executed by the processor is verified before execution.

3.3.2.2 Dumping memory via UART While not anti-forensic resistant because it involves the firmware of the HDD, is the possibility to leverage the debugging console provided by Seagate HDDs via their UART interface. Figure 3.12 on the previous page shows the UART pin layout. Table 3.1 on page 33 lists the baud rates and stop bit configurations of the tested Seagate HDDs. Booting the HDD should output a message similar to the one listed in Listing 3.2. Once this message is displayed commands can be entered. Listing 3.3 on the facing page displays an example session. First, Ctrl+Z is pressed to activate the ASCII Diagnostic Mode. This is acknowledged by the device by displaying the F3 T> prompt. From there further commands can be issued. In the example the Diagnostic Command Level is changed to Level C by entering /C. Then the ASCII Command Information is displayed via Q.

Boot 0 x40M Up TCC-001C[0x000065B4][0x00006A20][0x00006E8C] Trans .

Rst 0 x40M MC Internal LPC Process Spin Up TCC -001 C (P) SATA Reset

MCMainPOR: Start: Check MCMT Version: Current MCMainPOR: Non-Init Case Reconstruction: MCMT Reconstruction Start Max number of MC segments 0A61 Nonvolatile MCMT sequence number 000161C9 [ RSRS ] 07 C8 Reconstruction: Completed 1: [MCMTWS] MCMainPOR: MCTStateFlags 0000002A MCStateFlags 00005141 MCMainPOR: Feature Enabled...

[SR] 0000 PowerState = IDLE1 Listing 3.2: Boot message on UART from Seagate ST2000DM001

30 3.4 Discussion

F3 T >/C

F3 C>Q

Online CR: Rev 0011.0000, Flash, Abort [...] Online ^C: Rev 0011.0000, Flash, Firmware Reset [...] Online ^Z: Rev 0011.0000, Flash, Enable ASCII Diagnostic Serial Port Mode [...] All Levels ’/’: Rev 0001.0000, Flash, Change Diagnostic Command Level, /[Level] All Levels ’+’: Rev 0012.0000, Flash, Peek Memory Byte, +[AddrHi],[AddrLo],[NotUsed],[NumBytes] All Levels ’-’: Rev 0012.0000, Flash, Peek Memory Word, -[AddrHi],[AddrLo],[NotUsed],[NumBytes] All Levels ’=’: Rev 0011.0002, Flash, Poke Memory Byte, =[AddrHi],[AddrLo],[Data],[Opts] [...] Level 1 ’S’: Rev 0011.0001, Flash, Edit Processor Memory Byte, S[AddrHi],[AddrLo],[MemValue],[NumBytes],[Opts] Level 1 ’U’: Rev 0011.0001, Flash, Edit Buffer Memory Byte, U[AddrHi],[AddrLo],[MemValue],[NumBytes] Level 1 ’e’: Rev 0011.0000, Flash, Spin Down and Reset Drive, e[MsecDelay],[Opts] Level 1 ’m’: Rev 0011.0001, Flash, Edit Processor Memory Word, m[AddrHi],[AddrLo],[MemValue],[NumBytes],[Opts] [...] Level C ’Q’: Rev 0001.0000, Overlay, Display ASCII Command Information, Q[CmdLevel],[Cmd] [...] Level T ’[’: Rev 0011.0000, Overlay, ASCII Log Control, [[LogFunction],[Log]

Listing 3.3: Displaying the available commands over the ST2000DM001’s UART

The interesting commands to peek and poke, i.e., read and write memory, are + to peek a byte and = to write a byte. Zaddach et al. [173] were able to use this simple interface to place a GDB stub into the firmware and debug it further. Seaget is a project that uses this UART memory peeking interface to dump the memory and buffers of a Seagate HDD, hence also dumping the firmware as it resides in memory.

3.4 Discussion

Before concluding this chapter we would like to discuss the various compromises and how either our work or already existing research can be used to detect the compromise.

3.4.1 Compromise in EEPROM

A compromise located within the EEPROM, such as firmware bootkits, can be easily detected via the methods outlined in Section 3.2.3 on page 21. Because currently, all publicly available source code with regard to HDD rootkits use a bootkit within the EEPROM, verifying at least EEPROM integrity during forensic investigations seems like a reasonable addition to HDD acquisition.

3.4.2 Compromise in Service Area

As we have also outlined in this chapter more elaborate malware residing in the firmware overlays or modules waiting to eventually be triggered is also possible. We propose the development of custom boot loaders to dump potentially infected overlays and/or modules from the Service Area. However, due to the high complexity and unavailable documentation of the HDDs internal structures, such a task is not trivial. Our proposed memory analysis method can, however, be used to search for malware during firmware execution. But this is no automated task and requires a large amount of reverse engineering.

3.4.3 Compromised controller hardware

What has not been discussed in this chapter is a hardware compromise. This means that the controller chip itself is compromised. Either the chips mask ROM has been compromised during manufacturing or the chip itself has been replaced with an optical identical but functional different chip. One example of such component replacement is the NSA’s FLUXBABBIT project. Com- promised hardware can either be detected via differential power analysis and related techniques [135] or depending on the type of compromise via X-Ray [45]. A destructive attestation method would be to “decap” the chip and successively remove each of the chip’s layers and use a scanning electron microscope to verify the chip’s individual gates [112, 113, 128, 150].

31 3 Persistent data acquisition (from hard drives)

However, hardware compromise is very unlikely, as this would mean redeveloping the entire chip with compromising enhancements. Hence a more likely scenario is the compromise of the boot ROM stored within the controller chip. We were not able to verify the boot ROM directly. However, it can be verified via JTAG by reading it as mapped into memory at offset 0xffff0000. But this assumes the JTAG hardware itself isn’t compromised. To rule out any hardware compromise in the controller the entire controller chip can be replaced with a controller chip of a known non-compromised hard-disk. In fact, for simplicity, the entire controller board can be replaced with a known non-compromised controller board. In such a case, the HDD specific calibration data and references to the firmware overlays in the Service Area must, however, be copied from the old EEPROM onto the new EEPROM, as already outlined in Section 3.2.4 on page 27.

3.4.4 SSDs While this chapter was demonstrated with a WD3200AAKX HDD, parts of this chapter are also applicable to SSDs. To demonstrate this we also tested SSDs. These are listed in Table 3.1 on the next page. While we were not able to find an SSD that contained firmware on an EEPROM, we were able to find JTAG interfaces on most of them.

3.5 Conclusion and future work

After presenting the state of the art in hard drive firmware rootkit compromise, outlining detection and ramification measures for bootkits residing in the EEPROM and also a presentation of possible detection methods for more sophisticated rootkits hiding within overlays or modules in the Service Area, we would like to conclude this chapter. While we do not completely solve the problem of hard drive rootkits, it is the first work to investigate such hard drive compromises from a forensic perspective. It thereby offers the forensic community a first starting point for immediate measures, such as saving and potentially verifying the EEPROM contents when acquiring hard drives. This way EEPROM residing bootkits can be ruled out during investigations. This is important because, as has been shown in this chapter, the knowledge on how to implement these bootkits is publicly available. Hence, this form of hard drive compromise may no longer be the reserved domain of government attackers, but may eventually end up being used by criminals. Future work in this area is the establishment of a known good firmware . Maybe even support by VirusTotal (an online malware scanning service) to scan hard drive firmware samples. They already provide a possibility to scan BIOS firmware binaries. Another important future step is verifying the overlays and modules stored in the Service Area without relying on the firmware of the drive itself and also, for transparency sake, without relying on proprietary software. Vendors are also responsible for ensuring their firmware can not be trivially manipulated anymore. To this end, digital signing, as already proposed by Seagate in a technology [130], must become common practice. However, to not be at the mercy of trusting the vendor, the outlined firmware verification, especially of the Service Area must still be pursued.

32 3.5 Conclusion and future work xmod xmod xmod dbg dbg dbg dbg dbg dbg 115,200-8N1 115,200-8N1 115,200-8N1 hdr 0x4ba00477 0x4f1f0f0f 0x121003d3 0x121003d3 0x121003d3 MiniPROMiniPRO 38,400-8N1 9,600-8N1 ); in-circuit programmability (IC); SPI reader used for in-circuit reading (if rst rep dd IC Reader JTAG UART dd 3.3 yes MiniPro 0x0 possible); IDCODE found via JTAG; baud rate and stop bit configuration for UART interfacing; (where applicable) SamsungSuper Talent STT_FTM32GX25H MZ-75E500 - - Micron MTFDDAC128MAM-1J1 - ? HGSTHGSTHGSTSamsung HUA722020ALA330 HUA722020ALA330 HUA723020ALA640 SP2504C LE25FU206 PM25LD020 3.3 LE25FS406 3.3 no no 1.8 no - 57,600-8N1 SeagateSeagateSeagate ST10000DM003Seagate ST10000DM003Seagate ST2000DM001Toshiba ST31000340NSToshiba ST3320620ASWD LE25S81QE DT01ACA100WD W25Q80.W DT01ACA300WD 1.8 EN25S40WD 3.3 no W25X40L WD3200AAKX yes AT25F512AN WD3200AAKX 1.8 PM25WD040 3.3 3.3 WD5000AAKX Bus Pirate yes PM25WD040 3.3 WD800BEVS-08RS yes yes 3.3 no Bus W25X40 Pirate no PM25LD020 - W25X20BL 3.3 3.3 3.3 yes yes yes MiniPRO | MiniPRO Bus Pirate MiniPRO 38,400-8N1 38,400-8N1 38,400-8N1 Analyzed HDDs: listing their EEPROMs with the corresponding voltages (V Xmodem protocol for downloading code. Must be enabled via shorting pins. Fully fledged debugging console. Note: The baudrate can be changed. Did not contain any firmware. EEPROM was zeroed. SSD has labeled JTAG header, but JTAGulator’s IDCODE scan fails. Multiple quick read attempts must be issued. Or an external 3v3 supply may be used to overcome the power issues. Controller must be held in reset. SSD SSDSA2CT040G3 W25X40C Type Vendor Model EEPROM V HDD HGST HTS541010A9E680 MX25L4006E 3.3 no rst rep dbg xmod 0x0 hdr Table 3.1:

33

4 Memory acquisition evaluation

With increased use of forensic memory analysis, the soundness of memory acquisition becomes more important. Because an unsound memory acquisition can reduce the quality of the memory analysis, sound memory acquisition is also important to prevent accidental self-inflicted anti-forensics. In this chapter, we, therefore, present a black-box analysis technique in which memory contents are constantly changed via our payload application with a traceable access pattern. This way, given the correctness of a memory acquisition procedure, we can evaluate its atomicity and one aspect of integrity as defined by Vömel and Freiling [163]. We evaluated our approach on several memory acquisition techniques represented by 12 memory acquisition tools using a Windows 7 64-bit operating system running on an i5-2400 with 2 GiB RAM. We found user-mode memory acquisition software (ProcDump, Windows Task Manager), which suspend the process during memory acquisition, to provide perfect atomicity and integrity for snapshots of process memory. Cold boot attacks (memimage, msramdump), (VirtualBox) and emulation (QEMU) all deliver perfect atomicity and integrity of full physical system memory snapshots. Kernel level software acquisition tools (FTK Imager, DumpIt, win64dd, WinPMEM) exhibit memory smear from concurrent system activity reducing their atomicity. Their integrity is reduced by running within the imaged memory space, hence overwriting part of the memory contents to be acquired. The least amount of atomicity is exhibited by a DMA attack (inception using IEEE 1394). Further, even if DMA is performed completely in hardware, integrity violations with respect to the point in time of the acquisition let this method appear inferior to all other methods. Our evaluation methodology is generalizable to examine further memory acquisition procedures on other operating systems and platforms.

4.1 Introduction

Volatile memory (RAM) is an increasingly valuable source of digital evidence during a forensic investigation. Not only are cryptographic keys for full disk encryption kept in RAM, but also many other pieces of information like the list of running processes and the details of active network connections are kept in RAM and are lost, if the computer would be simply turned off during evidence collection. There are many ways to acquire volatile memory on standard desktop and server systems today [162]. The possibilities range from software-based methods with tools like DumpIt, WinPMEM, over DMA attacks [11], up to cold boot attacks [72]. All these methods have their advantages and disadvantages. On the one hand, while software-based methods are very convenient to use, they can be subverted by malware [145]. On the other hand, DMA and cold boot attacks are often defeated by unfavorable system configurations (BIOS passwords or inactive DMA ports) or technology-immanent problems. Overall, these hindrances might produce memory images that are not forensically sound. To what extent this happens, is still rather unclear. To address this point, Vömel und Freiling [163] integrated the many different notions of forensic soundness in the literature into three criteria for snapshots of volatile memory: (1) correctness, (2) atomicity and (3) integrity. All three criteria focus on concrete requirements that are motivated from practice: • A memory snapshot is correct if the image contains exactly those values that were stored in memory at the time the snapshot was taken. The degree of correctness is the percentage of memory cells that have been acquired correctly. • The criterion of atomicity stipulates that the memory image should not be affected by signs of concurrent activity. It is well known that unatomic snapshots become “fuzzy” [96]. The degree of atomicity is the percentage of memory regions that satisfy consistency in this respect.

35 4 Memory acquisition evaluation

• A snapshot satisfies a high degree of integrity if the impact of a given acquisition approach on a computer’s RAM is low. For instance, by loading a software-based imaging utility into memory, specific parts of memory are affected and the degree of system contamination increases (and consequently, integrity decreases).

All three criteria were formally defined and shown to be independent of each other. With these criteria, it was now possible to measure and not only estimate the forensic soundness of snapshot acquisition techniques. This was then done by Vömel and Stüttgen [164] for three popular memory acquisition utilities: win32dd, WinPMEM, and mdd. Their study exhibited some correctness flaws in these tools (which were later fixed), but also showed that their level of integrity and atomicity was all quite similar. The reason why Vömel and Stüttgen [164] only evaluated three software-based acquisition methods lies in their measurement approach: They used the open-source Intel IA-32 emulator Bochs running a Windows XP SP3 on which the acquisition utilities ran. The utilities were instrumented such that every relevant event was recorded using a hypercall into the emulator, thus enabling the measurement. Naturally, this white-box measurement approach was only possible for tools that were available to the authors with their source code, thus severely restricting the scope of their measurement. It is clear that approaches such as DMA and cold boot attacks can only be measured using a black-box approach. Furthermore, these measurements were performed in a situation where the Windows system was basically idle, thus giving a lower-bound measurement. The impact of system load on the quality of memory acquisition is not yet precisely known.

4.1.1 Related work

Vömel und Freiling [163] defined correctness, atomicity, and integrity as criteria for forensically- sound memory acquisition and provided a comparison matrix [162, Fig. 5] with regard to the different acquisition methods. However, they also indicate that “the exact positioning of the methods within the fields of the matrix may certainly be subject to discussion” [162, p. 7]. The first to evaluate these memory acquisition criteria were Vömel and Stüttgen [164]. As already stated they relied on a white-box methodology restricting them to open source tools. Other works using the notion of atomicity are BodySnatcher [127], HyperSleuth [100] and Vis [171], all of which try to increase atomicity of forensic memory acquisition by suspending execution of the operating system, hence reducing concurrency.

4.1.2 Contribution

In this chapter, we present the first black-box methodology for measuring the quality of memory acquisition techniques. Extending the insights of Vömel and Stüttgen [164], we take correctness for granted and focus on integrity and atomicity. Our approach allows comparing not only different software utilities with each other but also to compare them with totally different approaches like DMA and cold boot attacks. The idea of our approach is to apply the memory acquisition method to memory content that changes in a predictable way: Briefly spoken, we use a program that writes logical timestamps into memory in such a way that investigating the memory snapshot yields the precise time when a certain memory region was imaged. This allows inferring an upper bound in integrity and atomicity meaning that these criteria will be at most as bad for the respective procedures.

4.1.3 Outline

This chapter is structured as follows: First, in Section 4.2 on the facing page, we revisit the main definitions of Vömel and Freiling [163]. After this, we introduce our black-box measurement methodology in Section 4.3 on page 38. In Section 4.4 on page 40 we give an overview of our experimental setup and in Section 4.5 on page 46 we outline our results. Finally in Section 4.6 on page 50 we conclude our work.

36 4.2 Background: Criteria for forensically sound memory snapshots

4.2 Background: Criteria for forensically sound memory snapshots

We briefly revisit the main definitions of Vömel and Freiling [163]. In their model, memory consists of a set R of memory regions (pages, cells, or words) and a full snapshot covers all memory regions of the system, i.e., it stores a value for every memory region in R. However, their definitions also hold for partial snapshots, i.e., snapshots that cover subsets R ⊂ R of all memory regions. Our evaluation methodology makes use of partial snapshots, therefore, we simplify the definitions towards this case. Here we disregard correctness but rather focus on atomicity and integrity.

4.2.1 Atomicity of a snapshot Intuitively, an atomic snapshot should not show any signs of concurrent system activity. Vömel and Freiling [163] formalize this by reverting to the theory of distributed systems where concurrent activities are depicted using space-time diagrams. Figure 4.1 shows an imaging procedure that runs in parallel to another activity on a machine using four memory regions R = {r1, r2, r3, r4}. Each horizontal line marks the evolution of each memory region over time, state changes are marked as events. The imaging procedure is shown as four events (marked as squares) that read out each memory region sequentially. Concurrently, a separate activity updates memory regions (first r1, then r4, then r2, shown as black dots).

r1

r2

r3

r4

Figure 4.1: Space-time diagram of an imaging procedure creating a non-atomic snapshot

Definition 2 (Atomicity [163]). A snapshot is atomic with respect to a subset of memory regions R ⊂ R if the corresponding cut through the [. . . ] space-time diagram is consistent. The imaging process always corresponds to a cut through the space-time diagram since it necessarily has to access every memory region in R. The cut distinguishes a “past” (before the snapshot) from a “future” (after the snapshot). Intuitively, a cut is consistent if there are no activities from the future that influence the past [101, p. 123]. Given this intuition, it is clear that the snapshot created in Figure 4.1 is not atomic.

4.2.2 Integrity of a snapshot Even atomic snapshots are not taken instantaneously but require a certain time period to complete. The property of integrity refers to this aspect. Intuitively, integrity ties a snapshot to a specific point of time chosen by the investigator. A high level of integrity implies that the snapshot was taken “very close” to that point in time. Definition 3 (Integrity [163]). Let R ⊆ R be a set of memory regions and t be a point in time. A snapshot s satisfies integrity with respect to R and t if the values of the respective memory regions that are retrieved and written out by an acquisition algorithm have not been modified after t.

37 4 Memory acquisition evaluation

r1

r2

r3

r4

t

Figure 4.2: Integrity of a snapshot with respect to a specific point in time t

In a certain sense, integrity refers to the “stability” of a memory region’s value over a certain time period. Figure 4.2 illustrates the idea: The example consists again of four memory regions, i.e., R = {r1, r2, r3, r4}. We assume that at time t, the imaging operation is initiated and leads to a change in the memory regions r3 and r4, as indicated by the black dots (e.g., by loading a software-based imaging solution into memory). Again, the snapshot events (when the respective memory region is read out by the acquisition algorithm) are visualized as black squares. Regarding t, the snapshot satisfies integrity for memory regions r1 and r2 but not for r3 and r4. By t, we refer to the point of time when an investigator decides to take an image of a computer’s memory. Although being highly subjective, this point of time ideally defines the very last cohesive system state before being affected (in any way whatsoever) by the imaging operation; the value of t should, therefore, mark a time very early in the investigation process. Vömel and Freiling [163, Lemma 1] show that under certain assumptions the integrity of a snapshot implies its correctness and its atomicity.

4.3 Black-box measurement methodology

As already suggested by Vömel and Stüttgen [164] we developed a black-box measurement method- ology allowing us to estimate atomicity and integrity. Our black-box methodology comprises a worst case analysis, in a high load scenario.

4.3.1 Implementation Our technical implementation of the black-box methodology is as follows: First, we implement our payload application named RAMMANGL.EXE (the name refers to “RAM mangler”). This payload application allocates memory regions. Each memory region is marked with a counter which is constantly increased by the payload application like a timer. Second, we implement an analysis framework that reads the counter value back from each region and runs statistics on them according to the estimation explained in the next section.

4.3.2 Estimating atomicity and integrity We now devise two simple measures by which it is possible to estimate the atomicity and integrity of a memory snapshot. We call these measures the atomicity delta and integrity delta. Intuitively, atomicity is bounded by possibilities to write memory regions “from the past”. Hence, the faster all memory regions are acquired after the first region was “moved into the past”, i.e., was acquired, the lesser regions can potentially be written to “from the past”. Atomicity can hence be approximated by the atomicity delta, i.e., the interval from the acquisition of the first memory region to the acquisition of the last memory region as this is the area where inconsistencies within

38 4.3 Black-box measurement methodology the image can be introduced. If no memory region has been acquired yet, all memory regions can still be freely changed because no memory regions value has been “fixed” yet. The same is true once all memory regions have been “fixed”, i.e., acquired. Any changes to the memory regions then will not introduce any more inconsistencies. Good atomicity, therefore, corresponds to the speed of taking the memory snapshot. In contrast, integrity implies that the memory snapshot is closely tied to the values that were present in memory at the point in time the investigator initiated the acquisition. Obviously, software-based methods must, therefore, have a non-zero integrity since the acquisition tool changes memory regions when loading itself into the address space of the system. Since we focus on process memory and because forensic memory acquisition tools should strive for a neglectable footprint anyway, we consider the amount of memory regions actually occupied and thereby changed by the acquisition tool to be negligible. This assumption allows us to devise the integrity delta, i.e., the average over the times required to acquire each memory region, given that imaging was initiated at acquisition point in time t = 0.

Formally, let C = (c0, ··· , cN ) be the vector of counters embedded in the N memory regions of our payload application and t is the time the acquisition was started. We can now define our two measures that indicate atomicity and integrity as follows.

Definition 4 (atomicity delta). The atomicity delta is the time span between the acquisition of the first memory region and the last memory region, formally:

(max ci) − (min cj). i j

Definition 5 (integrity delta). The integrity delta is the average time over all regions between starting the acquisition and acquiring that memory region, formally:

N P ci i=0 − t N

Both measures are illustrated in Figure 4.3. The lower these values the better the atomicity and/or integrity respectively.

r4

r3

r2

r1 Integrity ∆ Atomicity ∆

t

Figure 4.3: Atomicity and integrity

39 4 Memory acquisition evaluation

4.3.3 Intuitive examples

As an intuitive example, imagine the memory of a system to consist of four memory regions, each memory region containing one counter. Initially, the counters are all set to 0. Once the memory acquisition starts the counters are atomically increased every timer tick. So the ideal memory acquisition process should provide a memory image of C = (0, 0, 0, 0), that is all counters of all four memory regions still being in the exact state the acquisition was started. An example for an acquisition with high integrity but low atomicity would be the counter values C = (0, 0, 40, 0). This is indicated by a high atomicity delta of 40 and an integrity delta of 10. An example for an acquisition with high atomicity but low integrity, on the other hand, would have counter values C = {1337, 1337, 1337, 1337}. In fact, an atomicity delta of 0 indicates perfect atomicity. While an integrity delta of 1337 indicates a poor point in time integrity, i.e., the snapshot was acquired 1337 counter ticks after the investigator requested the memory snapshot to be acquired.

4.3.4 Practical constraints

However, because in practice it is not possible for an application to atomically update all counters, we have to introduce some constraints to our methodology. The reason why the counters can not be updated atomically is the same reason why memory acquisition software can not acquire memory atomically. This is due to modern computer systems distributing their computing and also memory access resources in form of so-called threads. Each thread being allowed to use the system only for a given amount of time then handing the system resources over to another thread. This way multiple applications can run on the system quasi at the same time. While in reality they run sequentially, if the execution is handed over between threads fast enough, usually in the order of milliseconds, interactivity, e.g., interactive graphical interfaces, can be achieved. Further can system tasks, such as processing , be run periodically. This is important to keep the system working correctly. Because most software memory acquisition software relies on parts of the system, e.g., file system or network drivers, to exfiltrate the memory to persistent storage, they can not monopolize the systems resources in order to achieve atomic acquisition. One approach around this is to subduct the operating system the control over the file system or network devices and run the memory acquisition outside the real of the operating system, either as a different OS as demonstrated by BodySnatcher [127], or a hypervisor as demonstrated by HyperSleuth [100] or Vis [171].

4.4 Experiments

In this section, we briefly outline our experiment. We start with the setup, then elaborate on issues we encountered during the execution. We then describe our solution to the issues and our opinion in how far this affects our results and/or forensic memory acquisition in general.

4.4.1 Setup

We conducted our experiments on a 64-bit Windows 7 Enterprise operating system identifiable by its build number 7600.16385.amd64fre.win7_rtm.090713-1255. The hardware used was a ESPRIMO P900 E90+ with an Intel i5-2400 @ 3.10 GHz and one 2 GiB RAM module. Of these 2 GiB physical RAM, 102 MiB were mapped above 4 GiB (see PCI hole discussion in Section 4.4.3.1 on page 42) and 10 MiB were lost to the BIOS and PCI devices. Resulting in a total of 2038 MiB physical RAM usable by the operating system. The pagefile of the system was disabled to ensure that all relevant data was residing in RAM. Due to issues of Volatility relying on kernel debugging data structures, namely the kernel debugger data block (_KDDEBUGGER_DATA64), which Volatility was unable to read in all our 64-bit RAM dumps (a known issue), we wrote our own tools, the concept behind which we will introduce in Chapter7 on page 85.

40 4.4 Experiments

The memory was acquired directly onto a removable storage device consisting of a 320 GB Western Digital WD3200AAKX-00ERMA0 hard disk inside a Inateck FD2002 hard disk docking station connected via USB 2.0. Even though the docking station supports USB 3.0 it was connected via USB 2.0 only. The disk was formatted with the Windows native file system NTFS. The disk connected via USB could sustain 90 MB/s write speeds. This was tested via dd. We copied (using dd) 4 GiB test files into the NTFS file system of the disk multiple times. In no memory acquisition test was this write speed reached, making us confident that disk I/O was not the determining factor of our evaluation. The memory acquisition tools were also started from this removable storage devices, which was mounted by Windows as E. One exception to this was Windows Task Manager’s process dumping facility, which was invoked from the system’s Windows Task Manager. For the cold boot attacks with the memimage and msramdump tools we also used an WD3200AAKX disk and the Inateck USB docking station. Inception was also invoked from a Lenovo X230t, the internal hard disk of which could sustain 250 MB/s write speeds again according to dd.

4.4.2 Sequence We conducted our experiments as follows:

1. Startup computer.

2. Enter password to login user.

3. Wait approximately 1 min for the system to settle.

4. Open cmd.exe via the Windows Start menu.

5. Enter E: to change the working directory to our removable storage.

6. Type “RAMMANGL.EXE 512 2048” but not start the payload yet.

7. Now the memory acquisition tool was started and prepared to the point where the least amount of user interaction was needed to start the acquisition. In most cases this consisted of starting another cmd.exe as Administrator, changing the current working directory to E: and then invoking the tool. In other cases, we explain the procedure in Section 4.4.4 on the next page.

8. Start the payload application RAMMANGL.EXE.

9. Start the memory acquisition tool immediately afterwards. The multitude of different acquisition tools evaluated disallowed us from automating this step. We, therefore, tried as hard as we could to tie the start of the memory acquisition as close as possible to the point of time of starting the payload application.

10. Start timing measurement with a stopwatch while not touching the system.

11. Stop time after the acquisition has completed, then close RAMMANGL.EXE (Ctrl+C).

12. Save the console output of RAMMANGL.EXE and the output of the acquisition tool to disk.

13. Note down the time on stopwatch to experiment log.

14. Restart the system.

Occasionally the hard disk was defragmented to ensure maximum write capabilities. This was done at a maximum fragmentation rate of 3 %, as reported by Windows. The was also done by the Windows system.

41 4 Memory acquisition evaluation

4.4.3 Issues In this section, we will briefly list some issues that impacted our evaluation and how we have overcome them.

4.4.3.1 PCI hole The computer system used for our evaluation remapped all memory between 2 GiB and 3.5 GiB — the so-called PCI hole — to above 4 GiB. This remapping happens on the physical address level and not just the virtual address level. The remapping is usually performed by the BIOS. This remapping was no problem for user-mode, kernel-level and physical acquisition via cold boot attacks. It, however, posed a sometimes unresolvable problem for acquisition over DMA via IEEE 1394. IEEE 1394 only provides DMA for the lower 4 GiB of memory. In some occasions, important memory structures concerning our payload process, namely the process control block, were moved to memory above 4 GiB making it impossible to reconstruct the virtual address space of the payload process. All of the tested kernel level acquisition tools acquire the whole address space, “including” the PCI hole remapped addresses from 2 GiB to 4 GiB, filling this non-existing RAM with zero bytes. While this is not an issue preventing our experiments, it impacts some of the measured values. Because we only consider the memory contents of our payload application and these are scattered throughout the low 2 GiB, the tools need half of their total runtime to acquire this memory anyway. For further analysis, we extract all timing information from the counters in the memory region of our payload application.

4.4.3.2 Inconsistent page tables In about every fifth memory dump acquired via kernel-level acquisition, we were confronted with inconsistent page tables. While almost the whole virtual address space of our payload application RAMMANGL.EXE could be reconstructed, a few pages were sporadically mismapped to virtual memory of other processes, unused physical memory or kernel memory. The reason for this is yet unknown to us, however, because all tested kernel-level acquisition tools exhibited the same behavior, regardless of the acquisition method (either using MmMapIoSpace(), the \Device\PhysicalMemory device or PTE remapping) we do not consider it to be a tool error. However, on the other hand, we also do not consider it to be an error of our framework, because we confirmed the correct assembly of our virtual address space of our payload application with Volatility and the tools we developed an introduce in this thesis in Chapter7 on page 85. To resolve this open issue we simply repeated measurements with inconsistent page tables until we acquired correct images for our analysis.

4.4.4 Analyzed methods and tools In this section, we introduce the tools and methods we evaluated with our methodology. While classical memory acquisition techniques could be divided into hardware- and software-based solutions this no longer holds true because many acquisition techniques use both hard- and software [162, 3. Acquisition of volatile memory]. We hence divide the acquisition methods into: DMA attacks, cold boot attacks, software acquisition, and virtualization.

4.4.4.1 Cold boot attack The physical memory of a system can be acquired via a cold boot attack, which were first popularized by demonstrating them practically by Halderman et al. [72]. We will outline the cold boot attack in great detail in the next chapter. But briefly outlined, in a cold boot attack an attacker is leveraging the RAM’s remanence effect. RAM does not lose its contents instantly but the charges of the storage capacitors rather dissipate over time. Making it possible to read out the stored values of the RAM even after rebooting the system. To perform the attack an attacker reboots the hardware into a small operating system capable of acquiring the RAM contents and writing them to persistent storage [72]. Halderman et al. also made the tools, we refer to as the memimage, used for their original publication available online [71].

42 4.4 Experiments

Because the computer can be reset at any time, the point in time integrity is perfect. The same is true for atomicity, because once rebooted all system activity stops perfectly conserving the RAM contents. This makes the method 100 % atomic. However, while the point in time of the acquisition can be chosen perfectly, the minimal operating system injected into the system inevitably overwrites memory. Because the size of the memimage is only 9.9 KiB, this RAM contents loss can, in general, be neglected when compared to the total 2 GiB memory size of our test system. Making the integrity also almost 100 %. Another available tool to conduct cold boot attacks is msramdump by McGrew Security. It has a memory footprint of around 22.6 KiB, i.e., it will overwrite 22.6 KiB of RAM contents. One problem with cold boot attacks are, however, bit errors that can be introduced during hard resets of the system or when trying to transplant the RAM from one system to another, see Section 5.4.3 on page 63, in which case the correctness of the acquired image is considerably impacted. Even though the theory behind cold booting and/or resetting a machine at a specific point already intuitively lets one assume perfect atomicity, we verified this with our experiments. Due to the introduced bit errors during a hard reset or transplantation attack, which would have a negative impact on our evaluation methodology, we refrain from such complex attacks and only performed a simple reset attack, as outlined in Chapter5 on page 51.

4.4.4.2 Emulation QEMU is an open source computer emulator initially developed by Fabrice Bellard. It allows operating systems to be run within an emulated environment. We called QEMU from the command line with the -monitor stdio option. This allows convenient access to the QEMU monitor on the standard terminal of the host operating system. This way the memory of the emulator can be saved by invoking the command outlined in Listing 4.1. During memory acquisition, the emulation is paused making the acquisition fully atomic. Even though this is a known and clear-cut feature of the emulator we verified the perfect correctness, atomicity and integrity of this memory acquisition method in our evaluation. pmemsave 0 0x80000000 qemu.mem

Listing 4.1: Command to dump the lowest 2 GiB of memory from QEMU into the file qemu.mem

4.4.4.3 Virtualization VirtualBox is an open source virtualization solution by Oracle. It employs Intel’s VT-x virtualization technology allowing a guest operating system to be run within a host operating system without the performance drawbacks incurred by pure emulation solutions. However, similar to emulation the memory can be acquired with perfect atomicity, correctness, and integrity. The command to save the entire address space of a virtual machine into an image file can be seen in Listing 4.2. vboxmanage debugvm ${vmname} dumpguestcore --filename ram.elf64

Listing 4.2: Command to dump the memory from a VirtualBox virtual machine ${vmname} as an ELF64 dump into file ram.elf64

4.4.4.4 Kernel-level acquisition A very popular, because simple, area of memory acquisition tools are software tools, such as WinPMEM, DumpIt, win64dd, or FTK Imager, to name a few. They allow to conveniently dump the physical memory to either a removable storage medium or transmitting it over the network. Especially popular are so-called kernel-level acquisition tools. Because in modern operating systems applications have no access to physical memory but only their own virtual address space, privileged system access from within the kernel is necessary. This is often done in the form of a driver running in kernel mode in conjunction with a user-level interface. Even though the kernel-level driver has access to the full physical memory it is hard to write memory onto disk or transmit it via the

43 4 Memory acquisition evaluation network from within the kernel. Hence, most acquisition solutions have a user-mode part used to controlling the memory acquisition driver. The user-mode part is also in charge of writing the acquired memory to an image on disk or transmitting it over the network. This can then be done with the regular of the operating system. Because kernel-level acquisition tools are essentially a part of the same system they try to acquire a forensically sound memory image from, it is a very interesting aspect how their interaction with the system impacts atomicity, integrity, and correctness.

FTK Imager Lite — we used version 3.1.1 — by AccessData is a graphical framework for live forensics. It supports kernel-level memory acquisition. FTK Imager was one of the closed source tools mentioned by Vömel and Stüttgen as one candidate for a black-box analysis [164].

DumpIt — we used version 1.3.2.20110401 — by Matthieu Suiche and MoonSols is another kernel- level acquisition tool. Its main feature is its simple usage. The program has no options nor configuration. The user starts it directly from a connected removable storage device. The start location is also the location to which the memory image is written. Due to this ease of use and the general availability DumpIt is rather popular.

The tool win64dd — we used version 1.3.1.20100417 (Community Edition) — is another kernel-level acquisition tool by Matthieu Suiche and MoonSols. We used the freely available Community Edition of the otherwise commercial product. Unlike DumpIt the tool win64dd offers several configuration options, such as acquisition method (MmMapIoSpace(), \Device\PhysicalMemory, and PTE remapping as default) or acquisition speed (normal, fast, , and hyper sonic as default).

WinPMEM — we used version 1.6.2 — by Michael Cohen is an open source kernel-level memory acquisition tool. Like win64dd it offers an option to select between different acquisition modes (physical or iospace).

4.4.4.5 DMA Memory can be acquired via a DMA attack. This attack uses a system bus such as PCI [19], PCIe [46] or IEEE 1394 [11] to perform (DMA) on a target machine. As stated earlier IEEE 1394 is restricted to the lower 4 GiB of physical memory and requires a soft- ware driver on the target system to be present. PCI is not hot-pluggable, making it a solution that needs to be pre-installed, i.e., the target system must be made forensically ready before memory can be acquired. Because DMA has to transfer individual memory pages while the system is running, which potentially changes the memory contents, the atomicity measure, as proposed by Vömel and Freiling [163], of this method, is considered to be only moderate [162, Fig. 5].

The toolset inception — we used version 0.4.0. — is a framework for DMA attacks developed by Carsten Maartmann-Moe and is available freely under the GPL license. It allows DMA attacks via IEEE 1394, also known as FireWire or i.LINK, and PCIe.

# ./incept dump [...] [\] Initializing bus and enabling SBP-2, please wait or press Ctrl+C

Listing 4.3: inception indicating initialization of the IEEE 1394 bus and waiting in order to enable the SBP-2

Initializing the IEEE 1394 bus and enabling the Serial Bus Protocol 2 (SBP-2) can, according to our experiments, sometimes take up to 10 seconds, especially when the driver is not already loaded in the victim’s machine. This is indicated by, first, the output of the inception tool as can be seen in Listing 4.3, and second by various pop-ups within the Windows operating system indicating new hardware, the initialization of new drivers and eventually the message that the new hardware is ready.

44 4.4 Experiments

4.4.4.6 User-mode

Memory can be also acquired from the virtual memory. In such case, the memory acquisition is restricted to user-level processes.

The Windows Task Manager — we used the one supplied by our test system — can be used to dump memory of processes. In such case, the Windows Task Manager writes the process dump to C:\Users\user\AppData\Local\Temp\RAMMANGL.DMP. To keep the results in line with the other results the system hard disk was also a 320 GiB Western Digital WD3200AAKX hard disk. The process is suspended while the Windows Task Manager dumps the memory, resulting in high atomicity. However, because the process must be selected within the process tab of Windows Task Manager in order to initiate the dump, we exhibited a tiny lag between starting the payload application and starting the memory acquisition, slightly decreasing integrity.

ProcDump — we used version 7.1 — from the tools is another tool that can acquire memory of a process. As can be seen from Listing 4.4 the process of which the memory should be acquired can be defined as the processes image name. Hence unlike previous user-mode dumping tools the lag between starting the RAMMANGL.EXE, acquiring its process ID and eventually invoking the dump tool is removed, helping greatly with the point in time integrity of dumps.

procdump.exe -ma RAMMANGL.EXE

Listing 4.4: Commandline used to invoke ProcDump

ProcDump further has various options used to trigger a dump, e.g., CPU load, memory usage or others going above a certain threshold, or the process exhibiting an exception. This gives an investigator a very high degree of fine tuning the acquisition’s point in time, hence increasing integrity. ProcDump offers a method called clone to dump the process memory using a concept similar to copy-on-write at the operating systems level. With this, it was possible to obtain a surprisingly perfect memory image, something we only expected to obtain from virtualization or emulation. Listing 4.5 lists the parameters used to invoke ProcDump with process cloning and reflections keeping the downtime of the process being imaged to a minimum.

procdump.exe -ma -a -r RAMMANGL.EXE

Listing 4.5: Invoking ProcDump leveraging process cloning for acquisition with minimal process suspension

Another user-level dumping tool is pmdump — we used version 1.2 — by Arne Vidstrom. Unlike the other tools it does not suspend the process. We mostly just used it to spread the spectrum of our evaluation by introducing yet another kind of memory acquisition technique. Because the process is not suspended, pmdump exhibits memory smear resulting in reduced atomicity. Obviously, in most scenarios ProcDump leveraging cloning should be the preferred tool for acquiring a single process’ address space.

45 4 Memory acquisition evaluation

4.5 Results

We now present our results. We first outline our measurement accuracy. Then we give selected examples of our results. Eventually, we give an overview comparison among the acquisition methods.

4.5.1 Measurement accuracy We repeated all measurements until the relative standard deviation of the mean value of all counter values was below 10 %. This ensures that 95 % of all possible repeated measurements end up within 20 % of the mean value of our measurements. Given that our objective is not to evaluate different kernel-level acquisition software with each other, as they are very close to each other as already outlined by Vömel and Stüttgen [164], but rather to evaluate the overall state of memory acquisition these figures are good enough as each comparison group is far enough apart to not fall within 20 % of one another.

4.5.2 Individual results First, we give a brief insight into individual results.

4.5.2.1 pmdump Because the resulting image seen in Figure 4.4 provides an expected visual, we start by introducing the results of pmdump. This also helps to explain the way in which we visualize the measurements: As can be seen from Figure 4.4 it does take some time, i.e., counter increments (x-axis), until the tool starts to acquire some memory. This is first due to the fact that the tool needs the process ID of our payload application in order to dump its memory. Hence our payload application was running for some seconds during which its process ID was determined. Second, the pmdump tool seemed rather resource intensive slowing the overall system down, hence also slowing its own dumping process down. It can also be seen from Figure 4.4 that the tool acquires the virtual address space, as the y-axis depicts the counters spread along the virtual addresses of the process.

Figure 4.4: Acquisition plot of pmdump

Table 4.1 on page 49 gives the worst case figures, i.e., the maximum figures obtained in all runs, for our atomicity and integrity delta.

46 4.5 Results

4.5.2.2 inception

Figure 4.5 is quite a different picture compared to pmdump’s Figure 4.4 on the preceding page. Figure 4.5 shows inception in comparison to other acquisition methods. The x-axis shows time while the y-axis shows the memory regions. Each dot indicates the point in time when a specific memory region was acquired. From this, it can clearly be seen that inception is the least atomic acquisition — and also the overall slowest. Because inception acquires the physical memory its acquisition plot looks rather scattered and not as orderly as the plot of pmdump. This is because Windows does not map sequential virtual addresses to sequential physical addresses. It rather keeps physical memory in a heap and allocates individual pages. Due to the unit (MMU) of the CPU this translation usually happens transparently without loss of performance. Again Table 4.1 on page 49 gives the worst case figures regarding atomicity and integrity delta for inception.

Figure 4.5: Memory acquisition technique comparison (acquisition plot)

4.5.2.3 DumpIt

As can be seen from Figure 4.5 DumpIt is less smeared than inception via IEEE 1394. However, it is still behind super atomic methods such as cold boot attacks and virtualization, represented by VirtualBox, which are both indistinguishable vertical lines to the far left close to the 0 point on the x-axis. We selected DumpIt here as a representative of kernel-level acquisition methods as they all are very closely related and there is no point in illustrating virtually identical acquisition methods next to each other.

47 4 Memory acquisition evaluation

4.5.3 Atomicity and integrity comparison

In summary, it can be argued that there is quite a difference regarding atomicity and integrity considering different acquisition methods. However, as can clearly be seen from the acquisition density plot in Figure 4.6 that the different methods seem to cluster. For example, the kernel-level acquisition methods are all pretty identical with regard to atomicity and integrity, as well as acquisition speeds. One other group would be DMA acquisition, here represented by the inception toolkit. Then user-mode dumping, which should be split into methods suspending process execution and methods that do not. Last but not least are the ultra high atomicity methods, which can not even be distinguished anymore in Figure 4.6 because they all cluster along the y-axis. These are virtualization and emulation methods and the physical cold boot RAM attacks.

Figure 4.6: Memory acquisition technique comparison (acquisition density plot)

·104

inception 2

1.5

1 FTK Imager DumpIt Integrity Delta win64dd WinPMEM 0.5 Windows Task Manager VirtualBox pmdump 0 0 1 2 3 4 ProcDump memimage Atomicity Delta ·104

Figure 4.7: Each acquisition position inside an atomicity/integrity matrix

48 4.5 Results

Table 4.1 gives a listing of all evaluated methods worst case atomicity and integrity deltas as defined in the beginning of this chapter. The table is ordered according to the above-outlined groups as well as ranked by the sum of atomicity and integrity delta within each group.

(Worst Case) (Worst Case) Atomicity Delta Integrity Delta msramdump 1 43.84 memimage 1 63.28 VirtualBox 1 26.64 QEMU 1 35.24 ProcDump -r 0 39.75 ProcDump 1 36.50 Windows Task Manager 1 728.54 pmdump 37 136.62 WinPMEM 13230 5682.24 FTK Imager 13151 5917.24 win64dd 15039 8077.54 win64dd /m 1 15039 8172.28 DumpIt 15711 8500.09 inception 43898 22056.77

Table 4.1: Comparison of worst case atomicity and integrity deltas

Figure 4.7 on the facing page is the representation of Table 4.1 as an atomicity/integrity matrix. Please note that the figures within Table 4.1 can only be compared with measurements performed with the same parameters and same hardware because the counter increments are highly hardware dependent. To compare other tools we make our framework available to the public.

4.5.4 Anti-forensic resilience In this section, we discuss the anti-forensic resilience of the investigated memory acquisition techniques.

4.5.4.1 Kernel-level acquisition The problem of software tools is their trustworthiness with regard to anti-forensics [145], their requirement for target operating system administrative privileges, and their low atomicity [162, Fig. 5]. Even if special memory acquisition software such as lmap by Stüttgen and Cohen [146], which was especially crafted to resist anti-forensics, is used the software must still run on the system. The lack in atomicity gives greater risks to evidence being deleted during acquisition.

4.5.4.2 User-mode User mode acquisition tools can not be considered anti-forensic resistant at all, because, a kernel level rootkit can fully control the user-mode memory dumping program.

4.5.4.3 DMA DMA attacks via FireWire, as outlined in this chapter, can be prevented by disabling the SBP-2 protocol. This can be done by adding the line “blacklist firewire-sbp2” to any .conf file within /etc/modprobe.d/. Because the FireWire attack is already well known, modern operating systems will also disable FireWire DMA while in the lock screen. Adding to this the unfavorable low atomicity, which means great potential for evidence to be deleted during acquisition, DMA attacks, at least the variant tested in this chapter, can not be considered very anti-forensic resistant.

49 4 Memory acquisition evaluation

4.5.4.4 Emulation and virtualization Emulation and virtualization can be considered perfectly resistant against anti-forensics, because the system under analysis is already sandboxed within the virtual environment. However, a system that is not already in a virtualized environment can not be analyzed this way, reducing the applicability of this method.

4.5.4.5 Cold boot attacks Cold boot attacks provide a very high degree of atomicity and they do not depend on any target operating system administrative privileges to run. However, they may need hardware administrative privileges, if only a simple reset attack, instead of a full blown memory transplantation attack, as outlined in Section 5.4.3 on page 63, should be performed. However, as we will outline in Section 5.5.3 on page 67 we outline how such privileges can be obtained by a forensic analyst. Because, as soon as the system is rebooted, the RAM contents is out of reach of any anti-forensic software running on the system. As the integrity of the cold boot attack is very high, i.e., the point in time the memory image should be acquired can be specified very accurately, any detection attempts of the attack are too slow. The forensic analyst cuts the power to the system and acquires the RAM. There is no room for anti-forensic interference, such as deleting evidence, etc.

4.6 Conclusions and future work

We presented a practical approach to estimating atomicity and integrity of forensic memory acquisition tools. This is the first time that this was done with a black-box approach. In this way, we could also test closed source software acquisition tools. We that these evaluations are important because before conducting these experiments we had no intuition for how completely different acquisition techniques, e.g., kernel-level acquisition and DMA attacks via IEEE 1394, would compare. Currently, our results are highly tied into the hardware and RAM size. It would be worthwhile to have an independent figure representing atomicity and integrity so different tools could be compared more easily. Because the impact of non-atomic memory acquisition on memory analysis is not well studied, no tools consider the issues during analysis. The impact of non-atomic, concurrent and smeared memory snapshots on forensic memory analysis are yet unknown. However, misattribution seems to be one possible issue, e.g., when a process ends while the memory of the system is captured and during this time another process starts, it is possible that the memory contents belonging to these two processes get mixed up. In the worst case, incriminating evidence might be attributed to a different process and hence possibly different user. In a less worse case, exculpatory evidence may be missed, due to insufficient atomicity or integrity. Hence the impact of low atomicity and low integrity must be researched more thoroughly. In this chapter, we have shown that cold boot attacks compare favorably with the other memory acquisition methods, not only in atomicity and integrity but also with regard to anti-forensic resilience. Hence, in the next chapter, we will outline the cold boot attack in more detail.

50 5 Memory acquisition (via cold boot attacks)

Even though a target machine uses full disk encryption, cold boot attacks can retrieve unencrypted data from RAM. Cold boot attacks are based on the remanence effect of RAM which says that memory contents do not disappear immediately after power is cut, but that they fade gradually over time. This effect can be exploited by rebooting a running machine, or by replugging its RAM chips into an analysis machine that reads out what is left in memory. In theory, this kind of attack is known since the 1990s. However, only in 2008, Halderman et al. have shown that cold boot attacks can be well deployed in practical scenarios. In this chapter, we investigate the practicability of cold boot attacks. We verify the claims by Halderman et al. independently in a systematic fashion. For DDR1 and DDR2, we provide results from our experimental measurements that in large part agree with the original results. However, we further point out that a straightforward cold boot attack against DDR3 memory could not be reproduced. This is due to the fact that on newer Intel computer systems the RAM contents are scrambled to minimize undesirable parasitic effects of semiconductors. We, therefore, briefly outline a descrambling attack against DDR3 memory thereby enabling cold boot attacks on systems employing Intel’s memory scrambling technology. This is important, because, as was outlined in Chapter4 on page 35, cold boot attacks are unparalleled with regard to atomicity and integrity as well as anti-forensic resilience. Our test set comprises 17 systems and system configurations, from which 5 are based on DDR3.

5.1 Introduction

Contrary to widespread belief, the contents of RAM are not instantly lost after power is cut but rather fade away gradually over time. Cold temperatures slow down the decay of bits in RAM further. This effect is called the remanence effect and was first described by Link and May [98] in 1979. Since then, the remanence effect has been subject to security research several times [170, 5, 70, 134]. Theoretic attacks based on it were first proposed in the 1990s by Anderson and Kuhn [5], and were later described in detail by Gutmann [70] and Skorobogatov [134]. RAM can be classified into dynamic and static RAM (DRAM and SRAM). RAM modules in widespread PCs mostly use the DRAM technology because of its simplicity and low manufacturing cost. Gutmann explains in his work about in semiconductor devices [70] why DRAM exhibits the remanence effect. According to Gutmann, a DRAM chip consists of multiple DRAM cells where one stores the information of exactly one bit. Each cell consists of a capacitor. Each capacitor’s voltage is compared to the load of a reference cell which stores a voltage half-way between fully charged and fully empty. If the voltage of a cell is higher than the reference voltage the cell stores a one-bit; if it is lower it stores a zero-bit (or vice versa if the cell is an active low design). As the voltage of capacitors does not vanish instantly but rather decays exponentially, the bits are preserved for a short amount of time without power. The memory controller refreshes each cell’s voltage before it decays to the point where the bit information gets lost. Normally, the refresh rate is so high that each cell gets refreshed several times per second. However, the decay of capacitors is longer than the time between two refresh operations of the memory controller. This can be observed as the remanence effect, which lasts long enough that data in memory survives a system reboot. This observation prompted so-called cold boot attacks.

51 5 Memory acquisition (via cold boot attacks)

Cold boot attacks can make use of the property that the remanence effect is prolonged by cooling down RAM chips [134, 72]. Hence RAM modules are often cooled down or even frozen before those attacks. In the easiest form of a cold boot attack, the attacker reboots the system from USB thumb drive to start malicious system code that reads out what is left in memory. In a more advanced method, which becomes necessary when BIOS settings require a boot password or disallow to boot from USB, RAM modules can even be physically extracted in order to read them out in an analysis machine. In both cases, secret information, such as cryptographic keys for full disk encryption, can be retrieved from a computer’s RAM that is running or suspended to RAM. Despite the fact that the data remanence effect has been known for years, and that it has constantly been warned about, it was not until 2008 that Halderman et al. published the first practical attack based on the remanence effect in their work “Lest We Remember: Cold-Boot Attacks on Encryption Keys” [72]. Even though Halderman et al. have been cited by further research publications on the subject of cold boot attacks in subsequent years [21, 75, 22, 2], the practicability of this attack has never been verified independently, nor reconstructed by any of these publications, especially not for the modern DDR3 technology.

5.1.1 Related work

As stated above, the first practical attack based on the remanence effect was described by Halderman et al. [72, 73]. In their renowned paper from 2008, Halderman et al. have shown that cold boot attacks can be used to extract sensitive information, in particular, cryptographic keys, from RAM. From extracted RAM cryptographic keys were reconstructed using recovery algorithms and then used to break the full disk encryption of BitLocker, TrueCrypt, and FileVault. After this, many publications which focus on key recovery and reconstruction followed [121, 75, 155, 125, 99]. In 2013, Müller and Spreitzenbarth performed cold boot attacks against smartphones for the first time [106]. To this end, they published a recovery tool called FROST which can be used to retrieve encryption keys from Android devices, thus proving that the ARM is also vulnerable to cold boot attacks. Taubmann et al. [151] later extended the idea of cold boot attacks against mobile devices with lightweight tools. In 2015 Lindenlauf et al. evaluated the cold boot attack against DDR2 and DDR3 memory [97]. In there specific test setup, which consisted only of one DDR3 system, they were, unlike we, able to recreate the cold boot attack also against DDR3 memory. We contribute to this field by verifying the memory extraction process against and desktop PCs. We further investigate the new DDR3 technology in depth uncovering and subsequently solving an inherent problem preventing straightforward cold boot attacks against DDR3 memory.

5.1.2 Outline

In Section 5.2, we outline the test setup of our experiments, including the hardware and software configurations as well as the execution process of our experiments. In Section 5.4 on page 59, we present the results of our experiments, i.e., we present details about the temperature effects on RAM remanence. In Section 5.5 on page 64, we outline how pure software-based countermeasures can be circumvented by different forms of the cold boot attack. And in Section 5.7 on page 70, we finally conclude this chapter.

5.2 Setup

In this section, we give an overview on our setup. We describe the hardware we used, the software we deployed, and our experimental setup.

5.2.1 Hardware

First, we outline the hardware used during our experiments. This includes the computer systems we tested, as well as the equipment utilized in our experiments.

52 5.2 Setup

5.2.1.1 Computer systems We focused on mobile devices such as laptops because due to their greater exposure to physical access by a forensic analyst they represent likely targets of the cold boot attack. Laptops, due to their mobility and thus the associated risk of loss, are also more likely to be encrypted than, e.g., desktop or server systems, which are more focused on performance. Table 5.1 gives a list of the systems we tested. All RAM chips were non-ECC SDRAM modules. Identical RAM model numbers mean the same RAM module was used in the corresponding systems. From now on, we refer to each system configuration by its respective identifier denoted as A to Q.

ID System or DDR Type RAM model Size [MiB] A Asus Eee PC 1010H 2 HYMP512S64BP8-Y5 1024 B Asus Eee PC 1010H 2 NT512T64UH8A1FN-3C 512 C Asus Eee PC 1010H 2 04G00161765D (GDDR2) 1024 D Asus Eee PC 1010H 2 KVR667D2S5/1G 1024 E Asus Eee PC 1010H 2 CF-WMBA601G 1024 F HP Compaq NX6325 2 HYMP512S64BP8-Y5 1024 G HP Compaq NX6325 2 NT512T64UH8A1FN-3C 512 H Intel Classmate PC NL2 2 HYMP512S64BP8-Y5 1024 I Toughbook CF-19FGJ87AG 2 HYMP512S64CP8-Y5 1024 J ASRock K8NF4G-SATA2 1 HYS64D64320GU-6-B 512 K Fujitsu SCENIC P300 i845E 1 HYS64D64320GU-6-B 512 L Fujitsu SCENIC D i845G 1 KVR400X64C3A/512 512 M ASRock H77M-ITX 3 CMX8GX3M2A1600C9 8192 N Fujitsu ESPRIMO P900 E90+ 3 M378B5773CHO-CK0 2048 O ASRock Z77 Pro4 3 HMT351U6BFR8C-H9 4096 P Asus P8P67LE 3 M378B5773DHO-CH9 2048 Q ASRock Z77 Pro3 3 KHX2133C8D3T1K2/4G 4096

Table 5.1: List of tested computer systems and their corresponding RAM type and model

5.2.1.2 Thermometer Temperature measurements were performed with a Sinometer DT8380 contactless infrared thermome- ter with a laser pointer. Its measurement range is from -30 ℃ to 380 ℃. It has a distance-to-spot ratio of 12:1, meaning measuring from a distance of 12cm the temperature within a spot of a 1cm diameter is measured [133].

5.2.1.3 Cooling agent Cooling was provided by multiple cans of “KÄLTE 75 SUPER” spray from CRC Kontakt Chemie, which is a professional cooling agent especially suited for use with electronic components. It is non-burnable and provides a cooling power of 267J/ml and according to its specification the lowest attainable cooling temperature is -55 ℃ [26].

5.2.2 Test data placement The test data consisted of a 2 MiB chunk of random data generated by a pseudo-random number generator (PRNG) [92] and a 687 KiB 687 × 1024 pixel Portable Graymap (PGM) formatted picture of the Mona Lisa. The random data was used as machine comparable data for error analysis, while the pictures were used for visual inspection. The test data was written starting from the physical address 8 MiB. We, of course, verified for each tested system that this memory area was not overwritten during reboots. This was not the case for all tested systems.

53 5 Memory acquisition (via cold boot attacks)

5.2.3 Software We wrote two special-purpose boot loaders for our work, one for data placement and one for data extraction. Writing our own boot loaders saved us to boot into full OSes that potentially falsify our results. The data placement simulates content, such as full disk encryption keys that reside in a target’s RAM. Our software to place the test data was a minimal boot loader containing the PRNG for the machine comparable data. It also contained the picture of the Mona Lisa in its static data. After the data was copied to the desired memory location the WBINVD instruction [81] was issued to force the data to be written to RAM instead of potentially remaining in CPU caches. The extraction boot loader extracts the remanence of the placed data, thus completing our cold boot attack. The extracted data is eventually analyzed. The software we used for extraction is based on the memimage tool by Bill Paul [71], which was also used for the experiments in the original cold boot attack publication by Halderman et al. [72]. The memimage tool provides a simple boot image — scraper.bin — that can be written to a bootable device, such as a USB stick. When booted, this USB scraper copies the entire addressable memory contents of the system to the boot device. While this provides all functionality needed to perform cold boot attacks even against systems with several GiB of RAM, it was not very comfortable to use for our repeated measurements because dumping the entire memory takes a long time. Thus, the tool was modified to dump only the previous placed test data without wasting time on extracting the rest of the memory that is not relevant to our measurements. We did, however, also perform a multitude of full RAM dumps to evaluate the overall feasibility, applicability, and practicability of the cold boot attack.

5.2.4 Experiment Our experiments follow the procedure outlined as follows.

5.2.4.1 Structure The canonical procedure of our cold boot attack consists of the 8 steps illustrated in Figure 5.1 on the next page. Step #1 is to boot the system into the placer tool. The placer then, in step #2, copies the test data into the system’s RAM. Step #3 is to cool the RAM module down to the desired temperature c for the current measurement. Then, in step #4, the system is and in step #5, we wait t seconds. Depending on the kind of experiment, the RAM module may also be removed and transplanted into a different system during step #5. Afterwards, in step #6, the system containing the RAM is powered on again. In step #7, the booted extraction program dumps the remaining test data from RAM to disk. In the final step #8, the gathered remanence of the test data is compared with the clean undecayed test data, and the changed bits, i.e., bit errors, are counted. Figure 5.1 on the facing page contains pseudo code for the test data placement, the extraction, and the analysis on how many bits have decayed. The pseudo code uses the following variables and terminology:

• TESTDATA: test data on boot medium • DUMP: RAM dump stored on boot medium • RAM: physical RAM region being analyzed • BIT_ERROR_COUNT: number of bit errors • POPCOUNT: hamming weight of a byte • WRITE_BACK_CACHES: forces cached data to be written to RAM • N: Number of bytes in test data • C(c, t): cooling to c ℃ and t seconds without power

54 5.2 Setup

#3 Cool RAM to c ℃ #4 Shut down #1 Power up/boot #6 Power up/boot Cooling: Placement: Extraction: C(c, t) for (i=0; i < N; i++) { for (i=0; i < N; i++) { RAM[i] = TESTDATA[i]; DUMP[i] = RAM[i]; } #5 Wait t } WRITE_BACK_CACHES(); #7 Extract Remanence #2 Place test data Analysis: BIT_ERROR_COUNT = 0; #8 Analyze Remanence for (i=0; i < N; i++) { i f ( DUMP[i] != TESTDATA[i] ) { BIT_ERROR_COUNT += POPCOUNT(DUMP[i] ^ TESTDATA[i]); } }

Figure 5.1: Abstract setup of our experiments: Either a system is rebooted, or optionally its RAM modules are removed and transplanted into another system during step #5.

5.2.4.2 Execution We executed our experiments according to the following procedures:

Cooling In step #3, the cooling agent is sprayed evenly over the entire surface of the RAM module so that all RAM chips are covered equally, as can be seen in Figure 5.2. If possible, both sides of the RAM module are sprayed. In the case of the systems A to I, this was not possible due to the placement of RAM modules. After the initial cooling, no more cooling is applied so the RAM modules slowly warm up again during step #5. While this lack of cooling during step #5 may cause distortion in the decay curves of our measurements, applying constant cooling to maintain a constant temperature is practically not feasible. In real cold boot attacks, it is possible to reboot a machine to the point where the memory controller is refreshing the RAM modules again within 5 to 10 seconds. It is also possible to transplant RAM modules in this time. Hence measurements beyond that time are purely illustrators of the RAM remanence but irrelevant to real practical cold boot attacks anyway.

Figure 5.2: RAM module covered in cooling agent (This is an Extreme over-exaggerated example. During our experiments we never had to use this much cooling agent.)

55 5 Memory acquisition (via cold boot attacks)

Temperature Measurements The temperature is measured right after step #3 and before step #4. Because we measure with a contactless infrared laser thermometer, all measurements reflect surface temperatures. We measure from a distance of around 8 cm yielding a measuring spot 7 mm in diameter. This spot is small enough to encompass exactly one RAM chip. For the chip to measure, we choose the hottest chip of the RAM module. This chip is on all tested systems, without exception, in the lower row of the RAM module near the socket. The upper chips away from the socket are consistently several ℃ cooler. RAM chips are cooled down while the system is running, meaning while the RAM chips still produce heat. The plastic surface of the RAM chips gives higher temperature readings than a label on the RAM module or its circuit parts. Therefore, our measured temperatures do not reflect the real core temperature of the RAM module but rather the upper bound of the surface temperature. Our measurement procedure is probably the reason why we have temperature down to only 0 ℃, even though the cooling agent can cool a disconnected RAM chip down to -30 ℃ (measured using the same procedure). However, to obtain more relevant results, the RAM is cooled in a normal operational state. In case the temperature measurement indicates that cooling is insufficient for the current experiment, cooling is reapplied and the temperature is measured again.

Timing Measurements All timings mentioned are measured between step #4 and step #6. As an indicator of the events for step #4 and step #5, pressing the system’s power button is used. While this does not reflect the true point where a RAM module is cut and reconnected to power, it is an established and reproducible point in time.

Analysis The extracted test data is analyzed as follows: We first calculate the number of bit errors as outlined in the pseudo code of Figure 5.1 on the preceding page. We then divide the error count by the number of total bits, in our case 2 Mi. This gives use the percentage of correct bits. Note that, since we use random test data, it can happen with a statistical probability of 50 % that a bit of our random test data is exactly the ground state of the bit in RAM and hence “correct” by chance. That is, the minimum percentage of correct bits is 50 %. All measurements, especially the temperatures, are best efforts and not precisely accurate, as it is not possible to allocate the exact same amount of cooling agent every time. If for any of our experiments there was a deviation from the above-mentioned procedure, it is indicated as such.

5.3 Observations

Before presenting the results of the measurements, we want to discuss a few observations made during the experiments.

5.3.1 Ground state patterns An important aspect of analyzing already partially decayed memory images is the patterns in the ground state of the RAM. The ground state is the state the RAM decays to, i.e., the state the RAM will eventually take when not being refreshed anymore. The ground state of most RAM is not uniformly zero or one, but rather forms a pattern of alternating areas of zeros and ones. The pattern on test system A is formed from alternating stripes of ones followed by a stripe of zeros with some intermittent odd stray zeros and ones in both stripes. Each stripe is exactly 64 KiB, presumably the size of one memory subunit of the bigger 64 MiB memory chips of which the entire 1 GiB RAM module is made of. Halderman et al. [72] also observed such patterns. Figure 5.3 on the next page illustrates the different ground states of two different hardware configurations. From our understanding the ground state is dependent not only on the RAM module used, but also the system. The first is obvious, because a RAM module can interpret its contents any way it wants, i.e., decide autonomously about a bit being active high or active low. The second is not so obvious, however, also a system’s RAM controller can be at liberty to interpret bits differently. This is important for RAM transplantation, which we outline in Section 5.4.3 on page 63, in which we

56 5.3 Observations

(a) System C (b) System G

Figure 5.3: Illustration of different ground state patterns

could transplant the RAM of one system to another system only if the memory controllers are compatible. This can be asserted by checking the ground pattern of two systems with the same RAM module. If they are identical, the systems are compatible. A snapshot of the pattern can be obtained by powering the computer down for a long time so any remanence of previous RAM content is definitively gone, then taking a snapshot of the state of RAM. With this snapshot it can be analyzed whether a bit has not decayed. For example if the ground state of a bit is zero and a RAM dump from a cold boot attack finds the bit at that position to be a one it can safely be assumed that this bit has not decayed, because bits will decay to their ground level. Contrary to observations by Halderman et al. we never exhibited bits that did not decay to their ground state. However, even Halderman et al. acknowledge that the fraction of bits flipping in the opposite direction is negligible small [72]. Hence, this assumption can be used during analysis. Table 5.2 is showing the correlation between the ground state of bits, the observed state of bits and the possibility of decay. To determine whether a bit has not been decayed one can XOR the ground state with the observed state and whenever this XOR returns true the bit has not been decayed.

Ground state of bit Observed state of bit Has the bit decayed? 0 0 Maybe 0 1 No 1 0 No 1 1 Maybe

Table 5.2: Ground state and bit decay relationship

57 5 Memory acquisition (via cold boot attacks)

This information can be highly valuable when reconstructing the original memory state of a corrupted memory dump acquired via a cold boot attack. Hence for an attack, it is beneficial to also acquire the ground state pattern, just in case some bits are corrupted and reconstruction techniques must be used. Key reconstruction techniques can use this ground state within their reconstruct models [2, 72].

5.3.2 Cached data

During developing the data placer tool we purposely did not force a write back of the cached data. via the WBINVD instruction. This lead to the very obvious observation that data in the cache is not vulnerable to this kind of cold boot attack. Figure 5.4 illustrates this cache effect with pictures of Mona Lisa that have been written to memory without a forced write back of the cached data. It should be noted that before writing the Mona Lisa’s pictures into RAM the RAM was empty, i.e., it contained its natural ground state. In a real scenario, the black and white areas in the pictures would contain previous RAM contents.

(a) Only top portion of picture writ- (b) Bottom of picture still in cache (c) Single lines of picture still in ten to RAM cache

Figure 5.4: Observed cache effects due to missing cache write back

This observation, while completely obvious, validates the idea behind the anti-forensic encryption concept behind FrozenCache [115], which tries to keep encryption keys entirely in the CPU’s caches so they can not be extracted from RAM in a cold boot attack.

Cold boot attacks against CPU caches, which in itself are simply SRAM, could also be possible as SRAM also exhibits data remanence [134]. However, our observations have shown that, at least on the tested hardware, the cache is invalidated on boot, thus, requiring additional tricks, possibly hardware, and technical knowledge to attack the caches directly — if at all possible in practice.

58 5.4 Results

5.4 Results

We now present the results of our experiments. We first probe our test systems for RAM remanence. We then analyze the correlation between temperatures and RAM remanence. And last, we present results of our RAM transplantation experiments.

5.4.1 Remanence effect

Not all systems we tested exhibit RAM remanence. On some machines, RAM is reset even on warm resets, and even though any POST procedures are disabled and all fast and/or quick boot features are enabled in the system’s BIOS, as advised by Halderman et al. [72]. This fact was also observed by Chan et al. [21]. Whether this memory reset is done as part of fulfilling the TCG Platform Reset Attack Mitigation Specification [154] or, as suspected by Halderman et al., as a quirk of ECC-capable systems to always bring the RAM to a known state whether or not ECC RAM is actually installed or not, remains an open question. Table 5.3 provides an overview of the state of observable RAM remanence in the various systems we tested with different types of cold boot attacks. Note the difference of the DDR3 systems M to Q, compared to the DDR1/DDR2 systems A to L.

DDR RAM remanence observable after a cold reboot warm reset without cooling with cooling A 2 Yes B 2 Yes No Yes C 2 Yes D 2 Yes E 2 Yes F 2 Yes G 2 Yes H 2 No (reset) I 2 No (reset) J 1 Yes K 1 No (reset) L 1 No (reset)

M 3 Yes No (scrambled with ”signature” every 256 KiB) N 3 Yes No (scrambled) O 3 Yes No (scrambled) P 3 Yes No (scrambled) Q 3 Yes No (scrambled)

Table 5.3: List of observable RAM remanence in our test systems with different types of cold boot attacks.

Interestingly, all tested DDR3 systems maintain their entire RAM contents through warm resets, meaning that they are vulnerable to simple local warm reset attacks. However, after cold reboots, for all of them only noise patterns can be observed. These noise patterns are different each time and unrelated to the placed test data. See Figure 5.5 on the following page for noise patterns acquired on different systems. There exists one exception, however: On system M, the first four bytes of every 256 KiB block always equal 0x5a on memory reset. This seemingly deliberately placed “signature” suggests explicit scrubbing of the memory. But since the memory is preserved on a warm reset such deliberate scrubbing is unlikely.

59 5 Memory acquisition (via cold boot attacks)

(a) System M (b) System N (c) System O (d) System P (e) System Q

Figure 5.5: Scrambling patterns in DDR3 systems after a cold reboot

Further research indicated that DDR3 memory contents are scrambled to prevent parasitic semi- conductor effects. Because toggling a lot of bits in RAM from zero to one or vice versa, can cause electromagnetic side effects Intel patented a technique [105, 40] in which the data is XORed with a random data stream before being written into RAM, and again XORed with the same random stream when reading from RAM. This way the maximum number of zeros and/or ones in a stream is on average 50 %, reducing electromagnetic side effects. In Section 5.5.1 on page 65, we briefly summarize an attack against this scrambling deployed by Intel.

(a) 0 sec / 100 % (b) 2 sec / 99.2 % (c) 3 sec / 93.4 % (d) 4 sec / 93.1 % (e) 5 sec / 61.4 % (f) 6 sec / 51.9 %

Figure 5.6: Mona Lisa picture series as recovered from system C’s RAM after a cold boot attack at normal operational temperature of 20 to 25 ℃ after different amounts of time. Each picture’s caption includes the percentage of correct bits that were recovered.

Figure 5.6 is a representation of the RAM remanence in system C visualized as a sequence of Mona Lisa pictures. Figure 5.7 on the next page plots the time-related bit decays of the systems A to G and J, exhibiting RAM remanence even after a cold reboot at their normal operational temperatures. The measurements at time t = 0 represent a warm reset. The first measurement above t = 0 represents the fastest time a system could be power cycled, i.e., powered down and up again. Note that even though A uses the same RAM module as F, and B the same as G, their curves differ. This can be explained by system F and G supplying the fan and thus presumably also the memory controller with power for about 0.5 seconds after the power button is pressed. Thus, the actual time without power is less than presented in the graph. This shows that a simple cold reboot attack, which is impossible on system A and B, is possible on system F and G, even though those systems use the same RAM modules.

60 5.4 Results

5.4.2 Temperature and RAM remanence

100 A B C 90 D E F 80 G J

70 Correct bits (%)

60

50 0 2 4 6 8 10 12 14 Time (s)

Figure 5.7: RAM remanence of systems A to G and system J

We can conclude that cold boot attacks are feasible on most machines with DDR1 and DDR2, although the longevity of the RAM remanence varies considerably, as can be seen in Figure 5.7. We now analyzed the correlation of RAM remanence with the RAM temperature in more detail. To this end, we performed several cold boot attacks according to the experiment structure outlined in Section 5.2.4 on page 54, each with a different temperature c and a different time t. Again, the measurements at time t = 0 represent a warm reset and the shortest measurement above t = 0 represents the fastest time that systems can be power cycled.

Figure 5.8 on the following page shows plots about times and bit decays at different temperatures for systems A to E and H. These plots clearly show the correlation of lower temperatures and longer RAM remanence. Especially the legacy DDR1 system J shows a remarkable long RAM remanence. The remanence of systems A and B, which at normal operational temperature barely exists, can be prolonged such that the systems can be power cycled and retain over 99 % and 96 % of their bit information. As mentioned before, the differences between system A and F, and the difference between B and G, which use the same RAM, can be attributed to the fact that systems F and G are supplying the memory controller with power for about 0.5 seconds beyond the power button being pushed, plus the fact that the RAM in systems F and G has a higher operational temperature than the RAM in systems A and B.

With these measurements, the temperature influence on RAM remanence can be clearly confirmed for our DDR1 and DDR2 test systems. Even a modest surface temperature drop of only 10 ℃ prolongs remanence.

61 5 Memory acquisition (via cold boot attacks)

100 100 2 - 4ÊC 0 - 4ÊC 8 - 12ÊC 8 - 11ÊC 22 - 25ÊC 24 - 26ÊC 90 90

80 80

70 70 Correct bits (%) Correct bits (%)

60 60

50 50 0 5 10 15 20 25 30 35 40 45 50 0 2 4 6 8 10 12 14 Time (s) Time (s) (a) System A (b) System B

100 100 0 - 5ÊC -3 - 1ÊC 10 - 12ÊC 8 - 11ÊC 20 - 25ÊC 27 - 29ÊC 90 90

80 80

70 70 Correct bits (%) Correct bits (%)

60 60

50 50 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 0 5 10 15 20 25 30 35 40 45 50 Time (s) Time (s) (c) System C (d) System F

100 100 -1 - 0ÊC 10 - 12ÊC 8 - 12ÊC 32 - 37ÊC 26 - 34ÊC 90 90

80 80

70 70 Correct bits (%) Correct bits (%)

60 60

50 50 0 2 4 6 8 10 12 0 100 200 300 400 500 600 700 800 900 1000 Time (s) Time (s) (e) System G (f) System J

Figure 5.8: RAM remanence of systems A, B, F, G and J over time and at different temperatures. Note the different time scales. The highest temperature of each system’s measurement is its normal operation temperature.

62 5.4 Results

5.4.3 RAM transplantation attacks

As is evident from the previous section, RAM remanence is long enough, or can be prolonged long enough via cooling, that there is enough time for a RAM module to be transplanted from one system to another without losing the majority of its content. In practice, we were, able to transplant the RAM from system A to system B. For this, we cooled the RAM module in system A, quickly removed the RAM module from the running system, and then inserted it into system B. System B was subsequently booted and the RAM remanence extracted. We could in various attempts consistently recover over 98 % of all bits correctly. Our attempts became increasingly better and our last attempts even surpassed 99 %. Table 5.4 gives a list with the figures for our attempts. Even a seemingly failed attempt (the power button of system B was missed) resulted in 95 % of all bits being transferred correctly. Considering the findings of Halderman et al. where they reconstructed an AES key within seconds if 7 % of the bits are decayed [72], or Heninger and Shacham showing their ability to efficiently reconstruct an RSA private key with small public exponent from only 27 % of the bits [75], still renders our ”failed” attempt a success. Cold boot key recovery for other ciphers such as Serpent and Twofish, as published by Albrecht and Cid [2], can also tolerate higher bit errors than what we exhibited. Hence RAM transplantation is a feasible attack scenario. In another experiment, we successfully transplanted the RAM from system H, which according to Table 5.3 on page 59 resets its entire memory upon boot, into system A. This way we could efficiently circumvent whatever mechanism causes the memory into system H to be reset. This can also be used to circumvent BIOS locks as we stress in the next section. However, RAM transplantation is not always possible. It requires two systems with compatible memory controllers, e.g., trying to transplant from system K to J, A to F, F to A, I to A, I to F, and H to F, all failed.

Temperature [℃] Errors [bits] Correct bits [%] 5.9 77,645 99.537200 5.4 136,120 99.188662 3.5 157,994 99.058282 5.4 216,734 98.708165 5.2 268,176 98.401546

Table 5.4: Temperatures and errors for several cold boot attacks with RAM transplantation from system A to system B. The temperature was measured after the RAM was transplanted.

5.4.3.1 Recovering ASCII text

Even though encryption keys can be recovered despite bit errors, the question remains whether other RAM contents can be analyzed. To this end, we placed the text of Lewis Carroll’s “Alice’s Adventures in Wonderland” beside the Mona Lisa picture and the random data to give another example of real life data that a forensic analyst might be interested in. This children’s novel can represent sensitive e-mails or other documents. The code outlined in Listing 5.1 on the next page was used to bring the text back into the range of printable ASCII characters, in case it contained bit errors.

63 5 Memory acquisition (via cold boot attacks)

#include int main ( void ) { int c ; while (c=getchar() , c!=EOF) { c &= 0 x7f ; // slight error correction here i f ( ( c< ’ ’ || c> ’~’ )&&(c != ’\n’&&c != ’\r’&&c !=’\t’) ) c = ’ ∗ ’; putchar(c); } return 0 ; }

Listing 5.1: Program to replace any character outside the ASCII printable range with a star

Figure 5.9 shows the recovered text from a cold boot attack with temperature c ≈ 5.9 ℃ and time t ≈ 4 seconds resulting in an overall rate of 99.5 % correct bits. While, even with such a low error rate, the text becomes quite mangled it is still legible in a way that the storyline can be followed. The same way the storyline of a sensitive e-mail conversation could be followed or the scheme of embezzlement arrangements could be followed without any advanced technical effort in reconstructing and fixing the text at all. Of course finding the scattered pieces of texts in paged memory and piecing them together takes a little more effort it is still possible by using simple tools such a Unix’s strings utility to extract all printable characters from a RAM dump then looking over them by hand.

CH*PTER I. Dmwn t‘e*Rabbip-Hole

Alice was Beginnin’ to get &ery tirad of sitting by*her siSter on The bank, and o& having nothing to do: once or twice she had paeped into the book her sister was reading,*but it had no p(ctures kr convErsations in it, ’and what is the use of a book,’ thought Alice ’withoet pictureS*or coNvdrsatIon?’

Figure 5.9: Beginning of “Alice’s Adventures in Wonderland” recovered from a cold boot attack. All non-printable characters have been replaced with a star.

Deeper analysis such as listing the processes running on a system or performing a signature search for malware code are, however, rather difficult, because they do not contain any redundancy. Further, every bit within them matters, e.g. a process within the process list is pointing to the next entry via an address pointer, if this address is off only by one bit the list can not be followed anymore. We have stipulated research into fuzzy logic memory analysis, however, the results are still pending.

5.5 Bypassing countermeasures

We now take a look at various countermeasures to the cold boot attack that have been proposed since 2008 and discuss how those can be circumvented. Some of the outlined countermeasures already exist in practice, others have stayed theoretical proposals, which we show how to circumvent regardless.

64 5.5 Bypassing countermeasures

5.5.1 Descrambling DDR3 memory As outlined earlier, modern DDR3 memory uses scrambling to prevent current spikes and elec- tromagnetic interference. To this end, Intel performs memory scrambling in its DDR3 memory controllers. It can be argued that this scrambling is a countermeasure against cold boot attacks, because they seem to stop at least the straightforward cold boot attack proposed by Halderman et al. Hence, in this section we summarize the findings of Bauer et al. [10]. In this joint work, titled “Lest we forget: Cold-boot attacks on scrambled DDR3 memory”, we present an attack on scrambled DDR3 memory systems which requires 64 bytes of known plaintext per memory channel, i.e., at most 128 bytes for a dual-channel system, within the memory image in order to yield complete descrambling of the image. The attack is further refined by exploiting the mathematical relationships present within the key stream to reduce the number of known plaintext bytes to only 25, i.e., 50 bytes for a dual-channel system. In this section, we only shortly summarize the main idea of our joint publication.

5.5.1.1 Scrambling When transmitted and written to a RAM module, biased data streams, i.e., streams with an overwhelming number of zeros or ones, can cause electromagnetic side effects, which can lead to problems, e.g., when all bits are toggled simultaneously, the resulting power spikes could cause neighboring cells to toggle as well. Unbiased streams, i.e., streams with an equal amount of zeros and ones, help to mitigate such problematic behavior. A process known as scrambling can be used to turn a biased stream into an unbiased stream. To this end, the data stream is XORed with a pseudo-random number (PRN) sequence, before being transmitted. The PRN sequence will remove the bias from the stream, because the resulting stream will contain, on average, an equal amount of zeroes and ones. The inverse process, descrambling, can be facilitated by XORing the data stream with the same PRN sequence used for scrambling.

target RAM

Key K0 P M = P ⊕ K0 scramble

transplant acquisition system Key K1 I = M ⊕ K1 M descramble

Figure 5.10: Scrambled storage of data and image acquisition [10]

Figure 5.10 shows the schematic of a cold boot attack with scrambling involved. On top is the target system using key K0 to scramble the memory and on the bottom is the acquisition system using key K1 for descrambling. The problem is that if keys K0 and K1 are different, the descrambled data I is different from the original plaintext data P . In case the target system and the acquisition system are two different computers and the RAM module is actually transferred between them, then the keys are definitively different. In case the acquisition system and the target system are the same system and the system uses a static key, i.e., the key is set randomly on reboots making K0 = K1, the cold boot attacks can, despite the scrambling, be performed regularly. This special case was exhibited by the single test system used by Lindenlauf et al. [97]. However, even if the acquisition system and the target system is the same system, i.e., no RAM transplantation is performed, the keys can be different. This is the case for a cold reboot, i.e., resetting the system by cutting the power. On the majority of systems, this causes the scrambling key to be reinitialized, thus leading to wrong descrambling. In fact, in such a case the data is first scrambled with one key, then descrambled with another key, in principle scrambling it twice. This is one of the reasons why,

65 5 Memory acquisition (via cold boot attacks) even though Intel’s patents on their scrambler mechanism [105, 40] explains that a linear-feedback shift register (LFSR) is used to generate the scrambler data stream, which seems simple to attack, it is a rather complex task in practice.

5.5.1.2 Descrambling via stencil attack

Because Intel’s scrambler is based on an LFSR its key stream is periodic. This means the key stream Ki is a repeated concatenation of some subkey stream ki with

x Ki = ki and

|ki| = π with π being the periodicity, i.e., length, of the subkey stream ki. According to Figure 5.10 on the preceding page, plaintext memory contents P that was scrambled with key K0, then during acquisition descrambled, i.e., again scrambled, with key K1, is acquired as image I, with the following relationship between P , K0, K1 and I:

P = I ⊕ K0 ⊕ K1

Because both K0 and K1 are LFRS streams, they exhibit the above-mentioned periodicity, and hence can also be represented via a repeated concatenation of some subkey stream k01:

x K0 ⊕ K1 = k01 also with

|k01| = π

The above relationship can thus be expressed as:

x P = I ⊕ K0 ⊕ K1 = I ⊕ k01

The subkey stream k01 can then be recovered as follows, if a stream of length π is known within P :

x k01 = I ⊕ P

In case the known plaintext is a pattern of all zeros, i.e., 0x00, the subkey stream k01 is thus:

x k01 = I

Because the memory pattern 0x00 is the most likely pattern in RAM, the image I can be clustered into chunks of π length and searched for the most chunk with the most dominant contents. This is likely the subkey stream k01. To descramble the image I this subkey stream can be repeatedly XORed with each π-length chunk of I, like a “stencil”. Hence this attack is referred to as the stencil attack. From all machines we investigated, we concluded a 64-byte periodicity, i.e., π = 64. Hence, to perform the above-described stencil attack we only need 64 bytes of known plaintext.

For a more in-depth description of the DDR3 memory descrambling attack, including details on how to deal with bit errors during acquisition, dual channel memory systems, and an improved attack leveraging mathematical properties in the key streams, we would like to refer to our joint publication “Lest we forget: Cold-boot attacks on scrambled DDR3 memory” [10].

66 5.5 Bypassing countermeasures

5.5.2 RAM reset on boot Clearing RAM on boot, as mandated in the TCG Platform Reset Attack Mitigation Specifi- cation [154], can stop a straightforward reboot attack. However, it cannot stop an advanced transplantation attack as presented in the previous section. This technique is nonetheless a dangerous mitigation technique because it indeed raises the difficulty of cold boot attacks. But even without transplantation, it can be possible to acquire the RAM contents. In 2013 Schramp presented his talk “RAM Memory acquisition using live-BIOS modification” [129] at the OHM conference. In his talk, he outlined how he was able to reflash the BIOS of a running system with CoreBoot and SerialICE This way it was possible to extract the RAM contents from the system via a serial interface after rebooting the system, without the original BIOS clearing the RAM contents on reboot. This way also EEC memory, for which the BIOS zeroes the RAM to bring it into a known state for error correction to work, can be attacked. To this end, the new BIOS being flashed while the system is running must not initialize EEC memory on startup, but rather read the memory contents without error correction. We were able to partially verify the research by Schramp [129], i.e. we were able to change the stock vendor BIOS of a Lenovo X230t, with CoreBoot and SeaBIOS, while the system was running, and hence could remove BIOS restrictions such as passwords, etc. We were unfortunately only able to do so against a prepared system, from which the BIOS chip was already removed and placed inside an SOP-8 socket, which was re-attached back to the motherboard via wires, and thus could easily be swapped. However, we were not able to actually desolder the BIOS EEPROM from the running X230t without causing the operating system to either crash1, presumably due to bus interference, or shutdown due to overheating, caused by excess heat from soldering. While the operating system crashing does, presumably, not interfere with the RAM modules being refreshed by the memory controller hub, the system resets shortly after the crash, which did not provide enough time to complete the desoldering of the old BIOS EEPROM and resoldering of the new prepared CoreBoot BIOS EEPROM. Nonetheless, we consider this a valid option for forensic investigators.

5.5.3 Locking the boot process In case the boot device is looked, i.e., the system will only boot from HDD, the forensic investigator can place the RAM dumper tool on a HDD. Locking the whole boot process, e.g., with a BIOS master password, does also not prevent the advanced transplantation attack, because a forensic analyst can still transplant the RAM into a system with full control over the boot process. But like the RAM reset on boot, this practice also complicates cold boot attacks.

System Method A-E CMOS Battery Removal F-G CMOS Battery Removal2 H-L CMOS Battery Removal M-Q Jumper

Table 5.5: BIOS password circumvention methods for the various systems tested

However, on most systems, BIOS passwords should not be considered a security mechanism. Table 5.5 outlines on which systems we were able to circumvent the BIOS password via one of the following commonly known BIOS password circumvention techniques:

• Backdoor Password: BIOSes, especially on older desktop systems, often featured special passwords which could be used instead of the legitimately set BIOS password. Those backdoor passwords could then be used by technicians to gain access to the BIOS for maintenance. Common BIOS passwords are, e.g., “_award”, “AWARD_PW”, “PHOENIX”, “CMOS”, “A.M.I.”, “589589”, etc., depending on the BIOS manufacturer [76].

1The X230t would output noise both through its speakers and the display. 2This only works if the stringent security password option is disabled.

67 5 Memory acquisition (via cold boot attacks)

• CMOS Battery Removal: On some systems, the BIOS password is stored in the CMOS, which is continuously powered via its own small coin cell battery. If the power to the system is cut and this coin cell removed, the CMOS contents and hence BIOS password is lost. To this end, a cold boot attack with a hard reboot, i.e., cutting the power must be performed, and in addition to that the coin cell battery must be removed as well. Due to residual power, however, the system must be attempted to be booted while the power is still cut, this ensures the power to the CMOS is completely lost. While we were able to perform this BIOS password resetting cold boot attack, the results, with regard to bit loss, are comparable to the advanced transplantation attack. In some instances we, however, had to repeat the reset process, rendering the results far inferior to the transplantation attack. Hence, this method does not provide additional benefits beyond making a compatible system for transplantation unnecessary.

• Jumper: Many modern desktop systems feature a jumper to reset the CMOS and/or bypass the BIOS password. To this end, the specific jumper on the motherboard of the system must be changed to the position used for CMOS reset and/or BIOS password bypass, as dictated by the vendor. Then the cold boot attack can be performed as outlined previously.

The above outlined BIOS flashing attack will, of course, also work here too.

5.5.4 Temperature detection A proposed countermeasure [102] is the usage of temperature sensors which in case a sudden temperature drop is detected initialize a memory wiping process. However, we found that this method again only makes the cold boot attack harder but does not prevent it. To Simulate an attack, we used system A as the victim and system B as the attacker. We powered system B up without RAM. This obviously causes the boot to fail and leaves the system in an unresponsive failure state but the system’s memory controller is still fully functional and refreshes any RAM module inserted. The RAM module is very quickly, within under a second, removed from running system A and inserted into system B. As this transfer is done without cooling no temperature sensors are triggered. Once the RAM is in system B, it is refreshed by the memory controller immediately. To finish the attack we need to perform a normal cold boot attack on system B, bring it back to normal operational state and boot it into our extraction program. Since the RAM is already out of control of the temperature triggering system this does not pose a problem.

Temperature [℃] Errors [bits] Correct bits [%] 6.8 127,406 99.240601 8.0 1,217,038 92.745888 10.7 1,749,820 89.570260 9.9 4,408,509 73.723239

Table 5.6: Temperatures and bit errors for several transplantations from system A to system B without cooling. In the attempt with 73 % correct bits the RAM socket of system B was accidentally missed causing more loss.

Because the RAM transfer has to be performed without cooling it causes considerable RAM decay, even though it can be performed in under one second. Nevertheless, we were able to extract 90 % of the bits correctly, even reaching 99 % in one attempt. However, the attack is very fragile and the slightest mishaps while performing the non-cooled transfer results in severe data loss. In one such instance, only 73 % of the bits could be captured correctly. Table 5.6 gives an overview of the results of various attempts. As detailed in Section 5.4.3 on page 63, available research on key reconstruct says that, despite these elevated error rates, encryption keys can be reconstructed. Even though we demonstrate how temperature sensors can be circumvented, they pose a further obstacle because they make a phase of uncooled decay mandatory.

68 5.6 Limitations

5.5.5 0x7c00 defense The 0x7c00 method is another proposed countermeasure [102]. On the x86 architecture, 0x7c00 is the memory address to which an IBM-PC compatible BIOS loads the boot device’s (MBR), i.e., the first 512 bytes of a bootable device [144]. The theory behind the 0x7c00 defense is to place sensitive data, such as encryption keys, into this 512 bytes at 0x7c00, so that any reboot will overwrite them. However, the RAM module can be transplanted into a system with two memory slots with the slot holding the lower address space in which 0x7c00 resides being filled with a dummy RAM module, and the upper slot with the victim’s RAM module. So again, this countermeasure complicates the attack, but it does not prevent it entirely. Besides that, it only offers protection for 512 bytes. But also the above-mentioned BIOS reflashing attack will work. Eliminating the need to transplant at all.

5.6 Limitations

As demonstrated, all proposed software solutions to the cold boot problem can be circumvented with an adaption of the attack, because once RAM modules are removed from a system there is no way for the system to react upon the event of removal. Hence, although some solutions provide a certain level of mitigation, pure software solutions cannot be entirely effective. The only way to avoid cold boot attacks seems to be physical security or to build upon RAM chips that are less affected by remanence. When preventing an attacker, in this case, a forensic analyst, to access a running and/or suspended system with sensitive data in RAM, or by preventing RAM transplantation, cold boot attacks become impossible. However, this requires some kind of physical protection. For example, the first can be achieved by always turning the system off and never just locking it or leaving it in suspend, and the latter can be achieved by soldering RAM directly onto the system’s motherboard, as done in smartphones, tablets and also some laptops today. However, if the countermeasures involve hardware or hardware features, such as using CPU registers to store encryption keys, they are often not circumventable. Even though we were able to circumvent the scrambling of DDR3, we could not have done so if, instead of a simple LFSR-based scrambling algorithm, solid encryption, e.g., AES, was used. Therefore, we now outline the current limiting countermeasures to the cold boot attack.

5.6.1 CPU-bound encryption Some defenses treat RAM as the insecure data storage that it is. These measures keep the assets that need to be protected against an attack out of RAM. Researchers have found multiple different ways to store full disk encryption keys in a more secure place than RAM, e.g., AESSE [107] uses SSE registers, TRESOR [108] and TreVisor [110] use debug registers, LoopAmnesia [132] uses model specific registers, and FrozenCache [115] uses the CPU’s cache. PRIME [54] and Copker [68] are solutions to store private keys within registers. For ARM, a cold boot resistant implementation named ARMORED [62] has been developed. Of course, all these methods have their own disadvantages, such as limited register space, or the fact that preventing data from being written from cache to RAM is a tedious and impracticable process. However, we were not able to extract any data from the locations used as key storage by these systems. But note that, even these systems do not prevent a cold boot attack, they merely prevent keys from entering RAM. All other information can at any time be extracted in a cold boot attack. This again renders these measures being far from a complete solution. In 2012 Blass and Robertson [14] were able to attack TRESOR by injecting code into the kernel via a DMA attack. This code, eventually, gained access to the disk encryption key stored in the debugging registers. Such attacks are, however, not possible if Intel’s IOMMU is used. The IOMMU allows restricting memory access of IO devices, such as PCI cards. The IOMMU, in this regard, acts like the classic MMU and protects the memory of the operating system’s kernel from such DMA attacks.

69 5 Memory acquisition (via cold boot attacks)

5.6.2 RAM encryption In 2010 Cryptkeeper, the first system to encrypt RAM was published. However, it had one crucial flaw. The RAM encryption key resided in cleartext in RAM. We, therefore, extended the idea of TRESOR [108] with our work “A Bytecode Interpreter for Secure Program Execution in Untrusted Main Memory” [131], which, like TRESOR, uses registers for cold boot resistant storage. In this work, we present a bytecode interpreter that uses the registers to store its internal state as well as an AES encryption key. It will fetch encrypted instructions and data as needed from RAM. It will decrypt and execute them inside registers and when needed encrypt the data again before writing it from registers back to RAM. This way we are able to overcome the limits of CPU-bound encryption of only being able to protect a disk encryption key. With this bytecode interpreter, we are able to perform Turing complete calculations with full cold boot attack resistance. In 2016 RamCrypt [63] was published. It also uses registers to store the RAM encryption key. However, unlike our bytecode interpreter RamCrypt keeps a small set of memory in cleartext in RAM, hence, a carefully timed cold boot attack can still acquire the relevant data. Later the problem of RAM encryption was addressed by special hardware [169]. But ultimately the new Intel Software Guard Extensions (SGX) [103], special hardware extensions, will allow code and data to be placed into so-called enclaves, which are encrypted in memory and can only be accessed by code running in the same enclave. Unlike the before mentioned academic solutions, the wide availability of SGX on Intel CPUs ultimately threatens cold boot attacks.

5.7 Conclusion and future work

This chapter provides an independent study on the practicability of cold boot attacks. We system- atically recreated the practical RAM extraction procedure as presented by Halderman et al. Our empirical measurements on DDR1 and DDR2 showed the correlation between temperatures and RAM remanence, demonstrating that even minor cooling of the surface temperature of a RAM module by just 10 ℃ prolongs the remanence effect notably. By providing profound documentation on our experiments, a detail that is missing in the publication by Halderman et al., other researchers can better match and compare their findings to ours. We further elevated the attack to currently used DDR3 RAM technology and further showed that cold boot attacks are also in 2016 a viable option to acquire the RAM of systems without any anti-forensic interference. Even though, we outlined limitations of the cold boot attack its availability, which far exceeds the availability of other memory acquisition methods, such as kernel level acquisition methods, which require root access to a system, combined with its resistance to anti-forensic interference, render it the most favorable acquisition method available.

Because technology is permanently evolving, DDR3 has been superseded by DDR4. These new DDR4 modules and also the memory controllers need to be investigated. We also see future work in the field of data recovery from decayed memory images containing bit errors introduced during cold boot attacks. While this has already been done for encryption keys, it is still an open research question for other data such as kernel structures, including, but not limited to process lists, or other user data residing in RAM.

70 6 Essential data and anti-forensics

In his seminal work on file system forensic analysis, Brian Carrier defined the notion of essential data as “those that are needed to save and retrieve files.” He argues that essential data is, therefore, more trustworthy since it has to be correct in order for the user to use the file system. Because this essential data is more trustworthy it also means it is more reliable and anti-forensic resistant. In many practical settings, however, it is unclear whether a specific piece of data is essential because either file system specifications are ambiguous or the importance of a specific data field depends on the operating system that processes the file system data. We, therefore, revisit Carrier’s definition and show that there are two types of essential data: strictly and partially. While strictly essential corresponds to Carrier’s definition, partially essential refers to application specific interpretations. We empirically show the amount of strictly and partially essential data in DOS/MBR and GPT partition systems, thereby complementing and extending Carrier’s findings. We thereby also empirically show which data is more anti-forensic resistant given a specific system.

6.1 Introduction

Analysis is the interpretation of digital evidence. This interpretation is non-trivial because (at least theoretically) digital data can be manipulated perfectly. But the mere fact that evidence is digital does not mean that it cannot be trusted at all. Digital data can also be interpreted in different ways, e.g. data in memory can be interpreted as code or data. Consider for example a simplified partition table entry within the boot sector of a hard disk (see Figure 6.1). It consists of a reference to the first sector of the partition and the length of the partition in sectors. Furthermore, it provides an identifier for the “type” of the partition and a bootable flag to indicate whether the partition holds a bootable operating system. While it is possible to manipulate all parts of the entry using a , changes will have different consequences. For example, changing the bootable flag may only affect the behavior of the computer if it wants to boot from the partition and so changing the bootable flag from false to true may not affect the behavior of the computer at all. However, if the reference to the first sector of the partition is changed, the boot loader/BIOS cannot locate the beginning of the partition. In consequence, this means that booting from that partition or mounting it will necessarily fail. A user of that partition must set the field to its original value, which is rather cumbersome. It can, therefore, be argued that the value of the starting LBA field is more trustworthy than the bootable flag.

Start LBA Length Type Bootable

Figure 6.1: Typical partition entry in partition table

However, in case a system relies on the bootable flag to contain a specific value in order to use the partition entry, then the starting LBA field and other fields can be used to store arbitrary data, as long as the bootable flag is set to an invalid value. This knowledge is important, because it can be used to hide data from a forensic analysis.

71 6 Essential data and anti-forensics

The observation that some data fields can be trusted more than others was made by Carrier [18]. Carrier [18, p. 176] defines the term “essential file system data” as follows (emphasis maintained):

Essential file system data are those that are needed to save and retrieve files. Examples of this type of data include the addresses where the file content is stored, the name of a file, and the pointer from a name to a metadata structure. Non-essential file system data are those that are there for convenience but not needed for the basic functionality of saving and retrieving files. Access times and permissions are examples of this type of data. [18, p. 176]

Carrier argues that it is important to differentiate between these two types of data because essential data is more trustworthy than non-essential data: essential data needs to be correct because

[. . . ] otherwise the person who used the system would not have been able to read the data. [18, p. 12]

Consequently, Carrier classified every data field in file system data structures as being either essential or non-essential.

6.1.1 Problem statement

While the definition of essential data may be intuitively clear, there are many cases where the meaning of a field remains unclear despite the definition. One main problem is that it depends on the operating system how the data is interpreted. While one operating system may respect the bootable flag and only boot into those that officially have their bootable flag set, another operating system may offer to boot any one of all partitions from within the boot loader. Being this so, is then the bootable flag essential or non-essential? In this chapter, we revisit Carrier’s notion of essential data and show how to fit application/operating system specific data into Carrier’s terminology. We conducted a sequence of experiments regarding the data fields stored in the standard DOS/MBR and GPT partition systems: We systematically changed the value of the field and observed the behavior of different operating systems thereafter on that data. The resulting behavior patterns were not always the same over different operating systems, confirming the fact that essential data depends on the operating system. We argue that there are two types of essential data: strict and partial. While strictly essential corresponds to Carrier’s definition, partially essential refers to application/OS specific interpretations. We empirically show the amount of strictly and partially essential data in DOS/MBR and GPT partition systems, thereby complementing and extending Carrier’s findings.

6.1.2 Related work

In digital forensic science, there exist various classifications of evidence. Many of the categorizations of digital forensics are borrowed from physical evidence. For example, Lee and Harris divided evidence into two distinct groups, namely “” and “pattern evidence” [93]. They further defined seven criteria by which evidence can be categorized: the kind of crime the evidence indicates (murder, theft, etc.), the kind of material the evidence is made of (metal, plastic, etc.), the aggregation state of the evidence (solid, liquid, gaseous), the kind of forensic question answered by the evidence (contact, event reconstruction, exculpatory of a defendant), how the evidence was generated (blood drips vs. whipped blood), and the specific form of the evidence (fingerprint, color, dust, dirt, blood, etc.). Kirk and Thornton defined the class of microscopic evidence [87]. Gross and Geerdes were the first to mention the perishableness of evidence in 1985[ 64]. These categorizations can, with some modifications, also be applied to digital evidence as well.

72 6.2 Definition of essential data by Carrier and its problems

As has been mentioned sufficiently often before, Carrier 2005 defined essential and non-essential data, but he did this without recurring to traditional classifications. Dewald et al. later extended this definition to more general settings that include any type of digital evidence, even volatile data in RAM. Like for Carrier, their definitions remain informal but correspond to Carrier’s terms more or less directly. They distinguish two distinct groups: technically avoidable and technically unavoidable evidence [32]. They further suggest categorizing evidence by the distance from the crime scene and volatility [32]. In this chapter, we further extend and refine the definition of Carrier.

6.1.3 Outline

In this chapter, we revisit Carrier’s definition of essential data (in Section 6.2) and define the two new types of essential data in Section 6.3 on page 76. We evaluate our definition on DOS/MBR and GPT partition systems in Section 6.4 on page 77 and discuss the new terminology in Section 6.5 on page 81. We conclude in Section 6.6 on page 83.

6.2 Definition of essential data by Carrier and its problems

Brian Carrier coined the term essential data in his seminal work on file system forensic analysis [18]. Intuitively, essential data is data that is “needed to save and retrieve files.” [18, p. 176] While Carrier does not provide a more formal definition, his text contains multiple examples and statements that clarify the concept. The standard example for essential data is a pointer value from the metadata of a file to its content (see Figure 6.2): This pointer needs to be true, otherwise the user of the system would not be able to read the file. As an example for non-essential data Carrier mentions the last-accessed timestamp of the file. Carrier argues that

[. . . ] [i]f the time value is never updated, it will not affect the user when she tries to read from or write to the file. Therefore, we should trust the essential data more than the non-essential data because it is required and it is needed for the file system to save and restore files. [18, p. 176]

Name: Cluster: Size: Last Accessed: miracle.txt 345 40 October 27, 2004

Cluster 344 Cluster 345 Today, the Yankees Today, the Red Sox won the World Series. won the World Series.

Figure 6.2: Example of essential and non-essential data [18, Fig. 1.4, p. 13]: the name of the file and the pointer to its content are essential, the last access time is non-essential.

73 6 Essential data and anti-forensics

6.2.1 Problem 1: Definition depends on assumed basic functionality

In a sense, essential data defines the “core” of the data structures necessary to use the “basic functionality” of a file system. The basic functionality due to Carrier apparently is to manage a set of files on a disk. The introductory quotation at the beginning of this article gives examples for what this basic functionality is [18, p. 176]:

• saving (i.e., writing) a file,

• retrieving (i.e., reading) a file, and

• identifying a file with a name.

In contrast, Carrier explicitly states which data fields are solely “for convenience” and are thus non-essential data. For example, a timestamp or a user ID “is not essential because it does not need to be true” [18, p. 176] to read or write the file since

[. . . ] the OS may not have enforced the access control settings on a file; therefore, an investigator cannot conclude that a user did or did not have read access to a file. [18, p. 186]

Other examples of non-essential data are file attributes such as read-only in FAT. They are non-essential because

[. . . ] [t]he impact of these being set depends on how the OS wants to enforce them. The read only attribute should prevent a file from being written to, but I found that directories in Windows XP and 98 can have new files created in them when they are set to read only. [18, p. 227]

In other words, if the operating system can ignore the data field and still provide “basic functionality”, then the data field is non-essential. If this functionality is merely reading and writing to files, then we are close to Carrier’s definition used throughout most parts of his book. However, if it comes to partition systems, basic functionality is different: Carrier states that the “purpose of a partition system is to organize the layout of a volume” [18, p. 72]. This means that the only essential data are those that identify the start and end of each partition:

A partition system cannot serve its purpose if those values are corrupt or non-existent. All other fields, such as a type and description, are nonessential and could be false. [18, p. 72]

Interestingly, Carrier at one point switches to a totally different “basic functionality” while analyzing data from the Application category (such as file system journals):

In this section, we will refer to the data being essential if they are required for the application-level goal of the feature and not if they are required for storing data. [18, p. 339]

The application level goal is not simply reading and writing to files, but rather they “are essential for the goal of providing a log of file changes” [18, p. 392]. In several other places, application-level fea- tures are used to vary the notion of essential data, e.g., in the context of the $STANDARD_INFORMATION attribute in NTFS [18, p. 282, p. 316]. So overall, Carrier’s definition is relative to what basic functionality one assumes.

74 6.2 Definition of essential data by Carrier and its problems

6.2.2 Problem 2: Definition depends on application

Carrier also observes that the basic functionality (and, therefore, the definition of essential data) may be dependent on the “application”. The term application refers to any software accessing the data on disk, but most often it refers to an operating system (OS):

Some OSes may require a certain value to be set, but that does not mean it is essential. For example, a very strict (and fictitious) OS might not a file system that has any files with a last access time that is set in the future. Another OS may not have a problem with the times and will mount the file system and save data to it. requires that all FAT file systems start with a certain set of values even though they are used only when the file system is bootable. Linux, on the other hand, has no requirements for those values. [18, p.176]

So the fact that a specific value needs to be set for a particular operating system to boot the system does not mean it is essential. Carrier argues as follows:

[. . . ] knowing the OS that wrote to the file system is just as important as knowing the type of file system. When discussing file recovery, it is not enough to ask how to recover a file from a FAT file system. Instead, ask how to recover a file that was deleted in Windows 98 on a FAT file system. Many OSes implement FAT file systems, and each can delete a file using different techniques. For example, most OSes do the minimal amount of work needed to delete a file, but there could be others that erase all data associated with the file. In both cases, the end result is a valid FAT file system. [18, p. 240]

So a data field is essential only if all operating systems have to use it in order to provide basic functionality.

6.2.3 Problem 3: Definition cannot deal with redundant information

When analyzing the flags in the $STANDARD_INFORMATION attribute in NTFS, Carrier makes an interesting observation that points to a third ambiguity of his notion of essential data:

Many of these flags [e.g., encrypted, hidden, read-only, etc.] are the same as were seen with FAT, and a description of them can be found there. The flags for encrypted and sparse attributes are also given in the attribute headers, so I consider them to not be essential in this location. This is debatable, though, because another person could claim that this flag is essential and the MFT entry header values are not essential. [18, p. 361]

In other words, if an important data field appears redundantly in multiple places and both fields can be used to provide basic functionality, then it is “debatable” which one is essential and which one is non-essential.

75 6 Essential data and anti-forensics

6.3 What is essential data?

Given the ambiguities of Carrier’s informal definition, we now revisit the concept of essential data and try to approach its different meanings by using a simple formal notation. As observed in Section 6.2 on page 73, the notion of essential data depends on the “basic functionality” and the “application”. Different basic functionalities can be characterized as a set of operations that are “basic” and assumed vital for the task at hand. For example, the “standard” basic functionality throughout most parts of Carrier’s book is the set

{store a file, retrieve a file} but there may be different sets in different contexts. We define a set B of basic functionalities as

B = {b1, b2, b3,...}.

Next, to a basic functionality, the definition of essential data is relative to the set of applications A which we denote as

A = {a1, a2, a3,...}. The applications can be operating systems, database systems, or in general every computer system that accesses a on disk that makes up a partition table, a file system or a database. The data structure consists of several data fields. Examples for data fields have been mentioned above, e.g., the file name, pointers from metadata to file content, access timestamps, the start address of a partition in a partition table, or a magic value in a file header. It is on the level of data fields that we want to determine essential data. For simplicity, we abstract from the semantics of these data fields and simply denote by F the set of all data fields of interest. The combination of basic functionality and application uniquely defines the context in which we can precisely decide whether a data field is essential or not. The idea of our definition is to construct general notions of essential data from individual observations of how an application behaves when trying to provide some basic functionality using a specific data field. More precisely, we define a boolean evaluation function E with the following signature

E : B × A × F 7→ {true, false} which for a given behavior b, application a and field f returns whether the field f is necessary by a to correctly provide b. It is important to note that E can be evaluated by experiment, i.e., by choosing a concrete application a and looking how it behaves after manipulating some data field f. So, for example, if application a, e.g., Windows XP, can provide basic functionality b, e.g., access to file content, even if field f, e.g., the pointer to file content, is not present or malformed, then E(b, a, f) = false. In this case, we say that a evaluates negatively on f, i.e., the field f is non-essential to a to provide b. If, however, the application a cannot correctly access the data, not access the data at all, or even stops working or crashes, then E(b, a, f) = true. In this case, we say that a evaluates positively on f, meaning that the field f is essential for application a. to provide behavior b. We now define what it means for a data field to be essential data. We distinguish between application dependence and basic functionality dependence and define two different notions of essential data. Let us fix some basic functionality b ∈ B and some data field f ∈ F and consider a fixed set A of applications.

Definition 6. We say that f is strictly essential w.r.t. A and b iff all applications in A evaluate positively on f, i.e., iff ∀a ∈ A : E(b, a, f).

The term strictly essential tries to precisely capture Carrier’s definition of essential data. Because all applications require the data field, it seems logical that this data field is absolutely necessary for the data structure to function. However, it may also be the case that some application evaluates positively on the data field and some negatively. This captures an application-dependent form of essential data: Roughly spoken, the field is essential for some applications but not essential for others. This results in the notion of partially essential data.

76 6.4 Evaluation

Definition 7. We say that f is partially essential w.r.t. A and b iff some applications in A evaluate positively on f and some evaluate negatively. Formally, the following condition has to be satisfied:

∃a ∈ A : E(b, a, f) ∧ ∃a ∈ A : ¬E(b, a, f)

Intuitively, the term partially essential comprises data that only some applications require to correctly process the data structure. Often these are fields that are specified to be of a specific form by the data structure specification, but besides the specification stating so, there is no reason for these fields to contain the specified data. Examples of such partially essential data only required by the specification are checksums and informational fields. Finally, we define what it means for a data field to be neither strictly nor partially essential.

Definition 8. We say that f is non-essential w.r.t. A and b iff all applications in A evaluate negatively on f, i.e., iff ∀a ∈ A : ¬E(b, a, f).

The term non-essential includes all data fields that no application requires to function correctly. Examples for this are metadata, file contents, slack space or padding. This is usually content that the user of the data structure is free to decide on (like file content), or parts which are not considered part of the data structure at all (such as slack space or padding). Note that a data field can either be strictly essential or partially essential but not both. This corresponds to Carrier’s concept of essential data since in his view, data was either essential or non-essential, only that we now allow to distinguish application-specific types of essential data. Also, note that our definition is relative to the basic functionality. We believe that this is a defining aspect of essential data: As long as two basic functionalities are incomparable, so must be the notions of essential data for these functionalities. This is also reflected by Carrier’s statements.

6.4 Evaluation

We now evaluate our definitions of essential data by fixing basic functionality and a set of applications and then empirically running the application on a file system for which different data fields were changed or manipulated. We, therefore, tried to systematically reproduce the findings of Carrier [18], which we were not always able to. The set of applications consisted of Windows XP, Windows 7 and 12.04, and we systematically evaluated the data fields of the DOS/MBR and GPT partition systems. For these parameters, we attempted to reliably reproduce the deterministic behavior of operating systems when certain data fields were changed. In a sense, we tried to recreate the evaluation function E upon which the general notion of strictly/partially essential data is based. The analysis was performed by 19 students taking part in a course on digital forensics for computing professionals. The set of basic functionality in our experiment was the ability to access or mount a partition (in order to store and retrieve files). This is the default functionality B in the following. However, in some cases, we observed that the operating systems behaved differently regarding other functionalities. In this case we qualify our results with a footnote specifying the specific (non-default) functionality in question. We present our results as tables listing the data fields in the MBR and GPT headers and partition entries, with each data field being marked as essential or not for each tested operating system. More formally, the tables show the boolean values of the evaluation function for the specific combination of functionality and application. For comparison, we also always note the classification by Carrier [18]. Based on the results of the evaluation, the last column denotes whether the data field is strictly (SE), partially (PE) or non-essential (NE).

77 6 Essential data and anti-forensics

6.4.1 DOS/MBR

Table 6.2 on the facing page gives an overview of entries in the Master Boot Record (MBR) and their categorization. As can be seen, only the partition table entries are strictly essential, while the boot code is non-essential, unless the application is supposed to boot from the device. But even in that case, the boot code can be whatever the application decides and any other application should not be hindered to access the partitions by different boot code. The same is true for the disk signature and the “UEFI unknown value”, which are also user controllable.

The boot code is non-essential for accessing/mounting the partition, but for another basic function- ality, namely booting from the volume, it is even strictly essential. Since our default functionality is accessing the partition, we refer to the field as non-essential.

The interesting field is the boot signature field. Indeed, an OS can access the partition table and from there the partitions without this value, as shown by Carrier. However, in our evaluation, all three tested OSes required this value to be set. Otherwise, they refused to mount any partition, because they considered the MBR to be invalid, which according to the UEFI Specification [156] is correct. Windows XP and 7 even attempted to repair the volume. So given the evidence of Carrier and our own findings, we argue that it is partially essential.

Bytes Data Field f Essential [18] Essential Linux Essential Windows Type 0 - 0 Boot-Indicator No Yes∗ No PE 1 - 3 CHS Start-Address Yes§ No No PE 4 - 4 Partition Type No Yes† Yes† PE 5 - 7 CHS End-Address Yes§ No No PE 8 - 11 LBA Start-Address Yes§ Yes Yes SE 12 - 15 Size in LBA-Blocks Yes§ Yes Yes SE

∗Must be either 0x00 or 0x80, otherwise the entry is ignored §Either CHS Addresses or LBA Addresses must be provided. †Essential for automatically mounting the partition.

Table 6.1: MBR partition table entry data fields [156, Table 14. Legacy MBR Partition Record] and their type.

We now turn to the data fields of an MBR partition entry. As can be seen from Table 6.1 only the LBA Addresses are strictly essential, because the CHS Addresses are not used by the OSes. As Carrier already indicated the CHS Addresses are non-essential if the OS uses the LBA Addresses and those are present and correct. With our new levels of essential, we are able to adequately represent those instances where it depends on the application whether a field is essential or not.

To provide backwards compatibility a storage medium partitioned via the GPT system contains a so-called protective MBR (PMBR) at LBA 0. This PMBR protects the GPT partitions in the sense that legacy applications are prevented from accidentally accessing the disk areas containing the GPT Partitions as “used”. To this end, the PMBR contains one partition of type EFI (0xee). The rest of the partition entries is zeroed. GPT will work without this PMBR being present, in which case, however, GPT ignorant legacy systems may accidentally overwrite data in GPT partitions. The essential data of this PMBR with regard to the tested operating systems is the same as the classical MBR, hence not explicitly listed.

78 6.4 Evaluation SE SE SE SE PE Type † ∗ ∗ ∗ ∗ Yes Yes Yes Yes Essential Windows XP and 7 ∗ ∗ ∗ ∗ NoNo No No NE NE § § Essential [ 18 ] Essential Linux MBR data fields [ 156 , Table 13. Legacy MBR] and their type. f Table 6.2: Reserved No No No NE Logical Blocksize Part of boot codeOtherwise, and the not partition explicitly isOtherwise, defined not the by recognized. Microsoft Carrier. Windows wants to repair the drive containing the partition. † § ∗ 462478 -494 - 477510 - 494 461 - 511 2nd Partition Table Entry 3rd Partition Table Entry 4th Partition Table Entry Boot Signature Yes Yes Yes Yes Yes No Yes Yes Yes Bytes0440444 - -446 424. . 443 - . 446 - 445 461 Data Boot Field Code Unique MBR Disk Signature UEFI Unknown 1st Partition Table Entry No Yes No No Yes No No NE 512 -

79 6 Essential data and anti-forensics SE PE Type ‡ , ∗ † Yes Yes Essential Windows 7 † ∗ No Yes Yes PE No No No NE No No Yes PE Yes Yes Yes SE YesYes Yes Yes Yes SE Yes Yes Yes SE Essential [ 18 ] Essential Linux GPT header [ 156 , Table 17. GPT Header] Table 6.3: ” No Yes EFI PART f Size of Partition TableCRC32- Entries of Partition Table Entries LBA of 1st BlockNumber of of Partition Partition Table Table Entries Entries LBA of 1st BlockLBA of of Partition last Area Block of Partition Area LBA of Header (self-reference) Without the EFI SignatureMust the be PMBR greater is thanSetting used. the the partition Number number of to Partition be Table used. Entries to zero leads to a Blue Screen of Death in Windows 7. 0 - 7 EFI Signature “ 8 - 11 Revision Yes Yes Yes SE † ‡ ∗ 92 - Blocksize Reserved (zeroed) No No No NE 8488 - - 87 91 5672 -80 - - 71 79 83 Disk GUID No No No NE 3240 -48 - - 39 47 55 LBA of GPT Header No No No NE 1216 -20 -24 - 15 - 19 23 GPT Header Size 31 in Header Bytes CRC32-Checksum Reserved (zeroed) Yes No No Yes Yes Yes No Yes SE PE No NE Bytes Data Field

80 6.5 Discussion

6.4.2 GPT header

Table 6.3 on the facing page shows our empirical results for the GPT header. Because Windows XP does not support EFI and GPT it was not tested and hence is not listed. As can be seen from the table, again all three types of essential data are present. The main reason for the amount of partially essential data is again due to a specification requiring a field but applications not necessarily relying on it. According to the UEFI Specification [156, 5.3.2 GPT Header] the EFI Signature, Header CRC32-Checksum, LBA of Header and CRC32-Checksum of partition table entries must all be valid. But as Carrier already indicates, a GPT partitioned drive can be used without these fields being valid. An interesting fact is that the LBA of GPT backup header is non-essential, even though the UEFI Specification [156, 5.3.2 GPT Header] clearly states that the GPT Backup Header must also be checked for validity. Obviously, this is not done by any tested operating systems.

6.5 Discussion

We now discuss the benefits and shortcomings of our definition.

6.5.1 Usefulness of new definitions

Strictly essential data must be relied upon, because by definition it is the minimum amount of data needed to correctly access a data structure according to its specification. If less data is required to access the data structure this would violate our definition of strictly essential. In theory, with an infinite number of applications, this definition would exactly correspond to Carrier’s definition, because if a field is essential according to Carrier there cannot exist an application that is able to correctly interpret the data structure without this field. However, if such an application aw would exist then the original premise of the field being absolutely necessary to access the data structure could be dismissed because the aw would be a witness against the necessity of the data field. Hence, strictly essential data w.r.t. the set of all (thinkable) applications is exactly the minimal data necessary for a data structure to function correctly (with respect to a certain basic functionality, or course). As stated earlier partially essential data, is often data that is required by the specification, but not necessary at all to access the data structure. One example is the MBR Boot Signature. Even though in our tests all OSes required the signature, it is obviously possible to construct an OS that does not require this signature. If partially essential data is encountered during an analysis, care must be taken to consider the circumstances correctly, e.g., as can be seen from Table 6.2 on page 79 neither Windows nor the tested version of Linux recognized the partition table as valid, hence, this fact could be exculpatory. In theory, non-essential data should only comprise data fields that are by specification allowed to be completely controlled by the user of the data structure. If an application would exist that imposes a restriction on the data field content, e.g., assumes a particular value to be present, that application would violate the specification of the data structure and, therefore, would have to be considered broken. For Carrier too, the fact that data is fully user-controllable is a good indicator of something being non-essential:

Technically, any file that an OS or an application creates could be designed as a feature in a file system. For example, Acme Software could decide that its OS would be faster if an area of the file system were reserved for an address book. Instead of saving names and addresses in a file, they would be saved to a special section of the volume. This might cause a performance improvement, but it is not essential for the file system. [18, p. 205]

81 6 Essential data and anti-forensics

6.5.2 Trust hierarchy

With our new notions of essential data, we are able to adequately represent those instances where it depends on the application whether a field is essential or not, which allows us to define a hierarchy of trustworthiness:

• Strictly essential data must be trusted. They are generally not manipulatable without impacting the correct functionality of the data structure.

• Partially essential data can be trusted for specific applications. Because not all applications require this data to be specifically formatted, it is possible that this data is manipulated.

• Non-essential data cannot be trusted. Non-essential data is in general either user content, that is data which can be freely chosen by the user of the data structure, or “wasted” space within the data structure, that serves no data storage purpose, such as slack space or padding. However, non-essential data can become essential if “higher level” basic functionalities (functionality referring to user applications like browsers or word processors) are considered.

This trust hierarchy can give an indication to how anti-forensic resistant a specific data field is, because a field that is strictly essential can not be manipulated without impacting the correct functionality of the application. So, in case, a partition table of a hard drive actively being used by a system is analyzed, it can be assumed that the strictly essential fields could not have been manipulated by anti-forensic software. With this knowledge, anti-forensic resistant tools can be build, by making current tools only rely on strictly essential data, if possible. In case this is not possible, the tool should warn the forensic analyst when partially essential data was used to interpret the data.

6.5.3 Evidence hierarchy

Besides a hierarchy of trustworthiness of data, a hierarchy of where evidence is most likely stored can be specified:

• While strictly essential data can be evidence in itself, it cannot “contain” evidence. This is because the strictly essential data’s content is completely defined by the data structure and is not freely user influenceable.

• Partially essential data can be used to store user content, given it is not used by an appli- cation which evaluates positively on that data field. So for partially essential data the case circumstances, i.e., the set of installed applications, must be considered in order to judge whether a data field contains user data that can be used as evidence.

• Non-essential data can be used freely to store any data. Hence, this data must, in any case, be analyzed. During a forensic investigation, even fields that the data structure specification does not explicitly tag as user storage must be analyzed for hidden data if the field is non-essential. Here it is also important to note that data fields can depend on other data fields, e.g., if for DOS/MBR the bootable flag in the partition table entry is not set to either 0x00 or 0x80 on a Linux, the entry is discarded. This means the CHS and/or LBA addresses within the entry become non-essential, even though they are strictly essential in a valid partition entry.

Because non-essential data can be freely manipulated, it can be used as anti-forensic hidden storage, i.e., non-essential data fields of data structures can be used to either hide evidence of store components of a malware or rootkit. Hence, non-essential data must, in any case, be analyzed.

82 6.6 Conclusions and future work

6.6 Conclusions and future work

Carrier’s notion of essential data is important to understand the trustworthiness and hence resilience against anti-forensic manipulations of evidence found in file systems. In this chapter, we revisited Carrier’s definition and distinguished a “pure” notion (consistent across all possible contexts) and an “application-dependent” notion of his term. With our evaluation, we, therefore, followed a recommendation by Carrier to verify and test specific behavior of applications as stated in his book: [. . . ] it is important for an investigator to verify and test the application-specific behavior in the context of a specific incident. [18, p. 176]

Because anti-forensic methods often use flaws within systems, these practical evaluations are very important. Even given such a simple target as partition tables, we were able to uncover discrepancies between system with regard to how essential data is to them. An anti-forensic attacker familiar with a particular system could use those discrepancies to his advantage, e.g., storing evidence in a data field that is supposed to be strictly essential in theory, when on the particular system it is non-essential and thus can be freely manipulated. In future work, we need to extend our analysis to other and also more complex data structures. Interesting structures are file systems, such as FAT, NTFS and EXT. However, also memory structures such as kernel process structures could be evaluated. However, we would then need to also take general notions like technically avoidable and technically unavoidable evidence [32] into account. Further possible research needs to be done on data fields which have a logical dependency on each other. One example for this would be the GPT Backup Header. It is not used if a valid GPT Header is defined. This means that the GPT Backup Header could be considered non-essential. If no valid GPT Header exists the GPT Backup Header is used. This would make the GPT Header non-essential. However, both cannot be non-essential at the same time. At least one of them needs to be defined, making at least one strictly essential, but not both. Ways to formalize this need to be found.

83

7 Virtual memory analysis (on Windows NT)

Once the evidence has been acquired it needs to be analyzed. To this end, we in this chapter, address the analysis of virtual memory. This can be used to analyse the memory acquired via one of the previously mentioned methods. We further identify one blind spot of current state-of-the-art memory analysis methods, namely the swapped out memory pages. In order to provide more virtual memory than is actually physical present on a system, an operating system may transfer frames of memory to a pagefile on persistent storage. Current memory analysis software does not incorporate such pagefiles and thus misses important information. We, therefore, present a detailed analysis of Windows NT paging. We use dynamic gray-box analysis, in which we place known data into virtual memory and examine where it is mapped to, in either the physical memory or the pagefile, and cross-reference these findings with the Windows NT Research Kernel source code. We demonstrate how to decode the non-present entries, and accurately reconstruct the complete virtual memory space, including non-present memory pages on Windows NT systems using 32-bit, PAE or IA32e paging. Our analysis approach can be used to analyze other operating systems as well.

7.1 Introduction

With the increased usage of hard disk encryption, the employment of RAM disks, and persistence data avoidance technologies such as private browsing modes, memory analysis becomes more and more important to the computer forensic process. In some cases, such as memory resident malware, or the already mentioned private browsing modes, the volatile memory can become the only source of information in an investigation. Because the physical memory of a computer system is limited and far smaller than the available persistent storage modern operating systems provide virtual memory. In a virtual memory envi- ronment, virtual memory addresses are translated to physical memory addresses via an address translation process. To provide more virtual memory as is physically present in form of RAM modules, an operating system may swap parts of the virtual memory out to persistent storage. Hence, data in the virtual address space may not always be present in the physical main memory of a computer system, but only in the persistent storage of that system. The location of the virtual memory on the persistent storage is known as the pagefile.

7.1.1 Motivation

Previous research has already found the pagefile to be of forensic value. Software such as browsers can leave evidence in the pagefile [114]. However, current analysis methods are not adequate. Currently, the pagefile is treated as a miscellaneous date file, without considering each non-paged memory page’s position in the virtual address space. This approach can still yield valuable information, e.g., by running a keyword search over all non-paged memory pages in the pagefile [124, 111]. Searching the pagefile via a file carver can also successfully discover files [95]. However, without context, the forensic value of these findings is diminished. One example is a multi-user system. In which case, finding data of interest in the pagefile may not be enough to be helpful to the case if it can not be attributed to a particular user or process within the complete system. Further, a lot of data of interest is simply lost because it can not be adequately reconstructed without putting the non-paged memory pages into context, e.g., process structures or picture files.

85 7 Virtual memory analysis (on Windows NT)

In general, complex data structures spanning multiple memory pages require the paging relation to be reconstructed accurately and with certainty. Additionally, to reconstruct also non-paged memory data from the pagefile must be incorporated into the analysis. This is exactly what our work fixes. It provides the paging context of the non-paged memory pages within the pagefile of Windows NT systems to current memory analysis methods.

7.1.2 Related work Currently, memory analysis is already established and there exist a vast amount of works dealing with the acquisition and analysis of paged memory [162]. However, only one [89] considers non-paged virtual memory. Related work exists for the extraction of the pagefile.sys [95]. Many works mention extracting data from the pagefile.sys via crude methods [124]. However, such methods are at the borderline of forensic sanity, because some rely on knowing specific keywords before hand or extract data chunks without putting them into the context of the rest of the system processes. The incorporation of the pagefile into the memory analyzing process was proposed [82] and preliminary information has been published regarding the connection between the pagefile and non-present pages in the page table of Windows NT system [89]. This knowledge was further supplemented by information about mapped files [160] and the Windows Virtual Address Descriptor (VAD) tree [34]. Volatility, the leading project in open source memory analysis, strives to add pagefile support as part of their project road map, but it is not implemented yet. Rekall, a project forked from the Volatility “scudette” branch known as the “Technology Preview” branch, currently developed by Michael Cohen for , has very recently, during the preparation of this work, added preliminary code to their 1.2.1 release, in preparation for pagefile analysis. However, no publication nor information about this implementation, nor any evaluation with regard to its correctness existed during the preparation of our work.

7.1.3 Outline This chapter is structured as follows: In Section 7.2 we present our gray-box analysis scheme and the tools we used to conduct this analysis. In Section 7.3 on page 89, we provide an overview of paging on Windows NT systems, the content of which was ascertained and verified via our gray-box analysis scheme. In Section 7.4 on page 95, we give a brief overview of current memory and pagefile acquisition techniques for Windows NT. And in Section 7.5 on page 96, we give a brief overview of memory analysis that can be performed within the virtual address space as reconstructed by our method. Finally in Section 7.6 on page 98 we evaluate the new combined virtual memory analysis technique incorporating the pagefile. To this end, we compare it to current analysis methods. Last but not least, we conclude this chapter in Section 7.7 on page 103.

7.2 Grey-box virtual address translation analysis

Even though, work related to pagefile incorporation into Windows NT virtual memory analysis exists, as already outlined in Section 7.1.2, we developed a gray-box virtual address translation analysis method, which can be used to verify, disprove, and/or updated current knowledge. This gray-box scheme can further be used to analyse yet unknown systems.

7.2.1 Scheme The virtual address translation analysis scheme targets address translation via paging. However, it can be adapted to other translation methods, such as segmentation. The basic scheme can be seen in Figure 7.1 on the next page. The components consist of the Virtual Memory, the Physical Memory, the Pagefile, and the Virtual Address Translation. The Virtual Address Translation component is the target of the analysis. The goal is to infer knowledge about it. To this end, the Virtual Memory is filled with known pages. We use sequential numbers to fill the pages. We call these known pages crib pages (cf. word usage of crib in cryptology).

86 7.2 Grey-box virtual address translation analysis

The number pattern within each page allows us to uniquely identify each page and to find the corresponding physical frame in either the Physical Memory component or the Pagefile. The available knowledge about the basic workings of the Virtual Address Translation and the availability of at least the binary code of the operating system, residing in physical memory, makes this a gray-box analysis. The goal of this analysis is to provide for any given virtual page address the corresponding physical frame address or physical frame offset, if the physical frame is in the pagefile.

Virtual Memory Physical Memory

Memory Memory Page Frame

0x00000000 0xNNNNN000 0x00000001 0xNNNNN001 Crib ...... Crib Page 0x00000ffe 0xNNNNNffe Frame 0x00000fff 0xNNNNNfff

0x00001000 0x00001001 ... 0x00001ffe 0x00001fff Virtual Address Pagefile Translation 0x00001000 ? 0x00001001 ...... 0x00001ffe 0x00001fff

0x00000000 0x00000001 ... 0xNNNNN000 0x00000ffe 0xNNNNN001 0x00000fff ... 0xNNNNNffe 0xNNNNNfff Pagefile Frame

Figure 7.1: Grey-box virtual address translation analysis scheme

7.2.2 Test data generation The data needed for our gray-box analysis is created by our ramwrite tool. It is platform- independent, portable and only relies on ISO C89 features. ramwrite allocates a specific amount of memory via malloc(). It allocates one page more than is requested. This way the crib data can be aligned to a page boundary. The memory space from the address returned by malloc to the next page boundary is filled with 0xc001babe to distinguish it from the beginning of the crib data and also potentially from any other data. The ramwrite process stays open until user input is provided. This way the memory allocation can persist as long as needed. ramwrite supports different patterns it can write. The most useful are: addr: Writes a sequence of 32-bit numbers into the allocated space, starting with 0 at the start of the allocation. zero: Overwrites the allocation with zero bits; useful for “cleaning” the memory before tests. one: Like zero, but writes all one bits. deadbeef: Fills the allocation with the 32-bit value 0xdeadbeef; useful as a second distinguishable pattern.

87 7 Virtual memory analysis (on Windows NT) ramwrite is duplicated and renamed swapforcer. swapforcer is used to force the memory allocation of ramwrite out of physical memory into the pagefile by making a large allocation. swapforcer fills its memory allocation with 0xdeadbeef. Depending on how much of the ramwrite memory content should be written to the pagefile, swapforcer can also allocate the complete physical memory. With these two programs the needed crib data can be placed into physical memory and the pagefile as follows:

• ramwrite is executed and fills its memory allocation with sequential 32-bit numbers (addr pattern).

• swapforcer is executed and fills its memory allocation with a distinguishable pattern (0xdeadbeef pattern).

• By varying the size of the memory allocation by the swapforcer, the crib data distribution between physical memory and pagefile can be controlled. The more memory swapforcer allocates, the more frames of the ramwrite process are swapped into the pagefile.

7.2.3 Inferring the virtual address translation Once ramwrite and swapforcer are executed on a system, the physical memory and pagefile can be acquired. In Section 7.4 on page 95 we will go into detail how this is can be done on Windows NT. The analysis then consists of

1. figuring out the mapping of virtual crib page addresses to physical crib frame addresses and offsets, and

2. inferring how the operating system “remembers” this mapping.

The first is trivial. A simple program find_addr was created that searches the input at the specified offset for a crib page starting with the 32-bit value given by the parameter. The second step is more complicated and has, in some parts, to be done via manual reverse engineering:

1. The hardware dependent address translation must be interpreted.

2. The software dependent address translation must be interpreted.

The hardware dependent address translation is specified by the hardware manufacturer. Address translation is done via paging tables. For this, the operating system has to store the base address of the root table somewhere. In Section 7.5.1 on page 96, we will detail how this base address can be found on Windows NT systems. Once this base address is extracted the paging table tree can be traversed. After the hardware dependent address translation part is implemented in the memory analysis process, it is recommended to test it with memory dumps of the ramwrite program without any crib data being forced into the pagefile. The memory analysis process should be able to perfectly reconstruct the crib data, i.e., bring each crib frame into the correct order. For the software dependent part a trial-and-error process is used. We currently assume the correlation between virtual addresses and offset in the pagefile can be inferred from the page table entry itself. At least this is true for Windows NT and Linux (cf. swp_entry_t and pte_to_swp_entry() in the Linux source code). If a different system does not use the page table entries to store this information, the relevant data structure must be found within the operating system data structures. Here the location, i.e., addresses and offsets, of the crib pages and crib frames can be used to verify found data structures. Even though, theoretically an exhaustive search for crib frame addresses and offsets could be used the amount of possible candidates for operating system data structures could be overwhelming. Our findings so far, however, indicate that the crib frame offset into the pagefile can be determined, by appropriately shifting and masking the page table entry value.

88 7.3 Windows NT x86 and x64 virtual memory overview

7.3 Windows NT x86 and x64 virtual memory overview

In this section, we give an overview of virtual memory translation for Windows NT x86 and x64 with 32-bit, PAE and IA32e Paging. The information herein was derived via the techniques described in the previous section and was further extended and verified via cross-referencing the Windows NT Research Kernel source code [104].

7.3.1 Pagefile

Before the paging overview, we briefly explain the pagefile of Windows NT. The default pagefile on Windows NT systems is called pagefile.sys. On default it is stored in the root directory of the file system the operating system is installed on, i.e., %SystemRoot%\pagefile.sys. However, up to 16 pagefiles can exist. Their name, location, minimum and maximum size in MiB are specified in the registry strings stored in the entry shown in Listing 7.1.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PagingFiles

Listing 7.1: Pagefile registry entry

With each string entry having the form outlined in Listing 7.2

Listing 7.2: Pagefile registry entry format

The pagefiles are numbered, starting from 0 to 15 beginning with the first entry in the PagingFiles registry entry. At runtime, the Windows NT kernel keeps a _MMPAGING_FILE structure for each pagefile. The most relevant fields are outlined in Listing 7.3, with the offsets referring to a Windows 7 x86 with the kernel build version 7600. lkd> dt nt!_MMPAGING_FILE +0x018 File : Ptr32 _FILE_OBJECT +0x024 PageFileName : _UNICODE_STRING +0x040 PageFileNumber : Pos 0, 4 Bits +0x044 FileHandle : Ptr32 Void

Listing 7.3: WinDbg displaying memory layout of _MMPAGING_FILE structure

This way the pagefile number and corresponding path and name of the file can be derived from the physical memory of a system. However, in order to be of any use, the corresponding pagefiles must have been acquired alongside the physical memory, at which point the pagefile numbers, paths, and names should already be known to the analyst who acquired the pagefiles in the first place. In any case, multiple pagefiles are rare, because they must be explicitly configured and are, to our knowledge, not generated by Windows NT on default settings. On a live system, the pagefile itself is locked by the kernel. However, in Section 7.4.2 on page 95 we will detail how the file can still be acquired on a live system. The pagefile’s content is unstructured. Raw memory frames are stored in the file as if it was physical memory. Each frame is referenced by an offset. The next section explains the address translation between virtual memory page addresses and the pagefile frame offsets.

89 7 Virtual memory analysis (on Windows NT)

7.3.2 Page table entries Windows NT systems running on x86 based hardware will use PAE Paging [80, 4.4. PAE PAGING] if available. If it is not available they will use 32-bit Paging [80, 4.3. 32 BIT PAGING]. Windows NT systems running on x64 based hardware will use IA32e Paging [80, 4.5. IA32e PAGING]. In the following sections, we will give an overview of the page table entries (PTEs). PTEs can be divided into two categories: hardware PTEs and software PTEs. Hardware PTEs are all PTEs which have the present bit set. If this bit is set the interpretation of the values within the PTE is strictly done by hardware. In case the PTEs are marked as invalid, they are software PTEs. Software PTEs are interpreted strictly by software. In this case the Windows NT kernel. Later we will go into detail how Windows NT interprets the PTEs and translates the virtual addresses. We will now give an overview of the bit layout of possible PTEs.

7.3.2.1 32-bit Paging

Figure 7.2 lists the hardware PTEs that we encountered during our analysis of Windows NT running on x86 based hardware without PAE Paging present. Their definition is given strictly as per the Intel specification [80, Figure 4-4.]. Ignored fields are grayed out. They are not interpreted by the hardware and can thus be considered software fields. The bits at offset 12 to 31 give the 4 KiB address of either further tables or the start of the physical memory frame.

012345678910111213141516171819202122232425262728293031

Address of PD Ignored Ignored CR3 PCD PWT

Address of PT Ignored 0 1 PDE (PT) PCD PWT Ignored Ignored Accessed

Address of 4 KiB frame Ignored 0 1 PT (4 KiB) PCD PWT Global Ignored Ignored Accessed

Figure 7.2: 32-bit Paging PTE structures

Windows NT does not seem to use 2 MiB frames. We neither encountered such frames during our analysis nor makes the Windows Research Kernel source code any references to 2 MiB frames for 32-bit Paging [104, mi386.h].

012345678910111213141516171819202122232425262728293031

PageFrameNumber MMPTE_HARDWARE Valid Dirty Write Owner Global reserved Accessed Prototype LargePage CacheDisable CopyOnWrite WriteThrough

PageFileHigh Protection MMPTE_SOFTWARE Valid Prototype Transition PageFileLow

Figure 7.3: Windows NT 32-bit Paging structures

90 7.3 Windows NT x86 and x64 virtual memory overview

Figure 7.3 on the preceding page lists the software PTEs of Windows NT with regard to 32-bit Paging. The definition of the MMPTE_SOFTWARE PTE was taken from the Windows Research Kernel source code [104, mi386.h:2446] as was the MMPTE_HARDWARE PTE definition [104, mi386.h:2508]. The PageFileLow field gives the pagefile number. As discussed earlier there can be up to 16 pagefiles. The PageFileHigh field gives the offset into the pagefile. Like the hardware PTEs, this offset is a 4 KiB offset. Hence the byte offset can be obtained by simply masking the last 12 bits of the PTE. Not only memory frames can reside in the pagefile, but also PTEs themselves can reside in the pagefile. These findings are the same as presented in other research [89, Figure 3].

7.3.2.2 PAE Paging

Figure 7.4 lists the hardware PTEs that we encountered during our analysis of Windows NT running on x86 based hardware with PAE Paging enabled. Their definition is given strictly as per the Intel specification [80, Figure 4-7.]. Ignored fields are again grayed out and can be considered software fields. Further we also grayed out reserved fields, which according to Intel’s specification have to be zeroed [80, p. 4-18]. For all but the Page Directory Entry pointing to a 2 MiB frame (PDE (2 MiB)) the bits at offset 12 to 62 give the 4 KiB address of either a further table or the start of the physical memory frame. For the PDE (2 MiB) bits 21 to 62 give the 2 MiB address of the physical 2 MiB frame.

51526263 M-1M 01234567891011121314151617181920212223242526272829303132

Ignored Address of PDPT Ignored CR3

Reserved Address of PD Ignored Reserved 1 PDPTE (PD) PCD PWT Reserved

Reserved Address of PT Ignored 0 1

XD PDE (PT) U/S PCD R/W PWT Ignored Accessed

Reserved Address of 2 MiB frame Reserved Ignored G 1 1

XD PDE (2 MiB) U/S PAT PCD R/W Dirty PWT Accessed

Reserved Address of 4 KiB frame Ignored G 1

XD PTE (4 KiB) U/S PAT PCD R/W Dirty PWT Accessed

Figure 7.4: x86 PAE Paging structures

Again Windows NT does not seem to use 2 MiB frames, at least we did not encounter such frames during our analysis. However, the Windows Research Kernel source code references 2 MiB frames for PAE and IA32e Paging [104, miamd.h:2685]. The software PTEs for PAE Paging seem to be virtually identical to the IA32e ones, which we detail in the next section.

91 7 Virtual memory analysis (on Windows NT)

7.3.2.3 IA32e Paging

Figure 7.5 lists the hardware PTEs that we encountered during our analysis of Windows NT running on x64 based hardware using IA32e Paging. Their definition is given strictly as per the Intel specification [80, Figure 4-11.]. The IA32e PTEs are virtually identical to the PAE PTEs. The only difference is that bits at offset 62 to 52 are ignored.

51526263 M-1M 01234567891011121314151617181920212223242526272829303132

Reserved Address of PML4 Ignored Ignored CR3 PCD PWT

Ignored Address of PDPT Ignored A 1

XD PML4E U/S PCD R/W PWT Ignored Reserved Reserved

Ignored Address of PD Ignored 0 A 1

XD PDPTE (PD) U/S PCD R/W PWT Ignored Reserved

Ignored Address of PT Ignored 0 A 1

XD PDE (PT) U/S PCD R/W PWT Ignored Reserved A D Ignored Address of 2 MiB frame Reserved Ignored G 1 1

XD PDE (2 MiB) U/S PAT PCD R/W PWT Reserved A D Ignored Address of 4 KiB frame Ignored G 1

XD PTE (4 KiB) U/S PAT PCD R/W PWT Reserved

Figure 7.5: x64 IA32e Paging structures

Again Windows NT does not seem to utilize either the 1 GiB nor the 2 MiB frames. As state before, however, code for 2 MiB frames is defined in the Windows Research Kernel as the MMPTE_HARDWARE_LARGEPAGE software PTE structure [104, miamd.h:2685].

Figure 7.6 on the facing page lists the software PTEs of Windows NT with regard to IA32e Paging and, as already explained above, PAE Paging. The definition of the MMPTE_SOFTWARE, MMPTE_HARDWARE and MMPTE_HARDWARE_LARGEPAGE PTE was taken from the Windows Research Kernel source code [104, miamd.h].

As before the PageFileLow field gives the pagefile number. The PageFileHigh field gives the offset into the pagefile. Like the hardware PTEs this offset is a 4 KiB offset. Hence the byte offset can be obtained by shifting the value of the PTE by 21 bits and then simply masking the last 12 bits. As before not only memory frames can reside in the pagefile, but also PTEs themselves.

92 7.3 Windows NT x86 and x64 virtual memory overview

51526263 M-1M 01234567891011121314151617181920212223242526272829303132

MMPTE_HARDWARE.. . reserved2 PageFrameNumber reserved1 PAT Valid Dirty Write Owner Global . . . _LARGEPAGE Accessed reserved0 Prototype LargePage CacheDisable CopyOnWrite WriteThrough

PageFrameNumber MMPTE_HARDWARE Valid Dirty Write Owner Global Accessed reserved1 reserved0 Prototype LargePage NoExecute CacheDisable CopyOnWrite WriteThrough SoftwareWsIndex

PageFileHigh Reserved UsedPageTableEntries Protection MMPTE_SOFTWARE Valid Prototype Transition PageFileLow

Figure 7.6: Windows NT x64 paging structures

7.3.3 Virtual address translation

After the overview of available PTEs we now explain how they are used in Windows NT to translate virtual to physical addresses.

7.3.3.1 Hardware

All hardware PTEs, i.e., PTEs that have the valid bit set, are interpreted by the hardware. For a reference on how this is done, we recommend the Intel specification [80] which details the 32-bit Paging [80, Figure 4-2.], PAE Paging [80, Figure 4-5.] and IA32e Paging [80, Figure 4-8.] address translation process. In any case, if the valid bit is set, the interpretation must strictly follow the hardware specification.

7.3.3.2 Software

The software PTEs can be divided into different types: Demand Zero, Pagefile, Transition, and Prototype PTEs. All Software PTEs are hardware and paging mode independent. Only their corresponding fields, namely PageFileHigh, may reside at different bit offsets within the PTE. However, they are interpreted the same for every paging mode, as the memory management code of the NT kernel responsible for these software PTEs is already abstracted from the hardware PTEs [104, pagfault.c].

Demand Zero

01234567891011

0 0 0 0 0 Demand Zero PTE

Figure 7.7: Windows NT Demand Zero PTE

A PTE with the Valid, Transition, Prototype, PageFileLow and PageFileHigh fields set to zero [104, miamd.h:2385 and mi386.h:2225] is a so-called Demand Zero PTE. It can be seen in Figure 7.7. Any request to this virtual address should be satisfied by a memory frame that is filled with zeros [104, MiResolveDemandZeroFault()].

93 7 Virtual memory analysis (on Windows NT)

Pagefile

01234567891011

PageFileHigh 0 0 0 Pagefile PTE PageFileLow

Figure 7.8: Windows NT Pagefile PTE

The Pagefile PTEs, as seen in Figure 7.8, or MMPTE_SOFTWARE PTEs, as they are referred to in the Windows Research Kernel source code, are the most interesting to this work. If the Valid, Prototype and Transition bits are zero and the PageFileHigh field is not zero the PTE references the pagefile [104, MiResolvePageFileFault()] given by PageFileLow. The offset into the pagefile is given by PageFileHigh. Pagefile PTEs can also reference other PTEs, in which case the PTE that is referenced is loaded from the pagefile. If a Pagefile PTE is encountered during analysis in a place where a PDPTE (PDE), PDE (PT) or PTE (4 KiB) page table is expected the PageFileHigh field gives the offset to the relevant paging table structure within the pagefile. The index in this paging structure is taken from the virtual address bits the same way as for the hardware address translation. Note that because Windows NT does not use 1 GiB nor 2 MiB paging structures the 7th bit of the Pagefile PTE must be ignored. If bit 7 is one this does not indicate a large page. A PTE loaded from the pagefile can then be interpreted like a PTE loaded from a physical memory frame.

Transition

01234567891011

1 0 0 Transistion PTE

Figure 7.9: Windows NT Transition PTE

A Transition PTE, as seen in Figure 7.9, is a PTE with the Valid and Prototype field set to zero, but the Transition field set to one [104, MiResolveTransitionFault()]. This PTE is used to mark a page as being in transition, i.e., being in the process of being paged out to the pagefile. A PTE marked as being in transition, hence, still resides at the physical memory frame it points to. As with Pagefile PTEs also Transition PTEs can also reference other PTEs, which can be interpreted normally.

Prototype

01234567891011

1 0 Prototype PTE

Figure 7.10: Windows NT Prototype PTE

Prototype PTEs, as depicted in Figure 7.10, are used to facilitate shared memory and mapped files [104, MiResolveMappedFileFault()]. They have the Valid bit set to zero and the Prototype bit set to one [104, MiResolveProtoPteFault]. Even though these Prototype PTEs also constitute parts of the virtual address space we currently do not resolve them as we focus on reconstructing the pagefile.

94 7.4 Acquisition

7.4 Acquisition

In this section, we provide a brief overview of the available acquisition methods with regard to physical memory and pagefile. We will do so by summarizing existing research. We start with physical memory acquisition.

7.4.1 Memory

Acquisition of the physical memory was already outlined in Section 4.4.4 on page 42. However, one method was not outlined before — Windows crash dumps.

7.4.1.1 Crash dumps

Crash dumps are a good way to obtain the system memory. They can be, appropriate system configuration provided, triggered via keyboard input or by crashing the Windows NT kernel. As of writing, a bug in Windows 7 allows the system to be crashed via a GPT partitioned storage medium with the number of possible partition entries in the GPT header set to zero. This causes a division by zero in the kernel and, on default system configuration, a is written to disk. Because the operating system is stopped during the crash dump procedure the method has high atomicity [162, Fig. 5]. The problem with this method, however, is the location on disk the core dump is written to. It is written to the pagefile, destroying it as a possible information source for virtual memory analysis. Another problem is caused by encryption. If the file system is encrypted the crash dump used to be dumped to disk in cleartext. However, starting from with Service Pack 1 (SP1) and , a disk encryption software can implement a Crash Dump Filter Driver which allows encrypting the contents of crash dumps, as well as the file. Thus, rendering this method useless if software disk encryption is used.

7.4.2 Pagefile

In this section, we outline how the pagefile of Windows NT can be acquired either on a live system or via dead analysis.

7.4.2.1 Dead acquisition

Acquiring the pagefile.sys via dead acquisition is trivial. If no disk encryption is used we can simply shut down the computer or even just unplug the storage device the pagefile.sys is stored on from the target system. Next, we can mount the file system on the storage device and copy the pagefile.sys. In case disk encryption is used we must first obtain the encryption keys via either a cold boot, DMA attack or any other available means. Note that crash dumps can not be used for key recovery as outlined in Section 7.4.1.1. Also, key recovery via software executed on the target system should be avoided, because in such cases the pagefile.sys can be acquired live, as explained in the next section. The recovered encryption keys can be used to circumvent the disk encryption [72, 99]. In case self-encrypting disks (SEDs) are used a warm-replug attack [109] can be used. In case, disk encryption is used, and a DMA attack is not available, or SEDs are used, care must be taken to not lose the pagefile.sys, because neither the warm-replug attack on SEDs nor the cold boot attack are reversible. Hence, in such cases, a life acquisition is preferable, if available.

95 7 Virtual memory analysis (on Windows NT)

7.4.2.2 Live acquisition Acquiring the pagefile on a live system is more complicated because, as already outlined in Section 7.3.1 on page 89, the Windwos NT pagefile is locked by the kernel against any ordinary access during runtime. However, the pagefile can be acquired from the so-called Win32 Device Namespace, e.g., the extraction process of the pagefile.sys onto the removable drive mounted on volume E using the Sleuthkit’s ifind and icat tools to access the C volume’s device namespace \\.\c: is outlined in Listing 7.4. In this example, 11163 is the $MFT entry number of the pagefile.sys.

C:\>ifind.exe -n /pagefile.sys \\.\c: 11163 C:>icat.exe \\.\c: 11163 > E:\pagefile.sys

Listing 7.4: Extracting the pagefile.sys with the Sleuthkit

Tools which are able to automatically copy the pagefile of a running system are: Disk Explorer, Forensic Toolkit, WinHex, Pagefile Collection Toolkit (PCT) [95], ntfscopy, icat, or FGET (Forenisc Get by HBGary Inc.), to name a few.

7.5 Analysis

Once a physical memory dump and corresponding pagefile are acquired the process of reconstruction can begin. In order to reconstruct a virtual address space of a process the process structures, namely the Directory Table Base, i.e., the pointer to the process’ page table root must be found. After this the virtual address space can be reconstructed by implementing the virtual address translations as outlined in Section 7.3.3 on page 93. To this end, we implemented two kinds of tools. First, a tool that finds the _EPROCESS structures and extracting the DirectoryTableBase from them by carving the physical memory dump. Second, a tool that, given a physical memory dump, a pagefile copy, and a DirectoryTableBase value, reconstructs the virtual address space defined by the paging structure.

7.5.1 Finding DirectoryTableBase The address of the root table of the paging structure governing the virtual address space of a process is stored in the processes _EPROCESS structure in a variable called DirectoryTableBase. To find this variable the _EPROCESS structures of the various Windows NT kernel versions can be carved from physical memory via a signature. _EPROCESS structures are, to our knowledge, not swapped out to the pagefile. Listing 7.5 shows the eight fields we found to be enough to build a robust signature by employing the constraints outlined in Listing 7.6 on the next page. lkd> dt nt!_EPROCESS +0x000Pcb :_KPROCESS +0x000 Header : _DISPATCHER_HEADER +0x000Type :UChar +0x002Size :UChar +0x003 Reserved2 : Pos 2, 4 Bits +0x028 DirectoryTableBase : Uint8B +0x030 ThreadListHead : _LIST_ENTRY +0x000 Flink : Ptr64 _LIST_ENTRY +0x008 Blink : Ptr64 _LIST_ENTRY +0x180 UniqueProcessId : Ptr64 Void +0x2e0 ImageFileName : [15] UChar

Listing 7.5: _EPROCESS layout as per WinDbg

In Listing 7.6 on the facing page IS_PRINTABLE_ASCII_STRING() denotes a function that tests whether the string is composed of only printable ASCII characters. And with KERNEL_ADDRESS denoting the start of the kernel space. This is 0x80000000 for 32-bit and 0x80000000000 for 64-bit systems. Further DTB_ALIGNMENT is 0x1000, except for PAE systems, there it is 0x20. The value for SIZE is Windows NT version and build dependent. Size values for some systems are given in Table 7.1 on the next page.

96 7.5 Analysis

ThreadListHead.Flink >= KERNEL_ADDRESS ThreadListHead.Blink >= KERNEL_ADDRESS DirectoryTableBase != 0 DirectoryTableBase % DTB_ALIGMENT == 0 Type == 0 x03 Size == SIZE Flags == 0x00 Reserved2 == 0x00 IS_PRINTABLE_ASCII_STRING(ImageFileName) Listing 7.6: _EPROCESS signature constraints

Windows NT Version SIZE Windows 7 x86 7600 0x26 Windows 7 x64 7600 0x58 .1 x86 9600 0x28 Windows 8.1 x64 9600 0xb2 Windows 10 x64 9841 0xb4

Table 7.1: Size value for signature.

The offsets of the quoted fields can be obtained via WinDbg by executing the commands outlined in Listing 7.7.

.sympath srv* . reload dt nt!_EPROCESS dt nt!_KPROCESS dt nt!_DISPATCHER_HEADER dt nt!_LIST_ENTRY Listing 7.7: Obtaining offset information for siganture via WinDbg Please note that this signature may not be anti-forensic resistant, i.e., it may rely on values such as the Size field, which is not used by the operating system and hence can be overwritten with other values by a rootkit or malware [35]. We used this signature for our evaluation only. We refer to Dolan-Gavitt et al. [35] and Lee et al. [94] for anti-forensic resistant signature construction.

7.5.2 Reconstructing the virtual address space Once the root of the paging table structure is found the virtual address space spanned by that paging structure can be reconstructed. To this end, the memory range from zero to the highest memory address can be iterated and any present physical frame can be written out to a new file in order to create a dump of the process’ virtual memory space. The address translation is performed as per Section 7.3.3 on page 93.

7.5.3 Analyzing the virtual address space Once the virtual address space is reconstructed current methods to analyze the now flat process memory space can be used. These methods include, but are not limited to, the following: • File carving via foremost • Keyword search in the process space via strings • Encryption key search via aeskeyfind [71] or interrogate • Disassemble the process’ code and data segments While some of them may still not be very forensic savvy and accurate, such as file carving and strings, they can now be used to extract full files and texts from editors, emails, messengers or visited websites, even if such process memory was not paged or physically fragmented. Further, can any findings by these tools now be linked to the process they were found in, which can be linked to the user owning the process. All of which was not possible previously.

97 7 Virtual memory analysis (on Windows NT)

7.6 Evaluation

We developed, tested and verified the correct working of our virtual address space extraction tool against the following versions of Windows NT:

• Windows 7 7600 x86 (with 32-bit Paging and/or PAE Paging) • Windows 7 7600 x64 (with IA32e Paging) • Windows 8.1 9600 x86 (with 32-bit Paging and/or PAE Paging) • Windows 8.1 9600 x64 (with IA32e Paging)

• Windows 10 9841 x86 (with PAE Paging) • Windows 10 9841 x64 (with IA32e Paging)

7.6.1 Problem cases of virtual memory analysis For evaluation we consider four problem cases with regard to virtual memory analysis, as depicted in Figure 7.2 on the facing page: 1. Physical memory frames are fragmented but no frames are in the pagefile, as illustrated in Figure 7.2 on the next page: • Naïve carving of the physical memory may not yield any results due to the fragmentation. • Memory analysis only considering physical memory can completely reconstruct the virtual address space. • File carving applied to the pagefile can not yield any results, because no virtual page is mapped to a pagefile frame. 2. All virtual memory pages are sequentially mapped into the pagefile, as shown in Figure 7.2 on the facing page: • Any physical memory analysis will not yield any results, because all virtual pages are mapped to pagefile frames. • File carving applied to the pagefile will retrieve the content, because it is available sequentially.

3. Physical memory frames only reside in the pagefile and are fragmented, as shown in Figure 7.2 on the next page: • Any physical memory analysis will not yield any results, because all virtual pages are mapped to pagefile frames. • File carving applied to the pagefile may not yield any results due to the fragmentation.

4. Physical memory frames are scattered over physical memory and the pagefile. This is illustrated in Figure 7.2 on the facing page: • Any method besides virtual memory analysis incorporating the pagefile may not yield results, with regard to an investigation, because neither the physical memory, nor the pagefile contain the complete mapping of the virtual pages. Hence, only a combination of physical memory and pagefile analysis can provide a perfect reconstruction. Only the presented virtual memory analysis approach incorporating the pagefile is able to perfectly reconstruct the virtual address space in all of the four problem cases. In reality, we have found problem cases 1 and 4 to be most prevailing. We have only rarely encountered case 3. We have never encountered case 2. However, close matches, where almost all virtual memory pages were sequentially mapped to pagefile frames, have been observed.

98 Memory CarvingMemory Memory Virtual Analysis Memory Virtual + Analysis Pagefile Pagefile Carving 7.6 Evaluation ? ? ? Physical Memory Pagefile Memory Memory CarvingMemory Memory Virtual Analysis Memory Virtual + Analysis Pagefile Pagefile Carving Virtual Virtual ? Pagefile Physical Memory Maybe Maybe Memory Memory CarvingMemory Memory Virtual Analysis Memory Virtual + Analysis Pagefile Pagefile Carving Virtual Virtual Pagefile Physical Memory No No Maybe Yes Memory Memory Carving Memory Virtual Analysis Memory Virtual Analysis + Pagefile Pagefile Carving Virtual Virtual ? Pagefile Physical Memory Yes Yes Yes Yes Yes Problem Case 1 Problem Case 2 Problem Case 3 Problem Case 4 Memory Virtual Virtual Problem cases of virtual memory analysis: Only one case can be perfectly solved by current memory analysis techniques. Table 7.2: Memory CarvingPagefile Carving Maybe No No No Maybe Virtual Memory Virtual Memory + Pagefile

99 7 Virtual memory analysis (on Windows NT)

7.6.2 Synthetic data

First, we evaluated our virtual address reconstruction with synthetic data. This data consists of an allocation of crib pages, as outlined in Section 7.2.2 on page 87. After the ramwrite and swapforcer processes were started sync.exe from the Sysinternals Tools was started to increase atomicity with regard to the pagefile written to disk. The physical RAM and pagefile were acquired via VirtualBox as outlined in Section 4.4.4.3 on page 43, i.e., the VM was first paused, then the RAM was acquired via the debugvm dumpguestcore command, after which the VM state was saved, the hard disk cloned and the pagefile extracted via icat from the Sleuthkit.

Selected results can be seen in Table 7.3 on the next page. The table lists seven different datasets obtained from different kernels with different paging. The different kernels and paging modes demonstrate the correct functionality of our implementation for the various different Windows NT systems. The table further lists the amount of crib data allocated. It then details how much sequential crib data could be reconstructed with the four methods: physical memory carving, pagefile carving, classical memory analysis, i.e., reconstructing the virtual address space without incorporating the pagefile, and last but not least our proposed virtual memory analysis method incorporating the pagefile. For each method, the length of reconstructable crib data at the start of the memory allocation and the longest overall crib data reconstructable was listed. An equal sign (“=”) before the longest reconstructable crib data listing denotes that the longest match was found at the start of the memory allocation. This is important if file carving methods are considered, as these methods rely on file headers which are usually at the beginning of a file. For a perfect reconstruction both the reconstructable crib data at the start and the longest overall match should coincide with the allocation size and the longest overall match must be at the start of the allocation. A dash (“-”) denotes that no specific crib frame could be found.

Dataset 1 was obtained from a system exhibiting problem case 1, i.e., all crib data was mapped into physical memory. This can be seen from the fact that no crib frame was present in the pagefile at all. All crib data can be reconstructed only from physical memory.

Dataset 2 provides the transition state, where pages have started to be swapped out to the pagefile. However, because the pages were still in transition, refer Section 7.3.3.2 on page 94, their content was still present in physical memory and hence all crib data could be reconstructed using only physical memory.

Datasets 3 to 6 provide problem cases of type 4, i.e., crib data is placed in both the physical memory and the pagefile. The various techniques yield unpredictable results depending on the current fragmentation and scattering of the crib data. Only the proposed virtual memory analysis approach is able to perfectly reconstruct the crib data for all datasets. Datasets 3 to 6 provide the transition into problem cases 2 and 3, with the data that can be obtained via classical memory analysis not making use of the pagefile gradually declining from dataset 3 to 6.

Last, dataset 7 is the extreme problem case 2, in which classical memory analysis can not reconstruct any data at all, because all crib pages have been swapped out to the pagefile. The fact that the naïve memory carving method was still able to extract one 4 KiB frame, can be attributed to the fact that one particular physical memory frame, which was already allocated to a different process, was not overwritten by the other process yet. We verified this hypothesis by tracing that particular memory frame back to being part of the swapforcer process’ virtual address space. As stated earlier, we used the swapforcer process to swap as many crib pages as possible into the pagefile. This clearly shows the ill effects that incomplete analysis methods can have on the conclusions that a forensic analyst may draw. Here, the memory frame appeared to belong to the ramwrite process, while in reality it did not. Attributing evidence to a wrong process could mean attributing it to a different user. This ultimately could lead to wrong accusations against a person.

100 7.6 Evaluation Results of reconstructing synthetic data Table 7.3: Kernel Allocation Memory Carving [KiB] Pagefile Carving [KiB] Virtual Memory [KiB] Virtual Memory + Pagefile [KiB] [Build (Paging)] [KiB] Start Longest Start Longest Start Longest Start Longest 12 7600 x64 (IA32e) 76003 x86 (PAE)4 9841 x64 (IA32e) 0x40005 7600 x86 (32 Bit)6 9600 x64 0x4000 0x8 0x40000 (IA32e)7 9600 x86 = (PAE) 0x4 7600 0x8000 x64 0x10000 (IA32e) - 0x10000 0x10000 - - 0x8 - - 0x8 - 0xc 0x8 0x1c 0x10 - 0x1fe8 0x18 0x4 0x4 0x74c 0x13c - 0x28 0x2800 0xd 0x4000 0x38f8 0x4000 = 0x4000 = 0x2c00 0x4000 - - 0x4000 - 0x4000 - 0x4000 = 0x3afbc - = 0x40000 0x2114 0x8c = 0x10000 0x8000 0x24 = = 0x10000 0x4000 = - 0x4000 0x10000 0x40000 = 0x10000 0x8000 0x10000 0x10000

101 7 Virtual memory analysis (on Windows NT) 135 567 200 200 600 ≈ 135 454 0 ≈ 33 113 0 ≈ Results of reconstructing synthetic data: Images carved out of the raw memory, pagefile or different memory address space reconstructions. Physical Memory Pagefile Virtual Memory Pagefile and Virtual Memory Virtual Memory with Pagefile Table 7.4: Full Viewable0 Total Full 0 Viewable Total Full Viewable 269 Total Full 0 Viewable Total Full Viewable Total

102 7.7 Conclusion and future work

7.6.3 Real life data

Besides the evaluation with synthetic data using crib pages, we also evaluated our approach against real life data. To this end, we acquired the memory and pagefile of a Windows 8.1 x64 system with 1024 MiB RAM running on x64 based hardware, while an instance of Firefox in private browsing mode was running. Firefox was used to open a prepared HTML web page with 200 JPEG images embedded. The 200 JPEG images total over 1.2 GiB. Each image was a high quality 6000 × 4000 pixel image from a DSLR. The data was acquired via WinPMEM and the Sleuthkit’s fcat. We then used our virtual memory analyzer to extract the virtual address space of the firefox.exe process. We then carved this address space with the Foremost file carver. We further repeated the carving process on the raw physical memory image, the pagefile and the virtual memory address space only reconstructed from physical memory. The number of images each procedure could recover is listed in Table 7.4 on the preceding page. The “Full” column lists the image files that could be fully recovered, i.e., 6000 × 4000 pixel images that are not corrupted in any way. The “Viewable” column, on the other hand, lists distinct images that were recognizable by visual inspection, these include corrupted and only partially recovered images, as well as embedded thumbnail images. Because the JPEGs used had two smaller thumbnail images embedded the total number of JPEGs in the Firefox process is 600. Our method was able to recover all of them. No broken images were recovered. Because no other JPEG files were opened by Firefox, no additional images were recovered as well. The other methods could only recover the smaller embedded thumbnail images but no full 6000 × 4000 pixel image. This evaluation using real life data underlines the practicability of our results. The fact that current memory analysis techniques were only able to retrieve viewable content for 135 of the 200 images and were unable to recover the full 6000 × 4000 pixel images, while we are able to reconstruct all images, further punctuates the practical importance of our results.

7.7 Conclusion and future work

In all evaluated cases, we performed better or at least on par with current memory analysis techniques. Further has our gray-box analysis approach quickly exposed critical errors during the development of our tools. Hence we expect this technique to be able to successively improve current and future memory analysis software with regard to correctness and completeness, as well. As with all research, this work only presents a small part of what can be done. Future research will be needed on the following topics. While the Windows NT family of operating system is most prevailing, other operating systems need to be considered as well. Our analysis approach can, as already stated in Section 7.2.3 on page 88, be used on the Linux operating system as well, but also other operating systems such as Apple’s Mac OS or the BSD family of operating systems should be evaluated. To develop correct reconstruction we used virtual machines to acquire the physical memory and corresponding pagefiles of systems with ultimate atomicity. While this provides, without a doubt, the best results, it is not always possible. Hence, better acquisition methods acquiring both physical memory and the pagefile at the same time with high atomicity must be investigated. Also, the effects of virtual address reconstruction using inconsistent physical memory and pagefile dumps must be researched. Our preliminary findings indicate that even though a virtual address space can be reconstructed from inconsistent physical memory and pagefile dumps, the reconstructed address space inevitably contains incorrect, i.e., outdated, memory pages. But the impact of these reconstruction errors is not known yet. The virtual memory analysis needs to be extended beyond the pagefile. To this end, Prototype PTEs of shared memory and mapped files [160] should be considered. Relevant code can be found in the Windows NT Research Kernel source code [104]. The relevant parts are the code handling software page faults due to the Prototype PTEs [104, MiResolveProtoPteFault() and MiResolveMappedFileFault()].

103 7 Virtual memory analysis (on Windows NT)

As shown in other research [79], pagefiles can be compressed and also encrypted. Hence, decom- pression and decryption procedures may need to be integrated into future virtual memory analysis procedures that want to incorporate compressed and/or encrypted pagefiles.

Last but not least, best practice approaches with regard to combined pagefile and memory acqui- sition must be researched. For our tests, we always synced the disks of the system, because we assume it yields higher atomicity. However, we have not evaluated what consequences this could have, e.g. old data, which may contain relevant evidence, could be overwritten by the sync operation.

Summarizing this chapter it can be said that even though virtual memory analysis is an important field, it is, as outlined by our future work list, still an emerging field, that still has problems. We solved one blind spot, namely incorporating the pagefile. The generic research method we used is expected to be deployed against other operating systems as well as being used to evaluate the correctness of other memory forensic tools.

104 8 Conclusion and future work

In this thesis, we first outlined forensics and anti-forensics, both in the classical physical realm as well as the digital realm. We defined the main goal of anti-forensics, i.e., impairing the quality of a forensic analysis, and motivated it with practical examples. After this we tackled anti-forensic and rootkit problems. First, we investigated how data can best be acquired from hard drives that are potentially com- promised by a firmware rootkit. To this end, we first outlined the threat of hard drive firmware rootkits to forensic analysis. We then provided a procedure to detect and subvert hard disk drive firmware bootkits, as publicly published. We further outlined potential avenues to detect hard drive firmware rootkits nested deeper within the hard disk drive’s so-called Service Area, a special storage on the magnetic platter reserved for use by the firmware. We, therefore, showed that it is possible and feasible to counter anti-forensic measures manifested as compromised hard disk drive firmware. Because those hard disk drive firmware rootkits are undetectable otherwise, we urge the forensic community to adopt the techniques we outlined. We also advise that our introduced techniques are developed further and extended to not just include SSDs but also devices such as network cards and other peripherals that could harbor rootkits. After we addressed the acquisition of persistent data storage in form of hard disk drives, we shifted towards acquisition and later analysis of volatile storage, in form of RAM. To this end, we first evaluated the quality, both with regard to atomicity and integrity as well as anti-forensic resistance, of different memory acquisition techniques with our newly developed black-box analysis technique. This resulted in the cold boot attack proving to be the most favorable memory acquisition techniques. Because we open sourced our black-box technique we urge other forensic scientists in the community to use it to further test additional memory analysis tools to determine, for themselves, which acquisition tool to use. Even though we determined that the cold boot attack is the most favorable, specific circumstances may make it inapplicable, e.g., the system must not be rebooted. Hence, it is important for forensic analysts to know their options as well as each option’s limits with regard to atomicity, integrity, and anti-forensic resilience. We, therefore, researched the cold boot attack in great detail. First, experimentally confirming that cooling the RAM modules prolongs the remanence effect considerably. Then proving experimentally that transplanting RAM modules from one system to another is possible. We further addressed the issue of scrambling in modern DDR3 technology as well as other proposed countermeasures, such as BIOS passwords and temperature detection. With this we showed that cold boot attacks are an adequate anti-forensic resistant memory acquisition technique, because the are hard to defend, and once the system is cold-booted, i.e., rebooted, any anti-forensic code running on the system is stopped immediately and hence prevented from interfering with the memory acquisition. While we showed the cold boot attack to be feasible and practical on current RAM technology we also showed that memory encryption techniques may render it unusable, in which case we refer to our evaluation of memory acquisition techniques, from which a different suitable acquisition technique can be selected by forensic analysts. It is also important for judges to understand the idea behind the cold boot attack, because in some circumstances a system needs to be substantially altered, e.g., RAM frozen and transplanted into a different system, which is contrary to the long-standing status quo of digital forensics to not tamper with evidence. Here, the system is, however, not tampered with per se. The modifications are rather necessary to obtain the evidence. This can be compared to an analysis in forensic chemistry in which evidence is often dissolved to obtain the in it contained information, such as its composition — but this dissolving inadvertently tampers and even destroys the evidence in its material form. In classical forensic science this is acceptable, however, in digital forensics, such practice is often still frowned upon. But we would like to argue that also in the digital domain it can be necessary to modify the evidence in its material form in order to obtain information from it.

105 8 Conclusion and future work

After we outlined the acquisition of evidence, we addressed the analysis. To this end, we first revisited the theory of data analysis, namely the concept of essential data in forensic analysis as coined by [18]. After extending Carrier’s theories, we verified both the original theories and our extensions in practical experiments. We further argued that the essential data concept can be used to build a trust hierarchy, from which we concluded that anti-forensic resistant analysis methods should only rely on what we call strictly essential, i.e., trusted, data. In case this can not be done, the analysis tool should at least notify the investigator that the conclusion was drawn from potentially manipulated data. Here it is also important that judges understand the problem of anti-forensic manipulations and work needs to be done to clearly assess the likelihood of such manipulations. Last but not least, we tackled a problem in forensic memory analysis that had been unsolved for a long time. A blind spot where data and thus potential evidence in unmapped memory pages, i.e., pages swapped out onto persistent storage, were not examined by current state-of-the-art digital forensic virtual memory analysis tools. We fixed this by analyzing the Windows NT virtual memory paging via our conceived gray-box analysis method. To this end, we placed traceable data into virtual memory and forced it into both the physical RAM as well as being swapped out to the pagefile.sys stored on persistent storage. We were thus able to reverse engineer the complete virtual address mapping, including the non-mapped pagefile. With our evaluation, we further showed that only the presented analysis method considers and finds all available evidence. Other analysis methods leave blind spots which could be used by anti-forensic tools to hide evidence.

Even though our contributions to the field of anti-forensic and rootkit resistant digital forensic procedures are plenty, they are still not enough. Many anti-forensic threats may exist that have not even been discovered yet. For example, to our knowledge we are the first within the forensic community to raise concerns with regard to firmware rootkits on hard disk drives. However, the idea and feasibility of such firmware rootkits have been demonstrated since 2013. And its potential existed since the inception of hard disk drives with upgradeable firmware — which to our knowledge has always been there on modern hard disk drives. Even before hard disks, firmware rootkits for network cards have been proposed by Delugré [29]. We would like to argue that our outlined methods for hard disk rootkits, especially verifying the EEPROM contents, is also applicable to network cards. Another possibility for rootkits is within the so-called baseband processor [30], a special processor separate from the application processor on which the regular operating system resides. Here, our introduced JTAG methods are applicable.

Another concern is that existing anti-forensic resistant methods are, apparently, not applied in practice. For example, the fact that cold boot attacks against DDR3 memory are not possible anymore due to memory scrambling did not seem to resonate in the forensic community even though we have already published our findings and made freely available online in 2013. In fact, in 2015 results have been published asserting the opposite. Lindenlauf et al. [97] published a study on the feasibility of cold boot attacks even against DDR3 memory [97], completely ignoring the fact that DDR3 memory is scrambled. Their results have only been possible because the particular system configuration tested used constant scrambling. Had the cold boot attack actually been used in practice this fact would have spread more widely. We, therefore, urge forensic practitioners to at least familiarize with such anti-forensic resistant acquisition and analysis methods, so they can, in selected cases, apply them. Of course, we would favor anti-forensic resistant forensic methods to be applied always, but we do understand that without widespread tool support for such methods it is an additional burden on the forensic investigator, which can often not be fulfilled because of time or budget constraints. We, therefore, also urge tool developers to adopt and preferentially implement anti-forensic resistant acquisition and analysis methods.

106 The answer to whether we found the digital equivalent of luminol, as proposed in the introduction of this thesis, can not be answered yet. However, what can be assured is that blood detection via luminol has and still is constantly improved [136, 8] and scientifically scrutinized [27]. The use of luminol has matured. The same constant improvements and scrutiny must be applied to the methods introduced in this thesis. So they can mature. New technologies must be evaluated for their anti-forensic potential and new ways must be endeavored to undermine this anti-forensic potential.

But possibly the biggest problem with rootkits and anti-forensic methods is that if they are not actively sought out, they pass undetected — like blood which was wiped from a crime scene. After all, this is their purpose — manipulate, hide evidence and presenting a false reality, without being detected. We must, therefore, not let ourselves be stopped by deceptions for otherwise we may be trapped in undesirable circumstances [166]. Hence, it must be made best practice to expect anti-forensics and be prepared accordingly. With this thesis, we hope to help prepare forensic investigators around the world for the specific addressed anti-forensic threats and, therefore, take the first step in leaving the age of anti-forensic innocence.

107

Bibliography

[1] Pietro Albano, Aniello Castiglione, Giuseppe Cattaneo, and Alfredo De Santis. A novel anti-forensics technique for the android OS. In 2011 International Conference on Broadband, Wireless Computing, Communication and Applications, BWCCA 2011, Barcelona, Spain, October 26-28, 2011, pages 380–385, 2011. doi: 10.1109/BWCCA.2011.62. URL http://dx. doi.org/10.1109/BWCCA.2011.62. [2] Martin R. Albrecht and Carlos Cid. Cold boot key recovery by solving polynomial systems with noise. In Applied Cryptography and Network Security - 9th International Conference, ACNS 2011, Nerja, Spain, June 7-10, 2011. Proceedings, pages 57–72, 2011. doi: 10.1007/978- 3-642-21554-4_4. URL http://dx.doi.org/10.1007/978-3-642-21554-4_4. [3] Erwin Alles, Zeno Geradts, and Cor J. Veenman. Source camera identification for low resolution heavily compressed images. In Selected Papers of the Sixth International Conference on Computational Sciences and Its Applications, ICCSA ’08, Perugia, Italy, June 30 - July 3, 2008, pages 557–567, 2008. doi: 10.1109/ICCSA.2008.18. URL http://dx.doi.org/10. 1109/ICCSA.2008.18. [4] Philip Anderson, Maximillian Dornseif, Felix Freiling, Thorsten Holz, Alastair , Christopher Laing, and Martin Mink. A comparative study of teaching forensics at a university degree level. In IT-Incidents Management & IT-Forensics - IMF 2006, Con- ference Proceedings, October, 18th-19th, 2006, Stuttgart., pages 116–127, 2006. URL http://subs.emis.de/LNI/Proceedings/Proceedings97/article4931.html. [5] Ross Anderson and Markus Kuhn. Tamper resistance: A cautionary note. In Proceedings of the 2Nd Conference on Proceedings of the Second USENIX Workshop on Electronic Commerce - Volume 2, WOEC’96, pages 1–1, Berkeley, CA, USA, 1996. USENIX Association. URL http://dl.acm.org/citation.cfm?id=1267167.1267168. [6] Shiva Azadegan, Wei Yu, Hui Liu, Ali Sistani, and Subrata Acharya. Novel anti-forensics approaches for smart phones. In 45th Hawaii International International Conference on Systems Science (HICSS-45 2012), Proceedings, 4-7 January 2012, Grand Wailea, Maui, HI, USA, pages 5424–5431, 2012. doi: 10.1109/HICSS.2012.452. URL http://dx.doi.org/10. 1109/HICSS.2012.452. [7] Harald Baier and Julian Knauer. AFAUC - anti-forensics of storage devices by alternative use of communication channels. In Eighth International Conference on IT Security Incident Management & IT Forensics, IMF 2014, Münster, Germany, May 12-14, 2014, pages 14–26, 2014. doi: 10.1109/IMF.2014.11. URL http://dx.doi.org/10.1109/IMF.2014.11. [8] Filippo Barni, Simon W. Lewis, Andrea Berti, Gordon M. Miskelly, and Giampietro Lago. Forensic application of the luminol reaction as a presumptive test for latent blood detection. Ta- lanta, 72(3):896 – 913, 2007. ISSN 0039-9140. doi: http://dx.doi.org/10.1016/j.talanta.2006.12. 045. URL http://www.sciencedirect.com/science/article/pii/S0039914007000082. [9] Mauro Barni and Benedetta Tondi. Optimum forensic and counter-forensic strategies for source identification with training data. In 2012 IEEE International Workshop on Information Forensics and Security, WIFS 2012, Costa Adeje, Tenerife, Spain, December 2-5, 2012, pages 199–204, 2012. doi: 10.1109/WIFS.2012.6412649. URL http://dx.doi.org/10.1109/WIFS. 2012.6412649.

109 Bibliography

[10] Johannes Bauer, Michael Gruhn, and Felix Freiling. Lest we forget: Cold-boot attacks on scrambled DDR3 memory. Digital Investigation, 16, Supplement:S65–S74, 2016. ISSN 1742- 2876. doi: 10.1016/j.diin.2016.01.009. URL http://dx.doi.org/10.1016/j.diin.2016.01. 009. Proceedings of the Third Annual DFRWS Europe. [11] Michael Becher, Maximillian Dornseif, and Christian Klein. FireWire: all your memory are belong to us. Talk at CanSecWest (slides: https://cansecwest.com/core05/2005- firewire-cansecwest.pdf), 2005. [12] Hal Berghel. Hiding data, forensics, and anti-forensics. Communications of the ACM, 50(4): 15–20, 2007. doi: 10.1145/1232743.1232761. URL http://doi.acm.org/10.1145/1232743. 1232761. [13] Ariel Berkman. Hiding data in hard-drive’s service areas. Paper published via fulldisclo- [email protected] (paper: http://www.recover.co.il/SA-cover/SA-cover.pdf), 2013. [14] Erik-Oliver Blass and William Robertson. TRESOR-HUNT: attacking CPU-bound encryption. In 28th Annual Computer Security Applications Conference, ACSAC 2012, Orlando, FL, USA, 3-7 December 2012, pages 71–78, 2012. doi: 10.1145/2420950.2420961. URL http:// doi.acm.org/10.1145/2420950.2420961. [15] Rainer Böhme and Matthias Kirchner. Counter-Forensics: Attacking Image Forensics, pages 327–366. Springer New York, New York, NY, 2013. ISBN 978-1-4614-0757-7. doi: 10.1007/978- 1-4614-0757-7_12. URL http://dx.doi.org/10.1007/978-1-4614-0757-7_12. [16] Dominik Brodowski, Andreas Dewald, Felix Freiling, Steve Kovács, and Martin Rieger. Drei Jahre Master Online Digitale Forensik: Ergebnisse und Erfahrungen. In Sicherheit 2014: Sicherheit, Schutz und Zuverlässigkeit, Beiträge der 7. Jahrestagung des Fachbereichs Sicherheit der Gesellschaft für Informatik e.V. (GI), 19.-21. März 2014, Wien, Österre- ich, pages 391–405, 2014. URL http://subs.emis.de/LNI/Proceedings/Proceedings228/ article29.html. [17] Jamie Butler. DKOM (direct kernel object manipulation). Talk at Black Hat USA (video: https://www.youtube.com/watch?v=1Ie20b5IGgY), 2004. [18] Brian D. Carrier. File System Forensic Analysis. Addison-Wesley Professional, 2005. ISBN 0321268172. [19] Brian D. Carrier and Joe Grand. A hardware-based memory acquisition procedure for digital investigations. Digital Investigation, 1(1):50–60, 2004. doi: 10.1016/j.diin.2003.12.001. URL http://dx.doi.org/10.1016/j.diin.2003.12.001. [20] Eoghan Casey. Digital Evidence and Computer Crime: Forensic Science, Computers and the Internet. Academic Press, 2004. ISBN 9780121631048. [21] Ellick Chan, Jeffrey C. Carlyle, Francis M. David, Reza Farivar, and Roy H. Campbell. Bootjacker: compromising computers using forced restarts. In Proceedings of the 2008 ACM Conference on Computer and Communications Security, CCS 2008, Alexandria, Virginia, USA, October 27-31, 2008, pages 555–564, 2008. doi: 10.1145/1455770.1455840. URL http://doi.acm.org/10.1145/1455770.1455840. [22] Ellick Chan, Shivaram Venkataraman, Francis M. David, Amey Chaugule, and Roy H. Campbell. Forenscope: a framework for live forensics. In Twenty-Sixth Annual Computer Security Applications Conference, ACSAC 2010, Austin, Texas, USA, 6-10 December 2010, pages 307–316, 2010. doi: 10.1145/1920261.1920307. URL http://doi.acm.org/10.1145/ 1920261.1920307. [23] Rong-Jian Chen, Shi-Jinn Horng, and Po-Hsian Huang. Anti-forensic steganography using multi-bit MER with flexible bit location. IJAHUC, 18(1/2):54–66, 2015. doi: 10.1504/ IJAHUC.2015.067788. URL http://dx.doi.org/10.1504/IJAHUC.2015.067788.

110 Bibliography

[24] Philip A. Collier and B. J. Spaul. A forensic methodology for countering computer crime. Artificial Intelligence Review, 6(2):203–215, 1992. doi: 10.1007/BF00150234. URL http:// dx.doi.org/10.1007/BF00150234. [25] Kevin Conlan, Ibrahim Baggili, and Frank Breitinger. Anti-forensics: Furthering digital forensic science through a new extended, granular taxonomy. Digital Investigation, 18, Supplement:S66 – S75, 2016. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2016.04. 006. URL http://www.sciencedirect.com/science/article/pii/S1742287616300378. [26] CRC Industries Deutschland GmbH. TECHNISCHES MERKBLATT – KÄLTE 75 SUPER – Ref.: 20848. Datasheet: http://www.crcind.com/wwwcrc/tds/TKC4%20FREEZE75S.PDF, 2003. [27] Jonathan I. Creamer, Terence I. Quickenden, Leah B. Crichton, Patrick Robertson, and Rasha A. Ruhayel. Attempted cleaning of bloodstains and its effect on the forensic luminol test. Luminescence, 20(6):411–413, 2005. ISSN 1522-7243. doi: 10.1002/bio.865. URL http://dx.doi.org/10.1002/bio.865. [28] Kamal Dahbur and Bassil Mohammad. The anti-forensics challenge. In Proceedings of the 2nd International Conference on Intelligent Semantic Web-Services and Applications, ISWSA 2011, Amman, Jordan, April 18-20, 2011, page 14, 2011. doi: 10.1145/1980822.1980836. URL http://doi.acm.org/10.1145/1980822.1980836. [29] Guillaume Delugré. How to develop a rootkit for Broadcom NetExtreme network cards. Talk at REcon (slides: http://esec-lab.sogeti.com/static/publications/11-recon- nicreverse_slides.pdf), 2011. [30] Guillaume Delugré. Reverse engineering a Qualcomm baseband. Talk at the 28th Chaos Com- munication Congress (28C3) (slides: https://events.ccc.de/congress/2011/Fahrplan/ attachments/2022_11-ccc-qcombbdbg.pdf), 2012. [31] Andreas Dewald and Felix Freiling. From computer forensics to forensic computing: Investi- gators investigate, scientists associate. Technical Report CS-2014-04, Department Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2014.

[32] Andreas Dewald, Felix Freiling, Michael Gruhn, and Christian Riess. Forensische Informatik. Books on Demand, Norderstedt, 2nd edition, 2015. ISBN 978-3-8423-7947-3. [33] Alessandro Distefano, Gianluigi Me, and Francesco Pace. Android anti-forensics through a local paradigm. Digital Investigation, 7, Supplement:S83 – S94, 2010. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2010.05.011. URL http://www.sciencedirect.com/ science/article/pii/S1742287610000381. The Proceedings of the Tenth Annual DFRWS Conference. [34] Brendan Dolan-Gavitt. The VAD tree: A process-eye view of physical memory. Digi- tal Investigation, 4, Supplement:62 – 64, 2007. ISSN 1742-2876. doi: http://dx.doi.org/ 10.1016/j.diin.2007.06.008. URL http://www.sciencedirect.com/science/article/pii/ S1742287607000503. [35] Brendan Dolan-Gavitt, Abhinav Srivastava, Patrick Traynor, and Jonathon Giffin. Robust Signatures for Kernel Data Structures. In Proceedings of the 16th ACM Conference on Computer and Communications Security, CCS ’09, pages 566–577, New York, NY, USA, 2009. ACM. ISBN 978-1-60558-894-0. doi: 10.1145/1653662.1653730. URL http://doi.acm.org/ 10.1145/1653662.1653730. [36] Jeroen “sprite_tm” Domburg. Sprites mods – hard disk hacking. Website: https:// spritesmods.com/?art=hddhack, 2013. [37] Jeroen “sprite_tm” Domburg. Hard Disks: More than just block devices. Talk at OHM (video: https://www.youtube.com/watch?v=0Da6OARhgXk), 2013.

111 Bibliography

[38] Christian D’Orazio, Aswami Ariffin, and Kim-Kwang Raymond Choo. iOS anti-forensics: How can we securely conceal, delete and insert data? In 47th Hawaii International Conference on System Sciences, HICSS 2014, Waikoloa, HI, USA, January 6-9, 2014, pages 4838–4847, 2014. doi: 10.1109/HICSS.2014.594. URL http://dx.doi.org/10.1109/HICSS.2014.594. [39] EC-Council. Computer Forensics: Investigation Procedures and Response (CHFI). Cengage Learning, 2nd edition, 2016. ISBN 978-1305883475. [40] M. C. Falconer, C. P. Mozak, and A. J. Norman. Suppressing power supply noise using data scrambling in double data rate memory systems, August 6 2013. URL https://www.google. com/patents/US8503678. US Patent 8,503,678. [41] Wei Fan, Kai Wang, François Cayre, and Zhang Xiong. JPEG anti-forensics using non- parametric DCT quantization noise estimation and natural image statistics. In ACM Information Hiding and Multimedia Security Workshop, IH&MMSec ’13, Montpellier, France, June 17-19, 2013, pages 117–122, 2013. doi: 10.1145/2482513.2482536. URL http://doi.acm.org/10.1145/2482513.2482536. [42] Wei Fan, Kai Wang, François Cayre, and Zhang Xiong. A variational approach to JPEG anti-forensics. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pages 3058–3062, 2013. doi: 10.1109/ICASSP.2013.6638220. URL http://dx.doi.org/10.1109/ICASSP.2013.6638220. [43] Wei Fan, Kai Wang, François Cayre, and Zhang Xiong. JPEG anti-forensics with improved tradeoff between forensic undetectability and image quality. IEEE Transactions Information Forensics and Security, 9(8):1211–1226, 2014. doi: 10.1109/TIFS.2014.2317949. URL http:// dx.doi.org/10.1109/TIFS.2014.2317949. [44] Wei Fan, Kai Wang, François Cayre, and Zhang Xiong. Median filtered image quality enhancement and anti-forensics via variational deconvolution. IEEE Trans. Information Forensics and Security, 10(5):1076–1091, 2015. doi: 10.1109/TIFS.2015.2398362. URL http://dx.doi.org/10.1109/TIFS.2015.2398362. [45] Houda Ferradi, Rémi Géraud, David Naccache, and Assia Tria. When organized crime applies academic results: a forensic analysis of an in-card listening device. J. Cryptographic Engineering, 6(1):49–59, 2016. doi: 10.1007/s13389-015-0112-3. URL http://dx.doi.org/ 10.1007/s13389-015-0112-3. [46] Joe FitzPatrick and Miles Crabill. NSA playset: PCIe. Talk at DEF CON 22 (slides: https://www.defcon.org/images/defcon-22/dc-22-presentations/Fitzpatrick- Crabill/DEFCON-22-Joe-FitzPatrick-Miles-Crabill-NSA-Playset-PCIe.pdf video: https://www.youtube.com/watch?v=OD2Wxe4RLeU), 2014. [47] Marco Fontani, Alessandro Bonchi, Alessandro Piva, and Mauro Barni. Countering anti- forensics by means of data fusion. In Media Watermarking, Security, and Forensics 2014, San Francisco, CA, USA, February 2, 2014, Proceedings, page 90280Z, 2014. doi: 10.1117/12. 2039569. URL http://dx.doi.org/10.1117/12.2039569. [48] Dario Forte. Dealing with forensic software vulnerabilities: is anti-forensics a real danger? Network Security, 2008(12):18 – 20, 2008. ISSN 1353-4858. doi: http://dx.doi.org/10. 1016/S1353-4858(08)70143-0. URL http://www.sciencedirect.com/science/article/ pii/S1353485808701430. [49] Dario Forte and Richard Power. A tour through the realm of anti-forensics. Computer Fraud & Security, 2007(6):18 – 20, 2007. ISSN 1361-3723. doi: http://dx.doi.org/10. 1016/S1361-3723(07)70079-9. URL http://www.sciencedirect.com/science/article/ pii/S1361372307700799.

112 Bibliography

[50] Felix Freiling and Michael Gruhn. What is essential data in digital forensic analysis? In Ninth International Conference on IT Security Incident Management & IT Forensics, IMF 2015, Magdeburg, Germany, May 18-20, 2015, pages 40–48, 2015. doi: 10.1109/IMF.2015.20. URL http://dx.doi.org/10.1109/IMF.2015.20. [51] Felix Freiling, Jan Schuhr, and Michael Gruhn. What is essential data in digital forensic analysis? it - Information Technology, 57(6):376–383, 2015. URL http://www.degruyter. com/view/j/itit.2015.57.issue-6/itit-2015-0016/itit-2015-0016.xml. [52] Simson Garfinkel. Anti-forensics: Techniques, detection and countermeasures. In 2nd International Conference on i-Warfare and Security, page 77, 2007. [53] Simson Garfinkel. Digital forensics research: The next 10 years. Digital Investigation, 7, Supplement:S64 – S73, 2010. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2010.05. 009. URL http://www.sciencedirect.com/science/article/pii/S1742287610000368. The Proceedings of the Tenth Annual DFRWS Conference. [54] Behrad Garmany and Tilo Müller. PRIME: private RSA infrastructure for memory-less encryption. In Annual Computer Security Applications Conference, ACSAC ’13, New Orleans, LA, USA, December 9-13, 2013, pages 149–158, 2013. doi: 10.1145/2523649.2523656. URL http://doi.acm.org/10.1145/2523649.2523656. [55] Matthew Geiger. Evaluating commercial counter-forensic tools. In Refereed Proceedings of the 5th Annual Digital Forensic Research Workshop, DFRWS 2005, Astor Crowne Plaza, New Orleans, Louisiana, USA, August 17-19, 2005, 2005. URL http://www.dfrws.org/2005/ proceedings/geiger_couterforensics.pdf. [56] Matthew Geiger. Counter-forensic tools: Analysis and data recovery. In 18th Annual FIRST Conference, Maltimore, Maryland, pages 25–30, 2006. URL https://www.first. org/conference/2006/papers/geiger-matthew-papers.pdf. [57] Matthew Geiger and Lorrie Faith Cranor. Scrubbing stubborn data: An evaluation of counter- forensic privacy tools. IEEE Security & Privacy, 4(5):16–25, 2006. doi: 10.1109/MSP.2006.132. URL http://dx.doi.org/10.1109/MSP.2006.132. [58] Zeno Geradts, Jurrien Bijhold, Martijn Kieft, Kenji Kurosawa, Kenro Kuroki, and Naoki Saitoh. Methods for identification of images acquired with digital cameras. Proc. SPIE, 4232: 505–512, 2001. doi: 10.1117/12.417569. URL http://dx.doi.org/10.1117/12.417569. [59] Thomas Gloe, Matthias Kirchner, Antje Winkler, and Rainer Böhme. Can we trust digital image forensics? In Proceedings of the 15th International Conference on Multimedia 2007, Augsburg, Germany, September 24-29, 2007, pages 78–86, 2007. doi: 10.1145/1291233.1291252. URL http://doi.acm.org/10.1145/1291233.1291252. [60] Miroslav Goljan, Jessica J. Fridrich, and Mo Chen. Sensor noise camera identification: countering counter-forensics. In Media Forensics and Security II, part of the IS&T-SPIE Electronic Imaging Symposium, San Jose, CA, USA, January 18-20, 2010, Proceedings, page 75410S, 2010. doi: 10.1117/12.839055. URL http://dx.doi.org/10.1117/12.839055. [61] Travis Goodspeed. Active disk antiforensics and hard disk backdoors. Talk at 0x07 Sec-T Conference (video: https://www.youtube.com/watch?v=8Zpb34Qf0NY), 2014. [62] Johannes Götzfried and Tilo Müller. ARMORED: cpu-bound encryption for android-driven ARM devices. In 2013 International Conference on Availability, Reliability and Security, ARES 2013, Regensburg, Germany, September 2-6, 2013, pages 161–168, 2013. doi: 10.1109/ ARES.2013.23. URL http://dx.doi.org/10.1109/ARES.2013.23. [63] Johannes Götzfried, Tilo Müller, Gabor Drescher, Stefan Nürnberger, and Michael Backes. Ramcrypt: Kernel-based address space encryption for user-mode processes. In Proceedings of the 11th ACM on Conference on Computer and Communications Security, AsiaCCS 2016, Xi’an, China, May 30 - June 3, 2016, pages 919–924, 2016. doi: 10.1145/2897845.2897924. URL http://doi.acm.org/10.1145/2897845.2897924.

113 Bibliography

[64] Hans Gross and Friedrich Geerdes. Handbuch der Kriminalistik, volume 2. Pawlak, 1985. ISBN 9783881992640.

[65] Michael Gruhn. Windows NT pagefile.sys virtual memory analysis. In Ninth International Conference on IT Security Incident Management & IT Forensics, IMF 2015, Magdeburg, Germany, May 18-20, 2015, pages 3–18, 2015. doi: 10.1109/IMF.2015.10. URL http://dx. doi.org/10.1109/IMF.2015.10. [66] Michael Gruhn and Felix Freiling. Evaluating atomicity, and integrity of correct memory acquisition methods. Digital Investigation, 16, Supplement:S1 – S10, 2016. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2016.01.003. URL http://www.sciencedirect. com/science/article/pii/S1742287616000049. Proceedings of the Third Annual DFRWS Europe. [67] Michael Gruhn and Tilo Müller. On the practicability of cold boot attacks. In 2013 International Conference on Availability, Reliability and Security, ARES 2013, Regensburg, Germany, September 2-6, 2013, pages 390–397, 2013. doi: 10.1109/ARES.2013.52. URL http://dx.doi.org/10.1109/ARES.2013.52. [68] Le Guan, Jingqiang Lin, Bo Luo, and Jiwu Jing. Copker: Computing with private keys without RAM. In 21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23-26, 2014, 2014. URL http://www.internetsociety. org/doc/copker-computing-private-keys-without-ram. [69] Mayank R. Gupta, Michael D. Hoeschele, and Marcus K. Rogers. Hidden disk areas: HPA and DCO. IJDE, 5(1), 2006. URL http://www.utica.edu/academic/institutes/ecii/ publications/articles/EFE36584-D13F-2962-67BEB146864A2671.pdf. [70] Peter Gutmann. Data remanence in semiconductor devices. In 10th USENIX Security Symposium, August 13-17, 2001, Washington, D.C., USA, 2001. URL http://www.usenix. org/publications/library/proceedings/sec01/gutmann.html. [71] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William Clarkson, William Paul, Joseph A. Cal, Ariel J. Feldman, and Edward W. Felten. Memory Research Project Source Code, Center for Information Technology Policy at Princeton. Website: https://citp. princeton.edu/research/memory/code/, 2008. [72] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William Clarkson, William Paul, Joseph A. Calandrino, Ariel J. Feldman, Jacob Appelbaum, and Edward W. Felten. Lest we remember: Cold boot attacks on encryption keys. In Proceedings of the 17th USENIX Security Symposium, July 28-August 1, 2008, San Jose, CA, USA, pages 45–60, 2008. URL http://www.usenix.org/events/sec08/tech/full_papers/halderman/halderman.pdf. [73] J. Alex Halderman, Seth D. Schoen, Nadia Heninger, William Clarkson, William Paul, Joseph A. Calandrino, Ariel J. Feldman, Jacob Appelbaum, and Edward W. Felten. Lest we remember: cold-boot attacks on encryption keys. Commun. ACM, 52(5):91–98, 2009. doi: 10.1145/1506409.1506429. URL http://doi.acm.org/10.1145/1506409.1506429. [74] Ryan Harris. Arriving at an anti-forensics consensus: Examining how to define and control the anti-forensics problem. Digital Investigation, 3, Supplement:44 – 49, 2006. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2006.06.005. URL http://www.sciencedirect.com/ science/article/pii/S1742287606000673. The Proceedings of the 6th Annual Digital Forensic Research Workshop (DFRWS ’06).

[75] Nadia Heninger and Hovav Shacham. Reconstructing RSA private keys from random key bits. In Advances in Cryptology - CRYPTO 2009, 29th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 16-20, 2009. Proceedings, pages 1–17, 2009. doi: 10.1007/978-3-642-03356-8_1. URL http://dx.doi.org/10.1007/978-3-642-03356-8_1.

114 Bibliography

[76] Frank P. Higgins. Break through the BIOS password. Leaked via AntiSec FFF-E05 dump on 2011-11-18. Available online at https://cryptome.org/isp-spy/bios-spy.pdf), 2011. [77] S. Hilley. Anti-forensics with a small army of exploits. Digital Investigation, 4(1):13 – 15, 2007. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2007.01.005. URL http://www. sciencedirect.com/science/article/pii/S1742287607000072. [78] Maarten Huijbregtse and Zeno Geradts. Using the ENF criterion for determining the time of recording of short digital audio recordings. In Computational Forensics, Third International Workshop, IWCF 2009, The Hague, The Netherlands, August 13-14, 2009. Proceedings, pages 116–124, 2009. doi: 10.1007/978-3-642-03521-0_11. URL http://dx.doi.org/10.1007/978- 3-642-03521-0_11. [79] Golden G. Richard III and Andrew Case. In lieu of swap: Analyzing compressed RAM in mac OS x and linux. Digital Investigation, 11, Supplement 2:S3 – S12, 2014. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2014.05.011. URL http://www.sciencedirect.com/ science/article/pii/S1742287614000541. Fourteenth Annual DFRWS Conference. [80] Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 3A: System Program- ming Guide, Part 1. Intel Corporation, 2010. [81] Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z. Intel Corporation, 2012. [82] Nick L. Petroni Jr., AAron Walters, Timothy Fraser, and William A. Arbaugh. FATKit: a framework for the extraction and analysis of digital forensic data from volatile system memory. Digital Investigation, 3(4):197 – 210, 2006. ISSN 1742-2876. doi: http://dx.doi.org/ 10.1016/j.diin.2006.10.001. URL http://www.sciencedirect.com/science/article/pii/ S1742287606001228. [83] Dan Kaminsky. Secure random by default. Talk at DEF CON 22 (video: https://youtu. be/xneBjc8z0DE?t=203), 2014. [84] Karl-Johan Karlsson and William Bradley Glisson. Android anti-forensics: Modifying cyanogenmod. In 47th Hawaii International Conference on System Sciences, HICSS 2014, Waikoloa, HI, USA, January 6-9, 2014, pages 4828–4837, 2014. doi: 10.1109/HICSS.2014.593. URL http://dx.doi.org/10.1109/HICSS.2014.593. [85] Auguste Kerckhoffs. La cryptographie militaire. Journal des sciences militaires, 9:538, 1883. [86] Gary C. Kessler. Anti-forensics and the digital investigator. In Australian Digital Forensics Conference, page 1, 2007. [87] P.L. Kirk and J.I. Thornton. Crime Investigation. Wiley, 1974. ISBN 9780471482475. [88] Julian Knauer and Harald Baier. Zur Sicherheit von ATA-Festplattenpasswörtern. Proceedings of D-A-CH Security, pages 26–37, 2012.

[89] Jesse D. Kornblum. Using every part of the buffalo in windows memory analysis. Digital Inves- tigation, 4(1):24 – 29, 2007. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2006.12.002. URL http://www.sciencedirect.com/science/article/pii/S1742287607000047. [90] J. Laan and N. Dijkhuizen. Firmwall: Protecting hard disk firmware. Published on- line (paper: https://www.os3.nl/_media/2013-2014/courses/ot/jan_niels.pdf tool: https://github.com/janlaan/firmwall), 2014. [91] ShiYue Lai and Rainer Böhme. Countering counter-forensics: The case of JPEG compression. In Information Hiding - 13th International Conference, IH 2011, Prague, Czech Republic, May 18-20, 2011, Revised Selected Papers, pages 285–298, 2011. doi: 10.1007/978-3-642- 24178-9_20. URL http://dx.doi.org/10.1007/978-3-642-24178-9_20.

115 Bibliography

[92] Pierre L’Ecuyer. Tables of maximally equidistributed combined LFSR generators. Math. Comput., 68(225):261–269, 1999. doi: 10.1090/S0025-5718-99-01039-X. URL http://dx.doi. org/10.1090/S0025-5718-99-01039-X. [93] H.C. Lee and H.A. Harris. Physical Evidence in Forensic Science. Lawyers & Judges Publishing Company, 2000. ISBN 9781930056015. [94] Kyoungho Lee, Hyunuk Hwang, Kibom Kim, and BongNam Noh. Robust bootstrapping memory analysis against anti-forensics. Digital Investigation, 18, Supplement:S23 – S32, 2016. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2016.04.009. URL http://www. sciencedirect.com/science/article/pii/S1742287616300408. [95] Seokhee Lee, Antonio Savoldi, Sangjin Lee, and Jongin Lim. Windows pagefile collection and analysis for a live forensics context. In Future Generation Communication and Networking, FGCN 2007, Ramada Plaza Jeju, Jeju-Island, Korea, December 6-8, 2007, Proceedings, pages 97–101, 2007. doi: 10.1109/FGCN.2007.236. URL http://dx.doi.org/10.1109/FGCN.2007. 236. [96] Eugene Libster and Jesse D. Kornblum. A proposal for an integrated memory acquisition mechanism. SIGOPS Oper. Syst. Rev., 42(3):14–20, April 2008. ISSN 0163-5980. doi: 10.1145/1368506.1368510. URL http://doi.acm.org/10.1145/1368506.1368510. [97] Simon Lindenlauf, Hans Höfken, and Marko Schuba. Cold boot attacks on DDR2 and DDR3 SDRAM. In 10th International Conference on Availability, Reliability and Security, ARES 2015, Toulouse, France, August 24-27, 2015, pages 287–292, 2015. doi: 10.1109/ARES.2015.28. URL http://dx.doi.org/10.1109/ARES.2015.28. [98] W. Link and H. May. Eigenschaften von MOS-Ein-Transistorspeicherzellen bei tieften Temperaturen. Archiv für Elektronik und Übertragungstechnik, 33(6):229–235, 1979. [99] Carsten Maartmann-Moe, Steffen E. Thorkildsen, and André Årnes. The persistence of memory: Forensic identification and extraction of cryptographic keys. Digital Investigation, 6, Supplement:S132 – S140, 2009. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2009.06. 002. URL http://www.sciencedirect.com/science/article/pii/S1742287609000486. The Proceedings of the Ninth Annual DFRWS Conference. [100] Lorenzo Martignoni, Aristide Fattori, Roberto Paleari, and Lorenzo Cavallaro. Live and trustworthy forensic analysis of commodity production systems. In Recent Advances in Intrusion Detection, 13th International Symposium, RAID 2010, Ottawa, Ontario, Canada, September 15-17, 2010. Proceedings, pages 297–316, 2010. doi: 10.1007/978-3-642-15512-3_16. URL http://dx.doi.org/10.1007/978-3-642-15512-3_16. [101] Friedemann Mattern. Virtual time and global states of distributed systems. Parallel and Distributed Algorithms, 1(23):215–226, 1989. [102] Patrick McGregor, Tim Hollebeek, Alex Volynkin, and Matthew White. Braving the cold: New methods for preventing cold boot attacks on encryption keys. Talk at Black Hat USA (slides: https://www.blackhat.com/presentations/bh-usa-08/McGregor/BH_US_ 08_McGregor_Cold_Boot_Attacks.pdf), 2008. [103] Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V. Rozas, Hisham Shafi, Vedvyas Shanbhogue, and Uday R. Savagaonkar. Innovative instructions and software model for isolated execution. In HASP 2013, The Second Workshop on Hardware and Architectural Support for Security and Privacy, Tel-Aviv, Israel, June 23-24, 2013, page 10, 2013. doi: 10.1145/2487726.2488368. URL http://doi.acm.org/10.1145/2487726.2488368. [104] Microsoft Corporation. Windows Research Kernel v1.2. Non-public, 2006. [105] C. P. Mozak. Suppressing power supply noise using data scrambling in double data rate memory systems, May 17 2011. URL https://www.google.com.ar/patents/US7945050. US Patent 7,945,050.

116 Bibliography

[106] Tilo Müller and Michael Spreitzenbarth. FROST - forensic recovery of scrambled telephones. In Applied Cryptography and Network Security - 11th International Conference, ACNS 2013, Banff, AB, Canada, June 25-28, 2013. Proceedings, pages 373–388, 2013. doi: 10.1007/978-3- 642-38980-1_23. URL http://dx.doi.org/10.1007/978-3-642-38980-1_23. [107] Tilo Müller, Andreas Dewald, and Felix Freiling. AESSE: a cold-boot resistant implementation of AES. In Proceedings of the Third European Workshop on System Security, EUROSEC 2010, Paris, France, April 13, 2010, pages 42–47, 2010. doi: 10.1145/1752046.1752053. URL http://doi.acm.org/10.1145/1752046.1752053. [108] Tilo Müller, Felix Freiling, and Andreas Dewald. TRESOR runs encryption securely outside RAM. In 20th USENIX Security Symposium, San Francisco, CA, USA, August 8-12, 2011, Proceedings, 2011. URL http://static.usenix.org/events/sec11/tech/full_papers/ Muller.pdf. [109] Tilo Müller, Tobias Latzo, and Felix Freiling. Hardware-based full disk encryption (in) security survey. Technical Report CS-2014-04, Department Informatik, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2012. [110] Tilo Müller, Benjamin Taubmann, and Felix Freiling. TreVisor - OS-independent software- based full disk encryption secure against main memory attacks. In Applied Cryptography and Network Security - 10th International Conference, ACNS 2012, Singapore, June 26-29, 2012. Proceedings, pages 66–83, 2012. doi: 10.1007/978-3-642-31284-7_5. URL http://dx.doi. org/10.1007/978-3-642-31284-7_5. [111] Noora Al Mutawa, Ibtesam Al Awadhi, Ibrahim M. Baggili, and Andrew Marrington. Forensic artifacts of Facebook’s instant messaging service. In 6th International Conference for Internet Technology and Secured Transactions, ICITST 2011, Abu Dhabi, UAE, December 11-14, 2011, pages 771–776, 2011. URL http://ieeexplore.ieee.org/xpl/articleDetails.jsp? arnumber=6148436. [112] Karsten Nohl and Jan “Starbug” Krissler. Deep silicon analysis. Talk at HAR (video: https://www.youtube.com/watch?v=UoLyYZzXjQE), 2009. [113] Karsten Nohl and Jan “Starbug” Krissler. Silicon chips: No more secrets. Talk at PacSec (slides: http://www.degate.org/Pacsec2009/091001.Pacsec.Silicon.pdf), 2009. [114] Donny Jacob Ohana and Narasimha Shashidhar. Do private and portable web browsers leave incriminating evidence? A forensic analysis of residual artifacts from private and portable web browsing sessions. EURASIP J. Information Security, 2013:6, 2013. doi: 10.1186/1687-417X-2013-6. URL http://dx.doi.org/10.1186/1687-417X-2013-6. [115] Jürgen Pabel. FrozenCache – Mitigating cold-boot attacks for full-disk-encryption software. Talk at the 27th Chaos Communication Congress (27C3) (slides: https://events.ccc. de/congress/2010/Fahrplan/attachments/1786_FrozenCache.pdf, video: https://www. youtube.com/watch?v=EHkUaiomxfE, blog: http://http://frozencache.blogspot.de/), 2010.

[116] Cecilia Pasquini and Giulia Boato. JPEG compression anti-forensics based on first significant digit distribution. In 15th IEEE International Workshop on Multimedia Signal Processing, MMSP 2013, Pula, Sardinia, Italy, September 30 - Oct. 2, 2013, pages 500–505, 2013. doi: 10.1109/MMSP.2013.6659339. URL http://dx.doi.org/10.1109/MMSP.2013.6659339. [117] Michael Perklin. Anti-forensics and anti-anti-forensics. Talk at DEF CON 20 (slides: https://www.defcon.org/images/defcon-20/dc-20-presentations/Perklin/ DEFCON-20-Perklin-AntiForensics.pdf, video: https://www.youtube.com/watch?v= 1PEOCAxR5Hk), 2012.

117 Bibliography

[118] Huw Read, Konstantinos Xynos, Iain Sutherland, Gareth Davies, Tom Houiellebecq, Frode Roarson, and Andrew Blyth. Manipulation of hard drive firmware to conceal entire parti- tions. Digital Investigation, 10(4):281 – 286, 2013. ISSN 1742-2876. doi: http://dx.doi.org/ 10.1016/j.diin.2013.10.001. URL http://www.sciencedirect.com/science/article/pii/ S1742287613001072. [119] Slim Rekhis and Noureddine Boudriga. Formal digital investigation of anti-forensic attacks. In Fifth IEEE International Workshop on Systematic Approaches to Digital Forensic Engineering, SADFE 2010, Oakland, CA, USA, May 20, 2010, pages 33–44, 2010. doi: 10.1109/SADFE. 2010.9. URL http://dx.doi.org/10.1109/SADFE.2010.9. [120] Slim Rekhis and Noureddine Boudriga. A system for formal digital forensic investigation aware of anti-forensic attacks. IEEE Trans. Information Forensics and Security, 7(2):635–650, 2012. doi: 10.1109/TIFS.2011.2176117. URL http://dx.doi.org/10.1109/TIFS.2011.2176117. [121] Heinrich Riebler, Tobias Kenter, Christian Plessl, and Christoph Sorge. Reconstructing AES key schedules from decayed memory with fpgas. In 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2014, Boston, MA, USA, May 11-13, 2014, pages 222–229, 2014. doi: 10.1109/FCCM.2014.67. URL http://dx.doi.org/10.1109/FCCM.2014.67. [122] Anibal L. Sacco and Alfredo A. Ortega. Persistent BIOS infection – “The early bird catches the worm”. Talk at CanSecWest (slides: https://cansecwest.com/csw09/csw09-sacco- ortega.pdf), 2009. [123] Anibal L. Sacco and Alfredo A. Ortega. Persistent BIOS infection – “The early bird catches the worm”. Phrack Magazin, Volume 0x0d, Issue 0x42, Phile #0x07 of 0x11 (http://phrack. org/issues/66/7.html), 2009. [124] H. Said, N. Al Mutawa, I. Al Awadhi, and M. Guimaraes. Forensic analysis of private browsing artifacts. In International Conference Ionnnovations in Information Technology (IIT), pages 197–202, April 2011. doi: 10.1109/INNOVATIONS.2011.5893816. [125] Santanu Sarkar, Sourav Sen Gupta, and Subhamoy Maitra. Error correction of partially exposed RSA private keys from MSB side. In Information Systems Security - 9th International Conference, ICISS 2013, Kolkata, India, December 16-20, 2013. Proceedings, pages 345–359, 2013. doi: 10.1007/978-3-642-45204-8_26. URL http://dx.doi.org/10.1007/978-3-642- 45204-8_26. [126] Bryan Sartin. Anti-forensics – distorting the evidence. Computer Fraud & Security, 2006(5): 4 – 6, 2006. ISSN 1361-3723. doi: http://dx.doi.org/10.1016/S1361-3723(06)70354-2. URL http://www.sciencedirect.com/science/article/pii/S1361372306703542. [127] Bradley Schatz. BodySnatcher: Towards reliable volatile memory acquisition by software. Digital Investigation, 4, Supplement:126 – 134, 2007. ISSN 1742-2876. doi: http://dx.doi.org/ 10.1016/j.diin.2007.06.009. URL http://www.sciencedirect.com/science/article/pii/ S1742287607000497. [128] Martin Schobert. Semiautomatisches Reverse-Engineering von Logikgattern in integrierten Schaltkreisen (speziell zur Aufklärung geheimgehaltenener Verschlüsselungsverfahren). Talk at 0sec (slides: http://www.degate.org/documentation/0sec_talk_degate.pdf), 2009. [129] Ruud Schramp. RAM Memory acquisition using live-BIOS modification. Talk at OHM (video: https://www.youtube.com/watch?v=Zmo13Bd4XmU), 2013. [130] Seagate. Maximize security, lock down hard drive firmware with Seagate Secure Download & Diagnostics. Technology Paper, 2015. URL http://www.seagate.com/files/www- content/solutions-content/security-and-encryption/en-us/docs/seagate-secure- download-diagnostics-with-maximize-sec-lock-down-hard-drive-firmware-tp684- 1-1508us.pdf.

118 Bibliography

[131] Maximilian Seitzer, Michael Gruhn, and Tilo Müller. A bytecode interpreter for secure program execution in untrusted main memory. In Computer Security - ESORICS 2015 - 20th European Symposium on Research in Computer Security, Vienna, Austria, September 21-25, 2015, Proceedings, Part II, pages 376–395, 2015. doi: 10.1007/978-3-319-24177-7_19. URL http://dx.doi.org/10.1007/978-3-319-24177-7_19. [132] Patrick Simmons. Security through amnesia: a software-based solution to the cold boot attack on disk encryption. In Proceedings of the 27th Annual Computer Security Applications Conference, ACSAC ’11, pages 73–82, New York, NY, USA, 2011. ACM. ISBN 978-1-4503-0672- 0. doi: 10.1145/2076732.2076743. URL http://doi.acm.org/10.1145/2076732.2076743. [133] Sinometer Instruments. DT8380, DT8550 - INFRARED THERMOMETERS. Datasheet: http://www.sinometer.com/pdf/DT8380,%208550.pdf, 2003. [134] Sergei Skorobogatov. Low temperature data remanence in static RAM. Technical Report UCAM-CL-TR-536, University of Cambridge, Computer Laboratory, June 2002. URL http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-536.pdf. [135] Sergei Skorobogatov and Christopher Woods. Breakthrough silicon scanning discovers back- door in military chip. In Cryptographic Hardware and Embedded Systems - CHES 2012 - 14th International Workshop, Leuven, Belgium, September 9-12, 2012. Proceedings, pages 23–40, 2012. doi: 10.1007/978-3-642-33027-8_2. URL http://dx.doi.org/10.1007/978-3-642- 33027-8_2. [136] Walter Specht. Die Chemiluminescenz des Hämins, ein Hilfsmittel zur Auffindung und Erkennung forensisch wichtiger Blutspuren. Angewandte Chemie, 50(8):155–157, 1937. ISSN 1521-3757. doi: 10.1002/ange.19370500803. URL http://dx.doi.org/10.1002/ange. 19370500803. [137] Matthew C. Stamm and K. J. Ray Liu. Wavelet-based image compression anti-forensics. In Proceedings of the International Conference on Image Processing, ICIP 2010, September 26-29, Hong Kong, China, pages 1737–1740, 2010. doi: 10.1109/ICIP.2010.5652845. URL http://dx.doi.org/10.1109/ICIP.2010.5652845. [138] Matthew C. Stamm and K. J. Ray Liu. Anti-forensics for frame deletion/addition in MPEG video. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pages 1876–1879, 2011. doi: 10.1109/ICASSP.2011.5946872. URL http://dx.doi.org/10. 1109/ICASSP.2011.5946872. [139] Matthew C. Stamm and K. J. Ray Liu. Anti-forensics of digital image compression. IEEE Trans. Information Forensics and Security, 6(3-2):1050–1065, 2011. doi: 10.1109/TIFS.2011. 2119314. URL http://dx.doi.org/10.1109/TIFS.2011.2119314. [140] Matthew C. Stamm, Steven K. Tjoa, W. Sabrina Lin, and K. J. Ray Liu. Anti-forensics of JPEG compression. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2010, 14-19 March 2010, Sheraton Dallas Hotel, Dallas, Texas, USA, pages 1694–1697, 2010. doi: 10.1109/ICASSP.2010.5495491. URL http://dx.doi.org/10.1109/ICASSP.2010.5495491. [141] Matthew C. Stamm, Steven K. Tjoa, W. Sabrina Lin, and K. J. Ray Liu. Undetectable image tampering through JPEG compression anti-forensics. In Proceedings of the International Conference on Image Processing, ICIP 2010, September 26-29, Hong Kong, China, pages 2109–2112, 2010. doi: 10.1109/ICIP.2010.5652553. URL http://dx.doi.org/10.1109/ICIP. 2010.5652553. [142] Matthew C. Stamm, W. Sabrina Lin, and K. J. Ray Liu. Forensics vs. anti-forensics: A decision and game theoretic framework. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012, pages 1749– 1752, 2012. doi: 10.1109/ICASSP.2012.6288237. URL http://dx.doi.org/10.1109/ICASSP. 2012.6288237.

119 Bibliography

[143] Matthew C. Stamm, W. Sabrina Lin, and K. J. Ray Liu. Temporal forensics and anti- forensics for motion compensated video. IEEE Trans. Information Forensics and Security, 7 (4):1315–1329, 2012. doi: 10.1109/TIFS.2012.2205568. URL http://dx.doi.org/10.1109/ TIFS.2012.2205568. [144] Andrew S.Tanenbaum. OPERATING SYSTEMS: Design and Implementation. Prentice-Hall, 1987. ISBN 0-13-637331-3. [145] Johannes Stüttgen and Michael Cohen. Anti-forensic resilient memory acquisition. Digital Investigation, 10, Supplement:S105 – S115, 2013. ISSN 1742-2876. doi: http://dx.doi.org/ 10.1016/j.diin.2013.06.012. URL http://www.sciencedirect.com/science/article/pii/ S1742287613000583. The Proceedings of the Thirteenth Annual DFRWS Conference. [146] Johannes Stüttgen and Michael Cohen. Robust Linux memory acquisition with minimal target impact. Digital Investigation, 11, Supplement 1:S112 – S119, 2014. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2014.03.014. URL http://www.sciencedirect. com/science/article/pii/S174228761400019X. Proceedings of the First Annual DFRWS Europe. [147] Hung-Min Sun, Chi-Yao Weng, Chin-Feng Lee, and Cheng-Hsing Yang. Anti-forensics with steganographic data embedding in digital images. IEEE Journal on Selected Areas in Communications, 29(7):1392–1403, 2011. doi: 10.1109/JSAC.2011.110806. URL http://dx. doi.org/10.1109/JSAC.2011.110806. [148] Iain Sutherland, Gareth Davies, Nick Pringle, and Andrew Blyth. The impact of hard disk firmware steganography on computer forensics. JDFSL, 4(2):73–84, 2009. URL http://ojs. jdfsl.org/index.php/jdfsl/article/view/165. [149] Patchara Sutthiwan and Yun Q. Shi. Anti-forensics of double JPEG compression detection. In Digital Forensics and Watermarking - 10th International Workshop, IWDW 2011, Atlantic City, NJ, USA, October 23-26, 2011, Revised Selected Papers, pages 411–424, 2011. doi: 10. 1007/978-3-642-32205-1_33. URL http://dx.doi.org/10.1007/978-3-642-32205-1_33. [150] Christopher Tarnovsky and Karsten Nohl. Reviving smart card analysis. Talk at Chaos Com- munication Camp (slides: https://events.ccc.de/camp/2011/Fahrplan/attachments/ 1888_SRLabs-Reviving_Smart_Card_Analysis.pdf, videos: https://www.youtube.com/ watch?v=fFx6Rn57DrY), 2011. [151] Benjamin Taubmann, Manuel Huber, Sascha Wessel, Lukas Heim, Hans Peter Reiser, and Georg Sigl. A lightweight framework for cold boot based forensics on mobile devices. In 10th International Conference on Availability, Reliability and Security, ARES 2015, Toulouse, France, August 24-27, 2015, pages 120–128, 2015. doi: 10.1109/ARES.2015.47. URL http://dx.doi.org/10.1109/ARES.2015.47. [152] “the grugq”. Defeating forensic analysis on unix. Phrack Magazin, Volume 0x0b, Issue 0x3b, Phile #0x06 of 0x12 (http://phrack.org/issues/59/6.html), 2002. [153] P. Thomas and A. Morris. An investigation into the development of an anti-forensic tool to obscure USB flash drive device information on a Windows XP platform. In Digital Forensics and Incident Analysis, 2008. WDFIA ’08. Third International Annual Workshop on, pages 60–66, Oct 2008. doi: 10.1109/WDFIA.2008.13. [154] TCG Platform Reset Attack Mitigation Specification, Version 1.00, Revision 1.00. Group, Incorporated, 2008. [155] Alex Tsow. An improved recovery algorithm for decayed AES key schedule images. In Selected Areas in Cryptography, 16th Annual International Workshop, SAC 2009, Calgary, Alberta, Canada, August 13-14, 2009, Revised Selected Papers, pages 215–230, 2009. doi: 10.1007/978- 3-642-05445-7_14. URL http://dx.doi.org/10.1007/978-3-642-05445-7_14.

120 Bibliography

[156] Unified Extensible Firmware Interface Specification, 2.4. Unified EFI, Inc., 2013. [157] Giuseppe Valenzise, Vitaliano Nobile, Marco Tagliasacchi, and Stefano Tubaro. Countering JPEG anti-forensics. In 18th IEEE International Conference on Image Processing, ICIP 2011, Brussels, Belgium, September 11-14, 2011, pages 1949–1952, 2011. doi: 10.1109/ICIP.2011. 6115854. URL http://dx.doi.org/10.1109/ICIP.2011.6115854. [158] Giuseppe Valenzise, Marco Tagliasacchi, and Stefano Tubaro. The cost of JPEG compression anti-forensics. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic, pages 1884–1887, 2011. doi: 10.1109/ICASSP.2011.5946874. URL http:// dx.doi.org/10.1109/ICASSP.2011.5946874. [159] Giuseppe Valenzise, Marco Tagliasacchi, and Stefano Tubaro. Revealing the traces of JPEG compression anti-forensics. IEEE Trans. Information Forensics and Security, 8(2):335–349, 2013. doi: 10.1109/TIFS.2012.2234117. URL http://dx.doi.org/10.1109/TIFS.2012. 2234117. [160] R.B. van Baar, W. Alink, and A.R. van Ballegooij. Forensic memory analysis: Files mapped in memory. Digital Investigation, 5, Supplement:S52 – S57, 2008. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2008.05.014. URL http://www.sciencedirect. com/science/article/pii/S1742287608000327. The Proceedings of the Eighth Annual DFRWS Conference. [161] Wiger van Houten and Zeno Geradts. Source identification for multiply com- pressed videos originating from youtube. Digital Investigation, 6(1–2):48 – 60, 2009. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2009.05.003. URL http://www.sciencedirect. com/science/article/pii/S1742287609000310. [162] Stefan Vömel and Felix Freiling. A survey of main memory acquisition and analysis techniques for the windows operating system. Digital Investigation, 8(1):3 – 22, 2011. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2011.06.002. URL http://www.sciencedirect.com/ science/article/pii/S1742287611000508. [163] Stefan Vömel and Felix Freiling. Correctness, atomicity, and integrity: Defining criteria for forensically-sound memory acquisition. Digital Investigation, 9(2):125 – 137, 2012. ISSN 1742- 2876. doi: http://dx.doi.org/10.1016/j.diin.2012.04.005. URL http://www.sciencedirect. com/science/article/pii/S1742287612000254. [164] Stefan Vömel and Johannes Stüttgen. An evaluation platform for forensic memory acqui- sition software. Digital Investigation, 10, Supplement:S30 – S40, 2013. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2013.06.004. URL http://www.sciencedirect.com/ science/article/pii/S1742287613000509. The Proceedings of the Thirteenth Annual DFRWS Conference. [165] Philipp Wachter and Michael Gruhn. Practicability study of android volatile memory forensic research. In 2015 IEEE International Workshop on Information Forensics and Security, WIFS 2015, Roma, Italy, November 16-19, 2015, pages 1–6, 2015. doi: 10.1109/WIFS.2015.7368601. URL http://dx.doi.org/10.1109/WIFS.2015.7368601. [166] Edith Wharton. The age of innocence. D. Appleton and Company, New York, 1920. [167] Zhung-Han Wu, Matthew C. Stamm, and K. J. Ray Liu. Anti-forensics of median filtering. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2013, Vancouver, BC, Canada, May 26-31, 2013, pages 3043–3047, 2013. doi: 10.1109/ICASSP. 2013.6638217. URL http://dx.doi.org/10.1109/ICASSP.2013.6638217. [168] Martin Wundram, Felix Freiling, and Christian Moch. Anti-forensics: The next step in digital forensics tool testing. In Seventh International Conference on IT Security Incident Management and IT Forensics, IMF 2013, Nuremberg, Germany, March 12-14, 2013, pages 83–97, 2013. doi: 10.1109/IMF.2013.17. URL http://dx.doi.org/10.1109/IMF.2013.17.

121 Bibliography

[169] Alexander Würstlein, Michael Gernoth, Johannes Götzfried, and Tilo Müller. Exzess: Hardware-based RAM encryption against physical memory disclosure. In Architecture of Computing Systems - ARCS 2016 - 29th International Conference, Nuremberg, Germany, April 4-7, 2016, Proceedings, pages 60–71, 2016. doi: 10.1007/978-3-319-30695-7_5. URL http://dx.doi.org/10.1007/978-3-319-30695-7_5. [170] P. Wyns and Richard L. Anderson. Low-temperature operation of silicon dynamic random- access memories. Electron Devices, IEEE Transactions on, 36(8):1423–1428, Aug 1989. ISSN 0018-9383. doi: 10.1109/16.30954. [171] Miao Yu, Zhengwei Qi, Qian Lin, Xianming Zhong, Bingyu Li, and Haibing Guan. Vis: Virtualization enhanced live forensics acquisition for native system. Digital Investigation, 9 (1):22 – 33, 2012. ISSN 1742-2876. doi: http://dx.doi.org/10.1016/j.diin.2012.04.002. URL http://www.sciencedirect.com/science/article/pii/S1742287612000229. [172] Jonas Zaddach. Exploring the impact of a hard drive backdoor. Talk at REcon (slides: https://recon.cx/2014/slides/Recon14_HDD.pdf, video: https://www.youtube.com/ watch?v=KjmsLvD76rM), 2014. [173] Jonas Zaddach, Anil Kurmus, Davide Balzarotti, Erik-Oliver Blass, Aurélien Francillon, Travis Goodspeed, Moitrayee Gupta, and Ioannis Koltsidas. Implementation and implications of a stealth hard-drive backdoor. In Annual Computer Security Applications Conference, ACSAC ’13, New Orleans, LA, USA, December 9-13, 2013, pages 279–288, 2013. doi: 10.1145/2523649.2523661. URL http://doi.acm.org/10.1145/2523649.2523661. [174] Hui Zeng, Tengfei Qin, Xiangui Kang, and Li Liu. Countering anti-forensics of median filtering. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4-9, 2014, pages 2704–2708, 2014. doi: 10.1109/ICASSP.2014.6854091. URL http://dx.doi.org/10.1109/ICASSP.2014.6854091.

All online resources were last accessed 2016-08-10.

122