A Domain Specific Language for Digital Forensics and Incident
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by University of New Orleans University of New Orleans ScholarWorks@UNO University of New Orleans Theses and Dissertations Dissertations and Theses Fall 12-20-2019 A Domain Specific Language for Digital orF ensics and Incident Response Analysis Christopher D. Stelly [email protected] Follow this and additional works at: https://scholarworks.uno.edu/td Part of the Information Security Commons, and the Programming Languages and Compilers Commons Recommended Citation Stelly, Christopher D., "A Domain Specific Language for Digital orF ensics and Incident Response Analysis" (2019). University of New Orleans Theses and Dissertations. 2706. https://scholarworks.uno.edu/td/2706 This Dissertation is protected by copyright and/or related rights. It has been brought to you by ScholarWorks@UNO with permission from the rights-holder(s). You are free to use this Dissertation in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/ or on the work itself. This Dissertation has been accepted for inclusion in University of New Orleans Theses and Dissertations by an authorized administrator of ScholarWorks@UNO. For more information, please contact [email protected]. A Domain Specific Language for Digital Forensics and Incident Response Analysis A Dissertation Submitted to the Graduate Faculty of the University of New Orleans in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Engineering and Applied Science Computer Science by Christopher Drew Stelly B.S. University of Louisiana 2012 M.S. University of New Orleans 2014 December 2019 Declaration of Authorship I, Christopher STELLY, declare that this thesis titled, “A Domain Specific Language for Digital Forensics and Incident Response Analysis” and the work presented in it are my own. I confirm that: • This work was done wholly or mainly while in candidature for a research degree at this University. • Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated. • Where I have consulted the published work of others, this is always clearly attributed. • Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work. • I have acknowledged all main sources of help. • Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself. Signed: Date: ii UNIVERSITY OF NEW ORLEANS Abstract College of Science Department of Computer Science Doctor of Philosophy A Domain Specific Language for Digital Forensics and Incident Response Analysis by Christopher STELLY One of the longstanding conceptual problems in digital forensics is the dichotomy between the need for verifiable and reproducible forensic investigations, and the lack of practical mech- anisms to accomplish them. With nearly four decades of professional digital forensic practice, investigator notes are still the primary source of reproducibility information, and much of it is tied to the functions of specific, often proprietary, tools. The lack of a formal means of specification for digital forensic operations results in three major problems that go to the core of digital forensic science and its practical application. Specifically, there is a critical lack of : a) standardized and automated means to scientifically verify accuracy of digital forensic tools; b) automated methods to reliably reproduce forensic computations (their results); and c) common framework for inter-operability among disparate forensic tools. In addition, there is no standardized means for communicating software re- quirements between users (investigators) and researchers and developers, resulting in a mis- match in expectations, especially with respect to performance. Combined with the exponential growth in data volume and the complexity of applications and systems to be investigated, all of these concerns result in major case backlogs and inherently reduce the reliability of the dig- ital forensic analyses. This work proposes a new approach to the formal and usable specification of forensic com- putations, such that the above concerns can be addressed on a sound scientific basis via the introduction of a new domain specific language (DSL) called Nugget. DSLs are specialized lan- guages that aim to address the concerns of particular domains by providing practical abstrac- tions that allow users to directly provide an intuitive, but formal, description of their problem statements. Successful DSLs, such as SQL in relational databases, can transform an application domain by providing a standardized way for users to communicate what they need without specifying how the computation should be performed. This is the first effort to build a DSL, and a prototype support infrastructure for (digital) forensic computations with the following research goals: 1) provide an intuitive formal spec- ification language that covers core types of forensic computations and common data types; 2) provide a mechanism to easily extend the language that can incorporate arbitrary compu- tations; 3) provide a prototype execution environment that allows the fully automatic (and optimized) execution of the specified computation; 4) provide a complete, formal, and au- ditable log of computations that can be used to independently reproduce the results of an investigation; 5) demonstrate scalable, cloud-ready processing that can match the growth in data volumes and complexity with necessary hardware resources. iii Keywords. nugget,scarf, digital, forensics, cyber, security iv Contents Declaration of Authorship ii Abstract iii List of Figures vii List of Tables viii List of Listings ix 1 Introduction 1 1.1 Definitions, Usage, and Models.............................1 1.1.1 Methods, artifacts, tools, and targets......................2 1.1.2 Digital forensics usage..............................4 1.2 Brief History of Digital Forensics............................5 1.2.1 Examples and notable cases...........................6 1.3 The Problems within Digital Forensics.........................7 1.3.1 Replication and Verification...........................8 Domain Specific Languages...........................9 1.3.2 Tools........................................9 1.3.3 Tool veracity.................................... 10 1.3.4 Educational materials.............................. 11 1.3.5 Scaling....................................... 11 1.4 Brief description of research............................... 12 1.5 Contribution Claims................................... 12 2 Literature Review 14 2.1 Background........................................ 14 2.2 Modeling a Standardized Approach.......................... 14 2.2.1 Background.................................... 14 2.2.2 Low-Level and Mathematical Models..................... 15 Carrier’s history testing model......................... 15 Gladyshev’s finite state machine........................ 15 Garfinkel’s differential analysis......................... 15 2.2.3 High Level Models................................ 16 Cognitive task model............................... 16 Best Practices................................... 16 2.3 Digital Forensics Languages............................... 17 2.3.1 DERRIC - A (narrow) data language for Digital Forensics.......... 17 v 2.3.2 DEX - Digital evidence exchange........................ 17 2.3.3 Formalization of Computer I/O - Hadley Model............... 18 2.3.4 XIRAF - XML indexing and querying for DF................. 18 2.3.5 DFXML...................................... 19 2.3.6 Digital Forensics Ontologies........................... 19 2.3.7 SEPSES - logging with common vocabulary.................. 19 2.3.8 Roussev - a DSL for Digital Forensics..................... 20 2.3.9 Summary of proposed models and languages................ 20 2.4 Scaling Digital Forensics................................. 20 2.4.1 Initial discussions................................. 20 2.4.2 Hadoop and Sleuthkit integration....................... 22 2.4.3 Pig, Hadoop, and Sawzall............................ 22 2.4.4 Distributed Environment for Large-scale Investigations DELV- Distributed Forensics...................................... 22 2.4.5 GRR - Distributed Incident Response..................... 23 2.4.6 Hansken project.................................. 23 2.5 Summary.......................................... 23 3 Proposed Methodology 24 3.1 Introduction........................................ 24 3.2 Develop a language.................................... 25 3.2.1 What is a DSL?.................................. 25 Internal vs. external............................... 26 3.2.2 Developing a DSL now............................. 27 3.2.3 Necessary attributes of a Forensic DSL..................... 28 3.2.4 Describing a language.............................. 30 3.3 Develop a Distributed Architecture........................... 31 3.3.1 Containers..................................... 31 3.4 Extend the Language to Integrate Multiple Third-party Tools............ 33 3.4.1 Application Programming Interfaces (APIs) and Interoperability..... 33 3.5 Putting it all together..................................