A Methodology for Automated Digital Evidence Processing

A Methodology for Automated Digital Evidence Processing

PhD Thesis Alleviating the Digital Forensic Backlog: A Methodology for Automated Digital Evidence Processing Xiaoyu Du A thesis submitted in fulfilment of the degree of PhD in Computer Science Supervisor: Dr. Mark Scanlon Head of School: Assoc. Prof. Chris Bleakley UCD School of Computer Science College of Science University College Dublin September 2020 Table of Contents List of Figures . vii List of Tables . ix Abstract . xiii 1 Introduction . 1 1.1 Overview . 1 1.1.1 Challenges of Digital Forensics . 2 1.1.2 Approaches and Technologies for the Problem . 2 1.2 Research Questions . 4 1.3 Contribution of this Work . 4 1.4 Limitations of this Work . 5 1.5 Layout of this Thesis . 5 1.6 Brief Overview of the Approach . 6 2 Literature Review . 7 2.1 Introduction . 7 2.2 Digital Forensics . 8 2.2.1 Source of Digital Evidence . 9 2.2.2 Mobile Forensics . 10 2.2.3 Cloud Forensics . 11 2.2.4 IoT Forensics . 12 2.2.5 Digital Forensic Artefacts Example . 13 2.2.6 Digital Evidence Backlogs . 16 2.3 Digital Forensic Test Images . 17 2.3.1 Current Available Disk Images . 18 2.3.2 Automated Disk Image Generation Approach . 18 2.4 Digital Forensic Process Model . 19 2.4.1 Digital Forensic Framework in Initial Phase . 21 2.4.2 Refined Digital Forensic Process Models . 22 2.4.3 Recent Digital Forensic Models for Handling Modern Advancements 23 2.5 Triage Process Model . 25 2.5.1 Digital Device Triage Tools . 26 2.6 Cloud-based Digital Forensic Framework . 27 2.6.1 DFaaS Framework . 27 i 2.6.2 HANSKEN: DFaaS System Used by NFI . 28 2.6.3 Benefits and Advantages of DFaaS . 29 2.7 Data Deduplication and Data Reduction . 30 2.7.1 Data Deduplication Technology . 30 2.7.2 Data Deduplication in Digital Forensics . 31 2.7.3 Data Reduction Approaches . 32 2.8 Automated Digital Forensic Analysis . 33 2.8.1 Challenges of Automation in Digital Forensics . 33 2.8.2 Metadata and Timeline Analysis . 34 2.8.3 Plaso/Log2timeline . 35 2.9 Machine Learning and Digital Forensics . 35 2.9.1 Background of Machine Learning . 35 2.9.2 Machine Learning in Digital Forensics . 36 2.9.3 Background of Deep Learning . 38 2.9.4 Deep Learning Applications in Digital Forensics . 39 2.9.5 Current Challenges and Future Directions . 40 2.10 Correlation Analysis . 41 2.10.1 Cyber-investigation Analysis Standard Expression (CASE) . 42 2.11 File Artefact Prioritisation . 42 2.12 Summary . 43 2.12.1 Gaps in the State of the Art . 43 3 Methodology . 45 3.1 Introduction . 45 3.1.1 A Centralised Digital Evidence Processing System . 45 3.1.2 An Automated File Artefact Analysis Approach . 46 3.1.3 Design of Experimentation and Evaluation . 47 3.2 Tools for Test Disk Images Generation . 48 3.2.1 TraceGen: Overview . 48 3.2.2 TraceGen: Existing Automation Options . 49 3.2.3 EviPlant: Image Generation Using \Evidence Packages" . 51 3.3 Deduplicated Data Acquisition of the System . 52 3.3.1 Deduplicated Acquisition Process Pipeline . 53 3.3.2 Client Responsibilities . 54 3.3.3 Sever Responsibilities . 54 3.3.4 Summary . 55 3.4 Data Storage of the System . 55 3.4.1 Categorising Files Collected from the Disk . 56 3.4.2 Unallocated Space and Slack Space on the Disk . 56 3.4.3 Metadata and Acquisition Records . 57 3.4.4 Acquisition Logs for Performance Analysis . 59 3.5 Forensically Sound Image Reconstruction from Deduplicated Acquisition . 60 3.6 Data Extraction and Preparation . 62 3.6.1 Data Extraction: Tools and Techniques . 62 ii 3.6.2 File Data Reduction/Selection . 62 3.6.3 File Timeline Generation and Feature Extraction . 63 3.6.4 Feature Extraction from File system Metadata . 64 3.6.5 Train/Test Data and Evaluation . 65 3.7 Relevancy File Artefacts Prioritisation . 65 3.7.1 An Overview of the Workflow . 67 3.7.2 Relevancy Score Determination . 68 3.7.3 Adding Pre-trained Models . 69 3.7.4 Relevancy Score Generation . 70 3.8 Summary . 71 4 Test Disk Image Generation . 73 4.1 Overview . 73 4.1.1 Choice of Technologies . 73 4.2 File Data for Disk Image Creation . 74 4.3 EviPlant: Creation, Manipulation and Distribution Disk Image . 75 4.3.1 Overview . 75 4.3.2 Evidence Packages . 75 4.3.3 Testing of Diffing and Injection . 76 4.3.4 Summary . 77 4.4 TraceGen: Automated User Action Emulation . 78 4.4.1 Overview . 78 4.4.2 Virtual Machine Configuration . 78 4.4.3 File Provenance and Interactions . 79 4.4.4 User Action Emulation . 80 4.4.5 Continuous Usage Trace Generation . 80 4.4.6 Summary . 84 5 Deduplicated Digital Evidence Processing . 86 5.1 Overview . 86 5.2 Prototype Setup and Test Data . 87 5.3 Results: Average Acquisition Speed . 87 5.4 Results: Storage Space Saved . 91 5.5 Results: Image Reconstruction . 91 5.6 Tests on an Improved Approach for Data Acquisition . 92 5.6.1 Results . 93 5.7 Summary . 94 5.7.1 Benefits of this Approach . 95 6 File Artefact Analysis and Relevancy Prioritisation . 96 6.1 Overview . 96 6.1.1 Data Processing and Experimentation . 96 6.2 Metadata and Timeline from Disk Image . 97 6.2.1 Timeline Generation and Analysis . 98 iii 6.2.2 An Example Disk Image Timeline . 99 6.2.3 File Timeline Generation . 101 6.2.4 An Example of File Timeline . 102 6.2.5 Feature Extraction from File Timeline . 102 6.3 Metadata-based File Artefact Classification . 103 6.3.1 Example Scenario . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    149 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us