Identifying Factors Affecting Deleted File Persistence Through Empirical

Total Page:16

File Type:pdf, Size:1020Kb

Identifying Factors Affecting Deleted File Persistence Through Empirical Identifying Factors Affecting Deleted File Persistence Through Empirical Study and Analysis A Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University by Tahir Mehmood Khan Master of Science George Washington University, 2011 Bachelor of Science Saint Cloud State University, 2006 Director: James H. Jones, Jr, Associate Professor Department of Electrical & Computer Engineering Summer Semester 2017 George Mason University Fairfax, VA Copyright 2017 Tahir Mehmood Khan All Rights Reserved ii DEDICATION To my parents, family members, friends, and my professors who inspired me to complete this research. iii ACKNOWLEDGEMENTS I would like to thank my advisor and dissertation Chair, Dr. James Jones for his extraordinary support, mentorship, and expert guidance to complete this research. I also would like to thank my committee members: Dr. Duminda Wijesekera, Dr. Kathryn B. Laskey and Dr. Paulo Costa for their support and guidance throughout this project. My appreciation goes to my family, my parents and my wife for standing beside me to take this journey to the end. Last but not the least, special thanks goes to the members of the Krypton group and my colleagues at George Mason University and Gallaudet University who supported and guided me throughout this project. iv TABLE OF CONTENTS Page List of Tables ................................................................................................................... viii List of Figures ..................................................................................................................... x List of Abbreviations ........................................................................................................ xii Abstract ............................................................................................................................ xiii Chapter One: Introduction ................................................................................................ 15 Motivation ..................................................................................................................... 16 Research Question ......................................................................................................... 17 Contributions ................................................................................................................. 20 Chapter Two: Literature Review ...................................................................................... 21 Chapter Three: Methodology ............................................................................................ 27 Experimental Design ..................................................................................................... 27 Tracking Deleted Files .................................................................................................. 30 User Defined Parameters for Adiff.py Script ................................................................ 39 User Defined Parameters for Trace_file.py Script ........................................................ 40 Factors that Influence Persistence of Deleted Files ...................................................... 41 Disk and System Parameters ..................................................................................... 42 Deleted Files Parameters ........................................................................................... 44 User Activity Profiles ................................................................................................ 47 Deleted Files .................................................................................................................. 57 Deleted Files Categories ............................................................................................ 58 Deleted Files in 0-5 MB Group ................................................................................. 60 Percentage of File Contents Completely Overwritten ............................................... 61 Percentage of File Contents Partially Overwritten .................................................... 62 Percentage of File Contents Completely Survived .................................................... 63 Distribution of New Files .............................................................................................. 65 Shutdown User Activity ............................................................................................ 65 v Reboot-three-times User Activity .............................................................................. 67 Reboot User Activity ................................................................................................. 68 Web User Activity ..................................................................................................... 69 One-hour-reboot User Activity .................................................................................. 71 Reboot-one-hour User Activity ................................................................................. 72 3-GB User Activity .................................................................................................... 74 Experiment-data Activity .......................................................................................... 75 Mix-data Activity ...................................................................................................... 77 File Creation and Deletion Process and Virtual Machine Configuration Settings ........ 79 File Creation Process ..................................................................................................... 79 Application Uninstalled List ......................................................................................... 80 Virtual Machine Configuration Settings ....................................................................... 81 PassMark Fragger Utility and Disk Fragmentation Status ............................................ 82 Virtual Machine Suspend Procedure ............................................................................. 85 Virtual Machine Files Disk Components ...................................................................... 85 Raw Disk Image Conversion Process ........................................................................... 88 Chapter Four: Results and Analysis .................................................................................. 89 Effect of User Activities on Deleted Files .................................................................... 89 User Activities and User Actions .................................................................................. 91 Percentage of File Contents Completely Overwritten ............................................... 93 Percentage of File Contents Partially Overwritten .................................................... 96 Percentage of File Contents Completely Survived .................................................... 99 User Activities and Deleted File Size ......................................................................... 103 Fragmented and Non-Fragmented Files ...................................................................... 107 Disk Free Bytes ........................................................................................................... 110 Disk Fragmentation ..................................................................................................... 112 Disk Free Bytes and Disk Fragmentation ................................................................... 115 User and System Generated Files................................................................................ 120 File Path....................................................................................................................... 123 Chapter Five: Conclusions .............................................................................................. 126 Research Findings ....................................................................................................... 126 Primary Contributions ................................................................................................. 129 vi Secondary Contributions ............................................................................................. 130 Implications of the Research ....................................................................................... 130 Future Directions ......................................................................................................... 131 APPENDIX I SYSTEM CONFIGURATION SETTINGS ........................................... 132 APPENDIX II ................................................................................................................. 136 APPENDIX III ................................................................................................................ 138 References ....................................................................................................................... 200 vii LIST OF TABLES Table Page Table 1 Disk and system parameter names and types ....................................................... 18 Table 2 Deleted file parameter names .............................................................................. 18 Table 3 User activity profile parameter names ................................................................. 18 Table 4 Disk parameter adjustments ................................................................................
Recommended publications
  • Faster File Matching Using Gpgpus
    Faster File Matching Using GPGPUs Deephan Mohan and John Cavazos Department of Computer and Information Sciences, University of Delaware Abstract We address the problem of file matching by modifying the MD6 algorithm that is best suited to take advantage of GPU computing. MD6 is a cryptographic hash function that is tree-based and highly parallelizable. When the message M is available initially, the hashing operations can be initiated at different starting points within the message and their results can be aggregated as the final step. In the parallel implementation, the MD6 program was partitioned and effectively parallelized across the GPU using CUDA. To demonstrate the performance of the CUDA version of MD6, we performed various experiments with inputs of different MD6 buffer sizes and varying file sizes. CUDA MD6 achieves real time speedup of more than 250X over the sequential version when executed on larger files. CUDA MD6 is a fast and effective solution for identifying similar files. Keywords: CUDA, MD6, GPU computing, File matching 1. Introduction compression level with the root being the final hash computed by the algorithm. The structure of a Merkle tree is shown in File matching is an important task in the field of forensics and Figure 1. In the MD6 implementation, the MD6 buffer size information security. Every file matching application is driven determines the number of levels in the tree and the bounds for by employing a particular hash generating algorithm. The crux parallelism. of the file matching application relies on the robustness and integrity of the hash generating algorithm. Various checksum generation algorithms like MD5[1], SHA-1[2], SHA-256[3], Tiger[4], Whirlpool[5], rolling hash have been utilized for file matching [6, 7].
    [Show full text]
  • Administrator's Guide
    Trend Micro Incorporated reserves the right to make changes to this document and to the product described herein without notice. Before installing and using the product, review the readme files, release notes, and/or the latest version of the applicable documentation, which are available from the Trend Micro website at: http://docs.trendmicro.com/en-us/enterprise/scanmail-for-microsoft- exchange.aspx Trend Micro, the Trend Micro t-ball logo, Apex Central, eManager, and ScanMail are trademarks or registered trademarks of Trend Micro Incorporated. All other product or company names may be trademarks or registered trademarks of their owners. Copyright © 2020. Trend Micro Incorporated. All rights reserved. Document Part No.: SMEM149028/200709 Release Date: November 2020 Protected by U.S. Patent No.: 5,951,698 This documentation introduces the main features of the product and/or provides installation instructions for a production environment. Read through the documentation before installing or using the product. Detailed information about how to use specific features within the product may be available at the Trend Micro Online Help Center and/or the Trend Micro Knowledge Base. Trend Micro always seeks to improve its documentation. If you have questions, comments, or suggestions about this or any Trend Micro document, please contact us at [email protected]. Evaluate this documentation on the following site: https://www.trendmicro.com/download/documentation/rating.asp Privacy and Personal Data Collection Disclosure Certain features available in Trend Micro products collect and send feedback regarding product usage and detection information to Trend Micro. Some of this data is considered personal in certain jurisdictions and under certain regulations.
    [Show full text]
  • ACS – the Archival Cytometry Standard
    http://flowcyt.sf.net/acs/latest.pdf ACS – the Archival Cytometry Standard Archival Cytometry Standard ACS International Society for Advancement of Cytometry Candidate Recommendation DRAFT Document Status The Archival Cytometry Standard (ACS) has undergone several revisions since its initial development in June 2007. The current proposal is an ISAC Candidate Recommendation Draft. It is assumed, however not guaranteed, that significant features and design aspects will remain unchanged for the final version of the Recommendation. This specification has been formally tested to comply with the W3C XML schema version 1.0 specification but no position is taken with respect to whether a particular software implementing this specification performs according to medical or other valid regulations. The work may be used under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported license. You are free to share (copy, distribute and transmit), and adapt the work under the conditions specified at http://creativecommons.org/licenses/by-sa/3.0/legalcode. Disclaimer of Liability The International Society for Advancement of Cytometry (ISAC) disclaims liability for any injury, harm, or other damage of any nature whatsoever, to persons or property, whether direct, indirect, consequential or compensatory, directly or indirectly resulting from publication, use of, or reliance on this Specification, and users of this Specification, as a condition of use, forever release ISAC from such liability and waive all claims against ISAC that may in any manner arise out of such liability. ISAC further disclaims all warranties, whether express, implied or statutory, and makes no assurances as to the accuracy or completeness of any information published in the Specification.
    [Show full text]
  • Genetic Databases
    Stefano Lonardi March, 2000 Compression of Biological Sequences by Greedy Off-line Textual Substitution Alberto Apostolico Stefano Lonardi Purdue University Università di Padova Genetic Databases § Massive § Growing exponentially Example: GenBank contains approximately 4,654,000,000 bases in 5,355,000 sequence records as of December 1999 Data Compression Conference 2000 1 Stefano Lonardi March, 2000 DNA Sequence Records Composed by annotations (in English) and DNA bases (on the alphabet {A,C,G,T,U,M,R,W,S,Y,K,V,H,D,B,X,N}) >RTS2 RTS2 upstream sequence, from -200 to -1 TCTGTTATAGTACATATTATAGTACACCAATGTAAATCTGGTCCGGGTTACACAACACTT TGTCCTGTACTTTGAAAACTGGAAAAACTCCGCTAGTTGAAATTAATATCAAATGGAAAA GTCAGTATCATCATTCTTTTCTTGACAAGTCCTAAAAAGAGCGAAAACACAGGGTTGTTT GATTGTAGAAAATCACAGCG >MEK1 MEK1 upstream sequence, from -200 to -1 TTCCAATCATAAAGCATACCGTGGTYATTTAGCCGGGGAAAAGAAGAATGATGGCGGCTA AATTTCGGCGGCTATTTCATTCATTCAAGTATAAAAGGGAGAGGTTTGACTAATTTTTTA CTTGAGCTCCTTCTGGAGTGCTCTTGTACGTTTCAAATTTTATTAAGGACCAAATATACA ACAGAAAGAAGAAGAGCGGA >NDJ1 NDJ1 upstream sequence, from -200 to -1 ATAAAATCACTAAGACTAGCAACCACGTTTTGTTTTGTAGTTGAGAGTAATAGTTACAAA TGGAAGATATATATCCGTTTCGTACTCAGTGACGTACCGGGCGTAGAAGTTGGGCGGCTA TTTGACAGATATATCAAAAATATTGTCATGAACTATACCATATACAACTTAGGATAAAA ATACAGGTAGAAAAACTATA Problem Textual compression of DNA data is difficult, i.e., “standard” methods do not seem to exploit the redundancies (if any) inherent to DNA sequences cfr. C.Nevill-Manning, I.H.Witten, “Protein is incompressible”, DCC99 Data Compression Conference 2000 2 Stefano Lonardi March, 2000
    [Show full text]
  • 2017 W5.2 Fixity Integrity
    FIXITY & DATA INTEGRITY DATA INTEGRITY DATA INTEGRITY PRESERVATION CONSIDERATIONS ▸ Data that can be rendered ▸ Data that is properly formed and can be validated ▸ DROID, JHOVE, etc. DATA DEGRADATION HOW DO FILES LOSE INTEGRITY? DATA DEGRADATION HOW DO FILES LOSE INTEGRITY? Storage: hardware issues ▸ Physical damage, improper orientation, magnets, dust particles, mold, disasters Storage: software issues ▸ "bit rot", "flipped" bits, small electronic charge, solar flares, radiation DATA DEGRADATION HOW DO FILES LOSE INTEGRITY? Transfer/Retrieval ‣ Transfer from one operating system or file system to another, transfer across network protocols, ▸ Metadata loss: example – Linux has no "Creation Date" (usually "file system" metadata) Mismanagement ▸ Permissions issues (read/write allowed), human error DATA PROTECTION VERIFICATION DATA PROTECTION VERIFICATION ▸ Material proof or evidence that data is unchanged ▸ Material proof or evidence that data is well-formed and should be renderable ▸ Example: Different vendors write code for standard formats in different ways DATA PROTECTION VERIFICATION Verify that data is well-formed using... DATA PROTECTION VERIFICATION Verify that data is well-formed using... ▸ JHOVE ▸ DROID ▸ XML Validator ▸ DVAnalyzer ▸ NARA File Analyzer ▸ BWF MetaEdit WHOLE-FILE CONSISTENCY FIXITY FIXITY BASIC METHODS Manual checks of file metadata such as... FIXITY BASIC METHODS Manual checks of file metadata such as... ▸ File name ▸ File size ▸ Creation date ▸ Modified date ▸ Duration (time-based media) FIXITY ADVANCED METHODS FIXITY
    [Show full text]
  • Security 1 Lab
    Security 1 Lab Installing Command-Line Hash Generators and Comparing Hashes In this project, you download different command-line hash generators to compare hash values. 1. Use your Web browser to go to https://kevincurran.org/com320/labs/md5deep.zip 2. Download this zip archive. 3. Using Windows Explorer, navigate to the location of the downloaded file. Right-click the file and then click Extract All to extract the files. 4. Create a Microsoft Word document with the line below: Now is the time for all good men to come to the aid of their country. 5. Save the document as Country1.docx in the directory containing files and then close the document. 6. Start a command prompt by clicking Start, entering cmd, and then pressing Enter. 7. Navigate to the location of the downloaded files. 8. Enter MD5DEEP64 Country1.docx to start the application that creates an MD5 hash of Country1.docx and then press Enter. What is the length of this hash? (note: If you are not working on a 64 bit machine, then simply run the MD5deep.exe 32 bit version). 9. Now enter MD5DEEP64 MD5DEEP.TXT to start the application that creates an MD5 hash of the accompanying documentation file MD5DEEP.TXT and then press Enter. What is the length of this hash? Compare it to the hash of Country1.docx. What does this tell you about the strength of the MD5 hash? 10. Start Microsoft Word and then open Country1.docx. 11. Remove the period at the end of the sentence so it says Now is the time for all good men to come to the aid of their country and then save the document as Country2.docx in the directory that contains the files.
    [Show full text]
  • ARC-LEAP User Instructions for The
    ARC-LEAP User Instructions Appalachian Regional Commission Local Economic Assessment Package prepared for the Appalachian Regional Commission prepared by Economic Development Research Group, Inc. Glen Weisbrod Teresa Lynch Margaret Collins January, 2004 ARC-LEAP User Instructions Appalachian Regional Commission Local Economic Assessment Package prepared for the Appalachian Regional Commission prepared by Economic Development Research Group, Inc. 2 Oliver Street, 9th Floor, Boston, MA 02109 Telephone 617.338.6775 Fax 617.338.1174 e-mail [email protected] Website www.edrgroup.com January, 2004 ARC-LEAP User Instructions PREFACE LEAP is a software tool that was designed and developed by Economic Development Research Group, Inc. (www.edrgroup.com) to assist practitioners in evaluating local economic development needs and opportunities. ARC-LEAP is a version of this tool developed specifically for the Appalachian Regional Commission (ARC) and it’s Local Development Districts (LDDs). Development of this user guide was funded by ARC as a companion to the ARC-LEAP analysis system. This document presents user instructions and technical documentation for ARC-LEAP. It is organized into three parts: I. overview of the ARC-LEAP tool II. instructions for users to obtain input information and run the analysis model III. interpretation of output tables . A separate Handbook document provides more detailed discussion of the economic development assessment process, including analysis of local economic performance, diagnosis of local strengths and weaknesses, and application of business opportunity information for developing an economic development strategy. Economic Development Research Group i ARC-LEAP User Instructions I. OVERVIEW The ARC-LEAP model serves to three related purposes, each aimed at helping practitioners identify target industries for economic development.
    [Show full text]
  • Pack, Encrypt, Authenticate Document Revision: 2021 05 02
    PEA Pack, Encrypt, Authenticate Document revision: 2021 05 02 Author: Giorgio Tani Translation: Giorgio Tani This document refers to: PEA file format specification version 1 revision 3 (1.3); PEA file format specification version 2.0; PEA 1.01 executable implementation; Present documentation is released under GNU GFDL License. PEA executable implementation is released under GNU LGPL License; please note that all units provided by the Author are released under LGPL, while Wolfgang Ehrhardt’s crypto library units used in PEA are released under zlib/libpng License. PEA file format and PCOMPRESS specifications are hereby released under PUBLIC DOMAIN: the Author neither has, nor is aware of, any patents or pending patents relevant to this technology and do not intend to apply for any patents covering it. As far as the Author knows, PEA file format in all of it’s parts is free and unencumbered for all uses. Pea is on PeaZip project official site: https://peazip.github.io , https://peazip.org , and https://peazip.sourceforge.io For more information about the licenses: GNU GFDL License, see http://www.gnu.org/licenses/fdl.txt GNU LGPL License, see http://www.gnu.org/licenses/lgpl.txt 1 Content: Section 1: PEA file format ..3 Description ..3 PEA 1.3 file format details ..5 Differences between 1.3 and older revisions ..5 PEA 2.0 file format details ..7 PEA file format’s and implementation’s limitations ..8 PCOMPRESS compression scheme ..9 Algorithms used in PEA format ..9 PEA security model .10 Cryptanalysis of PEA format .12 Data recovery from
    [Show full text]
  • Metadefender Core V4.12.2
    MetaDefender Core v4.12.2 © 2018 OPSWAT, Inc. All rights reserved. OPSWAT®, MetadefenderTM and the OPSWAT logo are trademarks of OPSWAT, Inc. All other trademarks, trade names, service marks, service names, and images mentioned and/or used herein belong to their respective owners. Table of Contents About This Guide 13 Key Features of Metadefender Core 14 1. Quick Start with Metadefender Core 15 1.1. Installation 15 Operating system invariant initial steps 15 Basic setup 16 1.1.1. Configuration wizard 16 1.2. License Activation 21 1.3. Scan Files with Metadefender Core 21 2. Installing or Upgrading Metadefender Core 22 2.1. Recommended System Requirements 22 System Requirements For Server 22 Browser Requirements for the Metadefender Core Management Console 24 2.2. Installing Metadefender 25 Installation 25 Installation notes 25 2.2.1. Installing Metadefender Core using command line 26 2.2.2. Installing Metadefender Core using the Install Wizard 27 2.3. Upgrading MetaDefender Core 27 Upgrading from MetaDefender Core 3.x 27 Upgrading from MetaDefender Core 4.x 28 2.4. Metadefender Core Licensing 28 2.4.1. Activating Metadefender Licenses 28 2.4.2. Checking Your Metadefender Core License 35 2.5. Performance and Load Estimation 36 What to know before reading the results: Some factors that affect performance 36 How test results are calculated 37 Test Reports 37 Performance Report - Multi-Scanning On Linux 37 Performance Report - Multi-Scanning On Windows 41 2.6. Special installation options 46 Use RAMDISK for the tempdirectory 46 3. Configuring Metadefender Core 50 3.1. Management Console 50 3.2.
    [Show full text]
  • Windows X32 Cross Compile Guide
    Guide for cross compiling Trezarcoin for windows. By Iwens Fortis Tested on Ubuntu 16.04.2 LTS. • Start on ubuntu your terminal (search for terminal with ubuntu dash top icon left, or press windows key to open dash). • First we need to install the dependencies, for this execute the following cmd’s stated below in the terminal window. The commands to execute are in the grey textboxes. The commands will ask for a password which and u have to confirm some commands with y from yes. Tip transfer the pdf to ubuntu for easy copy and paste. • First update the apt library. sudo apt-get update • Install needed dependencies to install mxe to cross compile Trezarcoin. sudo apt-get install p7zip-full autoconf automake autopoint bash bison bzip2 cmake flex gettext git g++ gperf intltool \ libffi-dev libtool libltdl-dev libssl-dev libxml-parser-perl make openssl patch perl pkg-config python ruby scons sed unzip \ wget xz-utils libtool-bin libgdk-pixbuf2.0-dev g++-multilib libc6-dev-i386 upx -y • We need to get latest mxe from github and compile libraries needed. git clone https://github.com/mxe/mxe.git cd mxe make MXE_TARGETS="i686-w64-mingw32.static" boost make MXE_TARGETS="i686-w64-mingw32.static" qttools make MXE_TARGETS="i686-w64-mingw32.static" miniupnpc • Next we need to compile the recommended Berkeley DB our self cd ~ wget 'http://download.oracle.com/berkeley-db/db-4.8.30.NC.tar.gz' tar zxvf db-4.8.30.NC.tar.gz cd db-4.8.30.NC • Now we will create a bash script for this execute the following commands to create the bash script.
    [Show full text]
  • Sequence Alignment/Map Format Specification
    Sequence Alignment/Map Format Specification The SAM/BAM Format Specification Working Group 3 Jun 2021 The master version of this document can be found at https://github.com/samtools/hts-specs. This printing is version 53752fa from that repository, last modified on the date shown above. 1 The SAM Format Specification SAM stands for Sequence Alignment/Map format. It is a TAB-delimited text format consisting of a header section, which is optional, and an alignment section. If present, the header must be prior to the alignments. Header lines start with `@', while alignment lines do not. Each alignment line has 11 mandatory fields for essential alignment information such as mapping position, and variable number of optional fields for flexible or aligner specific information. This specification is for version 1.6 of the SAM and BAM formats. Each SAM and BAMfilemay optionally specify the version being used via the @HD VN tag. For full version history see Appendix B. Unless explicitly specified elsewhere, all fields are encoded using 7-bit US-ASCII 1 in using the POSIX / C locale. Regular expressions listed use the POSIX / IEEE Std 1003.1 extended syntax. 1.1 An example Suppose we have the following alignment with bases in lowercase clipped from the alignment. Read r001/1 and r001/2 constitute a read pair; r003 is a chimeric read; r004 represents a split alignment. Coor 12345678901234 5678901234567890123456789012345 ref AGCATGTTAGATAA**GATAGCTGTGCTAGTAGGCAGTCAGCGCCAT +r001/1 TTAGATAAAGGATA*CTG +r002 aaaAGATAA*GGATA +r003 gcctaAGCTAA +r004 ATAGCT..............TCAGC -r003 ttagctTAGGC -r001/2 CAGCGGCAT The corresponding SAM format is:2 1Charset ANSI X3.4-1968 as defined in RFC1345.
    [Show full text]
  • Resources: Free Software
    Resources: Free Software Last Updated: 10/28/2011 Online version: http://depts.washington.edu/triolive/wordpress/ttt/freeware Note: These are resources from our TRIO Tech Talk Blog where you can find the latest tech news and interact with other members of the TRIO Community.. Reviews of Free Software The Best Free Software of 2011 http://www.pcmag.com/article2/0,2817,2381528,00.asp PCMAG.COM’s list of 208 of the best free programs for PCs in a wide variety of categories. Best Free Mac Software 2010 http://www.pcmag.com/article2/0,2817,2369639,00.asp PCMAG.COM’s list of 73 of the best free programs for Macs in a wide variety of categories. The Top 100 Free Apps For Your Phone http://www.pcmag.com/article2/0,2817,2356415,00.asp PCMAG.COM’s list of the best free apps for iPhones, Androids, Blackberries and Windows Mobile phones. Security & Utilities Ad-Aware Free Internet Security http://www.lavasoft.com/products/ad_aware_free.php PC Magazine’s Editors’ Choice for free antivirus software for Windows. Rated highly for both finding and removing malware and blocking new infections. CCleaner http://www.piriform.com/CCLEANER A free utility for PC and Mac for optimizing your system, removing clutter and maintaining privacy. Includes tools to clean your registry, manage start-up items and uninstall programs. Recuva http://www.piriform.com/recuva Lost a file because your computer crashed or you accidentally deleted it? Use this free tool for PCs to recover files from your hard drive, recycle bin or memory card.
    [Show full text]