Testing the National Software Reference Library Neil C. Rowe U.S. Naval Postgraduate School Monterey, California, USA
[email protected] Forensics of directory metadata p We need tools to quickly find key information on a drive without searching file contents. p File and directory metadata is a big help to characterize drives (or partitions on a cloud). p We are developing a tool “Dirim”. p Our testbed the “Real Drive Corpus” is purchased from 22 countries, mostly China, Mexico, Israel, Palestine, and India – now 2420 drives and 44 million files. p It also includes wireless and storage devices. p For analysis, we exclude files with hashes found in the National Software Reference Library Reference Data Set (NSRL RDS) – it removes 30% of the files – and 5% of the hashes. p Research question: Just how good is the NSRL? The Dirim file-metadata analysis system Disk or flash drive File-directory metadata (in XML/DFXML) Common Simplified and standardized metadata hash codes File classification (from NSRL, mapping etc.) Data with deleted-file corrections Data excluding common files File classifications Statistical summaries Data clusters Special-feature analysis Suspiciousness analysis Graphical display of analysis results File metadata we extract from a disk image Ordinal features Nominal features Boolean features File size Drive name Allocated? Access minus creation File name Compressed? time Access minus File extension Encrypted? modification time Modification minus Top-level directory Empty? creation time Depth in file hierarchy Immediate directory