Database Forensic Analysis with Dbcarver
Total Page:16
File Type:pdf, Size:1020Kb
Database Forensic Analysis with DBCarver James Wagner, Alexander Rasin, Tanu Malik, Karen Heart, Hugo Jehle Jonathan Grier School of Computing Grier Forensics DePaul University, Chicago, IL 60604 Pikesville, MD 21208 fjwagne32, arasin, tanu, [email protected], [email protected] [email protected] 3rd-party DB Forensic File Forensic DB ABSTRACT DBMS Scenario Query Recovery Tools Carving Tools Carving Tool The increasing use of databases in the storage of critical and all transactions YES Maybe NO YES sensitive information in many organizations has lead to an Good (if tool can (Can’t extract increase in the rate at which databases are exploited in com- •DB is OK recover logs) DB files) •RAM puter crimes. While there are several techniques and tools deleted rows Maybe NO NO YES snapshot (rarely (No deleted (Can’t extract available for database forensics, they mostly assume apriori for table available Customer available) row recovery) DB files) database preparation, such as relying on tamper-detection RAM (cached) NO NO NO YES software to be in place or use of detailed logging. Investiga- DB content (Can't handle (Can’t carve tors, alternatively, need forensic tools and techniques that DB RAM) DB RAM) work on poorly-configured databases and make no assump- Bad all transactions NO Maybe NO YES tions about the extent of damage in a database. • DB is (Database (Based on (Can’t extract (Readable In this paper, we present DBCarver, a tool for reconstruct- corrupt is dead) corruption) DB files) parts of data) ing database content from a database image without using • no RAM deleted rows NO Maybe NO YES any log or system metadata. The tool uses page carving snapshot for table (Database (No deleted (Can’t extract (Readable Customer is dead) row recovery) DB files) parts of data) to reconstruct both query-able data and non-queryable data (deleted data). We describe how the two kinds of data can be combined to enable a variety of forensic analysis questions Figure 1: State-of-the-art tools for database forensic hitherto unavailable to forensic investigators. We show the analysis. generality and efficiency of our tool across several databases cyber-crime involves databases in some manner, investiga- through a set of robust experiments. tors must have the capacity to examine and interpret the contents of database management systems (DBMSes). Many CCS Concepts databases incorporate sophisticated security and logging com- •Security and privacy ! Information accountability ponents. However, investigators often do their work in field and usage control; Database activity monitoring; conditions { the database may not provide the necessary log- ging granularity (unavailable or disabled by default). More- Keywords over, the storage image (disk and/or memory) itself might be corrupt or contain multiple (unknown) DBMSes. Database forensics; page carving; digital forensics; data re- Where built-in database logging is unable to address inves- covery tigator needs, additional forensic tools are necessary. Digi- tal forensics has addressed such field conditions especially in 1. INTRODUCTION the context of file systems and memory content. A particu- Cyber-crime (e.g., data exfiltration or computer fraud) is larly important and well-recognized technique is file carving, an increasingly significant concern in today's society. Fed- which extracts, somewhat reliably, files from a disk image, eral regulations require companies to find evidence for the even if the file was deleted or corrupted. There are, how- purposes of federal investigation (e.g., Sarbanes-Oxley Act ever, no corresponding carving tools or techniques available [3]), and to disclose to customers what information was for database analysis. compromised after a security breach (e.g., Health Insur- In this paper, we focus on the need for database carv- ance Portability and Accountability Act [2]). Because most ing techniques (the database equivalent of file carving) for database forensic investigations. Databases use an internal storage model that handles data (e.g., tables), auxiliary data (e.g., indexes) and metadata (e.g., transaction logs). All relational databases store structures in pages of fixed size through a similar storage model (similar across relational This article is published under a Creative Commons Attribution Li- databases and thus generalizable). File carvers are unable cense(http://creativecommons.org/licenses/by/3.0/), which permits distribution and reproduction in any medium as well as allowing derivative works, provided that to recover or interpret contents of database files because file you attribute the original work to the author(s) and CIDR 2017. carvers are built for certain file types (e.g., JPEG) and do 8th Biennial Conference on Innovative Data Systems Research (CIDR ’17) not understand the inherent complexity of database stor- January 8-11, 2017, Chaminade, California, USA. age. Database carving can leverage storage principles that are typically shared among DBMSes to generally define and the presence of audit logs but also requires other database reconstruct pages; hence, page carving can be accomplished logs to be configured with special settings that might be without having to reverse-engineer DBMS software. Fur- difficult to enforce in all situations. While this work is use- thermore, while forensic memory analysis is distinct from ful and complementary, in this paper we propose methods file carving, buffer cache (RAM) is also an integral part for database reconstruction for forensic analysis without any of DBMS storage management. Unlike file carving tools, assumptions about available logging, especially audit logs. database carving must also support RAM carving for com- In fact, our method is more similar to file carving [7, 13], pleteness. In practice, a DBMS does not provide users with which reconstructs files in the absence of file metadata and ready access to all of its internal storage, such as deleted accompanying operating system and file system software. rows or in-memory content. In forensic investigations the We assume the same forensic requirements as in file carv- database itself could be damaged and be unable to provide ing, namely absence of system catalog metadata and un- any useful information. Essentially, database carving tar- availability of DBMS software, and describe how carving gets the reconstruction of the data that was maintained by can be achieved generally within the context of relational the database rather than attempting to recover the original databases. In our previous paper, [15] we have described database itself. how long forensic evidence may reside within a database, We further motivate DBCarver by an overview of what cur- even after being deleted. In this paper, we delve deeper into rent tools can provide for forensic analysis in a database. Be- the process of page carving and describe a database agnostic cause investigators may have to deal with a corrupt database mechanism to carve database storage at the page level, as image, we consider two scenarios: \good" (database is ok) well as show how forensic analysis can be conducted by an and \bad" (database is damaged). As basic examples of investigator. forensic questions that can be asked, we use three simple Database carving can provide useful data for provenance queries (\find all transactions", \find all deleted rows" and auditing [8], and creation of virtualized database packages \find contents of memory"). Figure 1 summarizes what the [11], which use provenance-mechanisms underneath and are DBMS itself, 3rd party tools, file carving and database carv- useful for sharing and establishing reproducibility of database ing tools can answer under different circumstances. applications [12]. In particular, provenance of transactional or deleted data is still a work-in-progress in that provenance 1.1 Our Contributions systems must support a multi-version semi-ring model [5], In this paper, we present a guide for using database carv- which is currently known for simple delete operations and ing for forensic analysis based on the digital investigation not for delete operations with nested subqueries. Our tech- process described by the National Institute of Justice (NIJ) nique can reconstruct deleted data, regardless of the queries [1] and Carrier 2005[6]. We describe a database forensic that deleted the data. procedure that conforms to the rules of digital forensics: • We describe how \page-carving" in DBCarver can be 3. PAGE CARVING IN DBCARVER used to reconstruct active and deleted database con- tent. (Section 3) 3.1 Page Carving Requirements We assume a forensic framework for examination of digi- • We describe SQL analysis on reconstructed active and tal evidence as established by the National Institute of Jus- deleted data from disk-image and memory snapshots tice [1] and also described in detail by Carrier in Founda- to answer forensic questions regarding the evidence tions of Digital Investigations [6]. This framework identifies (Section 4). three primary tasks that are typically performed by a foren- • We evaluate the resource-consumption in DBCarver, sic investigator in case of a suspicious incident, namely (i) the amount of meaningful data it can reconstruct from evidence acquisition, (ii) evidence reconstruction, and (iii) a corrupted database, and the quality of the recon- evidence analysis. In acquisition, the primary task is to pre- structed data (Section 5). serve all forms of digital evidence. In this paper, we assume evidence acquisition corresponds to preserving disk images Section 2 summarizes related work in database forensics, and of involved systems. A forensic investigator, depending on we conclude in Section 6, also describing future work. the investigation, may also preserve memory by taking snap- shots of the process memory. Snapshots of the database pro- 2. RELATED WORK cess memory can be especially useful for forensic analysis A compromised database is one in which some of the meta- because dirty data can be examined for malicious activity. data/data or DBMS software is modified by the attacker to Once potential evidence is acquired and preserved, the in- give erroneous results while the database is still operational.