A Forensic Analysis Method for Redis Database Based on RDB and AOF File
Total Page:16
File Type:pdf, Size:1020Kb
2538 JOURNAL OF COMPUTERS, VOL. 9, NO. 11, NOVEMBER 2014 A Forensic Analysis Method for Redis Database Based on RDB and AOF File Ming Xu* College of Computer, Hangzhou Dianzi University, Hangzhou, China Email: [email protected] Xiaowei Xu, Jian Xu, Yizhi Ren, Haiping Zhang, Ning Zheng College of Computer, Hangzhou Dianzi University, Hangzhou, China Email: [email protected], { jian.xu, renyz, zhanghp, nzheng }@hdu.edu.cn Abstract—Redis is a widely used non-relational and could be roughly divided into four categories: document in-memory database system. It holds a large amount of store databases, key-value store databases, graph information both in memory and file system, which is of databases and BigTable Column Family Store databases. great significance to forensic analysis. This paper mainly Among them, key-value store databases have the simplest proposes a forensic analysis method for Redis based on RDB and AOF file. A method of extracting useful information form for storing data. Each key is mapped to a value from RDB backup file is proposed based on the data storage containing arbitrary data. Redis is the most widely used mechanism described in this paper. A method of key-value store database and is rapidly gaining popularity reconstructing the write operation statements from AOF file all across the globe [5]. is also provided. Finally, the method of directly analyzing Redis forensics is of great importance in many aspects. data from memory is shown. The experimental results First, Redis is widely used in many companies to store demonstrate the effectiveness of our method. Most of the large amount of data and would thus be a primary target data could be extracted from RDB and AOF file, which in a forensic investigation. Second, some data in Redis provides important information for forensic investigators. might be mistakenly removed by database users. Redis Index Terms—Redis, NoSQL, database forensics, digital forensics provides a way to recover these deleted data. forensics Third, Redis is a potential target of database intrusions that involves stealing or tampering the database data. The data recovered in Redis could be used to prove a database I. INTRODUCTION security breach and determine the scope of a database intrusion. Finally, the study on Redis forensics could help Database systems play an important role in every the study on some other NoSQL databases with similar aspect of our life. They typically hold massive amount of key-value storage mechanism like Riak and Cassandra. data and form the basis for various applications [1-2]. The study on Redis forensics also provides insight into Relational databases work quite well when the data stored the forensic techniques of some other memory databases. in them is highly structured with strict relations between We could analyze the disk backup file instead of the them [3]. However, a lot of applications today use data memory to extract the database data. structures like lists, sets, hashes, and graphs. Storing Both the RDB file and AOF file are of great forensic these less structured data into traditional relational value in Redis forensics. First, while extracting and databases would require complex mapping algorithm and analyzing data from memory directly is very difficult, it often lead to poor performance. In addition to this, if the is relatively easy to parse the RDB file and AOF file data set is too large to fit into one server, the database instead and extract the data. Second, some deleted data in should be partitioned into multiple servers, which is a memory might still be found in RDB file. Third, the RDB weak point of many relational databases due to their file and AOF file could be used to recover important data complex deployment and bad performance. In general, if when the Redis server crashes and the data in memory is we are dealing with large amounts of data or the data is lost. Last but not least, by examining the data extracted less structured, we might have better options than from AOF file, we could learn what write operations are relational databases. performed in Redis. NoSQL databases provide an alternative way to store The goal of this paper is to show how to extract data data other than relational databases [4]. NoSQL databases from the Redis RDB backup file. A method to parse the are most useful when working with a huge quantity of AOF log file and reconstruct write operation statements is data that does not require a relational model. Aside from also proposed. We also briefly explain how the data is being non-relational, most NoSQL databases are also stored in memory. distributed, open-source and horizontally scalable. There In Section 2, we provide a brief overview of related are approximately 150 different NoSQL databases. They work in the field of database forensics. In Section 3, we © 2014 ACADEMY PUBLISHER doi:10.4304/jcp.9.11.2538-2544 JOURNAL OF COMPUTERS, VOL. 9, NO. 11, NOVEMBER 2014 2539 describe the structure of Redis RDB and AOF file. TABLE I. Section 4 shows our algorithms to extract data from these HEXADECIMAL STRUCTURE OF RDB FILE 0x00000000 52 45 44 49 53 30 30 30 36 FE REDIS0006……. files and Section 5 discusses the corresponding 00 FC 9C 45 6D 60 experiment. We also briefly cover the topic of Redis 0x00000010 3F 01 00 00 00 03 61 67 65 C0 ……age….nam memory forensics in Section 6. We conclude our work in 14 00 04 6E 61 6D Section 7. 0x00000020 65 04 4A 6F 68 6E FF 43 70 e.John……… 8B 5D AB 68 4A 84 II. RELATED WORK TABLE II. MEANING OF THE HEXADECIMAL VALUE OF THE RDB FILE Database forensics is a very important research field Offset Length Value Meaning that has received little research attentions these years. 0 5 52 45 44 49 53 Magic number: Redis Martin S Olivier [6] believed the lack of research is due 5 4 30 30 30 36 Redis RDB Version: 6 9 2 FE 00 Database number:0 to the inherent complexity of databases that is not fully 11 27 FC 9C 45 6D 60 3F Database data, stored understood in a forensic context. Harmeet Kaur Khanuja 01 00 00 00 03 61 as key-value pairs [7] discussed various methodologies for tamper detection 67 65 C0 14 00 04 in databases and outlined challenges and opportunities in 6E 61 6D 65 04 4A database forensics. He [8] also proposed a framework that 6F 68 6E 38 1 FF End of File marker builds the expert system for database analysis in two 39 8 43 70 8B 5D AB 68 Checksum stages. Patrick Stahlberg [9] demonstrated that existing 4A 84 database systems fail to securely remove deleted data and remnants of past operations, making the recovery of deleted data possible. Peter Frühwirt [10-12] described the file format of the MySQL Database with InnoDB Storage Engine and proposed methods for recovering basic SQL statements by analyzing InnoDB’s redo logs. Paul M. Wright [13] introduced advanced Oracle forensics techniques, which could ensure the safety and security of Oracle data. David Figure 1. Overall Structure of a RDB File Litchfield [14-19] performed forensic analysis of a Redis stores data in a database in the form of key-value compromised Oracle database server and discussed every pairs. Table 3 shows the meaning of the bytes in a aspect of Oracle forensics in his series of papers. Kevvie key-value pair in Table 1. Fowler [20] defined, established, and documented SQL TABLE IIII. server forensic methods and techniques in his book. MEANING OF THE HEXADECIMAL VALUE OF A KEY-VALUE PAIR Josiah L. Carlson [21] introduces Redis and explains Offset Lengt Value Meaning how to use Redis effectively in his book. Tiago Macedo h and Fred Oliveira [22] provide recipes for a variety of 11 9 FC 9C 45 6D 60 3F Expire time: 30 01 00 00 seconds issues a Redis user will face in their book. 20 1 00 Type of value: 0 21 4 03 61 67 65 Key: age III. REDIS INTERNALS 25 2 C0 14 Value: 20 In Redis, each key could be associated to an expire In this section, we explain in great details the structure time field, and will be removed automatically by Redis of Redis RDB file and AOF file, which forms the basis of server when the specified amount of time has elapsed. our algorithms to extract useful data from these files. The expire time is stored as an absolute Unix timestamps A. RDB File Format in milliseconds. The type of value field tells us which By default, the whole Redis dataset resides in volatile encoding method Redis uses in order to store the value memory. But Redis would also save the snapshots of all field. The key field is a Redis string, and the value field is the data in memory to a RDB file on disk. When a Redis stored based on the encoding method described in the server starts, the RDB file would be loaded into memory. type of value field. RDB file is useful for purposes like backup and disaster Redis is known for its rich support of various data recovery. Database users can copy RDB files to other structures. The value field in a key-value pair could be machines and data centers even when the database is still one of five data structures in Redis, which greatly running. By analyzing the RDB backup file, forensic facilitate the work of programmers. In the next few investigators could extract the Redis data without the subsections, we will discuss how these data structures are need to analyze the data in memory, which is fairly stored in a RDB file.