Forensic Analysis of Communication Records of Web-Based Messaging Applications from Physical Memory
Total Page:16
File Type:pdf, Size:1020Kb
Forensic Analysis of Communication Records of Web-based Messaging Applications from Physical Memory Diogo Barradas, Tiago Brito, David Duarte, Nuno Santos, and Lu´ıs Rodrigues INESC-ID, Instituto Superior Tecnico,´ Universidade de Lisboa, Portugal fdiogo.barradas, tiago.de.oliveira.brito, david.duarte, nuno.m.santos, [email protected] Keywords: Digital Forensics, Instant-Messaging, Memory Forensics, Web-Applications Abstract: Inspection of physical memory allows digital investigators to retrieve evidence otherwise inaccessible when analyzing other storage media. In this paper, we analyze in-memory communication records produced by web-based instant messaging and email applications. Our results show that, in spite of the heterogeneity of data formats specific to each application, communication records can be represented in a common application- independent format. This format can then be used as a common representation to allow for general analysis of digital artifacts across various applications, even when executed in different browsers. Then, we introduce RAMAS, an extensible forensic tool which aims to ease the process of analyzing communication records left behind in physical memory by instant-messaging and email web clients. 1 INTRODUCTION isting tools tend to be highly application-dependent. For example, Wong et al. present techniques that al- Instant-messaging (IM) and email applications such low for the recovery of digital artifacts for the Face- as Facebook’s chat and Gmail clients, respectively, book messaging service (Wong et al., 2011). How- are widely used communication services that allow ever, the proposed techniques cannot directly be ap- individuals to exchange messages over the Internet. plied to multiple other applications due to the het- Given the nature of the exchanged data, digital arti- erogeneity of data formats implemented by each ap- facts left by such applications may hold highly rele- plication. Furthermore, the fact that web-based ap- vant forensic value. This is particularly true if, from plications run inside a browser may interfere with such artifacts, it is possible to recover communication the durability of applications’ artifacts in memory, records of past conversations providing information for example due to the memory management pol- about the content of exchanged messages, identity of icy implemented by the browser. Existing works on communicating parties, or time-related information. browser forensics concentrate only on the extraction To assist forensic analysts in recovering such ar- of browser-specific artifacts (e.g., browsing history, tifacts, we aim to develop a forensic tool for extrac- web cache) leaving aside the recovery of application- tion of conversation records left by web-based mes- specific artifacts (Ohana and Shashidhar, 2013). saging applications in physical memory dumps. As In this paper we make three key contributions. opposed to their native counterparts, web-based ap- First, we present a digital forensics study aimed plications run inside a browser and are becoming in- to systematically analyze the digital artifacts left creasingly popular because users do not need to in- in memory by several popular IM and email web- stall them on their computers; all they need to do is to applications when executed on various browsers. We open a browser and visit a URL. By focusing on phys- successfully identified and retrieved IM communica- ical memory analysis, our goal is to complement the tion records from web-applications such as Facebook, functionality of existing forensic tools which focus on Twitter and Skype, as well as email records from the analysis of persistent state, e.g., local logs (Yang Outlook and a generic Roundcube email web-client. et al., 2016; Al Mutawa et al., 2011), and to address a Our study helps to characterize how the communi- latent need for the analysis of high-level data present cation records of web-based messaging applications in such memory dumps (Simon and Slay, 2009). are typically represented and to identify how browser- Although some prior work employs memory specific mechanisms may affect the durability of such forensic techniques on messaging applications, ex- records in memory (Section 2). Second, we introduce the design and implementa- in data representation and platforms may impact the tion of a forensic tool called RAMAS, which consists way communication records are loaded into memory of a collaborative and extensible framework for anal- and contribute for the absence of a common model of ysis of communication records from volatile memory. the structure of communication records among differ- RAMAS is able to extract such records from mul- ent implementations of browsers and operating sys- tiple web-based messaging applications and display tems. This implies that a tool developed for analyz- the extracted records on a user-friendly timeline. RA- ing this kind of evidence would exhibit the additional MAS is designed in a modular fashion so as to accom- complexity of having to take into account such differ- modate an ever-growing number of applications and ences between record structures, even when analyzing allow collaborative development and maintenance of a single application. We aim to assess whether there is the system by independent forensic analysts. This a common model which allows for the interpretation goal is achieved through the implementation of ex- of textual web-application data lingering in memory. traction modules: each module contains a set of (sim- How long do messages persist in memory? The ple) rules that allow RAMAS to extract the records of persistence of in-memory data structures may be af- a specific application and represent them on a com- fected by the browser where the web-based messag- mon application-independent format (Section 3). ing application runs. First, we must ascertain whether Lastly, we present an experimental evaluation the browser runtime environment imposes limitations of our framework. To this end, we used RAMAS on the ability to recover communication records from for conducting analysis over the data extracted from physical memory. In particular, to provide the inter- memory chips with sizes typically found in commod- action with web-applications, web-browsers rely on ity hardware. Our evaluation shows that RAMAS is different layout engines (ex. Blink, Gecko) which af- efficient, e.g., it can process communication records fect the way a browser hosts, renders or executes web spawning from six different applications, in an 8 GB content. Similarly, several implementations of oper- memory dump, in roughly about three minutes. We ating systems target different platforms (ex. worksta- also enact a use case for demonstrating the usefulness tion, mobile) and are expected to apply disparate low- of our framework’s evidence presentation capabilities level mechanisms for performing memory manage- which may help digital investigators in uncovering so- ment. Furthermore, different user interactions may phisticated correlations among evidence from several trigger the erasure or replacement of potential evi- applications or across memory images (Section 4). dence in volatile memory. For instance, data pertain- ing to a given web-application may be evicted shall a user navigate to a different browser tab or terminate 2 DIGITAL FORENSICS STUDY her browsing session. Finally, modern browsers im- plement private browsing execution modes which en- This section presents the digital forensics study that ables users to browse the web while disabling both the we carried out in order to assess the existence of com- browsing history and web cache. This feature is im- munication records in physical memory produced by plemented by popular browser vendors and is known web-based messaging applications. This study lays under different aliases, such as Incognito, InPrivate or the ground for the subsequent development of a foren- PrivateBrowsing. We must evaluate whether the use sic tool for automatic extraction of such records. of private browsing may compromise the existence of digital artifacts lingering in memory. 2.1 Goals of the Forensics Study 2.2 Experimental Methodology More concretely, the goal of this study is to check whether and in which conditions communication To investigate whether and in which conditions mes- records can be obtained from memory dumps. In par- sages can be obtained from memory, we performed ticular, our research is driven by two key questions: an experimental study of several web-based messag- How are messages represented in memory? The ing applications. In particular, we analyzed digi- programmers of web-based messaging applications tal artifacts concerning IM records for Facebook’s are free to implement them using a range of differ- chat, Facebook Messenger’s chat, Skype, Twitter’s ent technologies. Some design decisions comprise the Direct Messages, Google Hangouts, WhatsApp, Tele- choice of front-end and back-end programming lan- gram, and Trillian. We also analyzed communication guages (ex. Javascript, PHP), others involve select- records of three email web-clients: Outlook, a generic ing the data representation format of communication Roundcube email client, and Gmail. records (ex. JSON, XML, binary). This heterogeneity These applications were tested on different 2 Table 1: Feasibility of recovering web-based messaging application records from different browsers. Web-Application Browser Google Chrome Mozilla Firefox Opera Microsoft Edge v55.0 v50.0.1 v42.0 v38.14393.0.0 Facebook