DESENSITIZATION: Privacy-Aware and Attack-Preserving Crash Report

Ren Ding∗, Hong Hu∗, Wen Xu, Taesoo Kim Georgia Institute of Technology {rding, hhu86, wen.xu, taesoo}@gatech.edu

Abstract—Software vendors collect crash reports from end- To protect users’ privacy while supporting timely bug users to assist in the debugging and testing of their products. fixing, we need to remove users’ sensitive data from crash However, crash reports may contain users’ private information, reports before sending them to developers. We call this process like names and passwords, rendering the user hesitant to share the DESENSITIZATION. Several proposals aim to desensitize the reports with developers. We need a mechanism to protect users’ crash report, but unfortunately, we have not seen any real-world privacy in crash reports on the client side while keeping sufficient deployment. Techniques like Scrash [13] rely on developers information to support server-side debugging and analysis. to manually annotate sensitive data to remove them from In this paper, we propose the DESENSITIZATION technique, crash reports. However, the manual annotation is usually time- which generates privacy-aware and attack-preserving crash re- consuming and error-prone, and does not work on closed- ports from crashed executions. Our tool adopts lightweight source programs. Pattern-based search is a general method methods to identify bug-related and attack-related data from the to identify particular privacy-sensitive data, like URLs and memory, and removes other data to protect users’ privacy. Since email accounts [69], but it cannot handle program-specific data. a large portion of the desensitized memory contains null bytes, we store crash reports in spare files to save the network bandwidth Another set of works aims to desensitize the crashing input, and the server-side storage. We prototype DESENSITIZATION and which should make the program crash in a similar manner as the apply it to a large number of crashes of real-world programs, original [19], [80], [24], [4]. However, these proposals rely on like browsers and the JavaScript engine. The result shows that computation-heavy techniques, such as symbolic execution [15], our DESENSITIZATION technique can eliminate 80.9% of non- [22], [20], [16] or taint analysis [76], [61], [68], causing zero bytes from coredumps, and 49.0% from minidumps. The significant burdens to users or developers. More importantly, desensitized crash report can be 50.5% smaller than the original they require a non-trivial change to the current crash-report one, which significantly saves resources for report submission systems, which mainly use crash files instead of crashing inputs and storage. Our DESENSITIZATION technique is a push-button to classify and prioritize different bugs [2], [5], [36], [58], [79]. solution for the privacy-aware crash report. Given the understanding of the current obstacles, we identify I.INTRODUCTION three desirable properties that a DESENSITIZATION technique should have for wider deployment: Software vendors collect crash reports from their end-users to improve the stability and security of their products [2], [5], • Privacy-aware: The technique removes users’ private [36], [58], [79]. The crash report is either a coredump file information from the crash report as much as possible. that captures the CPU context and the memory content of the • Attack-preserving. The desensitized crash report still crashed program [1], [51], or the program input that makes the contains sufficient information for effective bug analysis. If execution crash [19], [80], [24], [4]. Unfortunately, in either the execution crashes due to a failed attack, the technique case, the crash report may contain privacy-sensitive information should keep the attack residue in the report. of individual users. For example, a recent study on 2.5 million • Widely applicable. The technique should be lightweight crash reports of a popular web browser identifies an enormous for end-users and compatible with current crash-report amount of private user data, including user names, passwords, systems. It should work on closed-source programs, cover session tokens, and personal contact information [69]. A crash various types of sensitive data, and generate minimal crash report with private information is unwelcome to both users reports for resource-limited environments. and developers. While users are unwilling to share the crash In this paper, we propose a bug-oriented and attack-oriented reports due to the concern of leaking their private information, method to desensitize crash reports so that they capture a developers do not want to collect privacy-sensitive reports snapshot of the crashed program with minimal risk of leaking where any storage breach may lead to bad publicity, and even users’ private information. Our observation is that although financial liability. there are many bugs on the client side, only a few general

∗ techniques are used on the server side to diagnose bugs The two lead authors contributed equally to this work. and detect attacks. For example, call-stack-based analysis is widely used to classify and prioritize crash reports [57], [8]. Therefore, DESENSITIZATION keeps only the information for Network and Distributed Systems Security (NDSS) Symposium 2020 commonly used bug-analysis techniques and removes all other 23-26 February 2020, San Diego, CA, USA ISBN 1-891562-61-4 data to protect users’ privacy. We design one general and four https://dx.doi.org/10.14722/ndss.2020.24428 lightweight techniques to identify bug-related and attack-related www.ndss-symposium.org data from the crashed memory snapshot. The general technique 1 #define MAX_LEN 64 stack heap ROP chain gadgets shellcode 2 typedef struct { char *ptr; void (*print)(); } String; pop rsi xor rdx, rdx S 3 void printString() { ... } passwd … 4 void vuln(char *input) { ret G1 add rbx, 0x8 5 char passwd[MAX_LEN]; load_passwd(passwd); … S’ buf ptr … 6 char *buf=( char *) malloc(MAX_LEN); pop rdi 7 String*string= (String*) malloc( sizeof(String)); string print ret mov [rdx], rbx 8 string->ptr= buf; string->print=&printString; … G2 syscall 9 strcpy(buf, input); 10 string->print(); xchg rax,rsp mprotect 11 } ret Gpivot G3

(a) Vulnerable source code. (b) Memory layout before crash.

Fig. 1: A vulnerable code snippet and the memory layout before the execution crashes. The code has a heap-based buffer overflow bug at line 9. Attackers corrupt a function pointer to pivot the stack, launch ROP attacks, and eventually run the shellcode. However, due to the incorrect address, the program crashes and generates a coredump. The stack contains user password — a sensitive private data value.

scans the memory snapshot to identify all pointers, as most Type Semantics Examples crashes and attacks stem from corrupted pointers. For each type Cookies website tokens Geo=(4.3267961,52.096922) of prevalent bug and attack, we design one technique to identify Forms current user-input

specific information or tailor existing techniques for client-side Autofilling user-input history efficiency. Specifically, we design a heap module to identify Passwd: all heap metadata, a format string module to detect malformed TABLE I: Possible privacy leakage from minidumps of , format strings, an ROP module to search for suspicious ROP including website tokens and user inputs. See§ V-B3 for more details. gadgets, and a shellcode module to collect possible payload. The modular design makes our tool extensible to support future bug-analysis and attack-detection techniques. manual inspection on several attack-caused crashes confirms The prototype of our DESENSITIZATION technique can that DESENSITIZATION keeps all attack residues necessary for generate desensitized crash reports from either crashed exe- developers to analyze these attacks. DESENSITIZATION can cutions or existing reports. For the runtime report generation, complete its work with reasonable resources, taking less than our tool installs a signal handler to take over the execution 15 seconds process each crash. Therefore, DESENSITIZATION once the process crashes. It applies DESENSITIZATION on is a practical yet effective method to generate privacy-aware the memory snapshot and generates the desensitized crash and attack-preserving crash reports. reports that are compatible with existing formats, such as coredumps (the default format on Linux [51]) and minidumps We make the following contributions in this paper: (the default format on Windows and many large programs, • Privacy-first crash report. We design a new format of like Firefox [1]). It can also desensitize existing reports to crash report that can protect users’ privacy while retaining remove private information. Currently, our prototype supports sufficient information for server-side bug detection and both 32-bit and 64-bit Linux ELF programs. We plan to add attack analysis. The new format is compatible with most the support of Windows PE binaries in future work. mainstream crash-report systems. To understand the efficacy and efficiency of our tool on • Techniques to generate a new report. We implement protecting users’ privacy, we evaluate it on four real-world one general and four specific techniques to construct crash applications and one CTF (capture-the-flag) challenge: the reports in the new format from crashed processes. Our multimedia player ffmpeg, the PHP language interpreter php, tool is easily extensible for future analysis techniques. the JavaScript engine chakra, the web browser firefox, and the • End-to-end system. Our evaluation shows that our tool DEFCON 2015 challenge tachikoma. We collect 13,390 crash is practical for generating crash reports that protect user reports from 86 known bugs and emulated attacks, including privacy while keeping attack residues. 11,875 coredumps and 1,515 minidumps. The result shows DESENSITIZATION is a push-button solution for privacy- that on average, DESENSITIZATION can remove 80.9% of all aware crash reports; developers just have to include the library non-zero bytes from each collected coredump and remove of DESENSITIZATION in their products. We have released the 49.0% of that from each minidump. Further, DESENSITIZATION source code of our tool and the collected crash reports at helps reduce the file sizes of coredumps by half, which saves https://github.com/sslab-gatech/desensitization. network bandwidth for end-users to report the bug and reduces disk usage for developers to store the reports. Meanwhile, the desensitized crash reports still contain sufficient information to II.PROBLEM DEFINITION support common bug-analysis and attack-detection techniques. In this section, we first present one example to illustrate the Specifically, the call-stack-based analysis produces the same DESENSITIZATION problem. Then we define our threat model. result for reports before and after applying DESENSITIZATION. For example, we submit the original and desensitized reports A. Motivating Example to socorro – the crash-analysis framework of firefox. The platform groups them into the same group, indicating that our We inspect the source code of firefox and its runtime, and tool keeps the necessary information for bug classification. Our identify several scenarios in which users’ private information

2 Techniques CallStack IP Signature RevExec Schemes Heap Struct ROP Shcode

Adobe-CR [2] ✓ ✓ CAVER [47] ✓ Apport [78] ✓ ✓ DANGNULL [46] ✓ Backtrace [8] ✓ ✓ HOTracer [42] ✓ -CR [36] ✓ ✓ KOP [17] ✓ CREDAL[81] ✓ Polychronakis et al.[66] ✓ Crash Graphs [45] ✓ Polychronakis et al.[65] ✓ KLEE*[15] ✓ ✓ ROPecker [21] ✓ Liblit et al.[48] ✓ ✓ ROPMEMU [37] ✓ Mac-CR [5] ✓ ✓ ROPscan [67] ✓ Modani et al.[56] ✓ SBCFI [63] ✓ POMP[82] ✓ SHELLOS[75] ✓ Rebucket [27] ✓ SigGraph [49] ✓ REPT [25] ✓ RETracer [26] ✓ DESENSITIZATION yes yes yes yes ✓ Schroter et al.[70] TABLE III: Anomaly detection schemes and their detection targets. Socorro [57] ✓ ✓ WER [33] ✓ ✓ The last line shows whether or not the desensitized reports still contain !analyze* [55] ✓ ✓ necessary information to identify these attacks.

DESENSITIZATION yes yes yes TABLE II: Crash analysis and triage techniques, and underlying mechanisms. CallStack relies on call stacks at the crashing point to severity of the privacy leakage issue [69]. The straightforward find unique bugs; IP uses the crashed instruction address to deduplicate solution is to remove all privacy-sensitive data from the crash crashes; Signature adopts program-specific heuristics to the callstack reports before sending them to developers. However, existing techniques for more effective triage. ∗ indicates the uncertainty due to works either rely on developers to manually annotate such the lack of proper documentation found. The last line shows whether variables or depend on non-scalable inter-procedural data-flow or not the desensitized crash reports still support these techniques. analysis to find them [13]. We need an effective and efficient solution to eliminate sensitive data from crash reports. Attack/Bug Residue. To support bug analysis and attack is dumped into the crash report. Table I provides a summary detection on the server side, the crash report should contain of our findings, where website cookies, user-inputs, and input sufficient information of the bug or residue of the attack. For history could be leaked. We leave the detailed discussion of example, in Figure 1b, we can find various data pieces used these cases in§ V-B3. Instead, we use a simplified example to by the attacker, including the address of the stack-pivot gadget demonstrate the privacy issue in crash reports. Gpivot, several ROP gadget addresses (e.g., G1, G2, G3) and the malicious shellcode S. With this information, developers Figure 1 shows a vulnerable program that is exploited by may determine the reason for the crash — a failed exploit attackers, but the attack fails due to an invalid memory access. against a heap-based buffer overflow. They can also identify The code in Figure 1a first loads the user’s password to astack the exploitation methods used in the attack, like stack pivoting, buffer passwd (line 5). Then, it allocates two heap objects, buf ROP, and shellcode injection. Such information helps developers and string, and initializes their fields accordingly (line 6-8). pinpoint the bug location and enables them to correctly estimate However, the strcpy at line 9 may overwrite the heap memory the bug severity (i.e., highly risky). Finally, they can fix the beyond buf if the untrusted input has more than MAX_LEN bytes, bug by patching line 9 to avoid future crashes and attacks. leading to the corruption of the string object. In that case, the indirect function call at line 10 may divert the control-flow to an As we can see from the example, a DESENSITIZATION attacker-expected location. Figure 1b shows the memory layout technique contains two contradictory goals: one is to reduce of a failed attack, where attackers overwrite the function pointer the user information from the crash report as much as possible string->print with the address of a stack-pivot gadget. The to protect the user’s privacy; another goal is to keep as much gadget changes rsp to the attacker-controlled memory region, information as possible to afford effective bug diagnosis and which contains a sequence of code addresses of ROP gadgets. attack analysis. Our work starts from widely used bug-analysis After running several gadgets, the attack finally executes the techniques, and aims to generate a minimal crash report that shellcode. Due to the incorrect address, the execution crashes at supports developer-side bug analysis and repair. an invalid memory access. The operating system or a user-space library (e.g., breakpad [35]) will generate a crash report that B. Bug-analysis and Attack-detection Techniques captures the memory status. Once the user agrees, the crash report will be sent to developers for bug analysis. We performed a study of existing bug-analysis and attack- detection tools used on the server side, aiming to understand Privacy Concern. The crash report may contain users’ private the commonality of these techniques. Table II and Table III information. For example, the stack variable passwd in Fig- show our study results about crash-classification techniques and ure 1b contains users’ password, which should be confidential. exploit-detection tools. Generally, most classification techniques However, current crash-report systems include all memory rely on either call stacks, program counters, or customized content in the coredump or at least all stack content in the signatures at crash sites for efficient triage. For example, minidump, which will leak the users’ password to developers. In socorro [57], developed and used by the firefox team, a recent study, researchers identified 20,000 session tokens and combines call-stack information and several program-specific 600 passwords from browser crash reports, demonstrating the heuristics to group different crash reports. In addition, heavier

3 techniques, such as reverse execution [25], [82], [26], [81], are Module Collected Data Related Bugs & Attacks also proposed to examine crashes in greater detail. Among Pointer Code ptrs ROP [73], JOP [11], COP [18], GOT.PLT all anomaly-detection mechanisms in Table III, four types of corruption, vtable injection [71], etc. information are commonly used to detect attacks, i.e., the Data ptrs DOP [40], type confusion [47], use-after- heap metadata, program-specific structures, ROP gadgets, and free [46], double-free [46], out-of-bound, etc. Heap Chunk size heap overflow, overlapping chunks [74], heap shellcode. For example, several works aim to detect ROP attacks spray, etc. by identifying ROP gadgets during program execution or from PREV_IN_USE use-after-free [46], double-free [46], unsafe- the execution trace [21], [37], [67], [63], [75]. unlink, bin-dup, house-of-* series [74], etc. ROP Gadgets & args ROP [73], JOP [11], COP [18], etc. ROPscan as an example. ROPscan [67] takes two steps to Fmtstr Strings & args format string attack identify ROP attacks from network packets or memory buffers: Shcode Payloads shellcode injection it first sequentially scans every 4-byte value from the input TABLE IV: Information collected by DESENSITIZATION, and to find potential code pointers that point to executable code potentially covered vulnerabilities and attacks. Our system supports locations (assuming a 32-bit system); it then starts to execute easy extension for developers to customize DESENSITIZATION for a the pointed-to code and identifies gadgets based on the length program-specific crash format, like nullifying more pointers to save of each block and the target of the ending instruction. If the space or keeping more data for debugging purposes. number of distinct gadgets exceeds the threshold, ROPscan concludes the input contains a ROP payload. ROPscan can detect ROP gadgets from the coredump generated in Figure 1b. crashes are caused by corrupted pointers, while many attacks Specifically, given the whole memory snapshot captured in manipulate pointers to achieve final goals. Second, our tech- the coredump, ROPscan first identifies the 4-byte value G pivot nique tackles other data that can provide debugging aids to track pointing to the executable code section. However, it cannot down multiple types of vulnerabilities (§III-B). For example, concretely execute this gadget, as the value of rax is missing. heap metadata is useful to identify heap overflows and use- Then, it continues to identify the next code pointer G . After 1 after-free bugs, and therefore we follow the design of each heap that, it tries to execute the code at G1, which will reveal more manager to find all heap structures. Third, DESENSITIZATION gadget addresses, i.e., G , G . ROPscan treats these three 2 3 considers possible exploit residues in the memory and uses consecutive ROP gadgets as an indicator of the ROP attack. heuristic-based methods to identify the related data of popular We design DESENSITIZATION to retain adequate infor- attack vectors. Table IV provides a list of information collected mation in the crash report to support as many server-side by DESENSITIZATION and potential vulnerabilities and attack analysis techniques as possible, while keeping the client-side vectors that are related to the collected information. Note computation lightweight to avoid heavy burden on end-users. that some vulnerabilities or attacks require a combination Although existing techniques may have different algorithms of several modules of DESENSITIZATION to get sufficient to parse data, some types of data are widely used by most information. For example, the detection of dangling pointers techniques. Based on this observation, DESENSITIZATION requires both data pointers and heap metadata. Other than chooses to focus on collecting common data to support popular embedded techniques, our tool also provides convenient analysis techniques with minimal overhead. for developers to write customized DESENSITIZATION modules based on their domain knowledge, like removing known useless C. Threat Model pointers for analysis to avoid potential information leakage or keeping more data that are critical to detect bugs (§III-C). We assume a software developer who creates benign but potentially buggy programs, and an end-user who likes to System Architecture. Figure 2 shows the architecture of use the program but cares about her privacy – she does not our DESENSITIZATION technique and its position in current want to share any credentials with the developer. The program crash-report systems. DESENSITIZATION is deployed on the execution may trigger some program bugs, either accidentally client side, either as an exception handler of the program, by the user’s rare operation or intentionally by an attacker. Due or adopted by the operating system as a system-wide crash- to the bug, the execution terminates on the user side. The crash report generator. Once a process crashes, DESENSITIZATION report system embedded in the buggy program, or provided is invoked with the full memory space as the input. In current by the underlying operating system, generates a crash report systems, all memory content is directly dumped in the crash that describes the execution failure. The user likes to share the reports. Instead, DESENSITIZATION takes several lightweight crash report with the developer to help diagnose the bug and analyses to identify bug-related and attack-related data from fix it in time. However, she has the concern that her private the memory (shown as a small rectangle under each technique). information in the crash report may be leaked to the developer Then, our technique keeps all identified data in the memory or to others through the developer’s activity. The developer and nullifies others to remove the user’s private information. does not want the liability of protecting the user’s privacy but Optionally, we store the desensitized memory as a sparse is eager to use the crash report to debug the program. file and compress it to reduce its size (not shown inthe figure). Finally, our technique utilizes existing systems to deliver the crash report to the server through the network. III.DESIGN On the server side, developers can use any heavy technique to Our DESENSITIZATION technique identifies necessary in- analyze the crash, like taint analysis [76], [61], [68], symbolic formation that is commonly related to bugs and attacks. First, it execution [15], [22], [20], [16] or even manual inspection. scans the whole memory to identify pointers (§III-A), including Finally, they determine the reason for the crash and synthesize code pointers and data pointers. Our observation is that most a patch to fix the bug.

4 Desensitization Crash Crash Execution Pointer Heap ROP Shcode FmtStr Report Manual Reason

… SymExe

Internet

TaintTrack ……

client server Fig. 2: Overview of DESENSITIZATION and its position in current crash-report systems. Once a process is crashed on the client side, our lightweight tool will inspect the process memory to identify bug- and attack-related data, like pointers. It removes all other data from the memory to generate a privacy-aware crash report. The report is sent to servers through network and is stored on the server side. Developers will use heavy bug-diagnosis and attack-analysis techniques to find the root cause of the crash.

A. Pointer Identification one entry for each function. A GOT.PLT entry originally refers to an instruction of the PLT section and is changed to the real Pointers play important roles in the program’s normal address of the external function after its first invocation. These execution, bug diagnosis, and attack detection. Code pointers entries are common corruption targets for attackers to bypass hold the addresses of code segments, which enable efficient randomization-based defenses [30]. Our pointer identification implementations of dynamic behaviors, like callbacks and will collect all normal GOT.PLT entries as valid code pointers. polymorphism of object-oriented programming. Data pointers Even if one GOT.PLT entry gets corrupted and becomes an connect program variables together and support efficient invalid pointer, DESENSITIZATION still strives to preserve it memory accesses. Therefore, many debugging efforts start due to its sensitivity toward bug-related and attack-related by examining pointers in a crash report. For example, call- debugging. Specifically, if DESENSITIZATION can identify the stack-based analysis recovers the call stack by following stack GOT.PLT section from the dump information, it will simply pointers and base pointers [57], [8]. Meanwhile, pointers are keep all entries in the section. Otherwise, it tries to identify the common targets of attack techniques, including both traditional GOT.PLT section based on the property of GOT.PLT entries, attacks such as ROP and format string attacks, and recently i.e., they either point to instructions in the PLT section or point prevalent use-after-free attacks and type confusions. to functions in other modules. DESENSITIZATION adopts generic approaches to identify Supporting ROPscan. Pointer identification collects point- all pointers from the memory. Specifically, it examines every ers, especially code pointers, so that the crash report after e g pointer-size bytes in the crashed memory, . ., 4 bytes for x86 DESENSITIZATION still contains sufficient information to systems and 8 bytes for x86-64 systems, and checks whether support the first step of ROPscan – code pointer identification. they are potential code or data pointers. DESENSITIZATION In the case of the crash generated from Figure 1, all pointers considers a pointer-size memory as a pointer if and only if its (i.e., buf, string, ptr, Gpivot, G1, G2, G3) will be identified value falls into one valid region of the memory space with and kept in the desensitized crash report. proper access permissions. We treat a pointer as a code pointer if the pointed page has the execution permission; otherwise, we label it as a data pointer. Values pointing to any invalid B. Bug-Specific and Attack-Specific Modules memory range, or non-accessible memory region, like those Besides identifying pointers for general purposes, we reserved for dynamic allocations, will be considered as non- further investigate other potential data related to program pointers. This method will not miss any pointers in memory, faults and malicious attacks in the crash memory. In particular, but may include non-pointers in the crash report. Fortunately, DESENSITIZATION considers popular, general offensive meth- our evaluation shows that even with potential false positives, it ods that are widely used by real-world attacks, and strives still significantly reduces the size of the crash report. to preserve the necessary information in the crash report We adopt several optimizations to improve the performance correspondingly so that developers can diagnose the crash of memory scanning. First, DESENSITIZATION combines con- and reveal the attack easily. Other than the supported bugs secutive memory regions regardless of their permissions (as and attacks, we design DESENSITIZATION to be extensible to long as accessible) to reduce the number of comparisons when support future bug-analysis techniques. validating pointers. For example, crashes of firefox contain 1) Heap Structure: In recent years, heap-based vulnerabili- 340 memory regions, leading to heavy comparisons to validate ties and exploits have dominated security breaches [6], [41], each pointer-size value. Merging continuous memory reduces [10], [31], [32], [64], [43], [23], [77], [7]. These exploits utilize more than half of these regions, significantly improving the various methods to abuse heap structures that lack sufficient validation performance. Second, our scanner maintains the security checks in various implementations. [74] provides a merged memory regions as an interval tree, which searches community-wide compilation of well-known heap exploits address ranges in a logarithmic time complexity. against ptmalloc [34], where each technique corrupts specific heap metadata to achieve certain attack primitives. Capturing GOT.PLT. ELF binaries use the Global Offsets Table (GOT) to access the functions and variables of external To help debug program faults and identify failed attacks libraries. Specially, the addresses of external functions are stored involving heap objects, DESENSITIZATION tries to identify and in the GOT.PLT section (PLT for Procedure Linkage Table), save all heap metadata from the process memory space. Our

5 Used chunk Free chunk Top chunk gArenas Used Page Main arena Free Unsorted Small Large arena_t

.bins Bin1 … Bin13 … Bin71 … chunk_t chunk1 ... bin_t Heap chunk2 Page Bin1 Bin2 ... U1 F1 U2 F2 U3 T

Fig. 3: Heap structures implemented by ptmalloc in glibc. It Fig. 4: Heap structures implemented by jemalloc in firefox. It embeds metadata, like pointers, into the allocated memory region. uses dedicated memory region to store all heap metadata.

heap module is designed to be generic and thus can cover many the main arena (i.e., gArenas), DESENSITIZATION parses types of bugs and attacks. For example, from the desensitized the tree of arenas (i.e., struct arena_t) to fetch all heap crash report, developers can identify corrupted heap structures structures. While each arena maintains a dynamic list of bins due to buffer overflows, or they can find dangling pointers of (i.e., struct arena_bin_t), it also points to another tree of freed memory, which might be the root causes of use-after-free chunks (i.e., struct arena_chunk_t), which contains a large 1 bugs. Currently, our DESENSITIZATION technique supports two continuous memory space for allocation, similar to the function common heap allocators, ptmalloc, which is used by glibc of struct heap_info for ptmalloc. The sizes of the continuous on Linux distributions, and jemalloc, which is adopted by memories are fixed based on native systems (with usually FreeBSD and other popular applications, such as firefox. 1 MB = 1 page * 250), and since there is no metadata between user-requested regions anymore, DESENSITIZATION Figure 3 presents a typical view of the heap layout in simply records the data in the struct arena_chunk_t header, ptmalloc arena , which maintains multiple s for multi-threaded rather than traversing through all user memory like it does applications, one for each thread. An arena is an instance of for ptmalloc. DESENSITIZATION identifies and saves all the malloc_state structure and serves the allocation and freeing of structures mentioned above from the process memory. its associated thread. Each arena contains several large amounts of continuous areas of heap memory. Each heap chunk is in 2) ROP Gadget Chain: Return-oriented programming (ROP) the status of either allocated or freed, indicated by the chunk is an indispensable step for modern memory-error exploits [73]. metadata. The metadata of an allocated heap chunk contains Despite ROP chains tending to be self-destructive during execu- the size field describing the chunk size, the prev_size field tion, DESENSITIZATION strives to save related information providing the size of the previous chunk (if it is freed), and if necessary. Note that the pointers of ROP gadgets have several flags bits, like the PREV_IN_USE bit indicating whether been saved by the pointer identification. Therefore, our ROP the previous chunk is freed or not. A freed chunk has two module focuses on saving the data in the ROP payload, which extra metadata fields, the fd pointer pointing to the next freed can be used as operands of ROP gadgets. Specifically, if a chunk and the bk pointer pointing to the previous freed chunk. consecutive memory of K bytes contains more than N code To support efficient allocation, ptmalloc connects freed heap pointers, DESENSITIZATION considers the region as a potential chunks through the fd and bk pointers, and groups them into ROP payload and will keep all the data in between, including different bins – one bin for all chunks with the same or similar pointers and non-pointers. The numbers K and N are tunable to size. To free an allocated chunk, ptmalloc inserts it into a users. In our evaluation, we use K=48,N=4 for 32-bit systems and proper bin according to its size before returning it to the system. K=96,N=4 for 64-bit systems, based on our experience. Further, To allocate a new chunk, ptmalloc tries to find a large enough DESENSITIZATION conservatively keeps a 96-byte memory cached chunk from the bins before requesting the memory region around stack pointer esp/rsp, which can provide both from the top chunk. If the crashed program uses ptmalloc, a general debugging aid for program faults and tackle those DESENSITIZATION manages to parse the heap metadata as ROP attack-crashing programs. above, starting from the information in arenas that are indicated by the symbol main_arena. Then, it walks through all the Supporting ROPscan. ROP gadget chain identification keeps structures accordingly in an iterative manner, such as tracking the operands of ROP gadgets inside the desensitized crash each chunk in memory space by its chunk size, until it finds report, which supports the second step of ROPscan — the all heap metadata or reaches a parsing error due to corrupted speculative execution and gadget detection. In the example of heap information. Figure 1, DESENSITIZATION will keep the operands for G1, G2, and G3. Therefore, ROPscan can start the execution from Figure 4 presents a typical view of the heap layout in G1 and identify three consecutive ROP gadgets. jemalloc. jemalloc shares some structures with ptmalloc, such as arenas and bins. The main difference is that jemalloc 3) Shellcode: Code injection attacks insert malicious in- saves all metadata in one dedicated memory region, separated structions, called shellcode, into the memory space of the from the real payload. This design improves the security victim. Although NX and DEP disable direct code injection [3], of the heap manager, as the common heap overflow cannot [29], attackers can create a writable-and-executable data region reach the metadata as easily as before. It also achieves through mprotect() and place shellcode inside, commonly better performance, as heap structures are maintained as serving as the last exploit step. To search for potential shellcode efficient tree structures and traversal of these structures is payloads in a lightweight manner, DESENSITIZATION scans more convenient due to the memory locality. Starting from memory dumps for critical system call instructions outside the

6 text section, such as int 80 for x86 platform and syscall data to physical disks in the sparse file format. We implement for x86_64 platform. For each potential system call instruction, DESENSITIZATION in 4,284 lines of Python code, publicly DESENSITIZATION fetches the proceeding 200 bytes and tries available at https://github.com/sslab-gatech/desensitization. to disassemble them. As the fetch might truncate actual 1) Memory Parser: Currently, DESENSITIZATION sup- payloads, DESENSITIZATION tolerates disassembling errors within the first 16-bytes. If the disassembling process does ports generating desensitized crash reports in coredumps not find any error after the first 16-bytes, and the assembly or minidumps. Coredumps contain almost all the memory code is ended with one system call instruction, we treat the content of the crashed process, including environment variables, extracted memory region as a valid shellcode payload. To cover exception information, context of CPU, heap and stack of shellcode with more than 200 bytes, we repeatedly apply the each thread, loaded libraries, and modules. Minidumps collect method to another 200 bytes before the valid shellcode, until we much less information to reduce the size of the report. For cannot find more shellcode regions. All the identified shellcode example, by default, it does not include the heap region of the crashed process. Considering the distinction in formats payloads are kept by DESENSITIZATION in the desensitized crash report. and design concepts, we implement two different parsers to generate coredumps and minidumps. To generate core- 4) Malicious Format String: In format string exploits, dumps and parse binaries and libraries on the Linux platform, attackers manipulate printing formats to perform arbitrary read DESENSITIZATION relies on an open-source tool [9] with and write with severe consequences. DESENSITIZATION hunts python bindings, pyelftools, for parsing various headers and all the valid strings that contain format specifiers defined sections in ELF format and DWARF debugging information. in the man page of Linux [50] through regular-expression Due to the limited availability of public libraries for generating matching, including those with various length and precision minidumps in Python, we implement our own minidump parser fields. DESENSITIZATION simply fetches memory between the without relying on third-party codes. For disassembling binaries, two closest null bytes (i.e., 0x00) as an overestimation for DESENSITIZATION works with pyxed [44], the python bindings each string. Next, it tries to find arguments for printf-like for Intel X86 Encoder Decoder (XED). functions. On 32-bit systems such as x86, arguments are passed through stacks while the pointer of the format string is the 2) Report Writer: To produce minimal crash reports, the writer of DESENSITIZATION stores the desensitized memory first argument. DESENSITIZATION searches for pointers that point to any malicious format strings on stacks and saves as sparse files, as a large number of bytes are cleared tozero. the six following pointer-size bytes to conservatively keep all Specifically, it fseeks on positions of kept bytes in the processed arguments. On 64-bit systems such as x86-64, the first several data to create “holes” in newly generated dumps. function arguments are passed through registers, where the pointer of the format string will be in register %rdi. In this V. EVALUATION case, DESENSITIZATION cannot pinpoint the potential locations We perform empirical evaluations on real-world crashes to of corresponding arguments, but just saves malicious format answer the following questions of DESENSITIZATION regarding strings and all their references in memory. its effectiveness, usability, and practicality: C. Supporting Future Analysis • Privacy protection. What percentage of bytes can our DESENSITIZATION technique nullify from the process mem- The current design of DESENSITIZATION covers most of ory? Can DESENSITIZATION remove sensitive data? (§V-B) the common bugs and attacks, but it is never complete due • Attack preservation. Can DESENSITIZATION retain suf- to various bug types and quickly evolving exploit techniques. ficient information to support common server-side crash Therefore, DESENSITIZATION provides friendly interfaces for analysis? Can DESENSITIZATION preserve attack residue if researchers to write their own modules to cover more bugs and the program crashes due to failed attacks? (§V-C) to support future analysis techniques. It also allows developers • Practicality. How well can DESENSITIZATION help reduce to define customized policies based on their domain knowledge, the size of the crash report? Can DESENSITIZATION generate either to aggressively remove more pointers that are known to the new crash report in a reasonable time with proper memory be unnecessary for debugging or to conservatively keep more usage? (§V-D) non-pointers that are critical for bug analysis. For example, developers can acquire extra data in critical sections to assist A. Experiment Setup backward execution at crash sites [25], [82] or search for more attack payloads in error-prone call frames. DESENSITIZATION To compare our new format of crash reports with existing provides configurable options for non-developers, such as ones, we first generate the reports in old formats using existing further removing data pointers in crash reports to avoid tools. Then, we apply our technique on these reports to identify uncommon information leakage. By taking various levels of and remove users’ sensitive information. Specifically, we rely on DESENSITIZATION techniques for certain address spaces of the Linux kernel to create coredumps and utilize the breakpad the crashed execution, DESENSITIZATION produces new crash library to generate minidumps. To simplify our description, we reports in a flexible manner. denote the crashes triggered by failed attacks as attacker-driven crashes (reports) and denote the crashes without any attack IV. IMPLEMENTATION involved as user-driven crashes (reports).

DESENSITIZATION contains two components: the parser Generating user-driven reports. We use three different for extracting necessary information that needs to be kept in methods to generate user-driven reports: one with the PoCs the crash report and the writer that dumps the desensitized of real-world bugs, one by PoCs, and one by sending

7 100 desensitized 80 pointer metadata 60

Size (%) 40

20

0 0 500 1000 1500 0 250 500 750 1000 1250 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 1000 2000 3000 4000 5000

(a) ffmpeg (b) php (c) chakra (d) ff-coredump (e) ff-minidump (f) tachikoma

Fig. 5: Distribution of non-zero bytes in the original memory space. “desensitized” shows the bytes removed from the memory by DESENSITIZATION, while “pointers” and “metadata” indicate the bytes that remain in the memory. Since the data related to heap structures, ROP gadgets, format strings, and shellcode account for a small portion of the memory, we combine them as “metadata.”

abort signals. First, we get 19 PoCs for ffmpeg, 15 for php, B. Privacy Protection by DESENSITIZATION 30 for chakra, and 15 for firefox from public resources [59], [62], [72], each against a distinct bug. These PoCs simply Since the privacy information is program-specific, it is trigger bugs in programs and immediately crash the process. challenging to quantitatively measure the privacy protection Second, we utilize the AFL crash mode [83] to keep randomly after DESENSITIZATION. Instead, we use three different meth- mutating the benign PoCs for 24 hours to generate more crashes. ods to understand and estimate the privacy protection. First, We believe the random mutation will not introduce malicious we measure the reduction of non-zero memory bytes. This actions, and therefore the generated crashes are still user-driven. reflects the privacy protection when the private information Third, we use firefox to visit the Alex Top 1500 websites is distributed evenly in the memory. Second, we retrieve and send the SIGABRT signal at a random time for the process the printable strings from the crash reports and measure to terminate abnormally. In general, we obtain 7,507 normal their reduction during DESENSITIZATION. As much of the crashes, including 1,667 crashes for ffmpeg, 1,313 for php, common private information is printable strings, this result 1,497 for chakra, and 3,030 for firefox. gives us an estimation of privacy benefits. Third, we manually inspect several cases in which firefox leaks the user’s private information and confirm that with DESENSITIZATION, such Generating attacker-driven reports. We also use three information is successfully nullified. different methods to generate attacker-driven reports. First, we find two exploits against ffmpeg, two against php, two 1) Memory Byte Reduction: Figure 5 shows the distribution against chakra, and one against firefox. The details of the of non-zero bytes in the original memory snapshot, grouped exploited CVEs are discussed in§ V-C2. Second, we also by the data types. Each unit on the x-axis represents one utilize the AFL crash mode to collect more crashes. Unlike crash report, sorted by the percentage of pointers, while the those user-driven reports generated through fuzzing upon the y-axis depicts the percentage of each data type. The pointer PoCs without attack payload, the current seed corpus contains and the metadata are kept in the memory space, while the fragile, but full chain exploits. Random mutation on the exploits desensitized data are set to zero to protect the user’s privacy. before feeding them to the vulnerable programs makes the The metadata data covers identified heap structures, ROP executions crash at different exploit stages. We conservatively gadget chains, shellcode, and malicious format strings. As treat them as attacker-driven crashes without checking each they occupy a small portion of all non-zero data, we combine input one by one. Third, we collect crashes from the DEFCON them in the figure. The numeric distribution of the remaining 2015 challenge tachikoma. We download the network traffic data is given in Table V, where the number is an average. throughout DEFCON 2015 [28], which presents exploits from top hackers striving to compromise exploitable programs. As On average, DESENSITIZATION can remove 80.9% of non- the only x86 binary, tachikoma contains 14,966 LoCs with zero bytes from the original memory snapshot; 89.5% of all seven intended bugs of various complexity, spanning buffer remaining data are pointers. If the user’s privacy is evenly overflows to logical errors. We extract exploits from 366,792 distributed in the memory, we can reduce the possibility of relevant network sessions and replay them to collect coredumps. information leakage by 80.9%. In fact, most of the user’s private Totally, we get 5,883 attacker-driven coredumps, including 400 information is allocated in concentrated locations, like a string for ffmpeg, 200 for php, 100 for chakra, 50 for firefox, and representing a website cookie. As such information is not likely 5,133 for tachikoma. More information on the PoCs, exploits, to be pointers, we believe the real result of privacy protection and crashes can be found in Table VI in §A. should be more appealing than what we reported here. Figure 5a shows the result of applying DESENSITIZATION We evaluate DESENSITIZATION on a 32-core machine on ffmpeg coredumps. On average, DESENSITIZATION can running 16.04, with Intel Xeon Gold 6136 processors nullify 93.50% of all non-zero bytes for each coredump. at 3GHz and 128GB memory. We limit DESENSITIZATION Among the remaining data, 77% of them are pointers. Heap to only utilize four cores to process each crash as normal metadata and ROP payloads account for 9.83% and 13.0%, desktops usually have a small number of cores, like four or respectively. Shellcode payloads and malicious strings take eight. DESENSITIZATION identifies necessary bug-related and negligible portions in the desensitized memory. We inspect the attack-related data from the memory and nullifies other bytes memory snapshot manually and find that a large portion of to remove potential privacy-sensitive data. the original memory contains multimedia files or part of their

8 Prog Size (K) Pointers Heap ROP Shcode FmtStr for developers to diagnose bugs or detect attacks. Nevertheless, ffmpeg 366 77.0% 9.83% 13.0% .110% .004% our DESENSITIZATION can still remove a large amount of php 974 90.2% 8.45% 1.28% .043% .016% unnecessary data from minidumps to protect the user’s privacy. chakra 1,567 97.2% 1.04% 1.74% .014% 0 ff-core 9,953 88.7% 2.95% 8.25% .017% .052% Figure 5f presents the results of DESENSITIZATION on ff-mini 6 82.4% 0 17.6% 0 0 crashes of tachikoma. In particular, DESENSITIZATION can tachikoma 12 68.6% 2.02% 27.8% 1.59% 0 nullify on average 95.6% of non-zero bytes in the coredumps. avg. - 89.5% 3.33% 7.09% 0.03% 0.04% While most are removed, DESENSITIZATION manages to keep 3.02% of the original memory as pointers, 0.09% as heap TABLE V: Average distribution of remaining data per crash report structures, and 1.23% as potential ROP residues. Since the produced by DESENSITIZATION across evaluating benchmarks. coredumps of tachikoma are produced by failed attacks against the synthesized binary, they are not comparable in terms of sizes and complexity (e.g., heap structures) to those from copies. These data usually do not affect the bug analysis but previous benchmarks. However, DESENSITIZATION is shown may reveal the user’s watching history. Removing them will to be consistent in removing data for special-purpose programs. protect user privacy. 2) Reducing Printable Strings: To further quantify the privacy benefit, we measured the reduction of printable strings php Figure 5b shows the result of desensitizing coredumps. by DESENSITIZATION from the original reports. We define a Similarly, DESENSITIZATION can nullify around 58.9% of non- printable string as a sequence of printable characters with zero data in each crashed memory and identify 37.1% of useful a minimum length N. In our evaluation, we use different data as pointers and 4.03% as other metadata. However, for php, values of N, including 1, 5, 9, and 17. Overall, the eval- the ratios of useful data vary significantly from crash to crash: uation serves as a proof-of-concept for illustration due to in the worst case, 64.9% of data is kept in the desensitized the lack of common schemes for measuring privacy leakage. memory, while in the best case, less than 10% of non-zero Figure 6 presents the evaluation results for all benchmarks with data will remain. We inspect the crashes and find that as a different definitions of printable strings. Each figure shows language interpreter, php provides many functionalities, like the percentage of printable strings left in the desensitized parsing multimedia files and maintaining network interfaces. reports. In general, DESENSITIZATION can remove more than The collected PoCs may take different execution paths and crash 95.0% of printable bytes from the coredumps of ffmpeg, php, the program in diverse locations, resulting in different ratios firefox, and tachikoma, achieving a significant privacy benefit. of useful data. For example, one PoC crashes the program by Further, if we define printable strings to have at least 9 bytes, triggering an infinite recursive call, leading to a large number DESENSITIZATION achieves almost 100% of string reduction of stack frames. As each stack frame has many data pointers, for all reports from all programs. For those bytes that cannot like return addresses and saved esp/rsp, DESENSITIZATION be effectively removed, we found that they are mainly kept by only nullifies 35.1% of all non-zero data the modules of ROP, format string, and shellcode as suspicious The evaluation result of chakra in Figure 5c suggests that payloads or arguments. DESENSITIZATION can consistently nullify about 89.1% of all For chakra coredumps, DESENSITIZATION manages to non-zero bytes in each coredump. Meanwhile, an average of remove 84.2% of printable strings with the conservative 10.6%, 0.11%, and 0.19% of the original data are identified definition, i.e., N=1. However, as shown in Figure 6c, in extreme and thus kept as pointers, heap structures, and potential ROP cases, 83.7% of such bytes are left in the reports. Through residues. Other metadata are negligible (less than 0.002%). We manual examination, we learned that those are actually false have to clarify that chakra relies on native systems for memory positives due to the special address layout in certain executions. allocation but maintains its own customized metadata. Currently, Particularly, most printable strings left by DESENSITIZATION DESENSITIZATION is only able to identify the relevant heap have a short length from 2 to 6 bytes. While these bytes structures from ptmalloc, and corruption on custom heap look like printable strings, they are in fact pointers, such as metadata will be missed during the attack analysis. We can 0x55554730 ("0GUU"). If we change the definition of printable extend our heap module to support the customized metadata string to require at least nine consecutive printable bytes (per of chakra, as we have done for ptmalloc and jemalloc. We alignment of x86-64 systems), DESENSITIZATION can remove leave this as a future work. almost all of them. The same explanation applies to certain cases of php coredumps in Figure 6b as well. Figure 5d and Figure 5e provide the results of apply- ing DESENSITIZATION on firefox crashes, the former for Figure 6e shows that DESENSITIZATION is less effective coredumps and the latter for minidumps. Figure 5d shows when handling the minidumps of firefox. With the conserva- that DESENSITIZATION can nullify 77.8% of non-zero data tive definition, it only removes 37.0% of printable byteson in coredumps and leave 19.7% as pointers, 0.65% as heap average. Even if we increase the requirement of the length, structures, and 1.83% as potential ROP residues. Figure 5e DESENSITIZATION still leaves more than half of printable indicates that DESENSITIZATION maintains 42.0% of non-zero strings in the desensitized reports. This is mainly because most bytes as pointers (with no heaps) and 9.00% as ROP residues, user data, especially printable strings, are typically stored in the and nullifies 49.0% from each original memory. Apparently, non-stack regions and thus are not saved in minidumps when DESENSITIZATION identifies more portions of pointers on the program crashes. With a limited number of printable bytes, minidumps. The reason is that a majority of non-pointer data of even a few false positives (like those we explain above) will firefox resides in the heap region, which are not captured by hurt the statistics of DESENSITIZATION. For example, when minidumps. Our result indicates that minidump is less effective handling strings with at least 9 bytes, DESENSITIZATION can

9 90 80 len(s) > 0 len(s) > 4 70 len(s) > 8 60 len(s) > 16 50 40 Size (%) 30 20 10 0 0 500 1000 1500 0 250 500 750 1000 1250 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 1000 2000 3000 4000 5000

(a) ffmpeg (b) php (c) chakra (d) ff-coredump (e) ff-minidump (f) tachikoma

Fig. 6: Percentages of printable strings left in desensitized crash reports. We performed evaluation with different definitions (lengths) of strings, shown in different colors. The overall result is that long strings are less common than short strings.

remove 154 bytes on average, leaving only 52 bytes from each efficient due to the memory locality and meanwhile renders the original crash. Yet the result still gives the reduction ratio of object management simpler, like no explicit malloc or free of printable bytes as 74.8%. Though DESENSITIZATION is not the payload. The nsTAutoStringN class is inherited by many designed to be perfect in terms of privacy preserving, Figure 6e others, such as nsAutoString and nsAutoCString. While the can be misleading in this sense. former is designed for wide characters with an internal buffer of 128 bytes, the latter has 64 bytes for narrow characters. If We also performed an automated scan of private informa- a small string of these classes is allocated on the stack, its tion with simple regular expressions, followed by a manual content will be on the stack as well. Since the minidump keeps analysis of the scanning results. We found several categories all stack content of the crashing thread, short strings will be of interesting information left in the original crash reports but dumped into the minidump, leading to privacy issues. removed by DESENSITIZATION. The information includes but is not limited to local directories, local environment variables, Leaking Cookies. The design of firefox adopts the principle usernames, passwords, cookies, visiting URLs, source IPs, of privilege isolation. In particular, the parent process does network protocols, geo-location coordinates, and snippets of all the security-critical work, like file access and network con- user scripts yet to be executed.§ V-B3 provides selective case nection establishment, while child processes merely serve for studies in more detail for firefox. rendering and parsing. In function SetCookiStringInternal() 3) Case Studies on Privacy Protection: We studied the of Figure 13 in §B, a child process is in charge of source code and the runtime of firefox and identified three parsing cookies and storing cookie attributes (e.g., name, scenarios where a minidump of the program will leak the value, host and expiring time) into stack variables of class user’s privacy, such as website cookies and user inputs. We nsCookieAttributes. As class nsCookieAttributes is derived inspected the desensitized minidumps and confirmed that from class nsAutoCString, any cookie attribute shorter than 64 bytes is stored in its internal buffer, which is in turn allocated on DESENSITIZATION successfully removes all these private data. the stack. Therefore, any crashes inside this function will dump

1 // Subclass of nsTString that allocates stack-based string. the cookie information to the crash report as long as the cookie 2 class nsTAutoStringN : public nsTString{ is shorter than 64 bytes. We inject a bug in firefox to make 3 public: SetCookiStringInternal() 4 nsTAutoStringN(): string_type(mStorage,0, ...), the execution crash inside function 5 mInlineCapacity(N-1){ and use the modified firefox to visit popular websites. Most 6 mStorage[0]= char_type(0); // null-terminate of them will crash and generate a minidump file, where we 7 } ... 8 static const size_t kStorageSize=N; can find the cookie information of the corresponding website. 9 protected: DESENSITIZATION is able to nullify the cookie contents, as 10 friend class nsTSubstring; they are unlikely to be pointers, heap metadata, or any other 11 size_type mInlineCapacity; 12 private: attack residues. 13 char_type mStorage[N]; // internal buffer on stack 14 } Leaking User’s Inputs through Form Submission. Form Fig. 7: Definition of nsTAutoStringN in firefox, inherited by submission is a common way for users to interact with web nsAutoString and nsAutoCString. The internal buffer mStorage may applications, like submitting user names and passwords for leak users’ private information (shaded line). authentication. Figure 14 in §B shows the code of firefox used for preparing the form data as requested by users. In memory, each form field is always in its plaintext unless the Source of Leakage. firefox adopts many techniques to make developer specifies one of multiple enctype attributes to it, the software work efficiently. One of these techniques is the as defined in function GetFromForm(). When firefox handles implementation of auto string classes, namely, nsTAutoStringN, forms of multipart/form-data for users to upload files, func- as shown in Figure 7. By default, the payload of a string is tion FSMultipartFormData::AddNameValuePair() places user allocated dynamically from the heap region, while the class inputs in objects of class nsAutoCString, which are allocated on only keeps some structural information, including a pointer the stack in plaintext. Further, many hidden form fields collect pointing to the real payload in the heap. However, the class user information for web tracking, like geographic locations and also contains a fixed-size array, called mStorage, and uses it timestamps. All of them may leak users’ private information to store the string content if the string is shorter than the array. through minidumps. We use the same method as before to verify This design choice makes access to the string content more that the minidump contains user input if the execution crashes

10 inside function FSMultipartFormData::AddNameValuePair(). ❶ Heap overflow ❶ Heap overflow ❷ GOT overwrite Depending on the concrete website, we find extremely sensitive function pointer data pointer à realloc@GOT information from form fields on the stack, like user names, ❹ ROP chain 2 ❷ ROP chain ❸ shellcode passwords, or even credit card information. DESENSITIZATION ❸ ROP chain 1 ❺ shellcode helps eliminate these sensitive data from minidump files, as heap .bss heap .bss most of them are alphanumeric characters, which are not (a) CVE-2016-10190 (b) CVE-2016-10191 confused with other debug information in the crash report. Fig. 8: Exploit steps against ffmpeg via two CVEs. Both CVEs are Leaking Filling History through Autofilling. Autofilling is heap overflows, and both attacks utilize ROP and code injection. a convenient feature for users to fill a form quickly from historical inputs. Specifically, when an HTML textbox (e.g., 1 , , textarea) has the autocomplete attribute, the browser will provide users a list of historical inputs. Figure 15 in 2) Case Studies of Attack Preservation: We illustrate how §B shows that in function EnterMatch(), a single user-provided the information identified and saved in crash reports by character triggers the browser to find matched historical inputs DESENSITIZATION, such as pointers and heap metadata, can and shows the results in a drop-down list. When users navigate support real-world attack analyses. Particularly, we implemented the drop-down list with up/down keys, or they select one record several proof-of-concept tools to detect attack residues from from the list, the highlighted or selected record will be stored in crash reports before and after DESENSITIZATION. The first a stack variable of class nsAutoString. If the execution crashes tool detects the corruption of various heap structures to identify here, the user information will be copied into the minidump. heap-based vulnerabilities. It walks through the linked lists to Note that some websites use customized JavaScript to handle find inconsistent metadata, such as abnormal chunk size, broken the autofilling by specifying the action type of forms, where the linked bins, and dangling pointers. The second tool searches filling history may be used differently. Similarly, we verify that for malicious format strings to detect format string attacks. It the minidump contains the filling history if firefox crashes at feeds each potentially malicious string to a printf-like function particular functions. We confirm that DESENSITIZATION clears with other identified arguments and verifies that the function’s the sensitive information from the collected minidumps. side-effect to memory is consistent with the crash report. The third tool scans for ROP residues left in the crash report. For each potential ROP payload, the tool runs the gadgets through C. Supporting Bug-analysis and Attack-detection instruction emulation and verifies that the side-effects to stack and frame register (i.e., esp or rsp) match the crash report. The We evaluate desensitized crash reports to verify that they last tool validates GOT.PLT entries with the help of external still contain sufficient information for popular bug-diagnosis debug symbols. We develop these tools based on our experience and attack-analysis techniques. Our evaluation is performed of attack analysis for the purpose of verification. Developers in two ways. First, we feed the crash reports before and after can pick other techniques to analyze a crash report [21], [67], DESENSITIZATION to existing bug-diagnosis tools and expect [37], [75], [66], [65]. the same analysis results. Second, we inspect several attacker- driven crashes to make sure that all data related to the attack As we mention in§ V-A, we feed collected exploits to construction are kept in the desensitized crash reports. each vulnerable program and collect the resulting crashes. Meanwhile, we collect coredumps from tachikoma by replaying 1) Supporting Bug Analysis: We use Socorro [57] and the exploit traffic. Then, we apply our PoC tools against the Backtrace [8] as two bug-analysis tools for our evaluation. The original crash reports and the desensitized ones. Our tools former is the default crash reporting service used by report the same result about the failed attack for both versions products, while the latter is a more generic commercial software of the crash reports, showing that DESENSITIZATION keeps recommended by Mozilla for providing signature-based crash important attack residues in the desensitized report. classification. Both tools rely heavily on call stacks incrash reports along with additional heuristics to generate a signature ffmpeg. Figure 8 depicts exploits against two heap overflow for each crash. Crash reports with the same signature are bugs of ffmpeg: CVE-2016-10190 and CVE-2016-10191. Both grouped together, and the number of crash reports in each attacks corrupt pointers on heap and stitch ROP gadgets and group is used to determine the priority for detailed analysis. finally execute malicious shellcode, but they differ slightly in terms of corruption targets and gadget chaining due to the We feed all the collected coredumps before and after nature of the vulnerabilities. In general, we collect 400 core- DESENSITIZATION to Backtrace and compare their crash dumps from the failing exploits through fuzzing. On average, firefox signatures. We also submit the minidumps of to DESENSITIZATION is able to nullify 93.5% of irrelevant data Socorro through its native crash reporting service and compare as described in§ V-B and keep those of pointers and metadata. signatures per crash. The result shows that DESENSITIZATION Among all remaining bytes, 47,701 bytes are ROP payloads; has no impact on the crash signatures of any dumps, including 386 bytes are potential shellcode payload; 42,200 bytes are coredumps of ffmpeg, php, chakra, and firefox, as well as identified as GOT.PLT entries. Although format string attacks firefox minidumps of . In other words, the crash signatures are not involved, DESENSITIZATION saves another 13 bytes generated by either tool are always the same before and after regarding format strings and related arguments. our DESENSITIZATION on each crash report. This findings validate that DESENSITIZATION merely removes unnecessary Since DESENSITIZATION maintains the heap metadata, our data safely from the crash reports without altering any useful analysis tool on heap integrity is able to detect the same information, at least for debugging processes like classification. overflown heap chunks in the desensitized crash report asthose

11 printf(): printf(): code. In sum, we collect 100 coredumps from failed exploits. leak &bin &libc leak &libc &system DESENSITIZATION successfully removes the irrelevant data hack free@got &system hack stack &exit while keeping pointers and heap structures in a ratio similar &”/bin/ls” call free@got …..... return to others. Furthermore, DESENSITIZATION saves around 221 GOT stack bytes of shellcode payloads and 27,291 bytes of ROP residues, (a) CVE-2015-8617 (b) CVE-2016-4071 as well as 10,072 bytes of GOT.PLT entries, but fails to find any malicious format string in this case. Although chakra uses Fig. 9: Exploit steps against php via two CVEs. Both CVEs are a customized heap manager, DESENSITIZATION manages to format string bugs, and both attacks use the ret2libc technique. save pointers that are relevant to ROP chain residues, as well as identified GOT.PLT entries in the crash report. firefox. We generate firefox crashes by exploiting CVE-2019- in the original one. For exploits that crash in the ROP chains, 9810, which is used in Pwn2own 2019 [60]. The bug is an DESENSITIZATION maintains the memory values pointed by array out-of-bound access due to an incorrect JIT optimization. stack pointers and keeps all the suspicious gadgets in the Our exploit leverages the bug to corrupt the length field of an 1 crash report, helping to capture the residual ROP gadgets. For Uint32Array object on the1 heap, which helps to bypass ASLR those overwriting GOT.PLT entries with incorrect addresses, and leak the base address of module libxul.so. Furthermore, DESENSITIZATION identifies entries in the memory and keeps we use the corrupted Uint32Array object to modify the buffer them in the crash report for validation. When the program address of an adjacent Uint32Array object to achieve arbitrary crashes in shellcode, most payloads are still in the crash report, memory read and write. The exploit controls the PC value by as DESENSITIZATION conservatively retains all potential ones. corrupting the virtual table of a HTMLDivElement object with the address of a crafted virtual table. The stack is then pivoted php. Figure 9 shows exploits against two format string to the heap afterward, where an ROP chain is to be executed. vulnerabilities of php: CVE-2015-8617 and CVE-2016-4071. The ROP payload changes the permission of a particular page The former allows attackers to learn the randomized function as both writable and executable and executes the shellcode address, while the latter recursively modifies the ebp values there. By random mutation through a fuzzing-based approach, on the stack until it overwrites the return address outside we gather 50 coredumps of firefox from failed exploits. As the vulnerable function. We gather 200 coredumps from the usual, DESENSITIZATION collects pointers and heap structures failing exploits through fuzzing. The proportions of unnecessary while nullifying unnecessary data. In addition, it saves 1,718 data, pointers, and other metadata are similar to those of bytes as shellcode and 5,129 bytes as malicious format strings. previous ones. Particularly, DESENSITIZATION collects around Although our tool based on ptmalloc does not apply in this case, 154 bytes of malicious format strings along with their arguments. DESENSITIZATION maintains 820,904 bytes for ROP residues DESENSITIZATION additionally saves 12,494 bytes for potential and 9,352 bytes for GOT.PLT entries in the crash report, which ROP payloads and 417 bytes for shellcode, which are apparently will help other tools on ROP and shellcode detection to pinpoint false negatives. Meanwhile, DESENSITIZATION identifies 6,632 the attack details [21], [67], [37], [75], [66], [65]. bytes for GOT.PLT entries. tachikoma. The challenge tachikoma emulates an interactive DESENSITIZATION conservatively saves potential format war game and introduces seven vulnerabilities for attackers to strings and their relevant pointer arguments in crash reports, exploit it. Most of the crashes do not have attack residues as and allows our analysis tool to track down the attacks with much network traffic is just random bytes for sniffing. Ouranal- the desensitized crash reports in most cases. Note that 32-bit ysis tools consistently pick up two attack vectors against distinct systems are easier to tackle, as arguments are stored in the vulnerabilities before and after applying DESENSITIZATION. stack region. Therefore, DESENSITIZATION can find the format The first is caused by attackers overflowing a fixed-size stack string pointer and keep their pointer arguments in relative offset buffer by copying user input from the heap, but it fails due to based on the format specifiers, even if the arguments are non- miscalculated offsets. Our tools can detect the misaligned heap pointer data in randomly mutated exploits. However, since chunks and the ROP-like residues that end up crashing the 64-bit systems pass first several arguments through registers, it binary. The second is caused by attackers exploiting another is hard for DESENSITIZATION to identify possible format string heap overflow vulnerability to redirect the control flow tothe arguments. Thus, DESENSITIZATION only keeps pointers as shellcode on the heap. Accordingly, our tools can catch the heap designed and might remove some arguments when they are non- violation, stack pivoting, and prepared shellcode. Our results pointer data, resulting in false negatives. As php is compiled coincide with the statistics of the CTF, where only two to three as a 32-bit binary, DESENSITIZATION can save necessary of the seven vulnerabilities have been found and exploited by information about format strings to support our analysis. the attending teams. Meanwhile, DESENSITIZATION is able to remove the irrelevant data while keeping 8,503 bytes of chakra. We select CVE-2016-0193 and CVE-2017-0266 of pointers and 288 bytes for GOT.PLT entries. It saves 251 bytes chakra for crash generation, which are caused by heap overrun as heap structures, 3,447 bytes as ROP residues, and 197 bytes and type confusion due to incorrect JIT optimization. The as shellcode, successfully maintaining useful information for exploit of CVE-2016-0193 corrupts the length field of an array our analysis tools to pinpoint exploits. object on the heap and uses the array to overwrite a virtual table pointer to hijack the control flow for ROP attacks. CVE- D. Practicality of DESENSITIZATION 2017-0266 allows attackers to craft a fake DataView object on the user-controlled memory for arbitrary memory read and 1) Performance of DESENSITIZATION: Figure 10 shows the write, thereby corrupting GOT.PLT entries to execute arbitrary performance of DESENSITIZATION on desensitizing crashes

12 25 1e3

4 20

15 3

10 2 Second(s)

5 Memory (MB) 1

0 ffmpeg php chakra ff-coredump ff-minidump tachikoma 0 ffmpeg php chakra ff-coredump ff-minidump tachikoma Fig. 10: Time for generating crash report by DESENSITIZATION. Fig. 11: Peak memory usage for desensitizing crash reports. Mostly, it incurs reasonable time of less than 15 seconds. Mostly, it incurs reasonable memory usage of less than 2.5 gigabytes.

across evaluating benchmarks. In general, DESENSITIZATION 2) File Size Reduction: As DESENSITIZATION strives to takes less than 15 seconds to process each program crash, nullify potential sensitive data, it creates a significant number of including the steps of identifying useful data in different types, zero bytes in the desensitized memory. By treating consecutive nullifying other data bytes, and writing the memory as a sparse zero bytes as holes, we can save the crash report – in a storage- file. While coredumps from ffmpeg and firefox have a median economy and bandwidth-economy way – as a sparse file, which of 9 and 11 seconds respectively, DESENSITIZATION takes only keeps non-zero bytes. Most modern filesystems, like less than 2 seconds consistently for those of php, chakra and ext4 on Linux and NTFS on Windows, support sparse files. tachikoma. The time needed to process each minidump (i.e., DESENSITIZATION also facilitates compression algorithms, 0.5s) is significantly less than that of analyzing coredump (i.e., which adopt similar concepts. To understand the advantage of 15s) due to their intuitive size differences. DESENSITIZATION on file storage, we generate crash reports in four settings: 1) storing the original memory as a sparse file, Our optimizations on DESENSITIZATION can significantly denoted as orig-sparse; 2) storing the desensitized memory as reduce the processing time. First, we find that loading a a sparse file, denoted as desen-sparse; 3) compressing the orig- coredump file into memory can result in an order of magnitude sparse file using the 7zip program, denoted as comp-sparse; more memory usage than the file size. For example, a coredump 4) compressing the desen-sparse file using 7zip, denoted as file of ffmpeg is about 150 MB on disk, but it is bloated comp-desen. Figure 12 shows the size of different crash reports, to 1.8 GB when loaded into memory. The extra memory normalized to the size of the orig-sparse file. The crashes on space, where our analysis should ignore, contains only null the x-axis are sorted by the percentage of identified pointers. bytes. We optimize this process by only loading each program The figures in Figure 12 (except Figure 12e) show that segment using its file_size instead of memory_size. In this DESENSITIZATION can significantly reduce the file size of way, the time required by DESENSITIZATION is reduced by coredumps, regardless of whether they are compressed or not. 56%. Second, the number of mapped memory regions also On average, the sparse file of the desensitized memory is affects the overhead of DESENSITIZATION. Specifically, crashes only 49.5% of the original sparse file. With compression, a from firefox can include up to 340 mapped memory ranges, comp-desen file is only 28.7% of the comp-sparse file.For causing a heavy burden to DESENSITIZATION for validating the chakra program (Figure 12c), DESENSITIZATION achieves pointers. Therefore, we merge consecutive memory regions to the maximum reduction of crash reports, where the desen- reduce the number of comparisons. Furthermore, we maintain sparse file only takes 26.1% of the original size, andthe the merged memory regions as an interval tree, sorting and comp-desen file saves 99.3% of disk space. In the worst case, searching address ranges in logarithmic time complexity. The DESENSITIZATION can save 24.7% of storage for the php overhead is therefore reduced by another 77%. program. When using compression, it still reduces the report size to 4.48%. Figure 11 shows the peak memory usage for processing crashes across benchmarks. In general, DESENSITIZATION From these result we can see that the effectiveness of consumes less than 2.5 gigabytes at the peak to process each DESENSITIZATION on reducing report size has a strong neg- crash. While coredumps from ffmpeg and firefox incur higher ative correlation with the density of pointers (each figure is memory usage with medians of 2 gigabytes, those of php, sorted by the percentage of pointers). This result is reasonable chakra and tachikoma only takes around 154, 429 and 117 because, as shown in Table V, pointers dominate the remaining megabytes per crash, respectively. Again, DESENSITIZATION non-zero data, on average 89.5%. Therefore, a memory space needs less memory resource to process each minidump (i.e., 104 with a low ratio of pointers will be significantly desensitized MB). In general, the memory usage is consistent with the file and left with many large holes of null bytes, leading to a high sizes of processed crash reports. As we mentioned above, most reduction rate by sparse files. However, the compression ratio is coredumps from ffmpeg and firefox present physical sizes of less related to the percentage of pointers. Since the compression more than 1 gigabytes, and so simply reformatting them into mainly depends on the frequency and diversity of data, pointer their memory representation requires a considerable amount of density is less important in this case. Although the results of memory usage. Meanwhile, the complexity of structures within ffmpeg and tachikoma suggest a positive correlation between the crash reports, such as the heap regions, also contributes them, we cannot find a similar relationship from other figures. to the finding as DESENSITIZATION needs more memory to An interesting observation from Figure 12a is that when preserve those pieces of information. the percentage of pointers is small, the desen-sparse file

13 100 desensitize+compress

80 desensitize compress

60

Size (%) 40

20

0 0 500 1000 1500 0 250 500 750 1000 1250 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 250 500 750 1000 1250 1500 0 1000 2000 3000 4000 5000

(a) ffmpeg (b) php (c) chakra (d) ff-coredump (e) ff-minidump (f) tachikoma

Fig. 12: Crash report size with various reduction techniques, unified to the sparse file of the original memory. desensitize saves the desensitized memory as sparse file; compress compresses the sparse file of the original memory; desensitize+compress compresses the sparse file of the desensitized memory.

has a smaller size than the comp-sparse file. In these cases, VI.DISCUSSION DESENSITIZATION is more effective than compression on reducing file sizes. By manually examining the original memory, We discuss several important aspects of DESENSITIZATION, we find that those crashes take large multimedia files as inputs, including its support to sophisticated attack analysis techniques which are left in memory when the program crashes. As the and possible attacks against the new crash report format. input data vary significantly in terms of frequencies and patterns, the compression can only reduce them to around 40% of the A. Analyzability of Desensitized Crashes original file. However, DESENSITIZATION manages to remove DESENSITIZATION strives to achieve a new balance be- most of the input data so that holes of null bytes dominate tween user privacy and the analyzability of the crash reports. the desensitized memory, making the sparse file more effective Our design of DESENSITIZATION follows a common practice of than compression in reducing the file size. protecting user privacy that achieves a significant privacy benefit at the cost of sacrificing some functionalities. For example, A comparison between Figure 12d and Figure 12e reveals the widely adopted minidump aggressively drops all data of that DESENSITIZATION reduces file size more effectively for the crash outside the stack, with the observation that stack coredumps than for minidumps, while the compression works data plays an important role in triaging and analyzing bugs [1]. well for both. Specifically, the desen-sparse files are only 55.8% Meanwhile, our tool starts from known analysis techniques and of the original sparse files for coredumps, while for minidumps, aims to shrink the crash while keeping necessary information. the number is around 92.3%. Simply compressing the original spare files results in 11.2% of the original size for coredumps, To mitigate the reduction of analyzability and support and for minidumps, the number is even smaller, 9.29%. The special use cases, we design DESENSITIZATION in a modular ineffectiveness against minidumps by DESENSITIZATION is mode so that developers can customize it to compensate for its caused by the selective dumping of the crashed memory. As security guarantee. We believe any practical solutions for a large, minidumps mainly contain thread stacks where many pointers complex piece of software would require domain knowledge exist, DESENSITIZATION cannot desensitize as much as that from experts, such as the developers of real-world applications. for the coredumps. Though DESENSITIZATION manages to We provide interfaces to allow developers to specify crash nullify 49.0% of non-zero bytes in minidumps, the removed memories for complete collection or removal of data within bytes are usually inter-mixed with other untouched ones on the the ranges. For example, data near the stack pointers at crash stack, rendering the sparse file less effective in terms of size sites can be saved if needed, which might be useful for limited reduction. Since the stack pointers are similar to each other, reverse execution by developers. On the other hand, matched especially for the saved esp/rsp, the compression method is data patterns through regular expression can also be selectively still able to significantly reduce the file size. preserved when specified.

Figure 12f shows that DESENSITIZATION can reduce the B. Hybrid Bug-report Model sizes of tachikoma coredumps in a consistent ratio. The result is consistent with the findings on memory byte reduction in Crash reports from a large number of users are highly Figure 5f, where DESENSITIZATION can remove data in a redundant: many crashes are caused by the same bug. Therefore, similar percentage for all crashes. In some extreme cases, the current crash report systems widely adopt automatic bug- desen-sparse files after DESENSITIZATION are still as high as classification techniques to group crash reports based on their 67.44% of the orig-sparse files. By manually examining these root causes and prioritize bugs with higher frequencies [57], [8]. crashes, we find that attackers tried to aggressively spray the In this case, our DESENSITIZATION technique helps reduce a heap region with consecutive identical objects filled by data large amount of network bandwidth and server-side storage, pointers to increase the chance of a successful exploitation. As as it removes crash-unrelated information. However, since DESENSITIZATION keeps all pointers, the desensitized sparse developers can benefit from a detailed memory snapshot, file cannot achieve a great reduced size. However, it is necessary or even the crashing input, we can integrate our method to keep all such pointers to assist server-side attack analysis. into a hybrid bug-report system. Specifically, for most end- Meanwhile, the comp-sparse file shows a better compression users, we deploy DESENSITIZATION to protect user privacy ratio than most others for this case due to the consecutive and reduce the crash report size. Meanwhile, we can have identical objects with repeated patterns. some volunteer users submit a complete memory snapshot

14 without DESENSITIZATION to assist developers in debugging server and relies on client-server interaction to construct the the program. To protect the privacy of volunteer users, the crash new input [80]. To achieve this goal, both systems rely on report system can randomly select some from all volunteer users a logging system to record the crashing input, a tracing tool and submit their complete crash report to developers. This to collect the execution path, a taint analysis engine to track hybrid model has less bandwidth and storage while providing the user input, a symbolic execution engine to capture the sufficient information for debugging. path constraints, and an SMT solver to provide a new input. These techniques are computation-intensive and require a lot C. Attacks against DESENSITIZATION of dependencies, either on the client side or on the server side, and thus provide very limited deployability. CAMOUFLAGE Although DESENSITIZATION mainly adopts systematic embeds two optimizations to reduce the information leakage approaches for preserving necessary debugging information, from the new input, specifically, path condition relaxation, some design choices such as identifying thresholds for attacks which modifies the path constraints to allow more satisfying are considered heuristic. While most are decided based on the inputs, and breakable input conditions, which prevent the new survey of exploits publicly available, such as the parameters for input from sharing bytes with the original one [24]. RETOME identifying ROP and shellcode payloads, the numbers can be improves the efficiency of symbolic execution by focusing on ad-hoc and thus provide a limited security guarantee. A stealthy input-related branches [4]. Our system identifies and removes attacker, for example, can craft ROP gadgets with long ROP debugging-irrelevant data from the crash file locally and thus sequences to bypass the detection from DESENSITIZATION, so is self-contained and efficient. that the resulting attack residues would be partially removed. Nevertheless, all the pointers involved in the gadget chains Budi et al. propose kb-anonymity to anonymize sensitive would still be preserved in this case, leaving the possibility datasets while preserving their behaviors before sending them for developers to track them down. Meanwhile, as discussed out for program testing and debugging [14]. The work focuses above, DESENSITIZATION offers interfaces for customization, on achieving indistinguishable property between multiple ho- including the preserving rules for crashing data. Therefore, more mogeneous data items – the sensitive data is retained but cannot sophisticated identification techniques from existing works42 [ ], be linked to individuals. Our work cleans up personal sensitive [66], [65], [21], [37] can easily co-exist with the current data for each crash file, which guarantees the anonymity. prototype once specified by the developers, with a potential A different input triggering the same execution path as the trade-off between accuracy and performance. original one may still leak some private information, especially the path constraints that contain strict requirement on the VII.RELATED WORK inputs. To further protect user privacy, several works search for alternative execution paths that have relaxed constraints SCRASH removes sensitive data from crash files to protect on inputs while triggering the same crash [52], [53], [54]. users’ privacy [13]. It relies on developers to manually annotate MPP relies on symbolic execution to explore program states sensitive variables and uses data-flow analysis to identify to find all paths that induce the same52 crash[ ]. REAP and more sensitive data. It allocates sensitive data in a separated RESPA improve the practicability of MPP by limiting the memory region and quickly erases them during program scope of symbolic execution [53], [54]. Specifically, REAP crashes. However, manual annotation incurs a heavy burden for places randomized, bounded detours around the original path developers and the data-flow analysis has limited scalability; to avoid the path explosion issue. RESPA takes a principled the analysis requires program source code and thus does not way to retain necessary nodes and find the shortest path to work on legacy binaries. Our system automatically identifies reach the buggy location with the least information leakage. and removes sensitive data through crash file analysis and thus does not affect normal executions. Satvat et al. perform an empirical study on 2.5 million crash reports and find an enormous amount of sensitive information, Brickell et al. propose a solution [12] to address the privacy like token IDs, session IDs, and passwords [69]. This result issue when multiple users share their bug reports [38]. In this confirms that current common crash report systems are leaking case, a server accepts many bug reports from professional users. users’ privacy to software vendors. The authors propose a Then, the server adopts a machine-learning algorithm to train simple pattern-based method to identify and replace sensitive a model to assign the description to each new, ambiguous data in URLs and bug descriptions. Our system focuses on the bug report from normal users. During the training and testing, crash file, which may contain more sensitive information. the server does not want the user to learn the internal of the trained model, while the user does not want the server to learn her execution traces. The proposed method combines several VIII.CONCLUSION cryptographic techniques to achieve its goal. Instead of sharing the full bug report, our system removes sensitive information We propose DESENSITIZATION, an extensible framework from the crash file and minimizes the chance to leak a user’s that generates privacy-aware and attack-preserving crash reports. private information. We can use their cryptographic techniques DESENSITIZATION utilizes lightweight analysis to identify bug- to protect the data left in the crash file, but that will require a and attack-related data, and nullifies others to protect users’ significant change on the server side. privacy and reduce the report size. Our evaluation on real-world crashes shows that DESENSITIZATION can effectively remove Castro et al. try to generate a different input that forces 86.6% of all non-zero bytes in coredumps and 40.9% of those the program to execute (i.e., crash) in the same path as the in minidumps. Additionally, the desensitized crash reports can crashing input and send the new input to the server [19]. be reduced to 45.0% of their original sizes, saving significant PANALYST moves most of the work from the client to the resources for report submission and storage.

15 ACKNOWLEDGMENT [19] M. Castro, M. Costa, and J.-P. Martin, “Better Bug Reporting with Better Privacy,” in Proceedings of the 13th ACM International Conference We thank the anonymous reviewers, and our shepherd, on Architectural Support for Programming Languages and Operating Stephen McCamant, for their helpful feedback. We are grateful Systems (ASPLOS), Seattle, WA, Mar. 2008. to Brendan Saltaformaggio for his valuable feedback and [20] S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley, “Unleashing discussions on earlier versions of this paper. This research Mayhem on Binary Code,” in Proceedings of the 33rd IEEE Symposium was supported, in part, by the NSF award CNS-1563848, on Security and Privacy (Oakland), San Francisco, CA, May 2012. CNS-1704701, CRI-1629851 and CNS-1749711, ONR under [21] Y. Cheng, Z. Zhou, M. Yu, X. Ding, and R. H. Deng, “ROPecker: A Generic and Practical Approach for Defending Against ROP Attacks,” grant N00014-18-1-2662, N00014-15-1-2162, N00014-17-1- in Proceedings of the 2014 Annual Network and Distributed System 2895, DARPA AIMEE, and ETRI IITP/KEIT[2014-3-00035], Security Symposium (NDSS), San Diego, CA, Feb. 2014. and gifts from Facebook, Mozilla, Intel, VMware and . [22] V. Chipounov, V. Kuznetsov, and G. Candea, “S2E: A Platform for In Vivo Multi-Path Analysis of Software Systems,” in Proceedings of the 16th ACM International Conference on Architectural Support for REFERENCES Programming Languages and Operating Systems (ASPLOS), Newport [1] N. Adman and M. Satran, “Minidump Files,” https://docs.microsoft.com/ Beach, CA, Mar. 2011. en-us/windows/desktop/Debug/minidump-files, 2018. [23] T. O. Chris Evans, “The Poisoned NULL Byte,” https://googleprojectzero. [2] Adobe Systems Inc., “Adobe: Submitting Crash Reports,” https://helpx. blogspot.com/2014/08/the-poisoned-nul-byte-2014-edition.html, 2014. adobe.com/photoshop/kb/submit-crash-reports.html, 2018. [24] J. Clause and A. Orso, “Camouflage: Automated Anonymization of [3] S. Andersen and V. Abella, “Data Execution Prevention. Changes to Field Data,” in Proceedings of the 33th International Conference on Functionality in XP Service Pack 2, Part 3: Memory Software Engineering (ICSE), Honolulu, HI, May 2007. Protection Technologies,” 2004. [25] W. Cui, X. Ge, B. Kasikci, B. Niu, U. Sharma, R. Wang, and I. Yun, [4] S. Andrica and G. Candea, “Mitigating Anonymity Challenges in “REPT: Reverse Debugging of Failures in Deployed Software,” in Automated Testing and Debugging Systems,” in Proceedings of the Proceedings of the 13th USENIX Symposium on Operating Systems 10th International Conference on Autonomic Computing (ICAC), San Design and Implementation (OSDI), Carlsbad, CA, Oct. 2018. Jose, CA, 2013. [26] W. Cui, M. Peinado, S. K. Cha, Y. Fratantonio, and V. P. Kemerlis, [5] Apple Inc., “Technical Note TN2123: CrashReporter,” https://developer. “RETracer: Triaging Crashes by Reverse Execution from Partial Memory apple.com/library/content/technotes/tn2004/tn2123.html, 2018. Dumps,” in Proceedings of the 38th International Conference on Software [6] H. Argp, “Pseudomonarchia jemallocum,” http://www.phrack.org/issues/ Engineering (ICSE), Austin, Texas, May 2016. 68/10.html, 2012. [27] Y. Dang, R. Wu, H. Zhang, D. Zhang, and P. Nobel, “ReBucket: A [7] P. Argyroudis and C. Karamitas, “Exploiting the jemalloc memory Method for Clustering Duplicate Crash Reports Based on Call Stack allocator: Owning Firefox’s heap,” in Black Hat USA Briefings (Black Similarity,” in Proceedings of the 34th International Conference on Hat USA), Las Vegas, NV, Aug. 2012. Software Engineering (ICSE), Zurich, Switzerland, Jun. 2012. [8] Backtrace, “Crash Deduplication: Triaging Effectively,” https://backtrace. [28] DEF CON Communications, Inc., “DEF CON Media Server,” https: io/blog/engineering/crash-deduplication/, 2017. //media.defcon.org/, 2015. [9] E. Bendersky, “Parsing ELF and DWARF in Python,” https://github.com/ [29] S. Designer, “Non-executable User Stack,” 2000. eliben/pyelftools, 2011. [30] A. D. Federico, A. Cama, Y. Shoshitaishvili, C. Kruegel, and G. Vigna, [10] blackngel, “Malloc des-maleficarum,” http://phrack.org/issues/66/10.html, “How the ELF Ruined Christmas,” in Proceedings of the 24th USENIX 2009. Security Symposium (Security), Washington, DC, Aug. 2015. [11] T. Bletsch, X. Jiang, V. W. Freeh, and Z. Liang, “Jump-oriented [31] J. N. Ferguson, “Understanding the Heap by Breaking It,” in Black Hat programming: a new class of code-reuse attack.” in Proceedings of the USA Briefings (Black Hat ,USA) Las Vegas, NV, Aug. 2007. 6th ACM Symposium on Information, Computer and Communications [32] g463, “The Use of set_head to Defeat the Wilderness,” http://phrack. Security (ASIACCS), Hong Kong, China, Mar. 2011. org/issues/64/9.html, 2007. [12] J. Brickell, D. E. Porter, V. Shmatikov, and E. Witchel, “Privacy- [33] K. Glerum, K. Kinshumann, S. Greenberg, G. Aul, V. Orgovan, preserving Remote Diagnostics,” in Proceedings of the 14th ACM Con- G. Nichols, D. Grant, G. Loihle, and G. Hunt, “Debugging in the (Very) ference on Computer and Communications Security (CCS), Alexandria, Large: Ten Years of Implementation and Experience,” in Proceedings VA, Oct.–Nov. 2007. of the 22nd ACM Symposium on Operating Systems Principles (SOSP), [13] P. Broadwell, M. Harren, and N. Sastry, “Scrash: A System for Big Sky, MT, Oct. 2009. Generating Secure Crash Information,” in Proceedings of the 12th USENIX Security Symposium (Security), Washington, DC, Aug. 2003. [34] W. Gloger, “Ptmalloc2 - a Multi-thread malloc Implementation,” https://github.com/emeryberger/Malloc-Implementations/tree/master/ [14] A. Budi, D. Lo, L. Jiang, and Lucia, “kb-Anonymity: A Model allocators/ptmalloc/ptmalloc2, 2001. for Anonymized Behaviour-preserving Test and Debugging Data,” in Proceedings of the 2011 ACM SIGPLAN Conference on Programming [35] Google, “BreakPad: A Set of Client and Server Components which Language Design and Implementation (PLDI), San Jose, CA, Jun. 2011. Implement a Crash-reporting System,” https://chromium.googlesource. com/breakpad/breakpad/. [15] C. Cadar, D. Dunbar, D. R. Engler et al., “KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems [36] ——, “The Chromium Projects: Reporting a Crash Bug,” https://www. Programs,” in Proceedings of the 8th USENIX Symposium on Operating chromium.org/for-testers/bug-reporting-guidelines/reporting-crash-bug, Systems Design and Implementation (OSDI), San Diego, CA, Dec. 2008. 2018. [16] C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler, “EXE: A [37] M. Graziano, D. Balzarotti, and A. Zidouemba, “ROPMEMU: A System for Automatically Generating Inputs of Death using Symbolic Framework for the Analysis of Complex Code-reuse Attacks,” in Execution,” in Proceedings of the 13th ACM Conference on Computer Proceedings of the 11th ACM Symposium on Information, Computer and and Communications Security (CCS), Alexandria, VA, Oct.–Nov. 2006. Communications Security (ASIACCS), Xi’an, China, May–Jun. 2016. [17] M. Carbone, W. Cui, L. Lu, W. Lee, M. Peinado, and X. Jiang, “Mapping [38] J. Ha, C. J. Rossbach, J. V. Davis, I. Roy, H. E. Ramadan, D. E. Kernel Objects to Enable Systematic Integrity Checking,” in Proceedings Porter, D. L. Chen, and E. Witchel, “Improved Error Reporting for of the 16th ACM Conference on Computer and Communications Security Software That Uses Black-box Components,” in Proceedings of the (CCS), Chicago, IL, Nov. 2009. 2007 ACM SIGPLAN Conference on Programming Language Design [18] N. Carlini and D. Wagner, “ROP is Still Dangerous: Breaking Modern and Implementation (PLDI), San Diego, CA, Jun. 2007. Defenses.” in Proceedings of the 23rd USENIX Security Symposium [39] C. Han, “A collection of JavaScript engine CVEs with PoCs,” https: (Security), San Diego, CA, Aug. 2014. //github.com/tunz/js-vuln-db.

16 [40] H. Hu, S. Shinde, S. Adrian, Z. L. Chua, P. Saxena, and Z. Liang, [65] M. Polychronakis, K. G. Anagnostakis, and E. P. Markatos, “Network- “Data-Oriented Programming: On the Expressiveness of Non-Control level Polymorphic Shellcode Detection using Emulation,” in Proceedings Data Attacks,” in Proceedings of the 37th IEEE Symposium on Security of the 3th Conference on Detection of Intrusions and Malware and and Privacy (Oakland), San Jose, CA, May 2016. Vulnerability Assessment (DIMVA), Berlin, Heidelberg, Germany, Jul. [41] Huku, “Yet Another free() Exploitation Technique,” http://phrack.org/ 2006. issues/66/6.html, 2009. [66] ——, “Emulation-based Detection of Non-self-contained Polymorphic Proceedings of the 10th International Symposium on [42] X. Jia, C. Zhang, P. Su, Y. Yang, H. Huang, and D. Feng, “Towards Shellcode,” in Research in Attacks, Intrusions and Defenses (RAID) Efficient Heap Overflow Discovery,” in Proceedings of the 26th USENIX , Berlin, Germany, Security Symposium (Security), Vancouver, BC, Canada, Aug. 2017. Sep. 2007. [67] M. Polychronakis and A. Keromytis, “ROP Payload Detection using [43] jp, “Advanced Doug lea’s malloc exploits,” http://phrack.org/issues/61/6. Speculative Code Execution,” in Proceedings of the 6th International html, 2003. Conference on Malicious and Unwanted Software (MALWARE), Oct. [44] C. Karamitas, “Python Bindings for Intel’s XED,” https://github.com/ 2011. huku-/pyxed, 2014. [68] F. Qin, C. Wang, Z. Li, H.-s. Kim, Y. Zhou, and Y. Wu, “Lift: A Low- [45] S. Kim, T. Zimmermann, and N. Nagappan, “Crash Graphs: An overhead Practical Information Flow Tracking System for Detecting Aggregated View of Multiple Crashes to Improve Crash Triage,” in Security Attacks,” in Proceedings of the 39th Annual IEEE/ACM Proceedings of the 2011 IEEE/IFIP 41st International Conference on International Symposium on Microarchitecture (MICRO), 2006. Dependable Systems and Networks (DSN), Washington, DC, 2011. [69] K. Satvat and N. Saxena, “Crashing Privacy: An Autopsy of a Web [46] B. Lee, C. Song, Y. Jang, T. Wang, T. Kim, L. Lu, and W. Lee, Browser’s Leaked Crash Reports,” arXiv preprint arXiv:1808.01718, “Preventing Use-after-free with Dangling Pointers Nullification,” in 2018. Proceedings of the 2015 Annual Network and Distributed System Security [70] A. Schroter, N. Bettenburg, and R. Premraj, “Do Stack Traces Help Symposium (NDSS), San Diego, CA, Feb. 2015. Developers Fix Bugs?” in Proceedings of the 7th IEEE Working [47] B. Lee, C. Song, T. Kim, and W. Lee, “Type casting verification: Stopping Conference on Mining Software Repositories (MSR), Big Sky, MT, an emerging attack vector.” in Proceedings of the 24th USENIX Security May 2010. Symposium (Security), Washington, DC, Aug. 2015. [71] F. Schuster, T. Tendyck, C. Liebchen, L. Dvai, A.-R. Sadeghi, and [48] B. Liblit and A. Aiken, “Building a Better Backtrace: Techniques for T. Holz, “Counterfeit object-oriented programming: On the difficulty Postmortem Program Analysis,” 2002, technical report. of preventing code reuse attacks in C++ applications.” in Proceedings of the 36th IEEE Symposium on Security and Privacy (Oakland), San [49] Z. Lin, J. Rhee, X. Zhang, D. Xu, and X. Jiang, “SigGraph: Brute Jose, CA, May 2015. Force Scanning of Kernel Data Structure Instances Using Graph-based Signatures,” in Proceedings of the 18th Annual Network and Distributed [72] SecurityFocus, “SecurityFocus,” http://www.securityfocus.com/, 2010. System Security Symposium (NDSS), San Diego, CA, Feb. 2011. [73] H. Shacham, “The Geometry of Innocent Flesh on the Bone: Return-into- [50] Linux, “printf(3) - Linus Man Page,” https://linux.die.net/man/3/printf, libc Without Function Calls (on the x86),” in Proceedings of the 14th 2018. ACM Conference on Computer and Communications Security (CCS), Alexandria, VA, Oct.–Nov. 2007. [51] Linux Programmer’s Manual, “core - core dump file,” http://man7.org/ linux/man-pages/man5/core.5.html. [74] Shellphish, “Educational Heap Exploitation,” https://github.com/ shellphish/how2heap, 2016. [52] P. Louro, J. Garcia, and P. Romano, “Multipathprivacy: Enhanced privacy [75] K. Z. Snow, S. Krishnan, F. Monrose, and N. Provos, “SHELLOS: in fault replication,” in Proceedings of the Ninth European Dependable Enabling Fast Detection and Forensic Analysis of Code Injection Attacks,” Computing Conference, 2012. in Proceedings of the 20th USENIX Security Symposium (Security), San [53] J. Matos, J. Garcia, and P. Romano, “Reap: Reporting Errors using Francisco, CA, Aug. 2011. Alternative Paths,” in Proceedings of 2014 European Symposium on [76] D. Song, D. Brumley, H. Yin, J. Caballero, I. Jager, M. G. Kang, Programming Languages and Systems, 2014. Z. Liang, J. Newsome, P. Poosankam, and P. Saxena, “BitBlaze: A New [54] ——, “Enhancing Privacy Protection in Fault Replication Systems,” in Approach to Computer Security via Binary Analysis,” in Proceedings of Proceedings of the 26th International Symposium on Software Reliability 2008 International Conference on Information Systems Security, 2008. Engineering (ISSRE), 2015. [77] st4g3r, “House of Einherjar - Yet Another Heap Exploitation [55] Microsoft, “!analyze,” https://docs.microsoft.com/en-us/windows- Technique on GLIBC,” https://github.com/st4g3r/House-of-Einherjar- hardware/drivers/debugger/-analyze, 2018. CB2016, 2016. [56] N. Modani, R. Gupta, G. Lohman, T. Syeda-Mahmood, and L. Mignet, [78] Ubuntu, “Apport Crash Duplication Detection,” https://blueprints. “Automatically Identifying Known Software Problems,” in IEEE 23rd launchpad.net/ubuntu/+spec/apport-crash-duplicates, 2007. International Conference on Data Engineering. IEEE, 2007, pp. 433– [79] ——, “Ubuntu: CrashReporting,” https://wiki.ubuntu.com/ 441. CrashReporting, 2018. [57] Mozilla, “Socorro,” 2010, https://github.com/mozilla-services/socorro. [80] R. Wang, X. Wang, and Z. Li, “Panalyst: Privacy-Aware Remote Error [58] ——, “Mozilla ,” https://support.mozilla.org/en-US/kb/ Analysis on Commodity Software,” in Proceedings of the 17th USENIX mozillacrashreporter, 2018. Security Symposium (Security), San Jose, CA, Jul.–Aug. 2008. [59] ——, “,” https://www.bugzilla.org/, 2019. [81] J. Xu, D. Mu, P. Chen, X. Xing, P. Wang, and P. Liu, “CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core [60] ——, “CVE-2019-9810,” https://www.mozilla.org/en-US/security/ Dump,” in Proceedings of the 23rd ACM Conference on Computer and advisories/mfsa2019-10/#CVE-2019-9810, 2019. Communications Security (CCS), Vienna, Austria, Oct. 2016. [61] J. Newsome and D. X. Song, “Dynamic Taint Analysis for Automatic [82] J. Xu, D. Mu, X. Xing, P. Liu, P. Chen, and B. Mao, “Postmortem Detection, Analysis, and SignatureGeneration of Exploits on Commodity Program Analysis with Hardware-Enhanced Post-Crash Artifacts,” in Software,” in Proceedings of the 12th Annual Network and Distributed Proceedings of the 26th USENIX Security Symposium (Security), System Security Symposium (NDSS), San Diego, CA, Feb. 2005. Vancouver, BC, Canada, Aug. 2017. [62] Offensive Security, “Exploits Database by Offensive Security,” http: [83] M. Zalewski, “AFL-Fuzz: Crash Exploration Mode,” https://lcamtuf. //www.exploit-db.com/, 2018. blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html, 2018. [63] N. L. Petroni Jr and M. Hicks, “Automated Detection of Persistent Kernel Control-flow Attacks,” in Proceedings of the 14th ACM Conference on Computer and Communications Security (CCS), Alexandria, VA, Oct.– Nov. 2007. [64] P. Phantasmagoria, “Exploiting The Wilderness,” http://seclists.org/vuln- dev/2004/Feb/25, 2004.

17 APPENDIX B. Cases of Privacy Leakage

A. PoCs and Crash Distribution 1 struct nsCookieAttributes { 2 nsAutoCString name, value, host, path; ... 3 } Benchmark Bugs Crashes 4 nsresult CookieServiceChild::SetCookieStringInternal(...){ ffmpeg (21) #4294, #4931, #4933, #4969, #5098, #5099, 1,667 5 // loop to parse cookie string data #5210, #5487, #5528, #5683, #5736, #5752, 6 do { 7 nsCookieAttributes cookieAttributes; #6107, #6303, #6491, #6640, #6804, #6873, 8 moreCookies= nsCookieService::CanSetCookie( #6887 9 aHostURI, key, cookieAttributes, requireHostMatch, CVE-2016-10190, CVE-2016-10191 400 10 cookieStatus, cookieString, serverTime, ...) 11 // more actions... php (17) #71311, #71436, #71449, #71450, #71637 1,313 12 } while (moreCookies); (CVE-2016-4344), #71735 (CVE-2016-3132), 13 } #72099 (CVE-2016-4539), #72403, #72595, Fig. 13: The code of handling cookie strings in firefox. The shaded #73029 (CVE-2016-7417), #73240, #74577, line may leave user’s cookies in crash reports. #74593, #74604 (CVE-2017-9118), #74977 CVE-2015-8617, CVE-2016-4071 200

chakra (32) CVE-2017-11764, CVE-2017-11799, CVE- 1,497 1 nsresult FSMultipartFormData::AddNameValuePair( 2017-11839, CVE-2017-11841, CVE-2017- 2 const nsAString& aName, const nsAString& aValue) { 11870, CVE-2017-11873, CVE-2017-11893, 3 nsAutoCString encodedVal; 4 nsresult rv= EncodeVal(aValue, encodedVal, false); CVE-2017-11911, CVE-2017-8548, CVE-2017- 5 // prepare more headers & data... 8636, CVE-2017-8640, CVE-2017-8646, CVE- 6 } 2017-8656, CVE-2017-8671, CVE-2017-8729, 7 nsresult HTMLFormSubmission::GetFromForm(HTMLFormElement* aForm, CVE-2017-8755, CVE-2018-0758, CVE-2018- 8 HTMLFormSubmission** aFormSubmission, ...) { 0767, CVE-2018-0769, CVE-2018-0776, CVE- 9 // Get encoding 2018-0776, CVE-2018-0780, CVE-2018-0834, 10 auto encoding= GetSubmitEncoding(aForm)->OutputEncoding(); CVE-2018-0835, CVE-2018-0837, CVE-2018- 11 // Choose encoder 0840, CVE-2018-0860, CVE-2018-0953, CVE- 12 if (method== NS_FORM_METHOD_POST&& 13 enctype== NS_FORM_ENCTYPE_MULTIPART) { 2018-8279, CVE-2018-8291 14 // create headers & data for FSMultipartFormData; CVE-2016-0193, CVE-2017-0266 100 15 // in turn calls AddNameValuePair() 16 *aFormSubmission= new FSMultipartFormData(actionURL, firefox (16) #1389812, #1397642, #1530958, #1532599, 30 17 target, encoding, aOriginatingElement); #1536768, #1538120, #1544386, CVE-2018- 18 // Else, other encoders ... 12386, CVE-2018-12387, CVE-2018-5091, 19 } CVE-2018-5092, CVE-2018-5095, CVE-2018- Fig. 14: The sample code of preparing form data in firefox. 5130, CVE-2019-9791, CVE-2019-9813 Alex Top 1500* 3000* CVE-2019-9810 50 1 nsresult nsAutoCompleteController::EnterMatch( 2 bool aIsPopupSelection, dom::Event* aEvent){ TABLE VI: The bugs for generating crashes of various real- 3 nsCOMPtr input(mInput); 4 nsCOMPtr popup(GetPopup()); world benchmarks, along with the number of crashes collected. 5 The rows in bold indicate bugs used for generating attacker- 6 int32_t selectedIndex; driven crashes. The row with asterisk superscript for firefox in- 7 popup->GetSelectedIndex(&selectedIndex); dicates the crashes caused by SIGABRT signals when visting 8 bool shouldComplete; 9 input->GetCompleteDefaultIndex(&shouldComplete); https://trac.ffmpeg.org/ticket/[BUGID] the websites. Visit and 10 https://bugs.php.net/bug.php?id=[BUGID] for the detailed reports 11 nsAutoString value; of the ffmpeg and php bugs, respectively. The PoCs of all the chakra 12 if (selectedIndex>=0){ and firefox bugs can be found at [39] and [59]. 13 // selected index for popup ... 14 GetResultValueAt(selectedIndex, true, value); 15 } else if (shouldComplete) { 16 // incomplete input triggering matched entry 17 nsAutoString defaultIndexValue; 18 if (GetFinalDefaultCompleteValue(defaultIndexValue)) 19 value= defaultIndexValue; 20 // else... 21 } Fig. 15: The sample code of autofilling per history inputs in firefox.

18