Jswhiz Static Analysis for Javascript Memory Leaks
Total Page:16
File Type:pdf, Size:1020Kb
JSWhiz Static Analysis for JavaScript Memory Leaks Jacques A. Pienaar Robert Hundt Purdue University Google, Inc. [email protected] [email protected] Abstract all major web browsers, such as Microsoft’s Internet Ex- JavaScript is the dominant language for implementing dy- plorer, Mozilla’s Firefox, Apple’s Safari, Google’s Chrome, namic web pages in browsers. Even though it is standard- and many others. With JavaScript’s execution speed acceler- ized, many browsers implement language and browser bind- ating, efforts such as node.js [1] promote using JavaScript ings in different and incompatible ways. As a result, a in servers as well. plethora of web development frameworks were developed to Despite the fact that JavaScript and its semantics have hide cross-browser issues and to ease development of large been standardized, the various JavaScript engines and their web applications. An unwelcome side-effect of these frame- embedding browsers implement important parts of Java- works is that they can introduce memory leaks, despite the Script in incompatible ways, a problem that has burdened fact that JavaScript is garbage collected. Memory bloat is a the web development community for years. As a result, a major issue for web applications, as it affects user perceived whole range of JavaScript development environments were latency and may even prevent large web applications from developed to ease development of larger JavaScript web ap- running on devices with limited resources. plications and to hide cross-browser differences. Prominent In this paper we present JSWhiz, an extension to the libraries such as JQuery [10] and Dojo [24], as well as frame- open-source Closure JavaScript compiler. Based on experi- works such as Google Web Toolkit [12, 15] or Google’s Clo- ences analyzing memory leaks in Gmail, JSWhiz detects five sure [8] are widely used and applications written with these identified common problem patterns. JSWhiz found a total toolsets are being used by hundreds of millions of users ev- of 89 memory leaks across Google’s Gmail, Docs, Spread- ery day. Google Closure stands out as it contains a JavaScript sheets, Books, and Closure itself. It contributed significantly compiler as well as library code. The (open-source) Closure in a recent effort to reduce Gmail memory footprint, which compiler is based on Mozilla’s Rhino [21]. It accepts legal resulted in bloat reduction of 75% at the 99th percentile, and JavaScript as input and produces JavaScript as output. It con- by roughly 50% at the median. tains a JavaScript comment-based type system to aid large- scale developments. The type system is also helpful in the Categories and Subject Descriptors D.3.4 [Processors]: context of this paper. The compiler offers many optimiza- Compilers; D.3.4 [Processors]: Optimization tions targeting performance, minification and obfuscation, General Terms Performance, Static Analysis which are all important for web applications. An interesting and unwelcome side-effect of these frame- Keywords Optimization, Performance, Static Analysis, works’ abstractions and wrappers is that they can introduce Memory Leak, JavaScript memory leaks, despite the fact that JavaScript is garbage col- lected. As with any garbage collected language, if references 1. Introduction to otherwise dead objects exist, the objects are reachable and JavaScript is the dominant language for implementing dy- cannot be reclaimed. In browsers, a simple example could namic web pages in browsers. It is the “Lingua Franca” of be that a group of DOM nodes is managed as a “view”, and the web. It has been standardized [14] and is supported by the application maintains a (badly managed) cache of these views. Since there are JavaScript objects maintaining refer- ences to DOM nodes, those can never be freed, unless the references from JavaScript are set to null as well. Permission to make digital or hard copies of all or part of this work for personal or A specifically common problem in browsers is being in- classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation troduced by the abstractions to handle the incompatible na- on the first page. To copy otherwise, to republish, to post on servers or to redistribute tive browser event systems. Typically, the frameworks pro- to lists, requires prior specific permission and/or a fee. CGO’13 23-27 February 2013, Shenzhen China. vide their own abstractions and wrap the native event sys- 978-1-4673-5525-4/13/$31.00 ©2013 IEEE. $15.00 tems to hide the browser differences. Often, global reg- istries are being used where references to events and event The rest of this paper is organized as follows. We show a targets are being maintained and have to be explicitly re- correlation between memory bloat and latency degradation leased to avoid leaks. This is similar to, e.g., the C++ new in Section 2. We describe the current manual labor intensive and delete memory allocation API requiring matching calls. process to find memory leaks in Section 3. In this section we This situation is further complicated by modern, dynami- also describe in detail the identified seven problem patterns cally compiled, browser-embedded execution engines. Com- and their resolution. We then discuss the static memory piled JavaScript living in an internal code cache can hang on leak identification tool we have developed in Section 4. We to DOM nodes which are allocated in the rendering parts of discuss results in Section 5, before we present related work the browser. Such problems are typically very hard to find, in Section 6. We conclude in Section 7. specifically because of insufficient tool support in browsers. Forgetting to manually “unlisten” to events can lead to leak- 2. Impact of Memory Bloat on Latency ing DOM nodes and unlimited memory growth. The Gmail team regularly got complaints from users about Memory bloat for web applications is big problem, it massive size Gmail processes. 2GB were not uncommon, impacts, as we shall show, user experienced latency, and some users reported memory usage of over 10GB. For mo- can prevent larger, long running web applications from even bile devices, such as Chromebooks or mobile tablets, which running on smaller devices. In the context of Gmail, we typically have less physical memory available than desktop studied the impact of memory bloat itself, as well its impact computers, Gmail would consume more memory than avail- on user experienced latency. able and be terminated on a regular basis. The Gmail team started an effort to reduce memory bloat To further study the impact of memory bloat on la- and identified and fixed several dozen problems in a labor tency, we ran a simple Gmail scenario (compose/discard) intensive, manual process. Since Gmail utilizes the Closure repeatedly for many hours. In Figure 1 the x-axis denotes compiler, the question was whether it would be possible time, 21,056 seconds for the whole experiment. The left y- to automate finding of these problems with help of static axis shows the latency in Milli-seconds, from 0ms to over analysis in the compiler. 2,000ms. The right y-axis shows the number of DOM nodes, The results presented in this paper are specific to the use which linearly increases in this experiment to up to over of the Closure library and mostly to the event system abstrac- 2,000,000 nodes. tions made in this framework. However, we want to empha- As the number of DOM nodes continues to grow into size that the problem itself is general—any framework that the millions because of Gmail memory leaks, we can see uses auxiliary data structures to maintain certain objects’ that the magnitude of the latency outliers also increases. lifetimes is susceptible to introducing memory leaks from An operation that took just about 100ms in the beginning mis-matched API calls. For example, the jQuery framework can take more than 2 seconds at the end of the experiment. has several leak patterns [3], and both the Dojo framework This can be explained by the effects of garbage collection. and the YUI framework suffer from leak patterns in the event In order to detect detached DOM trees the browser has to system [2, 4]. We focus on the Closure specific leak patterns, traverse all nodes, an operation that scales linearly to the but believe that many of these other patterns can be found number of DOM nodes. with static analysis similar to the one presented here. We The effect of this can be observed for real individual user make the following main contributions: accounts as well. Figure 2 shows how memory grows for • We show the correlation of memory bloat and latency a single user over the course of 4 days. Starting at only a degradations for web applications; few hundred MB, memory grows to roughly 1.5GB. Figure 3 shows the distribution of latency for this user. The blue • We identify seven patterns that introduce memory leaks dots represent user requests that were done during the day, in JavaScript; these examples are Closure specific, but, “empty” white space correspond to this user’s inactive night again, the conceptual problems are general. time. Latency variation increases greatly with memory bloat. • We developed compiler passes to identify five of them Latency outliers grow in size from roughly 2 seconds to over reliably, without any false positives. This implementation 12 seconds. will be released open-source with Google Closure. Using the new compiler passes, we found 89 memory 3. Memory leak patterns leaks across several large Google applications, such as Garbage collectors used by the modern browsers are able to Gmail, Docs, Spreadsheets, Presentations, Books, and the reclaim memory in a great assortment of varied and com- Closure libraries.