Spandex: Secure Password Tracking for Android Landon P
Total Page:16
File Type:pdf, Size:1020Kb
SpanDex: Secure Password Tracking for Android Landon P. Cox, Peter Gilbert, Geoffrey Lawler, Valentin Pistol, Ali Razeen, Bi Wu, and Sai Cheemalapati, Duke University https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/cox This paper is included in the Proceedings of the 23rd USENIX Security Symposium. August 20–22, 2014 • San Diego, CA ISBN 978-1-931971-15-7 Open access to the Proceedings of the 23rd USENIX Security Symposium is sponsored by USENIX SpanDex: Secure Password Tracking for Android Landon P. Cox Peter Gilbert Geoffrey Lawler Valentin Pistol Duke University Duke University Duke University Duke University Ali Razeen Bi Wu Sai Cheemalapati Duke University Duke University Duke University Abstract to indicate which parts of the system state hold secret information. Taint tracking has been extensively stud- This paper presents SpanDex, a set of extensions to An- ied for many decades and has practical appeal because it droid’s Dalvik virtual machine that ensures apps do not can be transparently implemented below existing inter- leak users’ passwords. The primary technical challenge faces [11, 19, 5, 14]. addressed by SpanDex is precise, sound, and efficient Most taint-tracking monitors handle only explicit handling of implicit information flows (e.g., information flows, which directly transfer secret information from an transferred by a program’s control flow). SpanDex han- operation’s source operands to its destination operands. dles implicit flows by borrowing techniques from sym- However, programs also contain implicit flows, which bolic execution to precisely quantify the amount of infor- transfer secret information to objects via a program’s mation a process’ control flow reveals about a secret. To control flow. Implicit flows are a long-standing prob- apply these techniques at runtime without sacrificing per- lem [8] that, if left untracked, can dangerously under- formance, SpanDex runs untrusted code in a data-flow state which objects contain secret information. On the sensitive sandbox, which limits the mix of operations that other hand, existing techniques for securely tracking im- an app can perform on sensitive data. Experiments with plicit flows are prone to significantly overstating which a SpanDex prototype using 50 popular Android apps and objects contain secret information. an analysis of a large list of leaked passwords predicts s that for 90% of users, an attacker would need over 80 Consider secret-holding integer variable and pseudo- s x = a y = b login attempts to guess their password. Today the same code if != 0 then : else : done. This code a x b y attacker would need only one attempt for all users. contains explicit flows from to and from to as well as implicit flows from s to x and s to y. A secure monitor must account for the information that flows from s to x 1 Introduction and s to y, regardless of which branch the program takes: y’s value will depend on s even when s is non-zero, and Today’s consumer mobile platforms such as Android and x’s value will depend on s even when s is zero. iOS manage large ecosystems of untrusted third-party Existing approaches to tracking implicit flows apply applications called “apps.” Apps are often integrated static analysis to all untaken execution paths within the with remote services such as Facebook and Twitter, and scope of a tainted conditional branch. The goal of this it is common for an app to request one or more pass- analysis is to identify all objects whose values are influ- words upon installation. Given the critical and ubiqui- enced by the condition. Strong security requires such tous role that passwords play in linking mobile apps to analysis to be applied conservatively, which can lead cloud-based platforms, it is paramount that mobile op- to prohibitively high false-positive rates due to variable erating systems prevent apps from leaking users’ pass- aliasing and context sensitivity [10, 14]. words. Unfortunately, users have no insight into how In this paper, we describe a set of extensions to An- their passwords are used, even as credential-stealing mo- droid’s Dalvik virtual machine (VM) called SpanDex bile apps grow in number and sophistication [12, 13, 24]. that provides strong security guarantees for third-party Taint tracking is an obvious starting point for securing apps’ handling of passwords. The key to our approach is passwords [11]. Under taint tracking, a monitor main- focusing on the common access patterns and semantics tains a label for each storage object. As a process ex- of the data type we are trying to protect (i.e., passwords). ecutes, the monitor dynamically updates objects’ labels SpanDex handles implicit flows by borrowing tech- 1 USENIX Association 23rd USENIX Security Symposium 481 niques from symbolic execution to precisely quantify tivation, Section 3 provides an overview of SpanDex’s the amount of information a process’ control flow re- design, Section 4 describes SpanDex’s design in detail, veals about a secret. Underlying this approach is the Section 5 describes our SpanDex prototype, Section 6 observation that as long as implicit flows transfer a safe describes our evaluation, and Section 7 provides our con- amount of information about a secret, the monitor need clusions. not worry about where this information is stored. For ex- ample, mobile apps commonly branch on a user’s pass- word to check that it contains a valid mix of characters. 2 Background and motivation As long as the implicit flows caused by these operations reveal only that the password is well formatted, the mon- Under dynamic information-flow tracking (i.e., taint itor does not need to update any object labels to indicate tracking), a monitor maintains a label for each storage which variables’ values depend on this information. object capable of holding secret information. A label To quantify implicit flows at runtime without sacrific- indicates what kind of secret information its associated ing performance, SpanDex executes untrusted code in object contains. Labels are typically represented as an a data-flow defined sandbox. The key property of the array of one-bit tags. Each tag is associated with a differ- sandbox is that it uses data-flow information to restrict ent source of secret data. A tag is set if its object’s value how untrusted code operates on secret data. In particular, depends on data from the tag’s associated source. Oper- SpanDex is the first system to use constraint-satisfaction ations change objects’ state by transferring information problems (CSPs) at runtime to naturally prevent pro- from one set of objects to another. Monitors propagate grams from certain classes of behavior. For example, tags by interposing on operations that could transfer se- SpanDex does not allow untrusted code to encrypt secret cret information, and then updating objects’ labels to re- data using its own cryptographic implementations. In- flect any data dependencies caused by an operation. We stead, SpanDex’s sandbox forces apps that require cryp- say that information derived from a secret is safe if it re- tography to call into a trusted library. veals so little about the original secret that releasing the SpanDex does not “solve” the general problem of im- information poses no threat. However, if information is plicit flows. If the amount of secret information revealed unsafe, then it should only be released to a trusted entity. through a process’ control flow exceeds a safe threshold, then a monitor must either fall back on conservative static 2.1 Related work: soundness, precision, analysis for updating individual labels or simply assume that all subsequent process outputs reveal confidential in- and efficiency formation. However, we believe that the techniques un- The three most important considerations for taint track- derlying SpanDex may be applicable to important data ing are soundness, precision, and efficiency. Tracking types besides passwords, including credit card numbers is sound if it can identify all process outputs that con- and social security numbers. Experiments with a proto- tain an unsafe amount of secret information. Soundness type implementation demonstrate that SpanDex is a prac- is necessary for security guarantees, such as preventing tical approach to securing passwords. Our experiments unauthorized accesses of secret information. Tracking is show that SpanDex generates far fewer false alarms than precise if it can identify how much secret information a the current state of the art, protects user passwords from process output contains. Precision can be tuned along a strong attacker, and is efficient. two dimensions: better storage precision associates la- This paper makes the following contributions: bels with finer-grained objects, and better tag precision SpanDex is the first runtime to securely track pass- associates finer-grained data sources with each tag. • word data on unmodified apps at runtime without Imprecise tracking leads to overtainting, in which safe overtainting or poor performance. outputs are treated as if they are unsafe. A common way SpanDex is the first runtime to use online CSP- to compensate for imprecise tracking is to require users • solving to force untrusted code to invoke trusted li- or developers to declassify tainted outputs by explicitly braries when performing certain classes of compu- clearing objects’ tags. tation on secret data. Tracking is efficient if propagating tags slows oper- Experiments with a SpanDex prototype show that • ations by a reasonable amount. The relationship be- it imposes negligible performance overhead, and a tween efficiency and precision is straightforward: in- study of 50 popular, non-malicious unmodified An- creasing storage precision causes a monitor to propagate droid apps found that all but eight executed nor- tags more frequently because it must interpose on lower- mally. level operations; increasing tag precision causes a moni- The rest of this paper is organized as follows: Sec- tor to do more work each time it propagates tags.