Strong Protection of Sensitive Textual Content of Mobile Applications

SchrodinText: Strong Protection of Sensitive Textual Content of Mobile Applications Ardalan Amiri Sani Computer Science Department University of California, Irvine [email protected] Abstract Keywords Many mobile applications deliver and show sensitive and Text protection; Mobile devices; UI safety; TrustZone; Vir- private textual content to users including messages, social tualization network posts, account information, and verification codes. All such textual content must only be displayed to the user 1. INTRODUCTION but must be strongly protected from unauthorized access There are millions of mobile applications today. The tex- in the device. Unfortunately, this is not the case in mobile tual content that many of these applications show to the devices today: malware that can compromise the operating user in their UI can contain extremely sensitive and pri- system, e.g., gain root or kernel privileges, can easily access vate information, such as social security number, bank ac- textual content of other applications. count information, private messages, passwords (in a pass- In this paper, we present SchrodinText, a system solution word vault), and verification codes used for two-factor au- for strongly protecting the confidentiality of application's se- thentication. Such content must only be displayed to the user lected UI textual content from a fully compromised operat- but must be otherwise protected against unauthorized access ing system. SchrodinText leverages a novel security monitor by malware. Indeed, Zhou et al. found 644 malware samples based on two hardware features on modern ARM processors: (in 27 families) that harvest \user's information, including virtualization hardware and TrustZone. Our key contribu- user accounts and short messages stored on the phones" [54]. tion is a set of novel techniques that allow the operating Application developers attempt to defeat this by encrypting system to perform the text rendering without needing access the raw text in their backend server and send the ciphertext to the text itself, hence minimizing the Trusted Computing to the mobile application. However, the application needs Base (TCB). These techniques, collectively called oblivious to decrypt the ciphertext before passing it to the operating rendering, enable the operating system to rasterize and lay system for rendering and showing to the user. Therefore, out all the characters without access to the text; the monitor malware that compromises the operating system (or even only resolves the right character glyphs onto the framebuffer just its graphics stack) can access the content. observed by the user and protects them from the operating This is a growing concern as mobile operating systems are system, e.g., against DMA attacks. We present our proto- increasingly large and unreliable. Malware hiding inside or- type using an ARM Juno development board and Android dinary unprivileged applications can easily compromise the operating system. We show that SchrodinText incurs no- operating system and gain root or kernel privileges [50, 53]. ticeable overhead but that its performance is usable. Such malware can then attempt to extract sensitive content \If one has left this entire system to itself for an hour, from other applications using various methods. For exam- one would say that the cat still lives if meanwhile no atom ple on Android, these methods include but are not limited has decayed. The first atomic decay would have poisoned it. The psi-function of the entire system would express to the following. Malware with root privilege can (i) take a this by having in it the living and dead cat (pardon the screenshot of the device UI, which might contain the afore- expression) mixed or smeared out in equal parts. It is mentioned sensitive text or (ii) replace Android's UI ren- typical of these cases that an indeterminacy originally dering libraries; the compromised library can then leak the restricted to the atomic domain becomes transformed into textual content of victim applications. Or malware with macroscopic indeterminacy, which can then be resolved kernel privileges can (i) access the application's memory in by direct observation." - Erwin Schrödinger1 order to read the plaintext after decryption or (ii) read the 1https://en.wikipedia.org/wiki/Schrodingers cat texture buffers allocated by the graphics stack to hold the character glyphs used in the text. Permission to make digital or hard copies of all or part of this work for personal or Unfortunately, as noted by Checkoway et al. [23], it is ex- classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation tremely difficult to fully protect an application from a com- on the first page. Copyrights for components of this work owned by others than the promised operating system. Yet, we ask ourselves: can we author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or at least protect the confidentiality of selected sensitive textual republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. content of the application's UI in such an environment? We MobiSys’17, June 19-23, 2017, Niagara Falls, NY, USA care about protecting selected textual content since most, if not all, sensitive information delivered to the user via apps © 2017 Copyright held by the owner/author(s). Publication rights licensed to ACM. ISBN 978-1-4503-4928-4/17/06. $15.00 is in textual format. DOI: http://dx.doi.org/10.1145/3081333.3081346 In this paper, we present a system solution, called Schrod- inText, designed to meet this goal. Our solution leverages To address this challenge, we choose to limit SchrodinText novel hardware features in ARM processors to create a se- to monospaced fonts, in which all characters have a fixed curity monitor for showing and protecting the text. The width. In addition, we introduce a technique for determin- monitor leverages both virtualization and TrustZone hard- ing the line-breaks without access to the text, called oblivi- ware available on modern ARM processor. The monitor is ous line-breaking. These two techniques allow the operating more privileged than the operating system and is used to system to perform the layout in full, only needing access to protect the selected textual content shown on the display. the number of characters in the text, and not the text itself. In SchrodinText, the application's backend server encrypts Second, once character glyphs are rasterized and their lo- the text with a key only available to the monitor, hence pro- cations are determined, they must be composited on top of tecting it from the operating system. However, this raises other layers in the framebuffer and displayed to the user. an important challenge: the text needs to be rendered by Android on modern mobile devices uses the GPU for com- the graphics stack in the operating system. The final pix- positing. Exposing the resolved glyphs to the GPU makes els displayed to the user depend on the font type, size, and them vulnerable to attacks by the operating system since color chosen by the app developer, all of which are available the GPU is programmed by the operating system. We use and understandable by the operating system, and not the two techniques, namely multi-view pages and two-stage com- monitor. The graphics stack includes several libraries and positing, to securely perform the compositing while protect- frameworks as well as kernel device drivers and is one of ing the resolved glyphs. Collectively, these two techniques the largest and most complicated components of the oper- allow us to use the operating system and GPU for composit- ating system. Therefore, how can the monitor display the ing of non-protected textures and use the monitor (but not textual content to the user given that the graphics stack is the GPU) for compositing the protected text glyphs, all the implemented by the operating system? while protecting them from the operating system and other At first glance, it seems like a feasible solution to this DMA-capable devices. problem is to move the graphics stack to the monitor. How- It is important to note that SchrodinText protects the ever, such an approach would significantly bloat the Trusted output text of the applications but not the input text. That Computing Base (TCB) of the monitor. A large TCB would is, it protects the textual content that is shown to the user, make the monitor vulnerable to attacks itself hence defeating such as received messages, bank account information, health the original purpose. An alternative approach might then be records, and verification codes in two-factor authentication. to support a minimal and simplified version of the graphics It, however, does not protect the text input by the user, in- stack in the monitor, in order to keep the TCB small. Un- cluding outgoing messages and typed passwords. Protecting fortunately, with this approach, the content rendered by the the input text also requires protecting the input stack in the monitor would not be visually well-integrated with the rest operating system, which is out of the scope of this work. of the content, which are rendered by the complete graphics We have designed SchrodinText with ease of use for de- stack in the operating system. Moreover, such a solution velopers in mind. More specifically, we provide a simple would not be able to easily benefit from updates to the op- UI widget, called SchrodinTextView, which is a modified erating system graphics stack, which, for example, might version of TextView used in Android to embed text in appli- introduce new fonts and effects from untrusted sources. cation's UI. The developer can simply embed this widget in In SchrodinText, we introduce an alternative solution that the UI, very similar to existing widgets.

Load more