Your Voice is Mine: How to Abuse Speakers to Steal Information and Control Your Phone ∗ †

Wenrui Diao, Xiangyu Liu, Zhe Zhou, and Kehuan Zhang Department of Information Engineering The Chinese University of Hong Kong {dw013, lx012, zz113, khzhang}@ie.cuhk.edu.hk

ABSTRACT SMS/Email, access privacy information, transmit sensitive data and «««< .mine Previous research about sensor based attacks on An- achieve remote control without any permission. Moreover, we droid platform focused mainly on accessing or controlling over found a vulnerability of status checking in Search app, sensitive components, such as camera, microphone and GPS. These which can be utilized by GVS-Attack to dial arbitrary numbers approaches obtain data from sensors directly and need corre- even when the phone is securely locked with password. »»»> .r67 sponding permissions. ======Previous research about sensor A prototype of VoicEmployer has been implemented to demon- based attacks on Android platform focused mainly on accessing or strate the feasibility of GVS-Attack. In theory, nearly all Android controlling over sensitive components, such as camera, microphone (4.1+) devices equipped with Google Services Framework are and GPS. These approaches obtain data from sensors directly and vulnerable to the proposed GVS-Attack. We hope that our study need corresponding sensor invoking permissions. »»»> .r67 can raise an alarm on the security risk of passive components (like «««< .mine This paper presents a novel attack that can bypass phone speakers). Although they will not generate any sensitive Android’s permission-based security mechanism. The basic idea information, they could still be exploited for privacy attacks. is that a zero-permission app (VoicEmployer) can play a malicious voice command at background, and that voice command will be Categories and Subject Descriptors picked up and executed by Google Voice Search, the built-in voice D.4.6 []: Security and Protection—Invasive assistant module on Android phones, enabling the malicious app to get lots of sensitive data that are originally not accessible by zero- permission apps. With ingenious design, our GVS-Attack can forge SMS/Email, access privacy information, transmit sensitive data, General Terms and achieve remote control without requesting any permission. Security Moreover, we found a vulnerability of status checking in the official Google Search app, which can be utilized by GVS-Attack to dial arbitrary phone numbers even when the phone is securely locked Keywords with a password. ======This paper presents a novel approach Android Security; Speaker; Voice Assistant; Permission Bypass- (GVS-Attack) to launch permission bypassing attacks from a zero- ing; Zero Permission Attack permission Android application (VoicEmployer) through the phone speaker. The idea of GVS-Attack is to utilize an Android system 1. INTRODUCTION built-in voice assistant module – Google Voice Search. With In recent years, are becoming more and more popu- Android Intent mechanism, VoicEmployer can bring Google Voice lar, among which Android OS pushed past 80% market share [32]. Search to foreground, and then plays prepared audio files (like “call One important reason for this popularity is that users can install number 1234 5678”) in the background. Google Voice Search applications (apps for short) as their wishes conveniently. But such can recognize this voice command and perform corresponding conveniences has brought serious security problems, like malicious operations. With ingenious design, our GVS-Attack can forge applications. According to Kaspersky’s annual security report [34], ∗Responsible disclosure: We have reported the vulnerability of Android platform attracted a whopping 98.05% of known malware Google Search app and corresponding attack schemes to Google in 2013. security team on May 16th 2014. Current Android phones are equipped with several kinds of †Demo video can be found on the following website: https:// sensors, such as light sensor, accelerometer, microphone, GPS, sites.google.com/site/demogvs/ etc. They enhance the user experience and can be used to develop creative apps. However, these sensors could be utilized as powerful Permission to make digital or hard copies of all or part of this work for personal or weapons by mobile malware to steal user privacy as well. Taking classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation the microphone as an example, a malicious app can record phone on the first page. Copyrights for components of this work owned by others than the conversations which may contain sensitive business information. author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or With some special design, even credit card and PIN number can be republish, to post on servers or to redistribute to lists, requires prior specific permission extracted from recorded voice [46]. and/or a fee. Request permissions from [email protected]. Nearly all previous sensor based attacks [46, 40, 13, 10, 47, SPSM’14, November 7, 2014, Scottsdale, Arizona, USA. 49, 30] only considered using input type of components to per- Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-2955-5/14/11 ...$15.00. form malicious actions, such as accelerometer for device position http://dx.doi.org/10.1145/2666620.2666623. analysis and microphone for conversation recording. These attacks are based on accessing or controlling over sensitive sensors di- The basic idea of GVS-Attack is to utilize the capability of rectly, which means corresponding permissions are required, such Google VoiceSearch. Through Android Intent mechanism, VoicEm- as CAMERA for camera, RECORD_AUDIO for microphone, and ployer can bring Google Voice Search to foreground, and then plays ACCESS_FINE_LOCATION for GPS. As a matter of fact, output an audio file in the background. The content of this audio file could type of permission-free components (e.g. speaker) can also be be a voice command, such as “call number 1234 5678”, which utilized to launch attacks, namely an indirect approach. will be recognized by Google Voice Search and the corresponding Consider the following question: operation would be performed. By exploiting a vulnerability of status checking in Google Search app, zero permission based Q: To a zero-permission Android app that only invokes VoicEmployer can dial arbitrary malicious numbers even when the permission-free sensors, what malicious behaviors can it do? phone is securely locked. With ingenious design, GVS-Attack can In general, zero permission means harmlessness and extremely forge SMS/Email, access privacy information, transmit sensitive limited functionality. However, our research results show that: data and achieve remote control without any permission. What is A: Via the phone speaker, a zero-permission app can make more, context-aware information can be collected and analyzed to phone calls, forge SMS/Email, steal personal schedules, obtain assist GVS-Attack, which makes this attack more practical. user location, transmit sensitive data, etc. It is difficult to identify the malicious behaviors of VoicEmployer through current mainstream Android malware analysis techniques. This paper presents a novel attack approach based on the speaker From the aspect of permission checking [9, 22], VoicEmployer and an Android built-in voice assistant module – Google Voice doesn’t need any permission. From the aspect of app dynamic Search (so we call it GVS-Attack). GVS-Attack can be launched behavior analysis [21, 53], VoicEmployer doesn’t perform ma- by a zero-permission Android malware (called VoicEmployer licious actions directly. The actual executor is Google Voice correspondingly). This attack can achieve varied malicious goals Search, a "trusted" system built-in app module. In addition, the by bypassing several sensitive permissions. voice commands are outputted from the speaker and captured by GVS-Attack utilizes system built-in voice assistant functions. the microphone as input. Such inter-application communication Voice assistant apps are hands-free apps which can accept and channel is beyond the control of Android OS. We tested several execute voice commands from users. In general, typical voice famous anti-virus apps (AVG, McAfee, etc.) to monitor the process commands include dialing (like “call Jack”), querying (like “what’s of GVS-Attack. None of them can detect GVS-Attack and report the weather tomorrow”) and so on. Benefited from the development VoicEmployer as malware. of speech recognition and other natural language processing tech- niques [16], voice assistant apps can facilitate users’ daily opera- Contributions. We summarize this paper’s contributions here: tions and have become an important selling point of smartphones. Every mainstream platform has its own built-in voice • New Attack Method and Surface. To the best of our knowl- assistant app, such as [8] for iOS and [36] for Windows edge, GVS-Attack is the first attack method utilizing the Phone. On Android, this function is provided by Google Voice speaker and voice assistant apps on mobile platforms. Be- Search1 (see Figure1) which has been merged into Google Search sides, this attack can be launched by a totally zero-permission app as a sub-module starting from Android 4.1. Users can start Android malware that can perform many malicious actions Google Voice Search through touching the microphone icon of through sensitive permission bypassing, such as malicious Google Search app widget, or the shortcut named Voice Search, number dialing, SMS/Email forging and privacy stealing. etc. • New Vulnerability. We found a vulnerability of status check- ing in Google Search app (over 500 million installations), which can be utilized by GVS-Attack to dial arbitrary - cious numbers even when the phone is securely locked with password.

• Prototype Implementation and Evaluation. We implemented a VoicEmployer prototype and carried out related experi- ments to demonstrate the feasibility of GVS-Attack. In addi- tion, We designed and tested some attack assisting schemes, such as context-aware information analysis, sound volume setting, etc.

Roadmap. The rest sections are organized as follows: Section 2 introduces Google Voice Search related contents, including backgrounds, vulnerability analysis and the adversary model. The details of GVS-Attack are described in Section3, which contains three different levels of attacks. In Section4, a VoicEmployer (a) Voice Dialer Mode (b) Velvet Mode prototype was implemented and related experiments were carried out. Section5 and Section6 discuss corresponding defense strategies and some related in-depth topics respectively. Previous Figure 1: Google Voice Search on Android research about sensor based attacks and Android app security analysis are reviewed in Section7. Section8 concludes this paper. 1Some people confuse with Google Voice Search. In fact, Google Voice Search module can work independently when Google Now is off. See http://www.google.com/landing/now/ 2. GOOGLE VOICE SEARCH ON ANDROID 2.1 Backgrounds headset for certain time, this event can trigger passing an Google Services Framework. Google Services Framework / ACTION_VOICE_COMMAND based Intent to the OS if the phone is Google Mobile Services are pre-installed on nearly all brands of not in call process. Then Voice Dialer mode will be activated. This Android devices. It can be treated as a suit of pre-installed apps de- usage scenario is reasonable, because the user must first confirm veloped by Google, including Google Play, Gmail, Google Search, the connection (pairing) request of the Bluetooth headset on his etc. [26] These killer apps (Google Search app has over 500 million phone, which is a process of authorization, and only after that, installations just on Google Play) are so popular that even cus- this Bluetooth headset is treated as a trusted device. In Android tomized Android firmwares would keep them. Taking Cyanogen- source codes, the file HeadsetStateMachine.java defines Mod2 as an example, due to licensing restrictions, Google Services and handles related operations [6]. Framework cannot come pre-installed with CyanogenMod, but However, we found a vulnerability that Google Search app these apps can still be installed via separate Google Apps recovery doesn’t strictly check whether the phone is connected with a package [17]. Bluetooth headset. This vulnerability results in that a third-party app can pass an ACTION_VOICE_COMMAND based Intent to the Android Intent Mechanism. In Android OS, Intent mechanism OS and activate Voice Dialer mode of Google Voice Search, even allows an app to start an activity / service in another app by when the phone is securely locked. Through decompiling the apk describing a simple action, like "view map" and "take a picture" [1]. file of Google Search app (version 3.3.11.1069658.arm4, released An Intent is a messaging object which declares a recipient (and on March 14th 2014), we analyzed related codes and located this contains data). VoicEmployer utilizes this mechanism to vulnerability. Figure2 shows the simplified logical flow of Google Google Voice Search module of Google Search app. From the view Search app handling voice assistant type of Intents. The area of Android OS, it only executes a normal operation to invoke a bounded by the dotted line is the location of this vulnerability. system built-in app module. An intent filter is an expression in an app’s manifest file that 2.3 Adversary Model specifies the type of intents that the component would like to GVS-Attack needs to be launched by an Android malware receive [1]. For example, the manifest file of Google Search app (called VoicEmployer) which has been installed on the user’s defines that hands-free function (voice commands) related compo- phone. This malware could disguise as a normal app and contains nents can handle two kinds of actions, namely ACTION_VOICE_ attack modules. Since VoicEmployer doesn’t need any permission, COMMAND [2] and ACTION_VOICE_SEARCH_HANDS_FREE [4]. disguising should be quite easy. After VoicEmployer is opened 2.2 Google Search App Vulnerability Analysis by the user (namely the victim) once, subsequent attack processes need no interaction with the victim. The attack scene is when the Google Voice Search is a voice assistant module of Google victim is not using his phone. Search app. It is designed for hands-free operations and can accept several kinds of voice commands, such as “call Jack” and Assumption. The victim’s Android (4.1+) phone should contain “what’s the weather tomorrow”. Google Voice Search runs in complete Google Services Framework. two modes: Voice Dialer – Figure 1(a) and Velvet – Figure 1(b). The reason is that VoicEmployer needs to invoke Google Voice VoiceDialer mode only accepts voice dialing commands and Velvet Search (and Google speech recognition / synthesis service) to mode is the fully functional mode. Google Voice Search can be execute attacks. This assumption is quite weak, because most invoked through Android Intent mechanism. First, a third-party smartphone vendors pre-install Google Services Framework on app constructs an Intent based on ACTION_VOICE_COMMAND their products. In order to make GVS-Attack work well, the or ACTION_VOICE_SEARCH_HANDS_FREE, and then passes it victim’s phone should use Android 4.1 or higher versions. Google to startActivity(). Android OS resolves this Intent and Search app running on Android 4.0 or lower versions cannot be finds Google Search app can handle it. Then this Intent is passed updated to the newest version and lacks many new features of voice to Google Search app. According to the current phone status3, commands. In August 2014, Android 4.1 or upper versions had different modes will be started by Google Search app: more than 75% device share distribution [3]. Secure Lock. The victim’s Android phone could be securely 1. If the phone is unlocked and the screen is on, Velvet mode locked or not. GVS-Attack contains three different levels: Basic will be started. Attack, Extended Attack and Remote Voice Control Attack. The 2. If the phone is insecurely locked, namely sliding to unlock secure lock status corresponds to Basic Attack and the insecure the screen without authentication, Voice Dialer mode will be lock status corresponds to all levels of attacks. started. Basic Attack achieves arbitrary number dialing. In order to launch Extended Attack and Remote Voice Control Attack, there is 3. If the phone is securely locked, such as using password and an additional requirement – the victim doesn’t enable secure screen pattern password, a voice warning will be played: "please lock functions, such as password and pattern password. Google unlock the device". Voice Search cannot run in Velvet mode without unlocking the screen. In fact, for convenience, many users (more than 30%) Vulnerability in Google Search App. In fact, Voice Dialer would give up secure screen lockers [41]. mode can be started even if the phone is securely locked. But Zero Permission. GVS-Attack can be launched by VoicEm- this function is only designed for the Bluetooth headset hands- ployer without any permission. Android OS uses a permission free mode. When a user presses the button on his connected mechanism to enforce restrictions on the specific operations. The 2http://www.cyanogenmod.org/ permission abusing problem has been noticed long before. Mali- 3The code implementations of corresponding status checking cious apps utilize some sensitive permissions (such as CAMERA, are: KeyguardManager.isKeyguardLocked(), KeyguardManager.isKeyguardSecure() and 4Following versions have been applied code obfuscation tech- PowerManager.isScreenOn(). niques, and this vulnerability still exists. Warning: Device securely locked... VOICE_SEARCH_HANDS_FREE The system resolves the received Intent Google Voice Search VOICE_COMMAND or Yes Yes running in Voice Dialer VOICE_SEARCH_HANDS Mode _FREE?

VOICE_COMMAND Screen off or device Google Search receives Device securely locked? the Intent locked? Google Voice Search No running in Velvet Mode No

Figure 2: The Vulnerability of status checking in Google Search app

RECORD_AUDIO and READ_CONTACTS) to collect user pri- Remote Data Transmission and Voice Control. Through call vacy and achieve other illegal targets. Since this problem is so channel, sensitive data can be transmitted without the INTERNET widespread, some users will pay attention to unknown or new apps permission. Also the attacker can control the victim’s Android requiring sensitive permissions [23]. phone remotely. Instead of requesting these permissions explicitly, our attacks GVS-Attack provides a new inter-application communication only use the speaker to perform sensitive actions protected by channel for mobile malware attacks. In the process of GVS- permissions. According to the current Android permission mecha- Attack, the input of microphone comes from the output of speaker nism, playing audio doesn’t require any permission. (VoicEmployer). This information transmission process is beyond Context-aware Information Collection and Analysis. Since the control of Android OS, which can be treated as an uncontrolled voice commands played may be heard and interrupted by the physical communication channel or a kind of covert channel. victim, VoicEmployer needs to analyze the current environment in Figure3 shows this communication channel. This new attack order to decide whether to launch attacks or not, that is to collect channel is totally different from previous research on security of information through sensors and legal Android API invoking. application communications [33, 15]. On mobile platforms, the feasibility of context-aware information analysis has been demonstrated in several previous research [25, 35, 46]. As a matter of fact, this target could be achieved without any permission. For example, light sensor and accelerometer can be used to analyze the surrounding environment. And the internal status of the phone can be checked by analyzing CPU workload, memory workload, etc. Typical Attack Scene. Due to job requirements or merely VoicEmployer personal habits, many people don’t turn off their phones when Sound Transmission sleeping [48]. In the early morning (before dawn, such as 3 AM), through the Air the victim is likely to be in deep sleep and the environment is quiet. The voice commands could be played in a very low volume and still be recognized by Google Voice Search. According to related Google Voice medical research, nocturnal awakenings usually occur with noise Search levels greater than 55 dB5 [39]. It means voice commands in low volume will not awaken the victim. This situation provides an ideal scenario for GVS-Attack launching.

3. ATTACKS Figure 3: Inter-Application Communication Channel of GVS- The basic idea behind GVS-Attack is to utilize the capability Attack of Google Voice Search and bypass Android permission checking mechanism. There are two highlights in our attacks: In the following subsections, three types of attacks are described Attack in Any Situation. By exploiting the vulnerability of status in details. Basic Attack can dial arbitrary numbers. Extended At- checking in Google Search app, zero-permission VoicEmployer tack can bypass several sensitive permissions. Combined previous can dial arbitrary malicious numbers even when the phone is two attacks, the attacker can interact with the phone, which leads securely locked. to Remote Voice Control Attack. 5Actually the noise-induced nocturnal awakening is affected by 3.1 Basic Attack: Malicious Number Dialing many factors, such as sleep stages, age, gender, smoking, etc. Also sensitivity to noise may vary greatly from one individual to another. Basic Attack achieves arbitrary malicious number dialing. That These topics are beyond the scope of this paper. More details see is to say the CALL_PHONE permission is bypassed. The vulnera- the guideline document [31] of World Health Organization (WHO). bility of Google Search app makes this attack still valid when the phone is securely locked. Without unlocking the screen keyguard, • “Email to [contacts], subject “meeting cancel”, message Google Voice Search would run in Voice Dialer mode which only “tomorrow’s meeting has been canceled”.” This command accepts dialing commands, as Figure 1(a) shows. Before launching results in sending an Email to the contacts with the above attacks, we answer two questions first: how to invoke Google Voice subject and message. It can be used to forge Emails with any Search and how to prepare voice command files. contents. Google Voice Search Invoking. When both the surrounding en- vironment and phone internal status reach the predefined threshold, • “Send SMS to number 1234 5678 “confirm subscribe to VoicEmployer will invoke Google Voice Search through Android weather forecast service”.” This command results in sending Intent mechanism. In fact, VoicEmployer only needs to construct an SMS to number 1234 5678. It can be used to forge SMS an Intent based on ACTION_VOICE_COMMAND, and passes it to and subscribe to premium-rate services. startActivity(). • “What is my next meeting?” ⇒“Your next calendar entry is Voice Command Files Preparation. After Google VoiceSearch tomorrow 10 AM. The tile is “Meet with boss”.” The victim’s starting, VoicEmployer plays an audio file (voice commands) as a calendar schedule is leaked through voice feedback. background service. These files could be packaged in VoicEm- ployer as a part of apk resource, but this method would increase • “What is my IP address?” ⇒ “Your public IP address is the size of installation package. A more covert solution is to utilize 111.222.111.222.” The victim’s IP address is leaked through Google Text-to-Speech (TTS) [5] service which is a system built-in voice feedback. service used for translating text contents to speech. TTS service is based on Internet, but it doesn’t require that the app invoking TTS • “Where is my location?” ⇒ “Here is a map of Brooklyn declares the INTERNET permission. Therefore VoicEmployer can District.” The victim’s location (district level) is leaked execute speech synthesis and save results as an audio file in its own through voice feedback. data folder (internal storage) dynamically. It is equivalent to that VoicEmployer utilizes the capability of Attack Launching. In previous preparations, VoicEmployer Google Voice Search and bypasses Android permission checking synthesizes two audio records like “call 1234 5678” and “OK”. mechanism. From Android OS’s view, VoicEmployer just passes 1234 5678 could be a malicious number. an Intent and plays some audio files. Table1 shows the permissions When this attack is launched, Voice Dialer mode of Google which can be bypassed by GVS-Attack. Some voice commands are Voice Search is activated and VoicEmployer plays the audio record not listed, because they need interaction (touching the screen) with –“call 1234 5678”. Then Voice Dialer recognizes this command the user, such as “create a calendar event ...” and “post to Google and returns a voice feedback “do you want to call 1234 5678, say plus ...”. OK or cancel”. VoicEmployer plays another audio record “OK”. After receiving this confirmation, Voice Dialer makes a call to 1234 5678. Since 1234 5678 is a malicious number, the victim will be Table 1: Permissions Bypassed by GVS-Attack charged with premium-rate service fee. What is more, his phone number is leaked as well. Voice Command Bypassed Permission(s) Call ... READ_CONTACTS, CALL_PHONE 3.2 Extended Attack: Sensitive Permission Listen to voicemail WRITE_SETTINGS, Bypassing CALL_PHONE Extended Attack needs to invoke Velvet mode of Google Voice Browse to Google dot com INTERNET Search, as Figure 1(b) shows. Therefore, the precondition is that Email to ... READ_CONTACTS, the victim doesn’t enable secure screen lock functions (password, GET_ACCOUNTS, INTERNET pattern password, etc.). Nonetheless, VoicEmployer must wake the Send SMS to ... READ_CONTACTS, phone and turn on the screen itself. WRITE_SMS, SEND_SMS In general, an app requires the WAKE_LOCK permission to wake Set alarm for ... SET_ALARM LayoutParams the phone from sleep status, and then sets flag . Note to myself ... GET_ACCOUNTS, FLAG_DISMISS_KEYGUARD to unlock the screen keyguard. RECORD_AUDIO, INTERNET Because WAKE_LOCK is not a sensitive permission and doesn’t What is my next meeting? READ_CALENDAR correlate with user data, so this requirement is not exorbitant. Show me pictures of ... INTERNET Actually there is a tricky implementation to bypass this permission. What is my IP address? ACCESS_WIFI_STATE, That is, VoicEmployer invokes Google Voice Search once and does INTERNET nothing until it exits for timeout. At this moment, the phone has ACCESS_COARSE_LOCATION been in waked status (the screen is on) and the keyguard has be Where is my location? , INTERNET dismissed. As a result, VoicEmployer utilizes the WAKE_LOCK permission of Google Search app to wake the phone indirectly. How far from here to ...? ACCESS_FINE_LOCATION, INTERNET User Sensitive Data Collection. Running in Velvet mode, Google VoiceSearch can accept more types of voice commands [27], not just dialing commands. Different voice commands cause Remote Data Transmission. GVS-Attack can dial a malicious different information leakages. Action commands lead to that number through playing “call ...”. When the call is answered by specific operations are performed, such as sending SMS. Other an auto recorder, the data transmission channel has been built. Any querying commands will trigger voice feedbacks, such as “what audio data can be transferred through this channel instead of using is the time?” will get “the time is 9:39 pm”. Typical information Internet connection. leakages include (sentences in italic are voice commands or voice Google Voice Search running during phone call period. In feedbacks): Android OS, the audio recording method is synchronized, which means multiple apps cannot access the microphone at the same time through network or short messages. These controls were based on on API level. However, as an essential system module, the phone corresponding communication permissions (such as INTERNET call function of Android is based on hardware-level implemen- and SEND_SMS) or even root privilege. But our remote control tations instead of invoking MediaRecorder / AudioRecord method utilizes the capability of Google Voice Search to build the class. Android OS simply needs to send control signals and related communication channel and to access sensitive data. These targets hardware (GSM module, microphone, speaker, etc.) will complete are completed without any permission. audio data processing functions directly [7]. Therefore during the With special design, Remote Voice Control Attack could become phone call period, another app can also access the microphone quite powerful. One example of voice interaction is the command through Android API. It means Google Voice Search can still be “How far from here to Lincoln Memorial by car?”. Google brought to the foreground to accept voice commands6. Voice Search provides a voice feedback like “The drive from As a result, in terms of querying commands, the feedback your location to Lincoln Memorial is 17.6 kilometers”. Then voice can be obtained by the attacker through this call channel. the attacker can use the successive approximation method to ask Based on speech recognition techniques, the attacker can obtain similar questions of different locations until he gets a feedback like corresponding text information. To be specific, VoicEmployer “The drive from your location to White House is 120 meters”. At invokes Google Voice Search and then plays voice command “call this moment, the attacker gets the accurate location of the victim, number 1234 5678”. After connected, VoicEmployer invokes which is quite dangerous. Google Voice Search and plays querying commands again, like Another example is that the attacker can leave a note to the “where is my location”. The feedback voice – “here is a map of Sha victim using the command “Note to self: You have been hacked”. Tin district” is transmitted to the attacker through the call channel. Google Voice Search would make a note (send an Email to the If VoicEmployer could get more permissions, more sensitive victim) with the content “You have been hacked”. In the morning, information would be leaked. A natural idea is to play audio after the victim get up, he will find such a strange note on his records on external storage directly. These audio records are phone. Based on this thought, some complex social engineering recorded by the victim and probably contain sensitive informa- attacks [29] could be designed and launched. tion, like a business negotiation. This operation requires the READ_EXTERNAL_STORAGE or WRITE_EXTERNAL_STORAGE 4. IMPLEMENTATION AND ANALYSIS permission. Both of them allow an app to read all data on external storage. For text files, they can be converted to audio data 4.1 Overall Structure and Attack Experiments READ_CONTACTS through TTS service. For example, with the We implemented a VoicEmployer prototype to demonstrate our permission, VoicEmployer can get phone contacts of the victim. attack schemes. Our implementation contains 5 modules: Main- Then it utilizes TTS service to convert the text into voice. So the Activity, AlarmReceivor, EnvironmentService, WakedActivity and attacker can obtain a complete copy of the victim’s contacts through VoiceCommandService. the call channel. Similar risks happen in READ_CALL_LOG, READ_SMS, etc. • MainActivity shows the normal starting UI. Its main The voice channel is not used for transferring text and voice data functions are registering an alarm and preparing audio files only. Actually, any type of files can be translated to hex coding of voice commands. format, so any file could be transmitted in the form of audio coding AlarmReceivor (or even read hex codes directly) in theory. When the attacker • is used to receive alarm events and start receives all hex codes of a photo through the call channel, he can EnvironmentService. restore it easily. If this photo contains sensitive information, such • EnvironmentService is used for context-aware infor- as selfie, it will be quite harmful to the victim. In practical attacks, mation collection and analysis. Range of analysis include compression encoding and error checking mechanism should be light sensor, accelerometer, /proc/, system workloads, considered. etc. If the analysis results reach a predefined threshold, WakedActivity will be started. Related implementation 3.3 Remote Voice Control Attack: Real World details are described in Section 4.2. Case Study Remote Voice Control Attack combines Basic Attack and Ex- • WakedActivity is used to unlock insecure screen key- tended Attack. This attack also needs to invoke Velvet mode of guard through setting LayoutParams.FLAG_DISMISS Google Voice Search. _KEYGUARD. This function can only be implemented in In Remote Voice Control Attack, after VoicEmployer triggering Activity components. Without dismissing screen keyguard, a malicious number dialing, the attacker answers this call directly. Velvet mode cannot be started. Note: This module is not During the calling period, VoicEmployer invokes Google Voice necessary for Basic Attack. Search periodically. When the attacker speak some voice com- • VoiceCommandService is used to invoke Google Voice mands, these commands will be played by the speaker (headset) Search and play audio files. Playing Voice command will be of victim’s phone. Then Google Voice Search recognizes the delayed a short while in order to wait Google Voice Search commands and perform corresponding operations. It means that getting ready. In terms of Remote Voice Control Attack, the attacker can interact with the victim’s phone through Google Google Voice Search will be invoked periodically. Voice Search, namely remote voice control. Previous research [55] showed more than 90% of Android mal- Taking Remote Voice Control Attack as an example, Figure4 ware would turn the compromised phones into a botnet controlled shows the workflow of VoicEmployer implementation. It utilizes the alarm mechanism to trigger attack related modules to avoid 6Google Voice Search is an Internet based service. With 3G/4G or continuous background services. Wi-Fi data connection, the phone can make phone call and access the Internet at the same time. To 2G data connection (GPRS and Attack Results. The experimental versions of Google Search EDGE), the two functions are conflicted. app were 3.3.11.1069658.arm and 3.4.16.1149292.arm, which were executing results are what we only care. Results showed CSR MainActivity: Run once was nearly 100%. The reasons are as follows:: 1. the attack to prepare voice happens in a quiet environment without interfering noise; 2. voice command file command sentences are simple, which don’t lead to ambiguity; Attack 3. VoicEmployer utilizes Google TTS to synthesize the command Preparation. speech, which removes accent and pronunciation problems. MainActivity: Register alarm at next t AM Detection by Anti-Virus Apps. We tested the following anti- virus apps for Android platform: AVG AntiVirus7, McAfee An- t AM Alarm trigger tivirus & Security8, Avira Antivirus Security9, ESET Mobile Secu- rity & Antivirus10 and Norton Mobile Security11. After executing AlarmReceivor: Start threat scanning, none of them reported VoicEmployer prototype as EnvironmentService Context-aware a malware. Also in the attack processes, none of their real-time Information monitoring modules can detect GVS-Attack’s malicious behaviors. Not Satisfy Analysis.

EnvironmentService: 4.2 Context-aware Information Collection and Content-awareness Analysis Experiments analysis The occasion of GVS-Attack is when the victim is not using his phone. Especially in the early morning (before dawn, such as 3 Satisfy AM), the victim is likely to be in deep sleep. To avoid the attack being found and interrupted by the victim, VoicEmployer needs WakedActivity: to analyze the current environment (including the surroundings and Optional Dismiss keyguard Google Voice phone status) before deciding whether to launch attacks. This target Search can be achieved through sensors and legal Android API invoking VoiceCommandService: Invoking. without any permission. If the analysis result shows the victim Invoke Google Voice is using his phone, VoicEmployer will keep inactive. Otherwise, Search GVS-Attack could be launched. Range of analysis and criteria Delay to wait GVS ready include:

VoiceCommandService: • Light sensor. Get the current brightness: if the brightness is Play voice command: Playing Voice Call number XXXX very low (such as less than 5% of max), it means the phone Commands is probably in pockets / bags or the victim has turned off the and Follow-up Call answered Attacks. light at night.

VoiceCommandService: • Accelerometer. Get the current position information of the Invoke Google Voice Search phone: if the result shows the phone is in static status Periodic trigger (namely stable readings), it means the victim is probably not holding / taking the phone.

Figure 4: Workflow of VoicEmployer Implementation • Android SDK class - PowerManager. Check the screen is on or off: if the screen is off, it means the victim is probably not using the phone. released on March 14th 2014 and May 6th 2014 respectively. We tested VoicEmployer prototype on Galaxy S3 GT-I9300, • Android SDK class - SimpleDateFormat. Obtain the MX2 and A953. GVS-Attack can be launched current system time: for example, 3 AM means the victim on all of them successfully. On S3 with official is likely to be in deep sleep. firmware, app [44] (a built-in voice assistant developed by Samsung) would be started, not Google Voice Search, but attack • Read /proc/stat and top command. Analyze the cur- processes are similar. This change derives from that Samsung sets rent CPU workload: if the workload is low (such as less than a higher priority in the manifest file of S Voice app to respond 50%), it means the victim is probably not using his phone. ACTION_VOICE_COMMAND and ACTION_VOICE_SEARCH_ HANDS_FREE. Table2 summarizes the experiment results. • Read /proc/meminfo and top command. Analyze the current RAM usage: if the usage is low (such as less that 50%), it means the victim is probably not using his phone. Table 2: GVS-Attack Experiments Phone Model Android Version Attack Result Based on above analyses, we designed and carried out context- CyanogenMod 4.4.2 success aware information analysis experiments using EnvironmentService Samsung Galaxy S3 Samsung Official 4.3 success module in 3 typical scenes:

Meizu MX2 Meizu official 4.2.1 success 7 Motolora A953 CyanogenMod 4.1.1 success https://www.avgmobilation.com/ 8https://www.mcafeemobilesecurity.com/ 9http://www.avira.com/en/free-antivirus-android Speech Recognition Accuracy. We consider Command Success 10http://www.eset.com/us/home/products/mobile-security-android/ Rate (CSR) instead of Word Error Rate (WER) [45], because the 11https://mobilesecurity.norton.com/ 1. The user is using his phone to play games (take Angry Birds using STT service can get the text content of voice input directly. Rio12 as an example) with the light on. To be specific, VoicEmployer sets the sound volume to 1 (the minimum) at first, then invokes SST service and plays a prepared 2. The user is walking on the street, while the phone (screen- test audio file (the content could be "This is a test message"). off) is in his trouser pocket. Text results from SST will be returned to VoicEmployer and be 3. [Target Scene] The phone (screen-off) is put on a horizontal compared with the correct texts. If the two sets of texts are the table with the light off. same, VoicEmployer sets system volume to 1 and launches GVS- Attack. If the two sets of texts are different, VoicEmployer adjusts The experiment was based on Meizu MX2 which has current the volume to 2 and carries out the same test again. This process mainstream hardware configurations (ARM Cortex-A9 quad-core will be repeated until MASV is found. We carried out MASV test experiments in a quiet meeting room 1.6 GHz, 2 GB LPDDR2 RAM). Before every scene’s test, the 2 phone was restarted to cut out distractions of irrelevant factors. of about 8 (see Table 4 as result). The environment background Table3 shows test results in 30 seconds. We can find that some sound pressure levels (SPL) [51] was about 48 dB. Another item values of Scene 3 (target scene) are significantly different from relationship we concerned is SPL vs. distance. In the same those of Scene 1 and Scene 2. It demonstrates that context-aware experiment environment, we recorded the transient peak SPL in information analysis can be used to detect the current environment different distances (0.5 m, 1 m and 2 m) from the phone using the and assist GVS-Attack practically. above MASV value. Test phones were put on a flat table face up without shelter. Table4 shows corresponding experiment results. We can find that when the distance exceeds 1 m, the SPL is quite Table 3: Content-aware Information Analysis low – only 8 dB higher than the background SPL. Method / Tool Scene 1 Scene 2 Scene3 Max 304 1 0 Table 4: Quiet Meeting Room, SPL vs. Distance, unit: dB Light Sensor Min 126 0 0 Phone Model 0.5 m 1 m 2 m Avg 227.95 0.5 0 Samsung Galaxy S3 – volume 6/15 57 56 54 X-axis Max 7.78 18.19 0.15 Meizu MX2 – volume 5/15 58 54 53 X-axis Min 5.62 -0.06 0.01 X-axis Avg 6.54 8.61 0.10 Motolora A953 – volume 6/15 58 56 54 Y-axis Max -0.50 14.56 0.11 Accelerometer Y-axis Min -1.97 -2.15 -0.10 Y-axis Avg -1.10 4.78 0.01 5. DEFENSE Z-axis Max 9.46 6.70 10.34 There exist some possible (but not perfect) schemes to defend Z-axis Min 5.88 -12.00 10.04 GVS-Attack. The vulnerability of status checking in Google Z-axis Avg 7.60 -0.18 10.22 Search app should be fixed first. When Google Search app receives Max 59% 43% 40% an ACTION_VOICE_COMMAND based Intent, if the phone is CPU Workload Min 32% 22% 18% securely locked, it must strictly check whether a Bluetooth headset Avg 43% 29% 28% is connected with the phone. If the connect status is true, Voice Max 66.2% 53.4% 52.3% Dialer can be started. Otherwise, a warning should be given, like Memory Usage Min 66.0% 53.2% 52.0% “please unlock the device first”. Therefore, in the situation of Avg 66.0% 53.3% 52.0% secure screen lock, GVS-Attack cannot be launched and the phone Screen Status on off off is safe. From the aspect of app development, Google Voice Search must check the status of speaker in real time, not just at the moment of 4.3 Sound Volume and Sound Pressure Level initialization. If other apps are accessing the speaker, Google Voice Experiments Search should suspend that app immediately. But this solution may affect other apps’ user experience. For example, an IM app may One issue we concerned is what the volume (STREAM_MUSIC) not play its notification sound when new message reach. should be set for playing voice commands. The volume level From the aspect of Android Intent mechanism, system built- should be loud enough to be recognized by Google Voice Search in apps / services should also add customized permissions in the and may not be noticed by the victim. We call it the minimal manifest file. if a third-party app wants to invoke a system built-in available sound volume (MASV for short). In this section, we only app / service, it must declare these requirements before installation. consider the volume setting of the speaker (MODE_NORMAL) in As a result, the user needs to confirm the authorization or cancel the Extended Attack, not the headset (MODE_IN_CALL). In Remote installation. Nonetheless, this defense method may result in similar Voice Control Attack, the attacker can adjust his own speaking abuse problems like system permissions. volume directly. From the aspect of user identification, speaker recognition (or so The hardware design and components are quite different in called voiceprint recognition) techniques [11] should be deployed different models of Android phones. Mixed with background to verify the identity of the speaker. If voiceprint authentica- white noise, it is impossible to give a fixed sound volume setting. tion fails, Google Voice Search will not accept the next voice One solution is using the method of successive approximation to commands. Touchless Control [38] (a variant of Google Voice pretest, which needs the help of Google Speech-to-Text (STT). Search, only for Motorola phones) has added this function to STT is a system built-in speech recognition service, which uses the authenticate the current user through speaking "OK, Google"[37]. same engine as Google Voice Search and can be invoked through Recently Google has announced that Android L (5.0) will use voice ACTION_RECOGNIZE_SPEECH [4]. The key point is the app fingerprinting for context-based security [43]. Potential problems 12http://www.rovio.com/en/our-work/games may be how to defend voice replay attacks. 6. DISCUSSION Limitation. We didn’t carry out a user study about the minimum Soundless Attack. One potential limitation of GVS-Attack is available sound volume (Section 4.3), because the noise-induced that the victim may notice the voice command played by VoicEm- nocturnal awakening is affected by many other factors, such as ployer and interrupt it. So one direct idea is whether soundless sleep stages, age, gender, smoking, etc [31]. Content-aware GVS-Attack could be launched. That is, VoicEmployer imports an information analysis (Section 4.2) may be affected by hardware audio file to the microphone directly without playing it, like creat- configurations. For example, high memory usage may derive ing a loopback. But in Android OS, the audio recording method is from small capacity RAM. Light sensors and accelerometers from synchronized. It means that the microphone cannot support multi- different vendors may have different precision. So it is difficult to progress accessing at the same time. So when Google Voice Search create a general detector to handle all cases. is accessing microphone to accept voice commands, VoicEmployer cannot access microphone at the same time at least on API level. In terms of audio files as the input of microphone, this feature needs 7. RELATED WORK the support of kernel / drivers. On Windows and platforms, Sensor based Attacks. On mobile platforms, sensor based there exist such drivers (such as VB-Audio Virtual Cable [50]) to attacks have been designed and analyzed in several previous pa- simulate a virtual microphone device. But on Android platform, an pers [46, 40, 13, 10, 47, 49, 30]. One typical example is app cannot modify kernel or install drivers directly. One feasible Soundcomber [46]. Schlegel et al. designed a Trojan with solution is to prepare a customized Android version with modified few and innocuous permissions, that can extract targeted private audio drivers. One recent research [54] has pointed out the security information (such as credit card and PIN number) from the audio risks in Android device driver customizations. Android inherits the sensor of the phone. Besides, it proved that smartphone based driver management methods of Linux and devices are placed under malware can easily achieve targeted, context-aware information /dev (or /sys) as files. Zhou et al. found the vulnerability that discovery from sound recordings. In another research project, certain important devices become unprotected (permission setting) through completely opportunistic use of the phone’s camera and during a customization. An unauthorized app can get access to other sensors, PlaceRaider [49] can construct three dimensional sensitive devices, namely user data. Based on this method, similar models of indoor environments and steal virtual objects. vulnerability may occur on audio drivers (/dev/snd), but we The work of [40] showed that accelerometer readings are a have not found such a case of unprotected writing privileges on powerful side channel which can be used to extract entire sequences our test devices. of entered text on a smartphone keyboard. In [47], Another perspective of soundless attacks is high-frequency sound their side-channel attack utilizes the video camera and microphone (such as higher than 20 kHz), which could be played by the phone to infer PINs entered on a number-only soft keyboard on a smart- and is difficult to hear by humans [42]. Unfortunately (fortunately phone. The microphone is used to detect touch events, while for security), Google Voice Search only accepts reasonable human the camera is used to estimate the smartphone’s orientation, and sound frequency and filters out other ranges, so the idea of high- correlate it to the position of the digit tapped by the user. The frequency sound is not impracticable. work of [30] studied environmental sensor-based covert channels Since Google Voice Search is an Internet based service, we in mobile malware. Out-of-band command and control channels ever tried to analyze the feasibility of connection hijacking and could be based on acoustic, light, magnetic and vibrational signal- data package tampering. After tests, the difficulties lie in that ing. This research is a bit like our GVS-Attack in the aspect of the connection is TLS protected and transmitted voice data are voice command transmission. The differences are that, in GVS- compressed by an unclear compression encoding algorithm. Attack, the command receiver (Google Voice Search) is not a part Quiet vs. Noisy. The scene of launching GVS-Attack is a quiet of malware (VoicEmployer) and the RECORD_AUDIO permission environment. The volume of voice commands could be very low is not necessary. So our attack scheme is more insidious. and still be recognized by Google Voice Search. Actually we also Beyond that, Sensors could be used for fingerprinting devices. A tested the performance of attacks in noisy environments, such as mechanism was proposed by [20] that smartphone accelerometers on the subway and in the canteen. The expected result would be possess unique fingerprints, which can be utilized for tracking changed, namely the volume of voice commands could be very users. Similar fingerprinting methods were designed with micro- loud and still be hidden in the background noisy. But test results phone and speaker in [18, 57]. Through playbacking and recording showed the background noisy (especially human voice) affected audio samples, they could uniquely identify an individual device the accuracy of speech recognition, to some degree. In addition, based on sound feature analysis. context-aware analysis will become complex and may need the Inter-Application Communication. In [15], Chin et al. focused RECORD_AUDIO permission. on Intent-based attack surfaces. It analyzed unauthorized Intent Scope of Attack. Since Google Services Framework is pre- receipt can leak user information. Data can be stolen by eaves- installed on nearly all brands of Android devices, most Android droppers and permissions can be accidentally transferred between devices can be affected by GVS-Attack, especially these equipped apps. Another attack type is Intent spoofing, that a malicious with Android 4.1 or higher versions. To voice assistant apps using app sends an Intent to an exported component. If the victim Google Speech-to-Text (STT) service, similar attacks could be app takes some action upon receipt of such an Intent, the attack launched. Even these using independent speech recognition en- can trigger that action. In their following work [33], Kantola gines should also be reviewed carefully, such as Samsung S Voice et al. proposed modifications to the Android platform to detect app. Furthermore, similar attacks may occur on iOS, Windows and protect inter-application messages that should have been intra- Phone platforms and other smart devices supporting voice control. application messages. The target is to automatically reduce attack However, for the lack of experimental devices, we haven’t tested surfaces in legacy apps. them temporarily and left them for future research. Application Analysis Android permission-based security has been analyzed from many aspects. For example, permission specification and least-privilege security problems were studied in [22] and [9]. Both of them designed corresponding static [5] Android Developers. TextToSpeech. analysis tools to detect over-privilege problems. Permission esca- http://developer.android.com/reference/ lation and leakage problems are also hot research topics. Related android/speech/tts/TextToSpeech.html. research include [24, 19, 12, 14, 28]. Permission-based behavioral [6] AOSP. HeadsetStateMachine. footprinting was used to detect known malware in Android market https://android.googlesource.com/ at large-scale in [56]. The work of [52] noticed the problem of platform/packages/apps/Bluetooth/. pre-installed apps. Wu et al. found a lot of those apps were overly [7] AOSP. Telephony. privileged for vendor customizations. https://android.googlesource.com/ To system code level analysis, dynamic analysis technique is an platform/packages/services/Telephony/. efficient solution. TaintDroid [21] is a system-wide dynamic taint [8] Apple. Siri. http://www.apple.com/ios/siri/. tracking and analysis system capable of simultaneously tracking [9] K. W. Y. Au, Y. F. Zhou, Z. Huang, and D. Lie. Pscout: multiple sources of sensitive data. It automatically labels (taints) analyzing the android permission specification. In data from privacy-sensitive sources and transitively applies labels Proceedings of the 2012 ACM conference on Computer and as sensitive data propagates through program variables, files, and communications security, pages 217–228. ACM, 2012. interprocess messages. DroidScope [53] is an emulation based An- [10] A. J. Aviv, B. Sapp, M. Blaze, and J. M. Smith. Practicality droid malware analysis engine that can be used to analyze the Java of accelerometer side channels on smartphones. In and native components of Android apps. Unlike current desktop Proceedings of the 28th Annual Computer Security malware analysis platforms, DroidScope reconstructs both the OS- Applications Conference, pages 41–50. ACM, 2012. level and Java-level semantics simultaneously and seamlessly. Fundamentals of speaker recognition But these analysis methods are useless to our GVS-Attack, [11] H. Beigi. . Springer, because VoicEmployer doesn’t touch sensitive data and perform 2011. sensitive operations directly. The voice commands are transferred [12] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, A.-R. Sadeghi, out of the phone, that is, an uncontrolled physical communication and B. Shastry. Towards taming privilege-escalation attacks channel. In addition, the remote data transmission function is on android. In 19th Annual Network & Distributed System based on the call channel and the transmission media is sound. Security Symposium (NDSS), volume 17, pages 18–25, 2012. These brand-new features break previous checking and protecting [13] L. Cai and H. Chen. Touchlogger: inferring keystrokes on mechanism. touch screen from smartphone motion. In Proceedings of the 6th USENIX conference on Hot topics in security, pages 9–9. USENIX Association, 2011. 8. CONCLUSION [14] P. P. Chan, L. C. Hui, and S.-M. Yiu. Droidchecker: This paper proposes a novel permission bypassing attack method analyzing android applications for capability leak. In based on Android system built-in voice assistant module - Google Proceedings of the fifth ACM conference on Security and Voice Search and the phone speaker. The app launching attacks Privacy in Wireless and Mobile Networks, pages 125–136. doesn’t require any permission. But achieved malicious targets ACM, 2012. could be quite dangerous and practical, from privacy stealing to [15] E. Chin, A. P. Felt, K. Greenwood, and D. Wagner. remote voice control. Through utilizing a vulnerability found by Analyzing inter-application communication in android. In us in Google Search app, this attack can dial arbitrary malicious Proceedings of the 9th international conference on Mobile numbers even when the phone is securely locked. Besides, related systems, applications, and services, pages 239–252. ACM, in-depth topics are discussed, including context-aware analysis, 2011. minimal available sound volume, soundless attack, etc. The [16] A. Clark, C. Fox, and S. Lappin. The handbook of feasibility of our attack schemes has been demonstrated in the computational linguistics and natural language processing, real world. This research may inspire application developers and volume 57. John Wiley & Sons, 2010. researchers to rethink that zero permission doesn’t mean safety and [17] CyanogenMod. Google Apps. the speaker can be treated as a new attack surface. http://wiki.cyanogenmod.org/w/Google_Apps. [18] A. Das, N. Borisov, and M. Caesar. Do you hear what I hear? Acknowledgements fingerprinting smart devices through embedded acoustic We thank anonymous reviewers for their valuable comments. components. In (To appear) Proceedings of the 2014 ACM conference on Computer and communications security. ACM, 2014. 9. REFERENCES [19] L. Davi, A. Dmitrienko, A.-R. Sadeghi, and M. Winandy. Privilege escalation attacks on android. In Information [1] Android Developers. Common Intents. Security, pages 346–360. Springer, 2011. http://developer.android.com/guide/ [20] S. Dey, N. Roy, W. Xu, R. R. Choudhury, and S. Nelakuditi. components/intents-common.html. Accelprint: Imperfections of accelerometers make [2] Android Developers. Intent. smartphones trackable. In Proceedings of the 21st Annual http://developer.android.com/reference/ Network and Distributed System Security Symposium, NDSS, android/content/Intent.html. 2014. [3] Android Developers. Platform versions. [21] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, https://developer.android.com/about/ P. McDaniel, and A. Sheth. Taintdroid: An information-flow dashboards/index.html#Platform. tracking system for realtime privacy monitoring on [4] Android Developers. RecognizerIntent. smartphones. In OSDI, volume 10, pages 1–6, 2010. http://developer.android.com/reference/ android/speech/RecognizerIntent.html. [22] A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner. [39] A. Muzet. Environmental noise, sleep and health. Sleep Android permissions demystified. In Proceedings of the 18th medicine reviews, 11(2):135–142, 2007. ACM conference on Computer and communications security, [40] E. Owusu, J. Han, S. Das, A. Perrig, and J. Zhang. pages 627–638. ACM, 2011. Accessory: password inference using accelerometers on [23] A. P. Felt, E. Ha, S. Egelman, A. Haney, E. Chin, and smartphones. In Proceedings of the Twelfth Workshop on D. Wagner. Android permissions: User attention, Mobile Computing Systems & Applications, page 9. ACM, comprehension, and behavior. In Proceedings of the Eighth 2012. Symposium on Usable Privacy and Security, page 3. ACM, [41] Ponemon Institute. Smartphone security survey of U.S. 2012. consumers. http://aa-download.avg.com/ [24] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna, and E. Chin. filedir/other/Smartphone.pdf, March, 2011. Permission re-delegation: Attacks and defenses. In USENIX [42] S. Rosen and P. Howell. Signals and systems for speech and Security Symposium, 2011. hearing, volume 29. BRILL, 2011. [25] H. W. Gellersen, A. Schmidt, and M. Beigl. Multi-sensor [43] B. P. Rubin. Google previews “Android L” at I/O. context-awareness in mobile devices and smart artifacts. http://www.besttechie.com/2014/06/25/ Mobile Networks and Applications, 7(5):341–351, 2002. google-previews-android-l-at-io/. [26] Google. Google Apps for Android. [44] Samsung. S Voice. http://www.samsung.com/ https://www.google.com/mobile/android/. global/galaxys3/svoice.html. [27] Google. Use your voice on Android. [45] J. Schalkwyk, D. Beeferman, F. Beaufays, B. Byrne, https://support.google.com/websearch/ C. Chelba, M. Cohen, M. Kamvar, and B. Strope. "Your answer/2940021?hl=en&ref_topic=4409793. word is my command": Google search by voice: A case [28] M. Grace, Y. Zhou, Z. Wang, and X. Jiang. Systematic study. In Advances in Speech Recognition, pages 61–90. detection of capability leaks in stock android smartphones. In Springer, 2010. Proceedings of the 19th Annual Symposium on Network and [46] R. Schlegel, K. Zhang, X. Zhou, M. Intwala, A. Kapadia, and Distributed System Security, 2012. X. Wang. Soundcomber: A stealthy and context-aware sound [29] C. Hadnagy. Social engineering: The art of human hacking. trojan for smartphones. In Proceedings of the 18st Annual John Wiley & Sons, 2010. Network and Distributed System Security Symposium, NDSS, [30] R. Hasan, N. Saxena, T. Haleviz, S. Zawoad, and 2011. D. Rinehart. Sensing-enabled channels for hard-to-detect [47] L. Simon and R. Anderson. Pin skimmer: inferring pins command and control of mobile devices. In Proceedings of through the camera and microphone. In Proceedings of the the 8th ACM SIGSAC symposium on Information, computer Third ACM workshop on Security and privacy in and communications security, pages 469–480. ACM, 2013. smartphones & mobile devices, pages 67–78. ACM, 2013. [31] C. Hurtley. Night noise guidelines for Europe. WHO [48] A. Smith. Americans and their cell phones. Pew Internet & Regional Office Europe, 2009. American Life Project, 15, 2011. [32] IDC. Worldwide smartphone shipments edge past 300 [49] R. Templeman, Z. Rahman, D. Crandall, and A. Kapadia. million units in the second quarter; android and devices PlaceRaider: Virtual theft in physical spaces with account for 96% of the global market, according to IDC. smartphones. In Proceedings of the 20th Annual Network http://www.idc.com/getdoc.jsp?containerId= and Distributed System Security Symposium, NDSS, 2013. prUS25037214, August, 2014. [50] VB-Audio Software. VB-Audio Virtual Cable. http: [33] D. Kantola, E. Chin, W. He, and D. Wagner. Reducing attack //vb-audio.pagesperso-orange.fr/Cable/. surfaces for intra-application communication in android. In [51] Wikipedia. Sound pressure. http: Proceedings of the second ACM workshop on Security and //en.wikipedia.org/wiki/Sound_pressure. privacy in smartphones and mobile devices, pages 69–80. [52] L. Wu, M. Grace, Y. Zhou, C. Wu, and X. Jiang. The impact ACM, 2012. of vendor customizations on android security. In Proceedings [34] Kaspersky Lab. Kaspersky security bulletin 2013. http: of the 2013 ACM SIGSAC conference on Computer & //media.kaspersky.com/pdf/KSB_2013_EN.pdf, communications security, pages 623–634. ACM, 2013. December 2013. [53] L. K. Yan and H. Yin. Droidscope: seamlessly reconstructing [35] Y.-S. Lee and S.-B. Cho. Activity recognition using the os and dalvik semantic views for dynamic android hierarchical hidden markov models on a smartphone with 3d malware analysis. In Proceedings of the 21st USENIX accelerometer. In Hybrid Artificial Intelligent Systems, pages Security Symposium, 2012. 460–467. Springer, 2011. [54] X. Zhou, Y. Lee, N. Zhang, M. Naveed, and X. Wang. The [36] . Cortana. http://www.windowsphone.com/ peril of fragmentation: Security hazards in android device en-us/features#Cortana. driver customizations. In Security and Privacy (SP), 2014 [37] Motorola. How do I setup and use touchless control? IEEE Symposium on. IEEE, 2014. https: [55] Y. Zhou and X. Jiang. Dissecting android malware: //motorola-global-portal.custhelp.com/ Characterization and evolution. In Security and Privacy (SP), app/answers/prod_answer_detail/a_id/ 2012 IEEE Symposium on, pages 95–109. IEEE, 2012. 94881/p/30,6720,8696/action/auth. [56] Y. Zhou, Z. Wang, W. Zhou, and X. Jiang. Hey, you, get off [38] Motorola. Touchless Control. of my market: Detecting malicious apps in official and http://www.motorola.com/us/Moto-X- alternative android markets. In Proceedings of the 19th Features-Touchless-Control/motox- Annual Network and Distributed System Security features-2-touchless.html. Symposium, pages 5–8, 2012. [57] Z. Zhou, W. Diao, X. Liu, and K. Zhang. Acoustic 17. READ_EXTERNAL_STORAGE fingerprinting revisited: Generate stable device id stealthily 18. READ_HISTORY_BOOKMARKS with inaudible sound. In (To appear) Proceedings of the 2014 ACM conference on Computer and communications 19. READ_PROFILE security. ACM, 2014. 20. READ_SMS 21. READ_SYNC_SETTINGS APPENDIX 22. RECEIVE_BOOT_COMPLETED A. PERMISSIONS STATISTICS 23. RECORD_AUDIO According to Android API Level 19, Google Search app (version 24. SEND_SMS 3.4.16.1149292.arm) declares / possesses the following system 25. SET_ALARM defined permissions: 26. SET_WALLPAPER 1. ACCESS_COARSE_LOCATION SET_WALLPAPER_HINTS 2. ACCESS_FINE_LOCATION 27. STATUS_BAR 3. ACCESS_NETWORK_STATE 28. 4. ACCESS_WIFI_STATE 29. USE_CREDENTIALS 5. BIND_APPWIDGET 30. VIBRATE 6. BLUETOOTH 31. WAKE_LOCK 7. BROADCAST_STICKY 32. WRITE_CALENDAR 8. CALL_PHONE 33. WRITE_EXTERNAL_STORAGE 9. GET_ACCOUNTS 34. WRITE_SETTINGS 10. GLOBAL_SEARCH 35. WRITE_SMS 11. INTERNET Non-public permissions: 12. MANAGE_ACCOUNTS 1. CAPTURE_AUDIO_HOTWORD 13. MEDIA_CONTENT_CONTROL 2. STOP_APP_SWITCHES 14. MODIFY_AUDIO_SETTINGS 3. PRELOAD 15. READ_CALENDAR 16. READ_CONTACTS