Master of Science in Engineering: Computer Security June 2019

Threat Analysis of Smart Home Assistants Involving Novel Acoustic Based Attack-Vectors

Adam Björkman Max Kardos

Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden This thesis is submitted to the Faculty of Computing at Blekinge Institute of Technology in partial fulfilment of the requirements for the degree of Master of Science in Engineering: Computer Security. The thesis is equivalent to 20 weeks of full time studies.

The authors declare that they are the sole authors of this thesis and that they have not used any sources other than those listed in the bibliography and identified as references. They further declare that they have not submitted this thesis at any other institution to obtain a degree.

Contact Information: Author(s): Adam Björkman E-mail: [email protected]

Max Kardos E-mail: [email protected]

University advisers: Assistant Professor Fredrik Erlandsson Assistant Professor Martin Boldt Department of Computer Science and Engineering

Faculty of Computing Internet : www.bth.se Blekinge Institute of Technology Phone : +46 455 38 50 00 SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 Abstract

Background. Smart home assistants are becoming more common in our homes. Often taking the form of a speaker, these devices enable communication via voice commands. Through this communication channel, users can for example order a pizza, check the weather, or call a taxi. When a voice command is given to the assistant, the command is sent to cloud services over the Internet, enabling a multi- tude of functions associated with risks regarding security and privacy. Furthermore, with an always active Internet connection, smart home assistants are a part of the Internet of Things, a type of historically not secure devices. Therefore, it is crucial to understand the security situation and the risks that a smart brings with it. Objectives. This thesis aims to investigate and compile threats towards smart home assistants in a home environment. Such a compilation could be used as a foundation during the creation of a formal model for securing smart home assistants and other devices with similar properties. Methods. Through literature studies and threat modelling, current vulnerabili- ties towards smart home assistants and systems with similar properties were found and compiled. A few vulnerabilities were tested against two smart home assistants through experiments to verify which vulnerabilities are present in a home environ- ment. Finally, methods for the prevention and protection of the vulnerabilities were found and compiled. Results. Overall, 27 vulnerabilities towards smart home assistants and 12 towards similar systems were found and identified. The majority of the found vulnerabilities focus on exploiting the voice interface. In total, 27 methods to prevent vulnerabili- ties in smart home assistants or similar systems were found and compiled. Eleven of the found vulnerabilities did not have any reported protection methods. Finally, we performed one experiment consisting of four attacks against two smart home assis- tants with mixed results; one attack was not successful, while the others were either completely or partially successful in exploiting the target vulnerabilities. Conclusions. We conclude that vulnerabilities exist for smart home assistants and similar systems. The vulnerabilities differ in execution difficulty and impact. How- ever, we consider smart home assistants safe enough to usage with the accompanying protection methods activated.

Keywords: Smart home assistants, threats, voice interface, vulnerability, exploit

i

Sammanfattning

Bakgrund. Smarta hemassistenter blir allt vanligare i våra hem. De tar ofta formen av en högtalare och möjliggör kommunikation via röstkommandon. Genom denna kommunikationskanal kan användare bland annat beställa pizza, kolla väderleken eller beställa en taxi. Röstkommandon som ges åt enheten skickas till molntjänster över internet och möjliggör då flertalet funktioner med associerade risker kring säker- het och integritet. Vidare, med en konstant uppkoppling mot internet är de smarta hemassistenterna en del av sakernas internet; en typ av enhet som historiskt sett är osäker. Således är det viktigt att förstå säkerhetssituationen och riskerna som medföljer användningen av smarta hemassistenter i en hemmiljö. Syfte. Syftet med rapporten är att göra en bred kartläggning av hotbilden mot smarta hemassistenter i en hemmiljö. Dessutom kan kartläggningen fungera som en grund i skapandet av en modell för att säkra både smarta hemassistenter och andra enheter med liknande egenskaper. Metod. Genom literaturstudier och hotmodellering hittades och sammanställdes nuvarande hot mot smarta hemassistenter och system med liknande egenskaper. Nå- gra av hoten testades mot två olika smarta hemassistenter genom experiment för att säkerställa vilka hot som är aktuella i en hemmiljö. Slutligen hittades och sam- manställdes även metoder för att förhindra och skydda sig mot sårbarheterna. Resultat. Totalt hittades och sammanställdes 27 stycken hot mot smarta hemassis- tenter och 12 mot liknande system. Av de funna sårbarheterna fokuserar majoriteten på manipulation av röstgränssnittet genom olika metoder. Totalt hittades och sam- manställdes även 27 stycken metoder för att förhindra sårbarheter i smarta hemas- sistenter eller liknande system, varav elva sårbarheter inte förhindras av någon av dessa metoder. Slutligen utfördes ett experiment där fyra olika attacker testades mot två smarta hemassistenter med varierande resultat. En attack lyckades inte, medan resterande antingen helt eller delvis lyckades utnyttja sårbarheterna. Slutsatser. Vi konstaterar att sårbarheter finns för smarta hemassistenter och för liknande system. Sårbarheterna varierar i svårighet att uföra samt konsekvens. Dock anser vi att smarta hemassistenter är säkra nog att använda med medföljande sky- ddsmetoder aktiverade.

Nyckelord: Smarta hemassistenter, hotbild, röstgränssnitt, sammanställning, at- tack

iii

Acknowledgments

We want to thank Martin Boldt and Fredrik Erlandsson for their supervision and guidance during the thesis. We also want to thank Knowit Secure, its employees, and our company supervisor Mats Persson, for their motivation and expertise. Finally, we would like to thank our families for their unrelenting support.

v

Contents

Abstract i

Sammanfattning iii

Acknowledgments v

1 Introduction 1 1.1 Problem Description and Research Gap ...... 2 1.2 Aim and Research Questions ...... 2 1.3 Scope and Limitations ...... 3 1.4 Document Outline ...... 3

2 Background 5 2.1 Smart Home Assistant ...... 5 2.1.1 ...... 6 2.1.2 Google Home ...... 6 2.2 Application Programming Interface ...... 6 2.3 Automatic Speech Recognition ...... 7 2.4 Speaker Recognition ...... 7 2.5 Threats Towards Smart Home Assistants ...... 7 2.5.1 Threat Mitigation ...... 8 2.5.2 Threat Classification ...... 8 2.5.3 Vulnerability Databases ...... 8 2.6 Threat Modelling ...... 8 2.6.1 STRIDE ...... 8

3 Related Works 11

4 Method 13 4.1 Systematic Literature Review ...... 13 4.1.1 Database Selection ...... 13 4.1.2 Selection Criteria ...... 14 4.1.3 Quality Assessment ...... 14 4.1.4 Data Extraction Strategy and Synthesis ...... 15 4.2 Threat Assessment of Smart Home Assistants ...... 15 4.2.1 Keywords ...... 15 4.2.2 Quality Assessment Criteria ...... 16 4.3 Threat Assessment of Similar Systems ...... 17

vii 4.3.1 Keywords ...... 17 4.3.2 Quality Assessment Criteria ...... 18 4.4 Threat Modelling ...... 18 4.4.1 Generalised STRIDE Analysis ...... 18 4.5 Experiment Design ...... 19 4.5.1 Experiment Environment ...... 20 4.5.2 Functionality Test of SHA ...... 20 4.5.3 Chosen Attacks ...... 20 4.5.4 Experiment Layout ...... 20 4.6 Experiment Execution ...... 21 4.6.1 Replay Attack ...... 22 4.6.2 Adversarial Attack Using Phsychoacoustic Hiding ...... 23 4.6.3 Harmful API Behaviour ...... 24 4.6.4 Unauthorised SHA Functionality ...... 25

5 Results 27 5.1 Threat Status of Smart Home Assistants ...... 27 5.1.1 Vulnerabilities ...... 28 5.1.2 Protection Methods ...... 31 5.2 Threat Status on Similar Systems ...... 33 5.2.1 Vulnerabilities ...... 34 5.2.2 Protection Methods ...... 36 5.3 Threat Modelling ...... 37 5.3.1 Possible Threats ...... 37 5.3.2 Protection Methods ...... 39 5.4 Threat Validation on SHA ...... 40 5.4.1 Replay Attack ...... 42 5.4.2 Harmful API Behaviour ...... 42 5.4.3 Unauthorised SHA Functionality ...... 43 5.4.4 Threat Validation Summary ...... 44

6 Analysis and Discussion 47 6.1 Research Implications ...... 47 6.2 Research Question Analysis ...... 48 6.3 Literature Reviews ...... 49 6.4 Threat Modelling ...... 50 6.5 Experiments ...... 51 6.5.1 Features Not Supported ...... 51 6.5.2 Vulnerability Score ...... 52

7 Conclusions and Future Work 53 7.1 Future Works ...... 53

Appendices 63

A Permission Forms 65 A.1 Permission IEEEXplore ...... 65

viii B Scripts 67 B.1 Script for Search Result Extraction ...... 67

ix

List of Figures

2.1 A command flow example as found in an Amazon SHA ©2018 IEEE. See Appendix A.1 for permission...... 6

4.1 A generalised system targeted in the STRIDE analysis process . . . . 19

5.1 The amount of protection methods addressing each vulnerability found during threat assessment of SHAs ...... 33 5.2 The amount of protection methods addressing each vulnerability found during threat assessment of similar systems ...... 37 5.3 The amount of protection methods addressing each SHA vulnerability generated during the threat modelling process ...... 40

xi

List of Tables

4.1 Form describing the data extracted from the literature review papers 15 4.2 Search keywords, sorted by category, used in the threat assessment of home assistants ...... 16 4.3 Search keywords, sorted by category, used in the threat assessment of systems similar to smart home assistants ...... 17 4.4 Attacks and their corresponding target, found during the threat as- sessments, chosen for the experimentation phase ...... 21

5.1 The amount of papers found through each database for the threat assessment of smart home assistants ...... 27 5.2 Papers remaining for the threat assessment of smart home assistants, after application of selection criteria ...... 28 5.3 Papers remaining for the threat assessment of smart home assistants, after application of quality assessment criteria ...... 28 5.4 Vulnerabilities discovered during the threat assessment of smart home assistants ...... 29 5.5 Protection methods discovered during the threat assessment of smart home assistants ...... 31 5.6 The amount of papers found through each database for the threat assessment of similar systems ...... 33 5.7 Papers remaining for the threat assessment of similar systems, after application of selection criteria ...... 34 5.8 Papers remaining for the threat assessment of similar systems, after application of quality assessment criteria ...... 34 5.9 Vulnerabilities discovered during the threat assessment of similar sys- tems ...... 35 5.10 Protection methods discovered during the threat assessment of similar systems ...... 36 5.11 Threats identified through the STRIDE process, targeting a modelled home environment ...... 38 5.12 Smart home assistant protection methods identified via threat modelling 39 5.13 Attacks and their corresponding target, found during the treat assess- ments, chosen for the experimentation phase ...... 41 5.14 Results of replay attacks towards Amazon Echo and Google Home, marking successful attacks, which return calendar information, with "X" ...... 42 5.15 Result of privacy-infringing and harmful queries towards Amazon Echo and Google Home ...... 43

xiii 5.16 Delta-value summary of threat validation attempts towards Google Home and Amazon Echo ...... 44

xiv Chapter 1 Introduction

Smart home assistants are quickly growing in the consumer market, with half of American homes expected to have one by 2022 [1]. Often taking the form of a speaker, the smart home assistant devices vary in physical appearance. Through built-in microphones and Automatic Speech Recognition (ASR), a technology aiming to enable human-machine interaction [2], smart home assistants record and analyse commands using Artificial Intelligence (AI). ASR and AI ensure intelligent responses through the speakers, allowing the user to communicate and conversate with their devices [3]. Through this channel, smart home assistants enable functions such as ordering a pizza or making a call. Many see this as an increase of convenience in everyday life, one of the primary drivers for smart home assistants popularity today [4]. Another driver is the desire to talk and hold a conversation with their devices, inspired by fantasy and science fiction characters such as Hal-9000 or KITT [5]. However, the popularity of smart home assistants comes with concerning draw- backs. Smart home assistants are Internet of Things (IoT)-devices, Internet-controlled devices with a history of insecurity and weak standards [6]. A display of IoT weak- nesses is the Mirai botnet; consisting of many different devices with lacking security measures. The botnet controller caused monetary losses of up to $68.2 million, mostly consisting of bandwidth costs [7]. Further insecurity concerns are found in state-sponsored hackers spying on the population through the devices or targeting vital infrastructure [8], [9]. Other con- cerns address privacy and user data. Law enforcement could use the data to track individuals [10], unauthorised users could gain access to sensitive data by asking the assistant and have it read out loud [5], and the device could always be listening, recording everything in the home environment [4]. Vendors address these in different manners. Google ensures both privacy and security through a technique called Voice Match to identify unique users, preventing the home assistant from revealing private details unless the correct person commands it to [5]. Although, Voice Match could be bypassed using voice spoofing, meaning impersonation or copying of someone else’s voice [11]. Consequently, it becomes difficult to know which weaknesses and following vulnerabilities exist on smart home assistants without analysing current research or testing the devices themselves.

1 2 Chapter 1. Introduction

1.1 Problem Description and Research Gap While previous analysis and testing of smart home assistants have taken place, threats towards smart home assistants are currently not comprehensively addressed [10], [4], [12], [13]. The lack of a comprehensive threat analysis could not only hinder users from grasping the risks associated with the devices but also hamper the creation of a formal security model. Such a model could assist in securing the development of smart home assistants. As the popularity of smart home assistants increase, so does the risks which insecure usage and development entail. Additionally, attacks towards smart home assistants have already occurred, meaning the concerns for user privacy and security are valid and should be addressed. This thesis aims to address the research gap in the smart home assistant domain by creating a public mapping of known and novel threats towards smart home assis- tants. This mapping will describe the current threats faced by such devices, simpli- fying secure usage and development. Furthermore, as the thesis includes protection methods, the result could function as a foundation during the future development of a formal model for securing smart home assistants.

1.2 Aim and Research Questions This thesis aims to investigate threats against smart home assistants with focus on their most common domain; the home environment. A mapping of threats con- taining vulnerabilities and protection methods is facilitated through the following: analysis of previously disclosed vulnerabilities towards smart home assistants, anal- ysis of novel vulnerability channels such as sound, and experiments adapting said vulnerabilities to smart home assistants. To map out protection methods, an exami- nation of existing and proposed methods, coupled with their effectiveness against the previous vulnerabilities, will be performed. Therefore, the objectives of this thesis are: • Gain an understanding of previously discovered threats against current smart home assistants.

• Locate vulnerabilities which could affect smart home assistants based on back- ground research in areas such as similar solutions like ASR applications, secu- rity publishings, and peer-reviewed research papers.

• Attempt to adapt and reproduce, or create, attacks exploiting the previous vulnerabilities, towards smart home assistants.

• Analyse whether protection methods in smart home assistants prevent the above attacks.

• If attacks are successful, then identify and discuss possible protection methods. Based on the objectives above, the following research questions were devised.

RQ 1: What vulnerabilities targeting smart home assistants have been reported in academic sources or in vulnerability and weakness databases? 1.3. Scope and Limitations 3

RQ 2: What additional types of vulnerabilities can be identified and to what extent could they affect smart home assistants?

RQ 3: Which protection methods are used to safeguard against vulnerabilities identi- fied in RQ1-2 and is there need for additional protection methods?

1.3 Scope and Limitations

The scope of this study is limited to test two smart home assistants: Amazon Echo Dot and Google Home. Both purchases are the latest edition available on the Swedish consumer market as of 2019-02-10. The reasoning behind this choice is due to their vendors holding the top shares in the worldwide home assistant market, coupled with their consumer availability [14]. Focusing on these devices increases reproducibility for other researchers, strengthening the scientific value of the thesis. The first literature review, targeting known vulnerabilities, will acquire data up to 2018 (inclusive) with no lower bound. The limit was set in conjunction with the creation of the project plan in December 2018, when our scope was defined. Defining a hard scope limit early on in the thesis process allows us to maintain reproducibility for existing research, meaning we do not need to adhere to research published during the thesis time frame. As a manageable amount of vulnerabilities were found in a preparatory search using this scope, we deem it worthwhile to investigate them all. To ensure coverage of today’s smart home assistant market, we include other devices beyond those selected for testing in the search string. The second literature review will target vulnerabilities in similar systems to smart home assistants, such as ASR applications. Similar systems will be defined from the first literature review: If a component is the cause of vulnerabilities in the first literature review, it will be defined as a similar system. The search limit will be the last five years. Based on a preliminary search there are too many systems and vulnerabilities if the five-year time frame is exceeded, including them all is not feasible. Furthermore, hardware-dependant vulnerabilities based on the physical modifi- cation of the smart home assistant device will not be attempted. This exclusion is due to such exploits being too time- and resource-consuming with the required acquisition of vulnerable devices.

1.4 Document Outline

The rest of the paper is structured as follows. The Background-chapter covers back- ground information needed for the thesis scope. The Related Works-chapter covers related studies and papers. The Method-chapter covers the procedures and design of the systematic literature reviews, threat modelling process, and experiment. The Result-chapter describes the results gathered through the method procedures. The Analysis and Discussion-chapter contains answers to the presented research ques- tions (RQ), discussion regarding the results as well as validity threats towards the thesis. The Conclusion and Future Work-chapter contains the thesis conclusion on 4 Chapter 1. Introduction the security of smart home assistants and a section about possible future works in the thesis domain. Chapter 2 Background

The Background-chapter presents information regarding smart home assistants and its voice interface functionality, research methodologies, and threat definitions.

2.1 Smart Home Assistant

The definition of a Smart Home Assistant (SHA) is a contentious issue, whereas our definition is an Internet-connected device in a home environment built for voice- interaction using microphones and ASR [2]. Two examples of tasks that an SHA performs are asking about the traffic or the weather. Tasks could also focus on controlling other smart-home devices, such as light bulbs or media players [15]. When a user wants to interact with the device, the user utters a wake word. The wake word differs depending on the device, with Google Home using "Okay, Google" or "Hey Google" [16]. After the wake word, the device listens for commands issued by the user. The issued command is then processed by AI, ensuring a meaningful response and allowing further user interaction with the device [3]. There are multiple types of SHA devices, separated by shape, size, and function- ality. Another differing factor is the region, with some SHAs designed for specific areas and languages. An example is the Clova Wave, designed for the Japanese market [17]. While the physical devices differ in appearance, the underlying AI is occasionally the same. For example, Amazon offer developer licenses of their Alexa AI through Amazon Voice Services. This licensing allows third-party developers to create devices with the Alexa AI built-in, such as the Harman Kardon Allure [18]. The SHAs which are the focus of this thesis are Google Home and Amazon Echo. Both devices exist in Sweden and their vendors hold the most significant shares of the worldwide SHA market [14]. Additionally, both Amazon and Google allow for licensed use of their underlying AI services, and respectively [19], [20]. Therefore, any vulnerabilities found affecting these devices could directly affect those based on the same AI services, creating a more comprehensive threat overview. In this thesis, systems similar to SHAs are also of interest. We define a similar system as one where at least one key component is the same as in an SHA. For example, ASR applications are similar as both have a voice-input channel. Other key components are voice or speech interfaces and speaker verification applications. Standard components, such as being powered by electricity, are not considered.

5 6 Chapter 2. Background

2.1.1 Amazon Echo The Echo is a series of SHAs created by Amazon, with their first device released in 2014 [10]. The series contains multiple devices, such as the Echo Dot, Echo Plus, and Echo Show. Commands supported by the Echo devices are called skills, which are stored in a skill-server and are provided by Amazon [15]. However, third-party skills developed by companies or users are also supported, extending the functionality beyond that provided by Amazon [15]. Such skills could be accessing bank accounts, or interacting with non-Amazon smart lights. Whether skills from Amazon or a third-party are used, the command flow struc- ture remains the same [15]. An example command flow is shown in Figure 2.1 below [13].

<

Figure 2.1: A command flow example as found in an Amazon SHA ©2018 IEEE. See Appendix A.1 for permission.

2.1.2 Google Home The Google Home is a series of SHAs created by Google, with its first device released in 2016 [21]. Among the devices in the Google Home series are the Google Home itself, Google Home Hub, and Google Home Mini. Commands supported by the Google Home devices are called actions and work in a similar way as Amazon’s skills [22]. Some actions are provided by Google and other by third-parties, adding functionality to the Google Home devices. The command flow employed by the Google Home is similar to that used by the Amazon SHAs.

2.2 Application Programming Interface

An Application Programming Interface (API) is a software intermediary, enabling communication between applications and their users [23]. Using an API means the communication channel is abstracted; instead of talking directly to an application, a request is sent to the API which fetches a response from the application, returning it to the requester. Furthermore, when the application is decoupled from the request in this manner, maintenance and alteration of the underlying application infrastructure 2.3. Automatic Speech Recognition 7 are simplified. As long as the format of the requests or responses sent through the application are kept consistent, the API will support them. While APIs are useful, they can pose a threat to their providers through is- sues such as API calls without authentication, non-sanitised parameters, and replay attacks [24], [25]. Having no API authentication means that anyone could access protected or sensitive resources. If API parameters are not sanitised, an attacker could inject malicious commands into the request. Finally, a replay attack could allow for an adversary to repeat a valid API call.

2.3 Automatic Speech Recognition ASR is the technology used to translate spoken word into written text [26]. The technology works by removing noise and distortions from recorded audio [2]. After that, the filtered audio is processed using different techniques and analysis methods, one of which is machine learning [2], [26]. Machine learning is a technique where a computer program learns to improve its own performance in solving a task based on its own experience [26]. Through statistical methods within machine learning, such as Hidden Markow Models (HMM) the SHA can determine which words were uttered by the user [2].

2.4 Speaker Recognition

Speaker recognition focuses on recognising who is speaking as opposed to ASR, which focuses on what is being spoken, as described in Section 2.3. Speaker recognition has two different categories: Automatic Speaker Verification (ASV) and Automatic Speaker Identification (ASI). ASV systems verify claimed identities [27], [28]. An example of ASV in action is a system where an employee scans a personal key tag then speaks into a microphone to verify the identity claimed by the key tag. ASI systems instead determine the identity without verifying an external claim [27], [28]. Despite the differences between speaker recognition and ASR, both use similar audio processing techniques and statistical models for determining identities [27], [28], [2].

2.5 Threats Towards Smart Home Assistants

Grasping the terms exploit, vulnerability, and attack is key to understanding threats towards SHAs. A vulnerability is a weakness in a system for which an exploit can be developed. If an exploit successfully leverages a vulnerability, it is called an attack [29]. Consequently, a threat is an unwanted action that could cause damage to a system or business [29]. In the context of SHAs, the definition of a threat is any event or action which causes unwanted or malicious behaviour of an SHA. An example of a threat against SHAs is called voice squatting. By creating voice commands that sound phonetically the same as others, an attacker could redirect users to malicious web pages or applications [30]. For example, if a user says open Capital One, an 8 Chapter 2. Background attacker could create a skill, or action, named Capital Won, relying on ambiguity to trick the user into launching their skill or action [30].

2.5.1 Threat Mitigation The mitigation of threats against SHAs is equally essential as identifying them. Mit- igation techniques, or protection methods, aim to deal with the threat and impact of attacks [29]. A preemptive mitigation technique towards voice squatting, as described in Section 2.5, would be the vendor performing verification checks of commands be- fore they are allowed for use. Furthermore, certificates could be issued to trusted developers, hindering malicious developers [31].

2.5.2 Threat Classification Typically, threats are classified after they have been discovered. The classification process is often simplified using broad definitions; focusing on threat groups rather than granular threat classification. Using broad threat classifications allow for inclu- sive use of pre-existing threat data, rather than the smaller set of purely identical threat scenarios. For example, the threat modelling methodology STRIDE classifies threats through their possible attack outcomes. The categories used by STRIDE can be seen in Section 2.6.1.

2.5.3 Vulnerability Databases Threats and corresponding vulnerabilities that are made public are stored in vulner- ability databases such as Exploit Database1, Mitre CVE2 and Mitre CWE3. Vulner- ability databases make it possible to search for vulnerabilities using parameters such as classification, devices, and score, with the latter being part of the classification performed by individual vulnerability databases themselves.

2.6 Threat Modelling

Threat modelling is an approach for preemptively identifying threats in a specific system from an attackers point of view. Once completed, threat modelling aids in categorising vulnerabilities together with determining attacker profiles, possible attack vectors, and which assets are most at risk.

2.6.1 STRIDE STRIDE is a Microsoft-developed threat modelling methodology used for the iden- tification of IT-security threats [32]. The list below explains the five categories of which security threats are categorised within by the STRIDE model [33].

1https://www.exploit-db.com/ 2https://cve.mitre.org/ 3https://cwe.mitre.org/ 2.6. Threat Modelling 9

• Spoofing - An attacker uses authentication details from users to gain access to resources otherwise unavailable.

• Tampering - Malicious actions which alter data.

• Repudiation - Users denying a specific action has occurred in the system.

• Information Disclosure - Confidential information is accessible to non-authorised users.

• Denial of Service - Users are denied service through blocking access to resources.

• Elevation of Privilege - An unprivileged user gains privilege through malicious means.

The STRIDE process is applied as follows. First, split the system into compo- nents. Then, for each component, determine which of the vulnerability categories could apply to said component. After establishing the vulnerability categories, in- vestigate threats to each component under its appropriate category. After that, de- termine protection methods for each threat. Finally, the process is repeated with the newly determined protection methods in mind until a comprehensive set of threats against the system is achieved [33].

Chapter 3 Related Works

There are only few security analyses of specific SHAs. Haack et al. [15] analyse the Amazon Echo based on a proposed security policy. Sound-, network- and API-based attacks are tested, confirming that the device security is satisfactory although can, in exceptional cases, be exploited. Furthermore, a set of recommendations is provided, addressing the tested attacks. There have been multiple studies on specific vulnerabilities targeting SHAs. Zhang et al. [34] designed the "DolphinAttack", an attack vector which creates executable, yet inaudible voice commands using ultrasonic sound. The paper verifies the attack against a Google Home device. Yuan et al. [35] designed the "REEVE-attack" in a paper which describes how Amazon Echo devices could be forced to obey com- mands injected into radio transmissions and air-broadcasted televisions via software defined radio. Zhang et al. [30] designed and performed an experiment, showing that phonetic ambiguity could be exploited to execute custom, harmful skills instead of legitimate skills with a similar name. Additional SHA vulnerabilities are found in Table 5.4. Vulnerabilities in components found in SHAs, especially voice-based interfaces and digital assistants, have also been studied. Lei et al. [13] detects multiple vulner- abilities in voice-based digital assistants, using Google Home and Amazon Echo as case studies, caused by weak authentication. Based on the found vulnerabilities, a proximity-based authentication solution is proposed. Piotrowski and Gajewski [11] investigate the effect of voice spoofing using an artificial voice and propose a method to protect against it. Chen et al. [36] propose a design and implementation for a voice-spoofing protection framework, designed and tested individually for use with smartphones. Additional vulnerabilities for systems similar to SHAs are found in Table 5.9. The privacy implications of SHAs have also been studied. Ford and Palmer [37] performed an experiment coupled with a network traffic analysis of the Amazon Echo, concluding that the device does send audio recordings even when the device is not actively used. Chung et al. [38] performed a forensic investigation of the Amazon Echo, finding settings, usage recordings, to-do lists, and emoji usage being stored on the device. Orr and Sanchez [10] investigate what information is stored on an Amazon Echo and what purpose it serves. Furthermore, forensic analysis is performed to determine the evidentiary value of the information stored on the device. Chung, Park, and Lee [39] developed and presented a tool for extracting digital evidence from the Amazon Echo. Furthermore, the security and privacy implications of smart homes and IoT-

11 12 Chapter 3. Related Works devices have also been studied. Bugeja, Jacobsson, and Davidsson [40] present an overview of privacy and security challenges in a smart home environment, contribut- ing to constraints and evaluated solutions. Rafferty, Iqbal, and Hung [41] presents privacy and security threats of IoT devices, specifically smart toys, within a smart home environment. Vulnerabilities and exploits are presented, together with a pro- posed security threat model. Vijayaraghavan and Agarwal [42] highlight novel techni- cal and legal solutions for security and privacy issues in a connected IoT environment, such as standards and policies, device trust, encryption, and lightweight encryption. Zeng, Mare, and Roesner [43] performed interviews to study end-user usage and security-related behaviour, attitude, and expectations towards smart home devices. The authors also devised security recommendations for smart home technology de- signers. Sivanathan et al. [44] perform an experimental evaluation of threats towards a selection of IoT devices, providing a security rating of each device and type of threat. The identified related works show that there is no comprehensive mapping of threats towards SHAs. This is the research gap this thesis attempts to address. Chapter 4 Method

The Method-chapter explains and motivates our chosen research methodologies: the systematic literature review, the quasi-experiment, and the threat modelling process.

4.1 Systematic Literature Review The aim of a systematic literature review (SLR) is to identify literature relevant to specific research questions using a well-defined methodology. The SLR allows for an unbiased and repeatable review via the identification, analysis, and interpretation of available literature [45]. The SLR consists of three phases: planning, conducting, and reporting the review. The first phase, planning, involves creating RQs and establishing a review protocol. The review protocol describes the processes used throughout the SLR, which include creating selection criteria, quality assessment checklists, data extraction strategies, among others. During the second phase, the SLR is conducted. The processes involved are research identification, selection of primary studies, data extraction and synthesis. The third phase involves managing and evaluating the data received [46].

4.1.1 Database Selection The databases listed below were used for the literature review. The databases were accessed using the Blekinge Institute of Technology (BTH) library proxy, ensuring full-text access for the literature. • BTH Summon

• Google Scholar

• IEEE Xplore

• ScienceDirect Publications from the following security conferences were also examined.

• Black Hat

• DEF CON

Moreover, the following exploit-, CVE-, and CWE-databases were used to find reported and publicly available vulnerabilities, weaknesses, and exploits.

13 14 Chapter 4. Method

• Exploit Database

• Mitre CVE

• Mitre CWE

Database limitations The Summon database does not support saving results from a search. Instead we used JavaScript1 to collect the research results, as seen in Appendix B.1. Similarly, Google Scholar does not support search saving. Therefore, we used the tool Publish or Perish to extract the resulting papers [47].

4.1.2 Selection Criteria Inclusion criteria ensure relevant research. The chosen criteria are listed below.

• Paper is not a duplicate

• Full text available online

• Paper written in English

• Paper not published after 2018

• Paper originating from conference, journal or a peer-reviewed source

• Paper type is academically sound, meaning it is not a news article or presen- tation

• Relevant title

• Relevant abstract

• Relevant conclusion

• Relevant full paper

4.1.3 Quality Assessment Ensuring the quality of papers extracted through the inclusion and exclusion criteria is considered critical. As an aid, Kitchenham’s guidelines present a list of quality assessment criteria [46]. As proposed by Fink, this list was treated as a base, meaning quality assessment criteria were selected and modified to fit our thesis, as opposed to covering each separately [48]. The quality assessment criteria of each literature review are presented in their corresponding section.

1https://developer.mozilla.org/en-US/docs/Web/JavaScript 4.2. Threat Assessment of Smart Home Assistants 15

4.1.4 Data Extraction Strategy and Synthesis The data extraction was performed using Table 4.1 as a template, describing the extracted data. Both authors performed an individual data extraction of each paper. The authors’ notes were then compared to unify the extraction. If an extraction differed between the authors, a discussion took place until both were unified. Such a difference occurred when the target paper complexity was high or its findings were not clearly presented. Furthermore, for the first literature review targeting reported vulnerabilities in SHAs, the data was synthesised into two categories: reported threats and protection methods. These categories correspond to RQ1 and RQ3. For the second literature review, the categories were threats in similar systems that could potentially affect SHAs, and protection methods for similar systems. These categories correspond to RQ2 and RQ3.

Table 4.1: Form describing the data extracted from the literature review papers

Extraction Type Data Article Title Article Author Date Source Publication Type Research Method Validity Threats Applicable RQs Smart device category

4.2 Threat Assessment of Smart Home Assistants To understand the current threats against SHAs this assessment aimed to analyse reported vulnerabilities and protection methods targeting SHAs, answering RQ1 and RQ3 respectively. The assessment used SLR as the instrument of choice.

4.2.1 Keywords Keywords derived from our RQs were used to find relevant research during the liter- ature review. These keywords were categorised and given an ID and can be seen in Table 4.2. Category A contains general keywords regarding our research. Category B contains names of SHAs found after researching today’s SHA market. Category C contains variants of vulnerabilities regarding our research. 16 Chapter 4. Method

Table 4.2: Search keywords, sorted by category, used in the threat assessment of home assistants

A B C A1=smart home B1=Amazon Echo C1=weakness assistant A2=smart B2=Google Home C2=flaw speaker B3=Amazon Tap C3=malicious B4=JBL Link C4=threat B5=Polk Command Bar C5=risk B6=UE Blast C6=vulnerability B7=Harman Kardon C7=attack B8=Harman Kardon Allure C8=security B9=Apple Homepod C9=exploit B10=Lenovo SmartDisplay B11=Yandex Station B12= B13=Clova Wave The following search string was used in the scientific databases to find SHA vulnerabilities. (A1 or A2 or B1 or B2 or B3 or B4 or B5 or B6 or B7 or B8 or B9 or B10 or B11 or B12 or B13) and (C1 or C2 or C3 or C4 or C5 or C6 or C7 or C8 or C9) Due to the exploit-, CVE-, and CWE-databases not allowing search operators, searches were performed manually for each of the keywords in fields A and B. The archives of security conferences were searched in the same manner.

4.2.2 Quality Assessment Criteria For the threat assessment of SHAs, the criteria shown in the list below were applied.

• Does the paper present any vulnerabilities or protection methods in SHAs?

– This criterion is passed if the paper presents vulnerabilities or protection methods in SHAs.

• Are the methodologies and results clearly presented?

– This criterion is passed if the presented research methodology follows a clear and reproducible structure.

• Are the findings credible?

– This criterion is passed if the reported findings are backed up by credible methods and sourced. 4.3. Threat Assessment of Similar Systems 17

• Are negative findings mentioned in the paper? If so, are they presented?

– This criterion is passed if the paper also clearly presents any negative findings.

4.3 Threat Assessment of Similar Systems As similar systems are as well of interest and within the scope, this assessment fo- cused on vulnerabilities found in similar systems which could potentially affect SHAs, answering RQ2 and RQ3 respectively. The assessment used SLR as the instrument of choice.

4.3.1 Keywords The keywords in Table 4.3 were derived from the results of the threat assessment of SHAs as reference. The keywords for the threat assessment of similar systems were, similarly to the SHA threat assessment, categorised and given an ID. Category A is general terms. Category B contains different components found in similar systems according to the threat assessment of SHAs and category C is the type of attacks and threats related to our research. Table 4.3: Search keywords, sorted by category, used in the threat assessment of systems similar to smart home assistants

AB C A1=IoT B1=voice interface C1=weakness B2=speech interface C2=flaw B3=automatic speech recognition C3=malicious B4=voice command interface C4=threat B5=speaker verification C5=vulnerability C6=attack C7=security C8=exploit

The following search string was used in the chosen databases for finding vulner- abilities in systems similar to SHAs.

(A1) and (B1 or B2 or B3 or B4 or B5) and (C1 or C2 or C3 or C4 or C5 or C7 or C8 or C9) 18 Chapter 4. Method

4.3.2 Quality Assessment Criteria For the threat assessment of similar systems, the criteria shown in the list below were applied.

• Does the paper present any vulnerabilities or protection methods in systems similar to SHAs?

– This criterion is passed if the paper presents vulnerabilities or protection methods in systems similar to SHAs.

• Are the methodologies and results clearly presented?

– This criterion is passed if the presented research methodology follows a clear and reproducible structure.

• Are the findings credible?

– This criterion is passed if the reported findings are backed up by credible methods and sourced.

• Are negative findings mentioned in the paper? If so, are they presented?

– This criterion is passed if the paper also clearly presents any negative findings.

4.4 Threat Modelling

The STRIDE threat model was applied to locate threats and protection methods not found during the systematic literature reviews, further answering RQ2 and RQ3. One analysis was performed, focusing on a generalised smart home environment. First, threats were identified for the target system. Second, possible protection methods for the found threats were identified.

4.4.1 Generalised STRIDE Analysis We chose the approach of using a generalised system for the first STRIDE analysis. The reasoning behind a generalised approach is the possibility of identifying vulner- abilities and threats without relying on specific components. Furthermore, without the limitation of targeting specific components the scope for threat identification can be made more inclusive. The analysed system is described in Figure 4.1 below. 4.5. Experiment Design 19

Figure 4.1: A generalised system targeted in the STRIDE analysis process

4.5 Experiment Design To practically verify whether vulnerabilities could affect SHAs, we performed a quasi- experiment with one test group consisting of two SHAs: Amazon Echo and Google Home. A quasi-experiment was chosen as it allows us to estimate the impact of each independent variable towards a target without any random selection [49]. The experiment directly answers RQ2 and due to the scope of only two SHA devices, there is no control group. The experiment was also “within-subject” [50], meaning that each SHA was the target of the same type of attacks (independent variable) and the result (dependent variable) was measured. However, the adaptation and execution of each attack were uniquely modelled for each SHA. Performing the same attacks against the two SHAs facilitated a fair comparison. During the experiments, multiple scale values determined an overall value, gauging the severity of each attack. The primary variables which influenced the experiment are as follows.

• Dependent variable: An overall scale value, gauging the severity of each attack.

• Independent variable: Multiple attacks attempting to exploit vulnerabilities in the SHA.

• Controlled variables: Ambient noise, network size/population, presence of oth- ers during experiments, and the SHA itself. However, others may be discovered and used during later stages.

To score our vulnerabilities, we employed a modified mean value consisting of the val- ues technical difficulty and potential impact. The dependent variable δ was therefore calculated as x · (y + z) δ = 2 20 Chapter 4. Method where x ∈ 0, 1 and x = 1 and x = 0 represent the attack being successful and unsuccessful respectively. The variable y ∈ 1, ..., 5 describes the technical difficulty implementing the attack and z ∈ 1, ..., 5 describes the potential impact of the at- tack. All variables were gauged by the authors, with higher values indicating a lower technical skill requirement or a higher potential impact. Finally, the scoring of the vulnerability is δ ∈ 0, ..., 5, where a higher δ entails high severity.

4.5.1 Experiment Environment The experiment environment was a closed room within a large office environment. As the room was not soundproof, noise may permeate from the outside surrounding. Within the room was an access point, the target SHA, the authors, along with devices and equipment needed to facilitate the attack tested. All devices within the room were connected to the same access point. Such attack-facilitating equipment can be found under each specific attack.

4.5.2 Functionality Test of SHA Before performing the experiments, each SHA underwent a functionality test which ensures the SHAs function as expected. The test was evaluated with a binary value, meaning that the SHA could either pass or fail. The functionality test entailed the following.

• Perform initial device setup using coupled device instructions.

• Query the device to report the current weather in Stockholm, Sweden.

• Query the device to perform a unit conversion of 4 feet to centimetres.

4.5.3 Chosen Attacks The same attacks were performed on both SHA devices. Performing the same attacks allowed for comparing, discussing, and drawing conclusions regarding the security of the devices. Furthermore, as both devices are popular in a home environment, testing the same attacks clarified whether they are threats to the home environment or the SHA itself. When selecting attacks for the experiment phase, specific aspects were considered. First, whether the attack, if successful, would be detrimental to the SHA or its users and second, whether there is a possibility to adapt and perform the attack within the thesis scope.

4.5.4 Experiment Layout To provide a clear overview of the tested attacks, each attack is presented as a unique entity rather than part of the experiment. These entities also state the original attack originating from either the threat assessments or the threat modelling process. The attacks are presented based on the following categories. 4.6. Experiment Execution 21

• Attack goal: The goal of an attacker performing the attack.

• Tools and equipment: Tools and equipment that are imperative to the attack.

• Adaptation steps: Steps taken to adapt the attack towards both SHAs.

• Scenario: A scenario describing how the attack could occur.

• Execution: The process that occurs as the attack takes place towards both SHAs.

The experiment result are presented separately in the Result-section.

4.6 Experiment Execution For the experiment, four types of independent variables were investigated. Each variable, representing one attack and which component of an SHA the attack targets are presented in Table 4.4 below. The execution, origin, and result of each attack is covered in the Results-section. In addition to the attack selection criteria given in Section 4.5.3, the chosen attacks acted as “samples” from the most explored areas of the vulnerabilities found. These areas are the voice interface, being represented through attack A1 and A2, and the network traffic to and from the SHA, being represented through attack A3. Finally, privacy is addressed through attack A4.

Table 4.4: Attacks and their corresponding target, found during the threat assessments, chosen for the experimen- tation phase

ID Target Attack

A1 Voice An adversary can record interface audio of a legitimate command and replay it to execute the command again. A2 Voice Voice commands can be interface hidden in recordings. A3 SHA API API functionality can be abused to reveal sensitive information to an adversary.

Continued on next page 22 Chapter 4. Method

Table 4.4 – Continued from previous page ID Target Attack

A4 SHA autho- Privacy-infringing or risation harmful SHA functionality can be accessed without authorisation.

In the following subsections, the methodology for testing each independent vari- able is described.

4.6.1 Replay Attack This attack corresponds to attack A1 in Table 4.4.

Attack Goal The goal of this attack was to use pre-recorded voice commands, replayed via a loudspeaker, to trigger SHA functionality.

Tools and Equipment Samsung Galaxy S8 with Android 9. The application used for voice recording is called “Voice Recorder”2 version 21.1.01.10 (as of 2019-04-02).

Adaptation steps This step was not needed for this attack.

Scenario A user talks on the phone and plans to book a meeting with a business partner. Since their phone is in their hand, they ask the Google Home device for calendar information, which is recorded by an attacker. When the user is not present, the attacker replays the command in order to compromise the user’s private calendar information.

Execution Both authors were actors. One posed as the user of the SHA. The other one as the attacker who recorded a command made by the user. The attacker then replayed the command. The attack was performed twice, once with voice identification disabled and once with voice identification enabled. The term voice identification encompasses

2https://play.google.com/store/apps/details?id=com.sec.android.app.voicenote 4.6. Experiment Execution 23 the Voice Match feature on a Google Home device and the Voice Profile feature present on Amazon Echo.

4.6.2 Adversarial Attack Using Phsychoacoustic Hiding This attack corresponds to attack A2 in Table 4.4.

Attack Goal The attack goal was to perform an in-person adversarial attack through psychoacoustic- hiding to trigger the SHA. However, the adversarial attack could be executed in a remote manner through, for example, television transmissions.

Tools and Equipment The equipment for this attack was a virtual machine3 running Ubuntu 18.04 (16GB RAM, 6 vCPUs) and a MacBook Pro 13-inch, mid 2014 Mojave 10.14.4 for play- ing the adversarial sounds over its loudspeakers. The tools used to generate the adversarial sounds were Kaldi4 and Matlab5.

Adaptation: Attack Training Phase To generate the attack, Kaldi needs to train a speech model. This model was trained and given to us by the researchers of the original attack. After training was com- pleted, Matlab was used to extract audio thresholds for the speech model [51]. Con- tinuing, Kaldi generated the adversarial attack using two files, “target-utterences.txt” and “target”. The first file was a library of phrases for Kaldi to use. The second file specified exactly which phrase from “target-utterences.txt” to use when generating the adversarial noise. The specification of phrases was done through matching an ID within “target” with the number of the phrase from “target-utterences.txt”. The content of “target-utterences.txt” looked like the following OK Google what is the weather in Stockholm Hey Alexa what is the weather in Stockholm All your base are belong to us The content of file “target” looked like the following. 001 119 002 125 003 138

Scenario A user listens to music on a streaming platform. An attacker has uploaded an album to the platform containing songs prepared with the adversarial noise. The user listens to the songs, which act as a cover for the audible adversarial noise making it more

3https://www.digitalocean.com/ 4http://kaldi-asr.org/ 5https://www.mathworks.com/products/matlab.html 24 Chapter 4. Method difficult for the end user to hear. Once the songs are playing, the SHA picks up the hidden commands and extracts personal information to the attacker.

Execution We used a loudspeaker to replay the audio file with the adversarial phrase embedded. Two audio files were created, one for each SHA. The process was repeated for both assistants with their corresponding adversarial audio file.

4.6.3 Harmful API Behaviour This attack corresponds to attack A3 in Table 4.4.

Attack Goal The goal of this attack was to access hidden API functionality of SHAs that could provide an attacker with sensitive user information or allow for harmful device con- trol. Furthermore, the attacker had no access to authentication data.

Tools and Equipment The equipment used in this experiment was the target SHA and a Lenovo Thinkpad E480 laptop acting as the access point. The laptop was using Windows 10 Pro Version 10.0.17763 Build 17763. The tools used throughout the experiment were the curl-utility6 and JavaScript.

Adaptation Steps As the original paper describing the API attack targets Google Home, the only adaptation that had to be made was that of the attack delivery [52]. We created a prepared web page which when visited will execute code with the end-result of accessing the sensitive SHA API. The page itself scanned and detected active devices on the network, sending specific API requests towards them. Adapting the attack for Amazon Echo required further research to pinpoint pre- cisely which API calls were available. Therefore, basic research was used to find any unofficially documented API calls [39]. In the same manner, the unofficially docu- mented API calls of Google Home were found7. Otherwise, the same process and tools were used between the two SHAs.

Scenario An attacker prepares a malicious web page and hosts it online. A user visits the web page from a device on their home network. Furthermore, the home network contains an installed SHA. Once the user visits the page, the code will execute within their browser and scan for local devices. The malicious code will then send two API- requests to all devices found. If the target device is an SHA, it will reply with the

6https://curl.haxx.se/ 7https://rithvikvibhu.github.io/GHLocalApi/ 4.6. Experiment Execution 25 results of a scan of all nearby access points. Finally, the resulting access point data is sent back to the attacker.

Execution There is an Amazon Echo API available. However, it was locked behind authentica- tion and authorisation. As the experiment is performed without credentials allowing authorised access to the attacker, no adaptation was made. Therefore, the rest of the section will cover the experiment methodology for the Google Home API. The two API calls revealing access points close to the Google Home device are described as a scan initialiser and a scan result fetcher. The scan initialiser is a POST API call to

:8008/setup/scan_wifi. The scan result fetcher is a GET API call to
:8008/setup/scan_results. None of the aforementioned API calls required any specific arguments to call and only the latter returns any content. To confirm the validity of the API calls we used the data-transfer utility curl to sequentially execute the following commands against the Google Home device.

• curl -X POST http://x.x.x.x:8008/setup/scan_wifi

• curl -X GET http://x.x.x.x:8008/setup/scan_results

Next, a JavaScript implementation of the API call was created. After that, a web page containing a JavaScript routine performed three different steps acting as the delivery method of the API call. First, the local IP of the web page visitor was determined through WebRTC8. Second, every address within the last octet range of the local IP was scanned. As JavaScript has no native ping functionality an attempt to load an image from each address was made, acting as the ping function. Third, every address responding to the image load-attempt was sent the two API calls using POST and GET requests. The JavaScript received the response, if any, and displayed it in the console.

4.6.4 Unauthorised SHA Functionality This attack corresponds to attack A4 in Table 4.4.

Attack Goal The goal of this attack was to perform harmful commands or reveal personal infor- mation through an unauthorised SHA user.

Tools and Equipment No equipment besides the SHA was needed.

8https://webrtc.org/ 26 Chapter 4. Method

Adaptation Steps During preparation, each SHA device went through its initial set-up stages. One user account was added and provided with a phone number and email address. As the Google Home can perform voice authorisation through its Voice Match feature, the Google Home with this feature enabled was treated as a separate device throughout the experiment. The same separation of targets occurred to the Amazon Echo and its Voice Profile service.

Scenario An attacker gains access to an Internet-connected speaker inside the home of a user. The attacker issues a voice command through the speaker, "Send a text message to number 1234", where the number 1234 corresponds to a pay-by-text service run by the attacker. As a text message is sent from the user’s phone, resulting from a command via the SHA, a payment is made from the unknowing user’s account to the payment service of the attacker.

Execution This experiment was executed through an actor querying the Amazon Echo, Amazon Echo with Voice Profile enabled, Google Home, and Google Home with Voice Match enabled. Voice Match and Voice Profile had been set up by one actor, while the querying actor had not previously used either device. The privacy-related queries were what is my email address, read my email, when is my birthday, what is my home address, send an email, make a call, send a text message, visit a web page, and purchase toilet paper. Chapter 5 Results

The Result-chapter displays the results of our research. The Threat Status of Smart Home Assistants-section and it subsequent subsections covers vulnerabilities and protection methods towards SHAs. The Threat Status on Similar Systems-section in it turn, covers vulnerabilities and protection methods towards similar systems. The Threat Modelling-section covers threats and protection methods generated by the STRIDE threat modelling process. Furthermore, The Threat Validation on SHA- section presents all results from the thesis experiment.

5.1 Threat Status of Smart Home Assistants Table 5.1 below describes the amount of papers found during the systematic literature review when applying the search strings described in Table 4.2.

Table 5.1: The amount of papers found through each database for the threat assessment of smart home assis- tants

Database Paper Search Count Date BTH Summon 271 2019-02-18 Google Scholar 3168 2019-02-19 IEEE Xplore 12 2019-02-19 ScienceDirect 147 2019-02-19 CVE 1 2019-02-19 Sum of Papers 3599

27 28 Chapter 5. Results

Table 5.2 below describes the amount of remaining papers after the application of selection criteria described in Section 4.1.2. Table 5.2: Papers remaining for the threat assessment of smart home assistants, after application of selection criteria

Criteria Paper Count Paper is a duplicate -1304 Full-text not available online -12 Paper not written in English -2 Paper published after 2018 -53 Paper not originating from conference, -2 journal, or peer-reviewed source Paper not academically sound -26 Irrelevant title -1973 Irrelevant abstract -105 Irrelevant conclusion -21 Irrelevant full paper -52 Sum of Papers 49

After the application of selection criteria, 49 papers remained from the original collection of papers. Due to the size of the resulting set, an automatic duplication check was the first criteria targeted. The relevancy of the title was next investigated, resulting in the removal of papers that may have fit under other criteria. Full-text availability checks were among the last to be performed, together with the irrelevant full paper criteria. After the removal of papers not fitting our quality assessment criteria according to Section 4.2.2, the final thesis data set consists of 24 papers. The last reduction of papers is due to unclear methodologies or non-credible findings. To answer RQ1 and RQ3, data extraction, coupled with a synthesis of the papers, was performed. The synthesis allocation is shown in Table 5.3 below, where an "X" in the table entails the RQ answered.

Table 5.3: Papers remaining for the threat assessment of smart home assistants, after application of quality as- sessment criteria

Amount of Papers RQ1 RQ3 14 X X 8 X 2 X

5.1.1 Vulnerabilities The systematic literature review shows that there are no comprehensive mappings of threats towards SHAs. However, this does not mean no threat mappings exist. 5.1. Threat Status of Smart Home Assistants 29

For example, Lei et al. [13] describes threat mappings and exploitation of two SHAs. Gong and Poellabauer [53] target a specific component in the voice interface of SHAs, while Haack et al. [15] focus entirely on a specific SHA device. Chung et al. [54] focus instead on the voice assistant found within a specific device. Table 5.4 describes all vulnerabilities found during the literature review, therefore answering RQ1. Different keywords are used for explaining target SHA(s):

• Verified on, meaning one or more papers verify the vulnerability towards an SHA.

• General, meaning the vulnerability could possibly be exploited on SHAs, as it has been verified on voice assistants found on such devices.

Table 5.4: Vulnerabilities discovered during the threat assessment of smart home assistants

ID Description Target SHA(s) Reference(s)

SV1 A malicious, overt command can be Verified on Amazon [55], [56], issued. This could be from a distance, Echo and Google [57], [13] such as through a window or door, or Home in close proximity to the SHA. SV2 A legitimate command can be Verified on Amazon [55], [56], recorded and repeated, creating a Echo and Google [53], [57], replay attack. Home [58] SV3 The SHA API allows for a forced scan Verified on Google [52] of ambient WiFi, displaying sensitive Home information as a result. Such a scan can be performed remotely, meaning sensitive information can be leaked. SV4 Authorisation codes can be compro- Verified on Google [52] mised during the SHA authentication Home procedure. Using these codes, an ad- versary can spy on the user’s voice his- tory. SV5 Through machine learning, the func- Verified on Amazon [52], [54], tionality of encrypted traffic generated Echo and Google [59], [60] by voice commands can be inferred. Home SV6 Traffic originating from the SHA can Verified on Amazon [61], [54], reveal when the device is being used Echo and Google [59] or if it exists, even when encrypted. Home

Continued on next page 30 Chapter 5. Results

Table 5.4 – Continued from previous page ID Description Target SHA(s) Reference(s)

SV7 Voice activation of the SHA can be Verified on Amazon [61], [54], triggered by anyone. Echo, General [13] SV8 Covert audio commands, inaudible or Verified on Amazon [61], [53], indecipherable to the user, can be is- Echo and Google [34], [62], sued by an adversary and picked up Home [63], [64], by the SHA. [57], [58] SV9 SHAs may be compromised by mal- General [54] ware. SV10 Operating system flaws may lead to General [53], [57] self-triggered voice command injection attacks. SV11 Voice commands can be issued which Verified on Amazon [53], [65], are interpreted differently by the user Echo and Google [15], [31], and the SHA, relying on ambiguity or Home [30] misunderstanding. SV12 Flaws in the underlying machine Verified on Amazon [57] learning algorithms of SHAs can allow Echo and Google for voice commands being generated, Home issued, and interpreted differently by the end user and the SHA. SV13 Voice commands could be hidden in- General [66] side other media, making them diffi- cult to detect for the user. SV14 The pin used for securing payments Verified on Amazon [15] through the SHA can be bruteforced Echo via voice commands. SV15 The Bluetooth protocol used in Verified on Amazon [67] SHAs can be vulnerable to general Echo and Google Bluetooth-attacks. Home SV16 Custom commands (skills/actions for Verified on Amazon [30] Google Home and Amazon Echo re- Echo and Google spectively) can pretend to hand over Home control to another command while col- lecting the data themselves.

Continued on next page 5.1. Threat Status of Smart Home Assistants 31

Table 5.4 – Continued from previous page ID Description Target SHA(s) Reference(s)

SV17 Custom commands (skills/actions for Verified on Amazon [30] Google Home and Amazon Echo re- Echo and Google spectively) can pretend to terminate Home themselves while still recording con- versations.

The threat assessment shows most vulnerabilities being found in the voice inter- face of the SHAs, enabling both overt and covert adversarial audio commands. An- other well-researched vulnerability made possible by the voice interface, but caused by issues within the SHA software, is ambiguous commands. Furthermore, the net- work traffic, while encrypted, is another area of focus for multiple papers with dif- ferent methods proposed to classify the traffic generated by SHAs, often based on machine learning.

5.1.2 Protection Methods Continuing, the threat assessment also partially answered RQ3, which Table 5.5 below describes.

Table 5.5: Protection methods discovered during the threat assessment of smart home assistants

ID Description Addresses Reference(s) Vulnerability

SP1 Through determining the distance be- SV1, SV2, SV7, [55] tween an SHA and a mobile device, SV8, SV10, SV11, malicious commands can be detected SV12, SV13, SV14 and stopped. SP2 Through authenticating the speaker SV1 [56], [53] via a challenge, replay attacks can be prevented. SP3 Decoupling of voice input and output SV10 [53] can resist self-triggered replay attacks. SP4 Adversarial training can resist SV12 [53] machine-learning based attacks. SP5 Detecting whether a voice command is SV2, SV8, SV10, [53], [58] issued through a speaker or by a hu- SV11, SV12, SV13, man can prevent multiple attacks. SV14

Continued on next page 32 Chapter 5. Results

Table 5.5 – Continued from previous page ID Description Addresses Reference(s) Vulnerability

SP6 Encrypting DNS queries can prevent SV5, SV6 [59] an adversary from identifying SHAs in the network. SP7 Tunnelling SHA traffic through a SV6 [59] VPN can prevent an adversary from correlating traffic to an individual de- vice. SP8 Through altering traffic from the SV5, SV6 [59] SHA, the possibility to infer user be- haviour is reduced. SP9 Downsampling the input audio before SV13 [66] handling it in the SHA can prevent hidden voice commands. SP10 Set up voice recognition, preventing SV7, SV8, SV10, [68] unauthorised personnel from using the SV11, SV13 SHA. SP11 SHA malware detection can be per- SV9 [69] formed through machine learning. SP12 Word- and phoneme-based analysis SV11 [31] can be performed to reduce ambigu- ity between voice commands. SP13 Through detecting motion and prox- SV1, SV2, SV7, [13] imity to the SHA, adversarial voice SV8, SV10, SV11, commands can be prevented. SV12, SV13, SV14 SP14 Analysis of the response from a voice SV16, SV17 [30] command can be used to identify ille- gitimate commands. SP15 The user’s intention can be analysed SV16 [30] to identify whether context switches are appropriate for the current com- mand.

Most protection methods discovered in the systematic literature review are de- signed for one or two vulnerabilities, while four stand out in addressing multiple vulnerabilities: two are based on proximity to the SHA, while another is based on recognising the primary user of the SHA. The fourth focuses on differentiating whether a command was issued by a person or a speaker, a possible protection 5.2. Threat Status on Similar Systems 33 method for most audio-based attacks. Figure 5.1 below shows how many protection methods apply to each found SHA vulnerability. The vulnerabilities SV3, SV4, and SV15 stand out as having no cor- responding protection methods whereas SV13 is the most addressed vulnerability, associated with five protection methods.

5

4

3

2

1

0 SV1 SV2 SV3 SV4 SV5 SV6 SV7 SV8 SV9 SV10 SV11 SV12 SV13 SV14 SV15 SV16 SV17

Figure 5.1: The amount of protection methods addressing each vulnerability found during threat assessment of SHAs

5.2 Threat Status on Similar Systems Table 5.6 below describes the amount of papers found during the systematic literature review when applying the search strings described in Table 4.3.

Table 5.6: The amount of papers found through each database for the threat assessment of similar systems

Database Paper Search Count Date BTH Summon 52 2019-02-27 Google Scholar 904 2019-02-26 IEEE Xplore 1 2019-02-27 ScienceDirect 10 2019-02-27 CWE 1 2019-02-26 Exploit Database 1 2019-02-26 Sum of Papers 969

A total of 969 papers were found in the databases. Whenever possible, filters were applied to only fetch papers with the full-text online, released between 2014-2018 in either a conference, journal or peer-reviewed source. Most papers were removed due to having irrelevant titles or being duplicates. Additionally, several findings were patents or magazine articles. These were deemed not academically sound and therefore removed. 34 Chapter 5. Results

For the full list of papers that were removed due to the different criteria, see Table 5.7. Table 5.7: Papers remaining for the threat assessment of similar systems, after application of selection criteria

Criteria Paper Count Paper is a duplicate -182 Full-text not available online -42 Paper not written in English -34 Paper published between 2014-2018 -53 Paper not originating from conference, -5 journal, or peer-reviewed source Paper not academically sound -133 Irrelevant title -400 Irrelevant abstract -85 Irrelevant conclusion -13 Irrelevant full paper -1 Sum of Papers 21

A total of 21 papers remained after our selection criteria process and were sub- jected to quality assessment. The quality assessment phases ensures that the re- maining papers uphold quality according to Section 4.3.2. The quality assessment phase resulted in a data set of 16 papers. The reduction of papers was primarily caused by inconclusive methodologies or vulnerabilities not being presented. As the aim of the review is to answer RQ2 and RQ3, we used data extraction and synthesis to see which papers answered the thesis RQs. Table 5.8 below shows how many of the 16 papers answer our RQs, marked with an X. The X entails which RQ is being answered. Table 5.8: Papers remaining for the threat assessment of similar systems, after application of quality assessment criteria

Amount of Papers RQ2 RQ3 8 X X 3 X 5 X

5.2.1 Vulnerabilities The review resulted in 12 different vulnerabilities on systems similar to SHAs, as seen in Table 5.9 below, answering RQ2. However, while the found papers focus on different systems, SHAs may be included and are therefore added to the target systems when applicable. 5.2. Threat Status on Similar Systems 35

Table 5.9: Vulnerabilities discovered during the threat assessment of similar systems

ID Description Target systems(s) Reference(s)

RV1 The lack of authentication protocols Voice interface, SHA [70], [58], allow attackers to perform potentially [71]. harmful commands. RV2 Speaker replay attacks allow attackers Voice interface, SHA [66], [35], to record and replay commands. [53], [58] RV3 Weaknesses in software and operating Any software system [53], [72] systems could be exploited for mali- cious purposes. RV4 Hardware flaws could allow for syn- Voice controlled [53] thetic speech to be generated via hard- systems ware noise. RV5 The underlying machine learning algo- Voice interface [53] rithm within the device could be ex- ploited to misunderstand given com- mands. RV6 Non-acoustic sensors can be used to Systems with [73], [74], establish speech sensors [75] RV7 Devices that use location-based ser- Any device using [58] vices could be attacked location-based services. RV8 Applications that allow for greater Voice interface [76], [71] than needed privilege could be ex- ploited. RV9 Network jamming through emission of All devices that [71] radio waves can interfere with func- communicates using tionality. radio waves RV10 Traffic could be captured and analysed All devices that [71] in order to retrieve valuable informa- communicates using tion. radio waves RV11 Vulnerable devices with Bluetooth Devices with BLE [77] Low Energy (BLE) capability could be functionality tracked.

Continued on next page 36 Chapter 5. Results

Table 5.9 – Continued from previous page ID Description Target System(s) Reference(s)

RV12 Commands could be hidden as back- Voice interface [66], [51] ground noise in audio files to force voice interface systems to perform il- licit tasks.

The review shows that most vulnerabilities in systems similar to SHAs are weak authentication protocols. There were also a few vulnerabilities found in the review that were novel for this thesis. These papers used sensors such as accelerometer and gyroscopes to retrieve and generate speech synthetically.

5.2.2 Protection Methods Table 5.10 below describes protection methods found during the threat assessment of similar systems.

Table 5.10: Protection methods discovered during the threat assessment of similar systems

ID Description Addresses Reference(s) Vulnerability

RP1 Using session based or two-factor chal- RV1 [56], [66], lenge response to improve authentica- [70] tion on voice-based systems. RP2 Using ASR, ASV and ASI to deter- RV1, RV2, RV12 [35], [57], mine whether sound is coming from a [58], [71], speaker. [78], [79] RP3 Secure sensor data which could poten- RV1, RV6, RV7, [70], [73], tially reveal sensitive information. RV12 [80] RP4 Controlling the data flow on voice in- [76] terface systems could maintain pri- vacy. RP5 Using more robust machine learning RV12 [51] methods could stop psychoacoustic at- tacks.

There were five types of protection methods presented in the review, partially answering RQ3. Speaker recognition using ASV and ASI techniques, aimed towards stopping replay attacks is one of them. Another problem is weak authentication, 5.3. Threat Modelling 37 which certain papers propose a solution for through two-factor authentication using challenge response, or using sessions with wearable devices and sensors such as an accelerometer. Sensors leaking private information is another issue, solvable with en- cryption. A smart home device firewall, to achieve granular data control, would also prevent overall data leakage. To secure voice interface systems, one paper propose altering the underlying machine learning algorithms. Figure 5.2 below shows how many protection methods apply to each found similar- system vulnerability. The vulnerabilities RV3, RV4, RV5, RV8, RV9, RV10, RV11 have no corresponding protection methods whereas RV1 and RV12 are the most addressed with three protection methods each. Overall, few protection methods were found during the threat assessment of similar systems. Several vulnerabilities with no protection methods are unspecific and large-scale. An example is the operating system vulnerability RV3, which targets a complex system which complicates the design of one specific protection method.

3

2

1

0 RV1 RV2 RV3 RV4 RV5 RV6 RV7 RV8 RV9 RV10 RV11 RV12

Figure 5.2: The amount of protection methods addressing each vulnerability found during threat assessment of similar systems

5.3 Threat Modelling The following section describes threats and protection methods found during the generalised SHA threat modelling process, described in Section 2.6.

5.3.1 Possible Threats The identified threats in the modelled home environment are shown in Table 5.11 below. 38 Chapter 5. Results

Table 5.11: Threats identified through the STRIDE pro- cess, targeting a modelled home environment

ID Target Vector Type Attack

T1 Bluetooth Information An attacker could use between phone Disclosure Bluetooth scanning to and SHA determine whether an SHA exist on a network, using that knowledge in planning further attacks. T2 Bluetooth Denial of The Bluetooth protocol between phone Service could get overwhelmed by and SHA illegitimate requests, preventing legitimate usage. T3 Phone to SHA Spoofing Through spoofed API calls, SHA-related information could be extracted. T4 Phone to SHA Tampering Unencrypted API calls could be intercepted and modified. T5 Phone to SHA Repudiation API functionality could allow for the forgetting of paired Bluetooth devices, decreasing repudiation. T6 Phone to SHA Denial of WiFi deauthentication Service commands could be sent towards the SHA, preventing regular connectivity functionality. T7 Voice interface Elevation of API calls could be used to Privilege gain access to information which an attacker was previously unauthorised to access. T8 SHA to Internet Information WiFi password and personal (via the router) Disclosure information could be saved unencrypted in the SHA-provider cloud.

Continued on next page 5.3. Threat Modelling 39

Table 5.11 – Continued from previous page ID Target Vector Type Attack

T9 Smartphone to Elevation of API calls only accessible to IoT-device Privilege the internal network could be exploited through an external source (ex. JavaScript). T10 Internet to Denial of The SHA could be knocked Router Service from the Internet, setting to a setup-mode and allowing for a hostile takeover.

Due to the threat modelling structure, many of the vulnerabilities described in Table 5.11 focus on the communication channels within the modelled SHA environ- ment. This focus is shown in vulnerabilities T1 and T2 which focus on Bluetooth communication. Another example is found in the vulnerabilities T6 and T8, focusing instead on WiFi. Furthermore, T3, T4, T5 and T9 target the API interface of the SHA through, for example, remotely accessing local API functionality.

5.3.2 Protection Methods The following are protection methods identified through the threat analysis.

Table 5.12: Smart home assistant protection methods identified via threat modelling

ID Description Addresses Threat

TP1 Mask the Bluetooth access point through T1, T2 encryption. TP2 Only enable Bluetooth manually when T1, T2 pairing is needed. TP3 Place API calls behind authentication. T3, T4, T5, T7, T9

TP4 Encrypt API calls. T3, T4, T5, T7, T9

TP5 Store passwords in an encrypted format. T8

TP6 Use tokens generated via the user logon T3, T4, T5, T7, T9 account for authorisation.

Continued on next page 40 Chapter 5. Results

Table 5.12 – Continued from previous page ID Description Addresses Threat

TP7 Use a different subnet for IoT devices. T6, T9

The protection methods found during the threat modelling process, shown in Table 5.12, primarily propose masking information as in TP1, TP4, and T5. Au- thorisation would also prevent several vulnerabilities, as shown in methods TP4 and TP6. Figure 5.3 below shows how many protection methods apply to the SHA vulner- abilities generated during the threat modelling process. The vulnerabilities T8 and T10 are the only ones not addressed, whereas every other vulnerability except T6 are addressed through multiple protection methods.

4

3

2

1

0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Figure 5.3: The amount of protection methods addressing each SHA vulnerability generated during the threat modelling process

5.4 Threat Validation on SHA All SHAs passed the initial functionality tests. The chosen attacks are briefly de- scribed in Table 5.13. 5.4. Threat Validation on SHA 41

Table 5.13: Attacks and their corresponding target, found during the treat assessments, chosen for the ex- perimentation phase

ID Attack Vulnerabilities Result

A1 We record a user The attack exploits The attack is successful uttering a vulnerability SV2 in against both Google command. The Table 5.4 and RV2 in Home and Amazon command is then Table 5.9, which allow Echo. The vulnerability replayed to the for the replaying of score is δ = 4 for both SHA via internal legitimate commands. SHAs. For explanation, loudspeakers. see Section 5.4.1. A2 By hiding voice The attack tries to use Both Google Home and commands in an indistinguishable Amazon Echo proved to audio record using adversarial noise in an be resistant to adversarial noise, audio file to trigger SHA adversarial noise attacks. we could attempt commands. The to use trigger vulnerability is RV12 in response from the Table 5.9. SHA. A3 We create a The attack exploits the The attack is website that uses unauthenticated and successfully exploited JavaScript to unencrypted API on a using different API calls. access the API of Google home device. The vulnerability score a Google Home The vulnerability, SV3 is for Amazon Echo is device to reveal found in Table 5.4. δ = 0, whereas the sensitive vulnerability score for information. Google Home is δ = 3. See Section 5.4.2 for explanation. A4 Multiple The attack testes the This vulnerability has commands aimed ability to extract some varying results, towards extracting information from SHAs depending on which Privacy-infringing using only the voice. command that was information were The tests were uttered. The tested against performed twice: first vulnerability score for both SHAs. with voice match off, Amazon Echo is δ = 4, and then with voice whereas the match enabled. The vulnerability score vulnerability is SV7 in Google Home is δ = 3.5. Table 5.4 and RV1 in See 5.4.3. Table 5.9. 42 Chapter 5. Results

All attacks in the following sections were successful. The unlisted attacks were not successful, gaining the score δ = 0.

5.4.1 Replay Attack The following subsections explains the result of the replay attack. The attack was performed with voice match both enabled and disabled. The results of the replay attack are found in Table 5.14, where successful attacks are marked with an "X".

Table 5.14: Results of replay attacks towards Amazon Echo and Google Home, mark- ing successful attacks, which return calendar information, with "X" Experiment type Amazon Google Echo Home Replay of calendar information X X command Replay of calendar information X X command (with voice identifica- tion enabled).

For this vulnerability δ = 4 for both Amazon Echo and Google Home. The reasoning is that the vulnerability was successfully exploited (x = 1), the technical difficulty is low (y = 5), and the perceived impact is moderate (z = 3).

5.4.2 Harmful API Behaviour No found Amazon Echo API calls imply a loss of privacy when issued without au- thentication. Therefore the rest of this subsection will only contain the results for the Google Home. Executing the API calls towards the Google Home through the curl utility resulted in the following output. Once a user visits the prepared web page, the same response as through curl can be seen in the browser console.

[ { "ap_list":[ { "bssid":"xx:xx:xx:xx:xx:xx", "frequency":5240, "signal_level":-69 } ], "bssid":"xx:xx:xx:xx:xx:xx", "signal_level":-69, "ssid":"name", "wpa_auth":7, "wpa_cipher":4, "wpa_id":0 5.4. Threat Validation on SHA 43

} ]

For this vulnerability δ = 3. The reasoning is that the vulnerability in this experi- ment is successfully exploited (x = 1), the technical difficulty is moderate (y = 3), and the perceived impact is moderate (z = 3).

5.4.3 Unauthorised SHA Functionality The information obtained from both SHAs, without any authentication, can be seen in the Table 5.15 below.

Table 5.15: Result of privacy-infringing and harmful queries towards Amazon Echo and Google Home

Query / Amazon Amazon Echo Google Google Home Devices Echo with Voice Home with Voice Match Profile

What is Not Not supported. Not Not supported. my email supported. supported. address?

Read my Reads Reads latest Not Not supported. email. latest unread emails. supported. unread emails. When is Not Not supported. Reveals Not authorised. my supported. birth date. birthday?

What is Not Not supported. Reveals Not authorised. my home supported. home address? address. Send an Replies to Replies to an Not Not supported. email. an unread unread email. supported. email. Make a Makes a call Makes a call Not Not supported. call. (USA & (USA & Mexico supported. Mexico numbers only). numbers only). Send a Not Not supported. Not Not supported. text supported. supported. message.

Continued on next page 44 Chapter 5. Results

Table 5.15 – Continued from previous page Query / Amazon Amazon Echo Google Google Home Devices Echo with Voice Home with Voice Match Profile

Visit the Not Not supported. Not Not supported. webpage supported. supported. www.bth.se.

Purchase Adds to the Adds to the Not Not supported. toilet cart, but cart, but supported. paper. purchase is purchase is not not confirmed. confirmed.

For this vulnerability δ = 4 for the Amazon Echo. The reasoning is that the vul- nerability in this experiment is successfully exploited (x = 1), the technical difficulty is low (y = 5), and the perceived impact is moderate (z = 3). The Google Home has a delta value of δ = 3.5, due to the vulnerability being successfully exploited (x = 1), the technical difficulty being low (y = 5) and the perceived impact being low (z = 2).

5.4.4 Threat Validation Summary This section summarises the delta values (δ) of each threat validation attempt and are shown in Table 5.16 below. Furthermore, the standard deviation of each device’s vulnerability score is calculated as r P (x − m)2 σ = n where m is the mean value of δ, n is the amount of δ-values, and x is the δ itself [81].

Table 5.16: Delta-value summary of threat validation at- tempts towards Google Home and Amazon Echo

Attack / Devices Amazon Google Echo Home

A1 4 4 A2 0 0 A3 0 3 A4 4 3.5

Continued on next page 5.4. Threat Validation on SHA 45

Table 5.16 – Continued from previous page Query / Devices Amazon Google Echo Home

Amount of attacks with 2 1 zero-value (δ) Standard deviation (σ) 2 1.6

Chapter 6 Analysis and Discussion

This chapter contains a general discussion on the implications of SHA security in a home environment, answers to our RQs, and a discussion on the thesis validity threats.

6.1 Research Implications The results show that SHAs with protection methods enabled are safe to use in a home environment. These protection methods learn your voice, restricting access and information from other users. However, out-of-the-box these features are disabled and opt-in, meaning users could disregard them. Another problem is that even if the protection methods stop the most rudimentary attacks, SHAs are still at risk. Other, more technically advanced attacks, such as embedding indistinguishable commands in songs or movies, can still affect the device. Another interesting aspect is how an SHA fits in terms of security in a smart home environment. As SHAs usually have the functionality to control thermostats, lights, TVs, and speakers to mention a few, its critical that this hub of functionality remains intact and secure. The thesis result found that no vulnerabilities allow for a complete takeover that gives an attacker granular control of all SHA functionality. However, it is worth noting that other devices within the home environment could potentially be targeted and used as an attack vector against SHAs. The more Internet-connected devices, the more attack surface a potential attacker has. The privacy aspect of SHAs is a common question, as plenty of vulnerabilities allow for breaching the user’s privacy. An example is the Google Home internal API: While internal, it can be accessed from the Internet to reveal data such as alarms, reminders, and nearby access points. The privacy aspect is even more important to consider when using a smart home assistant in an environment where sensitive information is shared. Such an environment may be the home of a political activist, a workplace, and so on. If this is the case, consider whether using a smart home assistant is in your best interest.

47 48 Chapter 6. Analysis and Discussion

6.2 Research Question Analysis

RQ1: What smart home assistant security vulnerabilities have been re- ported in academic sources or in vulnerability and weakness databases? Based on the 24 extracted academic sources, 17 types of vulnerabilities towards the two SHAs investigated in this study were identified. The vulnerability and weakness databases returned two vulnerabilities, of which both were removed after failing to meet our quality assessment criteria. Twelve types of vulnerabilities are verified to exist on Amazon Echo, while 12 are verified to exist on Google Home. An over- lap between vulnerabilities towards Amazon Echo and Google Home exists, as ten vulnerabilities exist on both systems. Three vulnerabilities are exclusively general, meaning they have not been verified on an SHA but instead a smart virtual assistant. The most vulnerable component of SHAs is the voice interface, being the direct or indirect cause of 12 vulnerabilities. The most documented vulnerability is covert au- dio commands which are either inaudible or indecipherable to the user, as described in eight papers. Another type of vulnerability is eavesdropping, made possible only through custom commands for Amazon Echo and Google Home. For the complete list of SHA vulnerabilities, see Table 5.4.

RQ2: What additional types of vulnerabilities can be identified and to what extent could they affect smart home assistants? The second literature review, consisting of 16 different papers, reported nine addi- tional vulnerabilities which could potentially affect SHAs. One new type of vulner- ability, reported in two papers, rely on the generation of synthetic speech: One uses hardware flaws and vibrations (RV4), while another uses non-acoustic sensors such as accelerometers and gyroscopes (RV6). Another type of vulnerability is facilitated through the use of radio wave communication, resulting in four vulnerabilities as reported in three papers. One disrupts all radio wave communication (RV9) while another focuses on traffic sniffing (RV10). The two other vulnerabilities focus on user tracking (RV7, RV11). Additionally, two software vulnerabilities were reported in four papers. One ex- ploits a weakness in a generalised operating system to retrieve information (RV3). The other exploits high-privilege applications to gain further access to a system (RV8). The last type of vulnerability is found in the machine learning algorithms present in many ASV, ASR, and ASI systems, as reported in one paper. The machine learning algorithm could misunderstand commands, resulting in the execution of unintended commands (RV5). For all vulnerabilities found in the second literature review, see Table 5.9. The types of vulnerabilities presented could affect SHAs to different degrees. The first literature review showed that the voice interface component of SHAs has multiple vulnerabilities, meaning the new synthetic speech-based vulnerabilities will most likely affect it as well. Furthermore, as SHAs communicate using WiFi, they will be affected by radio wave communication vulnerabilities. Although, that type of attack will presumably affect all devices using radio wave communication meaning the vulnerability is not inherently an SHA issue. The same can be said for software 6.3. Literature Reviews 49 vulnerabilities, as these could affect any software system. The vulnerabilities found in machine learning algorithms in voice interface sys- tems likely apply to SHAs. This verdict is based on the already existing, similar, vulnerabilities within the SHAs machine learning algorithms responsible for voice processing, as found in the first literature review.

RQ3: Which protection methods are used to safeguard against vulner- abilities identified in RQ1-2 and is there need for additional protection methods? Twenty protection methods were found during the two literature reviews, with the STRIDE analysis resulting in an additional seven protection methods. Three protec- tion methods, as described in five papers, propose a protection method which requires close physical proximity. One protection method use glasses fitted with a gyroscope and an accelerometer to establish session-based authentication (RP1). The second detects the motion of a speaker to determine proximity (SP13), while the third uses a mobile phone instead of a pair of glasses to determine the distance between the SHA and the speaker (SP1). Securing the traffic to and from the SHA is another proposed protection method found in both literature reviews and the STRIDE analysis, resulting in five methods. One method is the use of a VPN tunnel to hide SHA traffic (SP7). Similarly, two methods propose encrypting DNS and API calls to obscure common SHA function- ality (SP6, TP5). Additionally, another method proposes the obfuscation of SHA traffic (SP8). The final method proposes an application firewall to filter traffic orig- inating from all types of smart devices (RP4). There are three vulnerabilities from the first literature review that do not have any corresponding protection methods (SV3, SV5, SV15). Similarly, there are seven vulnerabilities in the second literature review that do not have any protection mech- anism (RV3, RV4, RV5, RV8, RV9, RV10, RV11). The STRIDE analysis resulted in one vulnerability without a protection method (T10). However, one protection method from the second literature review did not have any corresponding vulnera- bility (RP4). The literature reviews show that many of the vulnerabilities do affect SHAs, meaning additional protection methods are needed. The most effective protection method would be only allowing commands from a speaker near the device (SP1, SP13, RP1). Detecting whether a voice command is issued through a speaker or by a human is the second-most effective (SP5, RP2).

6.3 Literature Reviews

The literature reviews produced satisfactory results, however there are validity threats. A possible issue for construct validity is the pre-operational explication of a similar system in the second systematic literature review. By creating the definition early on, similar systems outside the definition scope may be lost, resulting in a less com- prehensive map of threats. Major components of the SHAs were extracted and used to both define similar systems and address construct validity. These components 50 Chapter 6. Analysis and Discussion include IoT-technology and voice-based interfaces. Internal validity is addressed through the use of specific inclusion and exclusion criteria. Furthermore, quality assessment criteria were used to assess To ensure the validity of these criteria, Kitchenham’s guidelines were used as a basis when determining said criteria. To address external validity, we chose to include multiple SHAs from different regions within the literature review scope. The SHAs chosen for the literature reviews were found in the initial stages of the thesis. Both researchers conducted basic research to find a range of SHAs existing in different markets. Even with this range of different devices, almost all studies focused on either Google Home or Amazon Echo. A reason for this focus might be that the Google Home and Amazon Echo hold the majority of market shares. Furthermore, the underlying voice assistant solutions created by Google and Amazon are often used in other SHAs. Therefore, our results, which focus on Google Home and Amazon Echo, are still applicable. Acting as both a threat and a method for addressing the reliability of the lit- erature reviews is the two-parted selection of papers. As both authors perform the quality assessment, we ensure the reliability of the final set of papers. However, it is not feasible for both authors to perform the initial exclusion process during both literature reviews due to the amount of papers. Therefore, the personal interpre- tation of the author during such an exclusion process is a possible threat towards reliability. The paper amount itself may also have resulted in an incautious initial omission, meaning papers that could potentially contain new vulnerabilities while upholding to our standards were excluded, affecting the thesis reliability. Furthermore, academic database issues may have impacted our results, due to explicit operators and the exporting of papers not always being supported. The lack of explicitly stated operators means that we have to trust the implicit operators in the databases themselves. No export functionality meant we had to create custom export scripts or rely on third-party programs to gather our literature.

6.4 Threat Modelling

We chose a generalised STRIDE-approach for this thesis. Instead of performing an in-depth analysis of a system including specific smart home devices, all such devices were generalised. This approach was chosen as specific IoT-devices are considered out of scope. Choosing a specific approach would generate device-specific information beyond SHAs, resulting in a value-loss as fewer resources are allocated to detecting SHA-specific vulnerabilities and protection methods. Another alteration of the STRIDE methodology is the iteration count. In prac- tice, the STRIDE analysis is performed in multiple iterations, whereas we chose to do only one. We deem that performing one iteration still provides the study with enough value to warrant such a decision. Furthermore, this work is meant to act as a broad foundation towards the tracking of SHA vulnerabilities, meaning further iterations would allocate resources and return less value to the thesis. 6.5. Experiments 51

6.5 Experiments The experiments have differing success rates. An example is the adversarial attack based on psychoacoustic hiding which, while unsuccessful, enables the discussion of indistinguishable sound-based attack vectors. For example, the DolphinAttack uses ultrasonic sounds outside of the human hearing spectrum to send commands to SHAs [34]. Similarly, the adversarial attack based on psychoacoustic hiding uses background noise which humans either ignore or perceive as interference [82]. The reason behind the failure of the adversarial attack is potentially due to the audio file being played via loudspeakers. In the attack paper, the audio file containing the adversarial commands are fed directly into the targeted system, meaning it has access to all data in the audio file and “hears” it directly. An issue regarding construct validity may arise as we compare the two devices within the thesis scope. The devices are exposed to the same independent variables, and even though the main functionality is the same between Google Home and Amazon Echo, there are some differences. This discrepancy might affect the results, given that in the experiment we place both devices in the same test group. A threat towards the internal validity of the experiment is the selection bias of the independent variable. Since the chosen independent variables are assumed to cover a few different areas of vulnerabilities, they are influenced by the bias from both the authors’ knowledge and the results from the literature reviews. Furthermore, the version number of both SHAs could be another threat towards internal validity. The results are based on a specific version number and may not be applicable in the future, depending on patches and software updates. This problem may, therefore, harm the robustness and reproducibility of the experiments. External validity may be affected as the experiments only test threats on two SHAs. This approach creates a bias due to the literature review consisting of a broader range of devices than the more focused scope of the experiments. It is possible that our technical skill level is not sufficient enough to fully un- derstand and adapt the chosen attacks towards SHAs, threatening the experiment reliability. This lack of skill means that an experiment that yields a negative result could, with more understanding and knowledge, yield a positive result. Another issue regarding reliability is the introduction of potential data loss and interference in two stages: the loudspeakers and the microphone. If the loudspeakers are not capable of reproducing all aspects of the adversarial audio file, data loss is imminent. The same reasoning can be applied to the SHA microphone. If the microphone is unable to record the full spectrum of the adversarial audio file, the attack may fail.

6.5.1 Features Not Supported While testing privacy-infringing and unauthorised functionality, some chosen com- mands were not supported for each device. This lack of support is the case for sending text messages on the Amazon Echo. However, note that in Table 5.13, if one SHA produces "Not supported"-results it does not necessarily imply that de- vice is less vulnerable. The device may support the tested functionality through other steps, such as Amazon Echo supporting texting through their calling service. 52 Chapter 6. Analysis and Discussion

Consequently, the results should not be interpreted as which device is the safest but rather act as an audit of whether simple, easy-to-access privacy-infringing or possibly harmful commands can be executed without authorisation. The cause of commands not being supported could be due to region lock. As our experiments were performed on devices purchased in Sweden, functionality that exists in other regions may not exist in ours. The locale of both devices was set to English and a VPN service was used to reduce the possibility of region locked functionality. However, this appeared not to affect functionality such as sending text messages.

6.5.2 Vulnerability Score During the threat validation process we calculated the standard deviation of both device’s vulnerability scores, as seen in Table 5.16. From the results we see that the Amazon Echo deviates more regarding the vulnerability score, possibly due to having a broader threat surface. As for attack severity, with both the zero-value attacks and small sample size in mind, the argument can be that made that attacks affecting the devices do not have a high impact. This argument strengthens our conclusion that SHAs are safe enough to use in a home environment. Finally, a larger sample size for the threat validation would therefore provide more accurate results. Chapter 7 Conclusions and Future Work

This thesis shows that several vulnerabilities exist in SHAs. The academic sources investigating SHAs provide the most significant number of vulnerabilities, most of which are verified by the researchers through experiments against the devices. The vulnerabilities found in the academic context of similar systems are mostly conceptual and theoretical, with few tested against the SHA devices. Vulnerability databases and security publishings provide few vulnerabilities, which in turn are often quickly patched and resolved by the vendors. Many of the discovered vulnerabilities with a possible high impact regarding security or privacy, targeting either SHAs or similar systems, are difficult to exploit. Consequently, the low-impact vulnerabilities require a lower technical skill to exploit. High impact vulnerabilities are often targeted, meaning that the risk of a typical home end-user becoming the victim of such an attack is reduced. The protection methods discovered do not cover all types of vulnerabilities re- ported in this thesis. However, the existing protection methods within the tested SHAs, if activated, provide essential protection against low-impact, opportunistic vulnerabilities which we deem more likely to occur. Therefore, we regard the exist- ing security risks as not severe enough for ordinary end-users to warrant avoiding the devices altogether. Although, end-users particularly vulnerable in case of attacks, for example due to political ties or line of work, should be aware of the risks.

7.1 Future Works

Several future studies in the thesis domain are valid. One approach would be to verify the threats found in this thesis, examining whether they are present in a home environment. Another approach could delve deeper into adversarial attacks, furthering the context of experiments. As the experiments show, sending adversarial audio files over the air could impact the data transfer. Therefore, feeding the audio file directly into the SHAs for processing may produce different results. Direct audio input could be approached through creating an application which processes audio files on the SHAs, or through the SHA vendors allowing for direct audio file input. Another approach would be the creation of a formal model for securing SHAs. As this thesis provides both protection methods and multiple types of vulnerabilities, it could function as a cornerstone for such a model. Privacy is another possible approach for future studies. While manufacturers

53 54 Chapter 7. Conclusions and Future Work claim to uphold end-user privacy, incidents have shown this is not always the case1, meaning further investigation is justified. Possible privacy investigation cases include whether the SHA listens more than claimed by the manufacturer, what data is sent to manufacturer servers and is it anonymised. To investigate these cases, one could perform traffic analysis or gain internal access to the devices themselves.

1https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to- you-on-alexa-a-global-team-reviews-audio Bibliography

[1] F. Bentley, C. Luvogt, M. Silverman, R. Wirasinghe, B. White, and D. Lottr- jdge, “Understanding the Long-Term Use of Assistants,” Pro- ceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Tech- nologies, vol. 2, no. 3, pp. 1–24, 2018, issn: 24749567. doi: 10.1145/3264901. [2] D. Yu and L. Deng, Automatic Speech Recognition, ser. Signals and Communi- cation Technology. London: Springer London, 2015, isbn: 978-1-4471-5778-6. doi: 10.1007/978-1-4471-5779-3. [3] W. Vogels, Bringing the Magic of Amazon AI and Alexa to Apps on AWS. 2016. [Online]. Available: https://www.allthingsdistributed.com/2016/ 11/amazon-ai-and-alexa-for-all-aws-apps.html (visited on 2019-01-31). [4] J. Lau, B. Zimmerman, and F. Schaub, “Alexa, Are You Listening?” Proceed- ings of the ACM on Human-Computer Interaction, vol. 2, no. CSCW, pp. 1–31, 2018, issn: 25730142. doi: 10.1145/3274371. [5] M. B. Hoy, “Alexa, , , and More: An Introduction to Voice Assis- tants,” Medical Reference Services Quarterly, vol. 37, no. 1, pp. 81–88, 2018, issn: 0276-3869. doi: 10.1080/02763869.2018.1404391. [6] C. Kolias, G. Kambourakis, A. Stavrou, and J. Voas, “DDoS in the IoT: Mirai and Other Botnets,” Computer, vol. 50, no. 7, pp. 80–84, 2017, issn: 0018-9162. doi: 10.1109/MC.2017.201. [7] K. Fong, K. Hepler, R. Raghavan, and P. Rowland, “rIoT : Quantify- ing Consumer Costs of Insecure Internet of Things Devices,” p. 48, 2018. [Online]. Available: https : / / pdfs . semanticscholar . org / 7396 / 8dfe4ab7c885ab5d7b51815d3b25d8d92640.pdf. [8] N. Apthorpe, D. Reisman, S. Sundaresan, A. Narayanan, and N. Feamster, “Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic,” IEEE Transactions on Network and Service Management, vol. 6, no. 2, pp. 110–121, 2017, issn: 1932-4537. arXiv: 1708.05044. [9] J. P. Kesan and C. M. Hayes, “Bugs in the Market: Creating a Legitimate, Transparent, and Vendor-Focused Market for Software Vulnerabilities,” SSRN Electronic Journal, 2016, issn: 1556-5068. doi: 10.2139/ssrn.2739894. arXiv: arXiv:1011.1669v3. [10] D. A. Orr and L. Sanchez, “Alexa , did you get that? Determining the evi- dentiary value of data stored by the Amazon® Echo,” Digital Investigation, vol. 24, pp. 72–78, 2018, issn: 17422876. doi: 10.1016/j.diin.2017.12.002.

55 56 BIBLIOGRAPHY

[11] Z. Piotrowski and P. Gajewski, “Voice spoofing as an impersonation attack and the way of protection,” Journal of Information Assurance and Security, vol. 2, pp. 223–225, 2007. [12] S. K. Ergunay, E. Khoury, A. Lazaridis, and S. Marcel, “On the vulnerability of speaker verification to realistic voice spoofing,” in 2015 IEEE 7th International Conference on Biometrics Theory, Applications and Systems (BTAS), IEEE, 2015, pp. 1–6, isbn: 978-1-4799-8776-4. doi: 10.1109/BTAS.2015.7358783. [13] X. Lei, G.-H. Tu, A. X. Liu, C.-Y. Li, and T. Xie, “The Insecurity of Home Dig- ital Voice Assistants - Vulnerabilities, Attacks and Countermeasures,” in 2018 IEEE Conference on Communications and Network Security (CNS), IEEE, 2018, pp. 1–9, isbn: 978-1-5386-4586-4. doi: 10.1109/CNS.2018.8433167. [14] Canalys, “Google beats Amazon to first place in smart speaker market,” Canalys Newsroom, no. April, pp. 0–2, 2018. [Online]. Available: https://www. canalys.com/static/press_release/2018/Pressrelease230518GooglebeatsAmazontofirstplaceinsmartspeakermarket. pdf. [15] W. Haack, M. Severance, M. Wallace, and J. Wohlwend, “Security Analysis of the Amazon Echo,” PDF, 2017, p. 14. [Online]. Available: https://pdfs. semanticscholar . org / 35c8 / 47d63db1dd2c8cf36a3a8c3444cdeee605e4 . pdf. [16] Google, Control Google Home by voice - Google Home Help. [Online]. Available: https://support.google.com/googlehome/answer/7207759?hl=en&ref_ topic=7196346 (visited on 2019-02-18). [17] LINE, [Japan]LINE Launches Smart Speaker ‘Clova WAVE’, 2017. [Online]. Available: https://linecorp.com/th/pr/news/en/2017/1894 (visited on 2019-03-01). [18] HARMAN International Industries Incorporated, Harman Kardon Allure | Voice-activated speaker specification sheet, 2017. [Online]. Available: https:// www.harmankardon.com/on/demandware.static/-/Sites-masterCatalog_ Harman/default/dw974d74ce/pdfs/HK_Allure_Spec_Sheet_English.pdf. [19] Amazon, Alexa Voice Service, 2019. [Online]. Available: https://developer. amazon.com/alexa-voice-service (visited on 2019-03-01). [20] Google, Google Assistant SDK | Google Assistant SDK for devices | Google Developers, 2019. [Online]. Available: https : / / developers . google . com / assistant/sdk/ (visited on 2019-03-01). [21] B. Li et al., “Acoustic Modeling for Google Home,” in Interspeech 2017, ISCA: ISCA, 2017, pp. 399–403. doi: 10.21437/Interspeech.2017-234. [22] Google, Actions on Google | Actions on Google | Google Developers. [Online]. Available: https://developers.google.com/actions/ (visited on 2019-02- 20). [23] M. Massé, REST API Design Rulebook. 2012, isbn: 9781449310509. [24] P. Siriwardena, Advanced API Security. Berkeley, CA: Apress, 2014, isbn: 978- 1-4302-6818-5. doi: 10.1007/978-1-4302-6817-8. BIBLIOGRAPHY 57

[25] Z. Su and G. Wassermann, “The essence of command injection attacks in web applications,” in Conference record of the 33rd ACM SIGPLAN-SIGACT sym- posium on Principles of programming languages - POPL’06, New York, New York, USA: ACM Press, 2006, pp. 372–382, isbn: 1595930272. doi: 10.1145/ 1111037.1111070. [26] Y. Baştanlar and M. Özuysal, “Introduction to Machine Learning,” in Intro- duction to Machine Learning, 2014, pp. 105–128, isbn: 9780262012430. doi: 10.1007/978-1-62703-748-8_7. arXiv: 0904.3664. [27] J. P. Campbell, “Speaker recognition: a tutorial,” Proceedings of the IEEE, vol. 85, no. 9, pp. 1437–1462, 1997, issn: 00189219. doi: 10.1109/5.628714. [28] D. A. Reynolds, “An overview of automatic speaker recognition technology,” in IEEE International Conference on Acoustics Speech and Signal Process- ing, IEEE, 2002, pp. IV–4072–IV–4075, isbn: 0-7803-7402-9. doi: 10.1109/ ICASSP.2002.5745552. [29] ISO/IEC, “INTERNATIONAL STANDARD ISO / IEC Information technol- ogy — Security techniques — Information security management systems — Overview and,” vol. 2018, p. 38, 2018. [Online]. Available: http://k504.khai. edu/attachments/article/819/ISO_27000_2014.pdf. [30] N. Zhang, X. Mi, X. Feng, X. Wang, Y. Tian, and F. Qian, “Understanding and Mitigating the Security Risks of Voice-Controlled Third-Party Skills on Amazon Alexa and Google Home,” Tech. Rep. 2, 2018. arXiv: 1805.01525. [31] D. Kumar et al., “Skill squatting attacks on amazon alexa,” 27th USENIX Se- curity Symposium, pp. 33–47, 2018. [Online]. Available: https://www.usenix. org/system/files/conference/usenixsecurity18/sec18-kumar.pdf. [32] S. Hernan, S. Lambert, T. Ostwald, and A. Shostack, “Threat modeling-uncover security design flaws using the stride approach,” MSDN Magazine, 2006, issn: 1528-4859. [33] Microsoft, The STRIDE Threat Model | Microsoft Docs. [Online]. Available: https://docs.microsoft.com/en- us/previous- versions/commerce- server/ee823878(v=cs.20) (visited on 2019-02-19). [34] G. Zhang, C. Yan, X. Ji, T. Zhang, T. Zhang, and W. Xu, “DolphinAttack,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Commu- nications Security - CCS ’17, ACM, New York, New York, USA: ACM Press, 2017, pp. 103–117, isbn: 9781450349468. doi: 10.1145/3133956.3134052. [35] X. Yuan et al., “All Your Alexa Are Belong to Us: A Remote Voice Con- trol Attack against Echo,” in 2018 IEEE Global Communications Conference (GLOBECOM), IEEE, 2018, pp. 1–6, isbn: 978-1-5386-4727-1. doi: 10.1109/ GLOCOM.2018.8647762. [36] S. Chen et al., “You Can Hear But You Cannot Steal: Defending Against Voice Impersonation Attacks on Smartphones,” in 2017 IEEE 37th Interna- tional Conference on Distributed Computing Systems (ICDCS), IEEE, 2017, pp. 183–195, isbn: 978-1-5386-1792-2. doi: 10.1109/ICDCS.2017.133. 58 BIBLIOGRAPHY

[37] M. Ford and W. Palmer, “Alexa, are you listening to me? An analysis of Alexa voice service network traffic,” Personal and Ubiquitous Computing, vol. 23, no. 1, pp. 67–79, 2019, issn: 1617-4909. doi: 10.1007/s00779-018-1174-x. [38] H. Chung et al., “Leadership and strategy in the news,” New Scientist, Chandos Information Professional Series, vol. 28, no. 1, M. Levin-Epstein, Ed., p. 1, 2018, issn: 0262-4079. doi: 10.1007/s00146-017-0733-4. arXiv: 1707.08696. [39] H. Chung, J. Park, and S. Lee, “Digital forensic approaches for Amazon Alexa ecosystem,” Digital Investigation, vol. 22, S15–S25, 2017, issn: 17422876. doi: 10.1016/j.diin.2017.06.010. arXiv: 1707.08696. [40] J. Bugeja, A. Jacobsson, and P. Davidsson, “On Privacy and Security Chal- lenges in Smart Connected Homes,” in 2016 European Intelligence and Security Informatics Conference (EISIC), IEEE, 2016, pp. 172–175, isbn: 978-1-5090- 2857-3. doi: 10.1109/EISIC.2016.044. [41] L. Rafferty, F. Iqbal, and P. C. K. Hung, “A Security Threat Analysis of Smart Home Network with Vulnerable Dynamic Agents,” in Computing in Smart Toys, 2017, pp. 127–147. doi: 10.1007/978-3-319-62072-5_8. [42] V. Vijayaraghavan and R. Agarwal, “Security and Privacy Across Connected Environments,” in Security and Privacy Across Connected Environments, 2017, pp. 19–39. doi: 10.1007/978-3-319-70102-8_2. [43] E. Zeng, S. Mare, and F. Roesner, “End User Security and Privacy Concerns with Smart Homes,” in Proceedings of the 13th Symposium on Usable Privacy and Security, 2017, isbn: 9781931971393. [44] A. Sivanathan, F. Loi, H. H. Gharakheili, and V. Sivaraman, “Experimental evaluation of cybersecurity threats to the smart-home,” in 2017 IEEE Inter- national Conference on Advanced Networks and Telecommunications Systems (ANTS), IEEE, 2017, pp. 1–6, isbn: 978-1-5386-2347-3. doi: 10.1109/ANTS. 2017.8384143. [45] B. Kitchenham et al., “Preliminary guidelines for empirical research in soft- ware engineering,” IEEE Transactions on Software Engineering, vol. 28, no. 8, pp. 721–734, 2002, issn: 0098-5589. doi: 10.1109/TSE.2002.1027796. [46] B. Kitchenham and S. Charters, “Guidelines for performing Systematic Liter- ature reviews in Software Engineering Version 2.3,” Engineering, 2007. [47] A.-w. Harzing, Publish or Perish, 2007. [Online]. Available: https://harzing. com/resources/publish-or-perish. [48] A. Fink, Conducting Research Literature Reviews, Fourth edi. SAGE Publica- tions, Inc., 2013, isbn: 9781452259499. [49] B. D. Meyer, “Natural and quasi-experiments in economics,” Journal of Busi- ness and Economic Statistics, 1995, issn: 15372707. doi: 10.1080/07350015. 1995.10524589. [50] G. Charness, U. Gneezy, and M. A. Kuhn, “Experimental methods: Between- subject and within-subject design,” Journal of Economic Behavior Organiza- tion, vol. 81, no. 1, pp. 1–8, 2012, issn: 01672681. doi: 10.1016/j.jebo. 2011.08.009. BIBLIOGRAPHY 59

[51] L. Schönherr, K. Kohls, S. Zeiler, T. Holz, and D. Kolossa, “Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding,” vol. 54, no. 1, 2018, issn: 00295515. arXiv: 1808.05665. [52] Y. Jia, Y. Xiao, J. Yu, X. Cheng, Z. Liang, and Z. Wan, “A Novel Graph-based Mechanism for Identifying Traffic Vulnerabilities in Smart Home IoT,” in IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, vol. 2018-, IEEE, 2018, pp. 1493–1501, isbn: 978-1-5386-4128-6. doi: 10.1109/INFOCOM. 2018.8486369. [53] Y. Gong and C. Poellabauer, “An Overview of Vulnerabilities of Voice Con- trolled Systems,” arXiv preprint arXiv:1803.09156, 2018. arXiv: 1803.09156. [54] H. Chung, M. Iorga, J. Voas, and S. Lee, “Alexa, Can I Trust You?” Computer, vol. 50, no. 9, pp. 100–104, 2017, issn: 0018-9162. doi: 10.1109/MC.2017. 3571053. [55] L. Blue, H. Abdullah, L. Vargas, and P. Traynor, “2MA: Verifying Voice Com- mands via Two Microphone Authentication,” Proceedings of the 2018 ACM ASIA Conference on Computer and Communications Security, pp. 89–100, 2018. doi: 10.1145/3196494.3196545. [56] Y.-T. Chang, A Two-layer Authentication Using Voiceprint for Voice Assis- tants. digital.lib.washington.edu, 2018. [Online]. Available: https://digital. lib.washington.edu/researchworks/handle/1773/42134. [57] Y. Gong and C. Poellabauer, “Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues,” in 27th International Conference on Computer Communication and Networks (ICCCN), IEEE, 2018, pp. 1–9, isbn: 978-1-5386-5156-8. doi: 10.1109/ICCCN.2018.8487334. [58] L. Blue, L. Vargas, and P. Traynor, “Hello, Is It Me You’re Looking For?” In Proceedings of the 11th ACM Conference on Security Privacy in Wireless and Mobile Networks - WiSec ’18, New York, New York, USA: ACM Press, 2018, pp. 123–133, isbn: 9781450357319. doi: 10.1145/3212480.3212505. [59] N. Apthorpe, D. Reisman, and N. Feamster, “Closing the Blinds: Four Strate- gies for Protecting Smart Home Privacy from Network Observers,” arXiv preprint arXiv:1705.06809, 2017. arXiv: 1705.06809. [60] R. B. Jackson and T. Camp, “Machine learning for encrypted Amazon Echo traffic classification,” PhD thesis, Colorado School of Mines, 2018. [Online]. Available: https : / / dspace . library . colostate . edu / handle / 11124 / 172223. [61] C. Jackson and A. Orebaugh, “A study of security and privacy issues associated with the Amazon Echo,” International Journal of Internet of Things and Cyber- Assurance, vol. 1, no. 1, p. 91, 2018, issn: 2059-7967. doi: 10.1504/IJITCA. 2018.090172. [62] N. Roy, “Inaudible acoustics: Techniques and applications,” PhD thesis, Uni- versity of Illinois at Urbana-Champaign, 2018. [Online]. Available: https:// www.ideals.illinois.edu/handle/2142/102475. 60 BIBLIOGRAPHY

[63] L. Song and P. Mittal, “Inaudible Voice Commands,” arXiv preprint arXiv:1708.07238, 2017. arXiv: 1708.07238. [64] N. Roy, S. Shen, H. Hassanieh, and R. R. Choudhury, “Inaudible Voice Com- mands: The Long-Range Attack and Defense,” 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18), pp. 547–560, 2018. [Online]. Available: https://www.usenix.org/system/files/conference/ nsdi18/nsdi18-roy.pdf. [65] T. Vaidya, Y. Zhang, M. Sherr, and C. Shields, “Cocaine Noodles: Exploit- ing the Gap between Human and Machine Speech Recognition,” 9th USENIX Workshop on Offensive Technologies (WOOT 15), 2015. [Online]. Available: https://www.usenix.org/system/files/conference/woot15/woot15- paper-vaidya.pdf. [66] X. Yuan et al., “CommanderSong: A Systematic Approach for Practical Ad- versarial Voice Recognition,” 27th USENIX Security Symposium (USENIX Se- curity 18), pp. 49–64, 2018. arXiv: 1801.08535. [67] A. M. Lonzetta, P. Cope, J. Campbell, B. J. Mohd, and T. Hayajneh, “Security vulnerabilities in bluetooth technology as used in IoT,” Journal of Sensor and Actuator Networks, vol. 7, no. 3, p. 28, 2018. doi: 10.3390/jsan7030028. [68] T. Anscombe et al., “IoT and Privacy By Design in the Smart Home,” PDF, 2017. [Online]. Available: https://www.welivesecurity.com/wp-content/ uploads/2018/02/ESET_MWC2018_IoT_SmartHome.pdf. [69] N. An, A. M. Duff, M. R. S. Noorani, S. Weber, and Spiros Mancoridis, “Mal- ware Anomaly Detection on Virtual Assistants,” Proceedings of the 2018 Inter- national Conference on Malicious and Unwanted Software (MALWARE ’18), 2018. [Online]. Available: https://www.cs.drexel.edu/~spiros/papers/ malcon2018b.pdf. [70] H. Feng, K. Fawaz, and K. G. Shin, “Continuous Authentication for Voice Assistants,” in Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking - MobiCom ’17, New York, New York, USA: ACM Press, 2017, pp. 343–355, isbn: 9781450349161. doi: 10.1145/3117811. 3117823. [71] Y. Meng, W. Zhang, H. Zhu, and X. S. Shen, “Securing Consumer IoT in the Smart Home: Architecture, Challenges, and Countermeasures,” IEEE Wireless Communications, vol. 25, no. 6, pp. 53–59, 2018, issn: 1536-1284. doi: 10. 1109/MWC.2017.1800100. [72] N. Asrir, Windows Speech Recognition- Buffer Overflow, 2018. [Online]. Avail- able: https://www.exploit-db.com/exploits/45077. [73] M. Yang, T. Zhu, Y. Xiang, and W. Zhou, “Density-Based Location Preser- vation for Mobile Crowdsensing With Differential Privacy,” IEEE Access, vol. 6, pp. 14 779–14 789, 2018, issn: 2169-3536. doi: 10.1109/ACCESS.2018. 2816918. BIBLIOGRAPHY 61

[74] J. Han, A. J. Chung, and P. Tague, “Pitchln: eavesdropping via intelligible speech reconstruction using non-acoustic sensor fusion,” in Proceedings of the 16th ACM/IEEE International Conference on Information Processing in Sen- sor Networks - IPSN ’17, New York, New York, USA: ACM Press, 2017, pp. 181–192, isbn: 9781450348904. doi: 10.1145/3055031.3055088. [75] S. A. Anand and N. Saxena, “Speechless: Analyzing the Threat to Speech Pri- vacy from Smartphone Motion Sensors,” in 2018 IEEE Symposium on Security and Privacy (SP), IEEE, 2018, pp. 1000–1017, isbn: 978-1-5386-4353-2. doi: 10.1109/SP.2018.00004. [76] I. Zavalyshyn, N. O. Duarte, and N. Santos, “HomePad: A Privacy-Aware Smart Hub for Home Environments,” in 2018 IEEE/ACM Symposium on Edge Computing (SEC), IEEE, 2018, pp. 58–73, isbn: 978-1-5386-9445-9. doi: 10. 1109/SEC.2018.00012. [77] G. Solmaz and F.-J. Wu, “Together or alone: Detecting group mobility with wireless fingerprints,” in 2017 IEEE International Conference on Communi- cations (ICC), IEEE, 2017, pp. 1–7, isbn: 978-1-4673-8999-0. doi: 10.1109/ ICC.2017.7997426. [78] V. Tiwari, M. F. Hashmi, A. Keskar, and N. C. Shivaprakash, “Virtual home assistant for voice based controlling and scheduling with short speech speaker identification,” Multimedia Tools and Applications, pp. 1–26, 2018. doi: 10. 1007/s11042-018-6358-x. [79] Z. Wu, Z. Peng, and T. Yu, “Application Research of Voiceprint Recognition Technology in Mobile E-commerce Security,” in Proceedings of the 2018 2nd In- ternational Conference on Economic Development and Education Management (ICEDEM 2018), Paris, France: Atlantis Press, 2018, isbn: 978-94-6252-642-6. doi: 10.2991/icedem-18.2018.119. [80] F. A. Mansoori and C. Y. Yeun, “Emerging new trends of location based sys- tems security,” in 2015 10th International Conference for Internet Technology and Secured Transactions (ICITST), IEEE, 2015, pp. 158–163, isbn: 978-1- 9083-2052-0. doi: 10.1109/ICITST.2015.7412078. [81] C. Jogréus, Matematisk statistik med tillämpningar, andra uppl. Studentlitter- atur, 2009, p. 380, isbn: 978-91-44-05449-0. [82] L. Schönherr, K. Kohls, S. Zeiler, T. Holz, and D. Kolossa, “Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding,” NDSS Symposium 2019, vol. 54, no. 1, 2018, issn: 00295515. arXiv: 1808. 05665.

Appendices

63

Appendix A Permission Forms

This chapter contains permission forms to reuse parts of publications such as figures or tables.

A.1 Permission IEEEXplore

The following is a permission form for IEEEXplore stating that Figure 2.1 is allowed to be used given the rules below. The figure is originally from the paper Automatic Speech Recognition written by Lei et al. [13]. Copyright 2018, IEEE Thesis / Dissertation Reuse

The IEEE does not require individuals working on a thesis to obtain a formal reuse license, however, you may print out this statement to be used as a permission grant:

Requirements to be followed when using any portion (e.g., figure, graph, table, or textual material) of an IEEE copyrighted paper in a thesis:

1) In the case of textual material (e.g., using short quotes or referring to the work within these papers) users must give full credit to the original source (author, paper, publication) followed by the IEEE copyright line 2011 IEEE. 2) In the case of illustrations or tabular material, we require that the copyright line [Year of original publication] IEEE appear prominently with each reprinted figure and/or table. 3) If a substantial portion of the original paper is to be used, and if you are not the senior author, also obtain the senior author's approval.

Requirements to be followed when using an entire IEEE copyrighted paper in a thesis:

1) The following IEEE copyright/ credit notice should be placed prominently in the references: [year of original publication] IEEE. Reprinted, with permission, from [author names, paper title, IEEE publication title, and month/year of publication] 2) Only the accepted version of an IEEE copyrighted paper can be used when posting the paper or your thesis on-line.

65 66 Appendix A. Permission Forms

3) In placing the thesis on the author's university website, please display the following message in a prominent place on the website: In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of [university/educational entity' s name goes here]'s products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http://www.ieee.org/publications_standards/publications/rights/ rights_link.html to learn how to obtain a License from RightsLink.

If applicable, University Microfilms and/or ProQuest Library, or the Archives of Canada may supply single copies of the dissertation. Appendix B Scripts

B.1 Script for Search Result Extraction var titles = document.getElementsByClassName("actionBtn ng-scope"); console.log(titles.length) for (var i = 0; i < titles.length; i++){ titles[i].click(); }

67

Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden