2020 International Conference on Computational Science and Computational Intelligence (CSCI)

Customized Services Using Voice Assistants

Kori Painchaud and Leonidas Deligiannidis Science & Networking Wentworth Institute of Boston, MA 02115, USA {painchaudk | deligiannidisl}@wit.edu

Abstract - of Things (IoT) devices can be defined as a versions, called the Home Mini and the collection of computing devices that communicate and transfer data Echo Dot. The main difference is the size, which allows the with one another. As the popularity of IoT devices increases, users bigger products (Google Home and ) to have want to maximize the functionality of their devices. Voice-enabled better sound, due to their speakers. The Google Home has one IoT devices serve and assist the user by performing functions like two-inch driver and dual two-inch passive radiators, while the playing music, controlling lights, setting alarms and reminders, and much more. The popularity of these devices has grown, and they Mini only has a 40mm driver [4]. Drivers push out sound appeal to consumers because of the available accessories that can be waves, while passive radiators produce more power and purchased and connected to, such as smart lights, smart shades, etc. resonance, equating to better sound quality. Whether it be the These smart accessories can connect to devices such as the Google Google Home Mini, or Amazon Echo Dot, the function of the Home or the Amazon Echo, allowing the user to control multiple devices stays the same. Their purpose is to be a speaker which common house functions with their voice. This demonstrates can provide users with answers, play music, and control other how to control non-smart LED lights using voice commands with smart accessories. Users can have an array of smart devices smart home devices. A service known as IFTTT, “If This, Then and products all controlled seamlessly by a Google Home or That”, is utilized to add custom commands to two smart speakers. an Amazon Echo Dot. This paper shows how accessible and simple it is for an individual to control a non- using voice commands, while Many individuals prefer an Amazon Echo Dot over a shedding light on how the use of VPAs can aid people with Google Home Mini, due to its capabilities regarding Amazon disabilities. The security risks and threats of using IFTTT are integration. Ordering products from Amazon using an addressed. Amazon Echo Dot is very easy to do, especially with the Amazon Dash Button. The Amazon Dash Button is a Keywords: , Voice Assistants, IFTTT device that orders certain products with just the click of a button. Other users prefer a Google Home because it has a I. INTRODUCTION better natural language processor [5]. Natural language In 1962 at the Seattle World’s Fair, IBM debuted the first processors (NLP) “has emerged from fields such as artificial voice assistant. The assistant was called Shoebox, and it intelligence, linguistics, formal languages, and compilers” could “perform mathematical functions and recognize 16 [6]. This is an advantage because NLP helps spoken words as well as digits 0-9” [2]. Since then, Virtual interpret the human language and respond in a Private Assistants (VPAs) technology has grown and comprehensive manner. Deep Neural Networks, which integrated itself into our daily lives. It is said that the primary highlights the main components of dialogue systems, helped way for humans to get information quickly is the visual Google improve . Amazon uses automatic channel, and the secondary way being auditory [3]. Now, (ASR) which converts speech to text with Apple, Google, , Samsung, and Amazon all have natural language understanding (NLP) [7]. Both devices, the their own adaptations of VPAs. However, Samsung, Google, Amazon Echo Dot and the Google Home Mini are popular and Apple integrate their own VPAs into smart phones, and competitive. Since each device has their strong suit, it is which can perform advanced tasks to aid the user. Some conceivable that an individual would purchase both devices common tasks include telling the user the weather forecast, and intend on using them together. However, this creates looking up information on the Internet, controlling music, issues of compatibility between the two devices. opening applications, and controlling smart home devices, The purpose of the IFTTT platform/service is to help etc. users’ products and services smoothly work together [8]. The Google Assistant does not simply integrate itself in use of this service solves the problem of compatibility . There is a product, Google Home, that is solely between the Amazon Echo Dot and Google Home Mini. The controlled using voice commands. Amazon, which uses platform allows a user to use both devices to control the LED Alexa as the voice assistant, has a similar product called the light strip, without having to pick one or the other, meaning Amazon Echo [1]. Both products have smaller product the Echo Dot and Home Mini will both have the ability to

978-1-7281-7624-6/20/$31.00 ©2020 IEEE 1060 DOI 10.1109/CSCI51800.2020.00197 turn on and off the light strip. Using a platform like IFTTT place, etc.” [9]. This service can be run on an Android device, will also benefit the environment. If consumers use this and requires a GPS system, a three-axis gyroscope, and service to make non-smart products into smart products, then internet connection. The service takes a user’s input and it saves them from purchasing new smart products and responds back with a human voice. Users can use this service disposing the old non-smart products. The IFTTT service for navigation because the voice assistant will tell them how allows consumers to “repurpose” non-smart products into to get to their destination based of their current location. The smart products that can be voice controlled. system also can describe the foreground of where they are In this paper we will demonstrate and discuss how to walking. The security of the user is also thought of; and they control LED strip lights, or any other electrical accessory, use an algorithm to detect any unexpected falls. If there is an using voice commands on both the Amazon Echo Dot and unexpected fall, the system will notify the user’s guardian by Google Home Mini, and the security risks of using IFTTT or sending them the location where the fall occurred [9]. any other third-party application. The IFTTT platform allows On the same note, individuals from the University of users to create and connect services to both the Amazon Echo Indonesia designed an IoT Smart Home system using Google Dot and the Google Home Mini. A few examples of IFTTT Assistant. The system was designed specifically to aid services include Google Assistant, , Philips physically disabled people, who struggle with turning on Hue, Ring, and many more. Philips Hue is a company that switches and other mechanisms in their homes. They believe specializes in making smart bulbs and lights. Ring is a home that smart homes “provides comfort, safety, energy-saving and smart security company that manufactures products like potential for the house at any time, which gives a better doorbells, indoor/outdoor security cameras, and smart quality of life for the people” [10]. Using only voice lighting. The company is owned by Amazon and their smart commands, the smart home system was designed to control products can be controlled by smart homes, such as the electronic equipment. These devices could be a television, Google Home or the Amazon Echo. Ring is an IFTTT service lights, or fans [10]. The smart home system would allow that allows for many different projects, an example being a individuals with physical disabilities the ability to turn lights project where your lights (Philips Hue) blink when the and other devices on and off using voice commands. The doorbell rings. This visually alerts the people in the home that notion that VPAs can help people with disabilities gives someone is at the door. The project uses the Ring and Philips meaning and purpose to VPA projects. Hue service. The two services that are used in this paper are There is a wide array of projects available online relating Google Assistant and Amazon Alexa. Webhooks, which to the Google Home and the Amazon Alexa. One project that connect data to the specified Uniform Resource Locator gave inspiration to this paper was a project called “Google (URL), are used so the devices can interact with a Raspberry Home + Power Strip” [11]. This specific Pi, which in turn can be connected to other environmental project used the IFTTT platform to create a smart power strip. or relay controllers, etc. For this specific project, The author of the project achieved this using a Google Home, there will be multiple webhooks for the Google Home Mini Raspberry Pi, GPIO cables, a power strip, and three two- and the Amazon Echo Dot, all using the same URL. Other channel relays. The objective of the project was to create a than an Amazon Echo Dot and a Google Home Mini, smart power strip for non-smart devices to plug in to. Smart hardware includes: A Raspberry Pi model 4B, a SunFounder accessories like a light bulb are more expensive than a non- two-channel relay, and white LED strip lights. The white smart bulb. Making a smart power strip will allow users to LED strip lights will be wired to a channel/input of the two- voice control non-smart devices with a Google Home. channel relay, and the relay will be wired to the Raspberry Pi Lamps, fans, and anything with two states could be plugged using female to female general input output (GPIO) cables. into the power strip and controlled using voice commands. The outcome of this project is to use voice commands on both The goal of the smart power strip is to plug in a non-smart the Amazon echo dot and Google Home to control the relay, device and can turn it on and off with a Google Home. which will then control the lights. Ultimately, this means any device that has two states, on or off, can be plugged into the power strip and controlled by the II. RELATED WORK Google Home. For the “Google Home + Raspberry Pi Power Strip” There are many practical projects and systems designed project, any model of a Raspberry Pi would work. The using voice assistant technology. Individuals from the School Raspberry Pi was used to run a Node + Express Server, which of Software Engineering in Beijing China have prototyped a handled all the POST requests. The code for the server was life assistant system for people who are visually impaired. on the project’s GitHub page, which was then implemented This life assistant is voice driven, and the functions include: and modified for this paper’s project. To accomplish the “messaging, describing the street view, navigating to certain power strip project, you must remove all the physical

1061 switches from the power strip. This makes the project to set. This applet uses the Philips Hue and the Weather complex because it requires wiring and splicing of the Underground service. In this paper, IFTTT will be used to positive wire to the three relays. connect the Amazon Echo Dot and the Google Home Mini to The “Google Home + Raspberry Pi Power Strip” project the Google Assistant and Alexa services. Once the service is was designed solely for the Google Home without any chosen, there are multiple different triggers to choose from. thought of using any Amazon devices, such as the Amazon “Triggers” are the words or phrases that will set off or Echo or the Amazon Echo Dot. Fortunately, on IFTTT there activate the action of the lights being turned on. Both the is a service called “Alexa” which is similar to the Google Google Assistant and Alexa services use the “Say a simple Assistant service. The only difference is that the Alexa phrase” trigger. Other triggers include “Say a phrase with a service serves Amazon devices, such as the Amazon Echo number”, or “Say a phrase with a text ingredient”. The simple Dot. Implementing both the Alexa and Google Assistant phrase trigger is chosen because it is less complex than saying services to this project added functionality, because now a a number or a symbol with a phrase. consumer does not have to pick which device to use and own. Creating a service using Google Assistant has more The paper’s project differs from the “Google Home + options for the trigger fields compared to the Alexa service. Raspberry Pi Power Strip” project in many aspects. Instead For example, an individual can input three different ways to of only using one smart home (Google Home), both the say one phrase. If the phrase is “Turn on the lights”, a user Google Home Mini and Amazon Echo Dot will be used to can then add two more additional ways to say it. A user might turn on and off the LED strip lights. An important aspect of add “Turn the lights on”, or “Please turn on the lights”. These this project is security. As previously mentioned, a URL is extra fields add more flexibility to the trigger. used to connect with the Raspberry Pi (using the Webhook The Google Assistant service also allows a user to add a service). The URL in the power strip project uses Hypertext response, meaning the Google Assistant will respond after Transfer Protocol (HTTP), which is a protocol that allows you tell it a command. Using the same example above, if a communication between web servers and web browsers. This user told the Google Home to “turn on the lights”, the lights protocol lacks security, which is why Hypertext Transfer would turn on, and the Google Assistant would then reply Protocol Secure (HTTPS) is implemented into this project. with “Ok, the lights are on”. This response is customizable HTTPS is like HTTP, except it encrypts data using Transport and assures that the user knows that the Google Assistant Layer Security (TLS). This means that all of the data sent acknowledged and understood the command. Unfortunately, from the server to the web browser will be securely the Alexa service is simpler and does not have these extra encrypted. To allow for the URL to use HTTPS, a Secure features. While creating the applet, there is no field to add an Socket Layer (SSL) certificate must be created: SSL is the Alexa response. There is also only one way to say a phrase. deprecated, or older version of TLS. Adding this layer of When a user tells Alexa a command using the IFTTT security is important because it ensures that the data transfer platform, Alexa replies saying “Ok, connected to IFTTT”. between the server and web browser is secure. Instead of This Alexa response is for all phrases that are used with the using Node + Express to serve the application, Apache is IFTTT platform. Figure 1 shows the Google Assistant applet; used. Apache allows for a free Secure Socket Layer (SSL) the Alexa applet is similar to this. certificate to be made easily. However, this solution does not When using the Google Asistant there are three different prevent someone who “knows” the exact URL and ways to trigger, or say, the command. There is also a field to parameters to send POST requests to our apache server and add a response, which is what the Assistant will say in return thus controlling the LED strip lights. to acknowledge the command. Looking at the Alexa trigger, there is only one field for a phrase and no field for a response. III. IFTTT AND THIRD-PARTY SERVICES The wording of the Alexa phrase is also different. The user must say “Alexa trigger turn on the lights”. This sounds A. IFTTT silghtly odd, because for the Google Asisstant, we can use the IFTTT is a web-based platform that connects different normal “Ok Google” phrase to trigger the command. services and devices together. These devices can be IoT The Google Assistant service and the Amazon Alexa devices or non-IoT devices, depending on the intended use of applet are set to “Make a web request”, meaning that the the applet. IFTTT uses applets, which can be defined as a action will make a web request to a public URL, also called simple chain of conditional statements. An example of this is a Webhook. Any service that makes a web request needs to an applet which puts your phone on complete silent when you have a webhook. Therefore, the Alexa and Google Assistant say, “study mode”. This applet uses the Google Assistant services need a webhook for the commands to be directed to service to listen for the voice command. Another example is the Raspbery Pi. For this project, the URL follows this an applet that turns Philips Hue bulbs on when the sun starts format:

1062 https://ip_address/API/switches/sw1?password=password” ambiguity of the voice commands and the user’s misconception about the service” [12]. This is a problem for new or inexperienced users, who do not fully understand the capabilities of the voice assistant. The attacker could be asking the user for private information, while the user is unaware that the attack is even happening. A Voice Squatting Attack (VSA) works when a user thinks they are saying a command using the invocation name, but the Google Home or Amazon Echo Dot interprets the words slightly off. An example of this attack is when an “adversary who aims at Capital One could register a skill Capital Won, Capitol One, or Captain One.” [12]. The invocation name is “Capital One”, but the devices could interpret the users command as other things. In this case, attackers would create malicious skills for “Capital Won”, “Capitol One,” and “Captain One”. C. Mitigation of VMA and VSA After understanding VSA, a product was developed to mitigate the risks and threats that come from using Google Actions and Alexa Skills. The product that was developed was a “skill name scanner that converts the invocation name string of a skill into a phonetic expression specified by ARPABET” [13]. ARPABET was developed by Advanced Research Project Agency, and “represents phonemes and allophones of the General American English language” [13]. The purpose of the skill name scanner is to measure the Figure 1. Google Assistant applet. phonetic distance between the different skill names, which The Internet Protocol (IP) address in the URL is the external will determine skills that could be harmful. If the phonetic IP address of the Raspberry Pi. The same URL is added for distance is small, then it is possible that the skill is malicious, both Alexa and Google Assistant services because both because it phonetically sounds like a real skill. After scanning devices need to be directed to the Raspberry Pi, which will 19,670 custom skills on the Amazon Market, the researchers then perform the action of toggling the lights on or off. found 4,718 skills that have squatting risks [12]. This skill B. Security of VPA using Third Party Services name scanner can help mitigate the threats of a VSA but cannot mitigate a VMA. There have been two new security risks identified in using To mitigate the threats of a VMA, the utterance of the virtual personal assistants, both being authentication issues. VPA must be identified. This means that for every skill or The fault goes to third-party platforms that allow users to add action that the VPA performs, we must be knowledgeable new actions or skills to the Google Home Mini and the about what the VPA will say back to the user. How the VPA Amazon Echo Dot. Google allows users to create “Actions”, responds to the user is important in identifying skills that try while Alexa allows individuals to create “Skills”. These to impersonate real skills. A technique was developed that companies allow users to create actions and skills to increase “automatically identifies those similar to system utterances, the functionality of the devices. The notion that these devices even in the presence of obfuscation attempts (e.g., changes to are customizable piques many people’s interests, but users the wording), and also captures the user’s skill switching should be wary of potential security and privacy threats and intention from the context of her conversation with the risks. running skill” [12]. This is accomplished by using NLP and One of the authentication challenges is called a Voice . With a precision of over 95%, it is Masquerading Attack (VMA). The problem is whether the effective in determining VMA’s. user knows if they are talking to the correct VPA [12]. This Any individual can create an applet and share it on the can occur “through the skill market, an adversary can publish IFTTT website for other users to implement it on their own malicious third-party skills designed to be invoked by the device. Since anyone can create and share applets, security user’s voice commands… in a misleading way, due to the and privacy are a major concern. For example, an applet is

1063 written to upload all attachments from their email to a internet, but it can control several devices in different places OneDrive folder. The issue here is that an email attachment in the house. Figure 3 shows a picture of the enclosure of the could contain malicious attachments, which will now upload relay board and all the wiring to the outlets. to the OneDrive folder, possibly on multiple devices. This “increases the likelihood that the user will mistakenly execute the malicious program” [14]. Certain applets can unintentionally cause a breach in privacy or cause the user harm. A set of 19,232 unique published applets were analyzed, with 50% of the applets being potentially harmful. While most of the applets correctly perform tasks that the user programmed it to, “they (the applets) in general do have the potential to cause or increase the risk of harms such as embarrassment, leaking behavioral data, or even physical harm” [14]. An example of an applet that could cause physical harm is one that “opens the window if the temperature rises above a certain threshold” [14]. This applet opens the window depending the interior temperature Figure 2. Implementation of our project. (1) a voice command is sent to the cloud. (2) a voice response comes to the speaker, as well as a POST request of the room, which can be tempered with by an attacker. to our IFTTT web server. The POST request looks like: There are two ways which the attacker could change the https://IP/scriptName?lights=on. (3) a POST request to control a relay (or interior temperature of the room. One way being flipping the relays) is sent to a server that is connected to a relay board. The Router breaker and turn off the air conditioning, another being needs to be configured to forward packets with TCP port number 443 to the covering the vent for the air conditioner [14]. Users should IFTTT web server. be careful while choosing to implement certain applets, as they may have the potential to cause harm or leak personal data.

IV. Implementation In this section we will describe how we set up a Google Home mini to turn on and off 120VAC outlets where we can connect any two 120VAC devices that each draws up to 2 Amps. We first set up the Google Home mini to function as a regular Voice . We then implemented a python web server on a raspberry pi zero. The web server runs https on port 443, we refer to this server as “IFTTT web server”. We also configured our home router to “port forward” port 443 to the IP address of our web server. We then built another web server using a raspberry pi zero that Figure 3. A picture of the enclosure of the relay board and all the wiring to controls a relay board; this enables us to have several pi’s the outlets. connected to relay boards to control multiple devices located in different rooms. V. Conclusion As shown in figure 2, the user issues a voice command Despite the security risks of using IFTTT and third-party via the Google Assistant. Using the IFTTT protocol, a functions, the service that IFTTT provides is functional and response comes back to the Google Assistant if it is a normal simple. Creating an SSL certificate for the specific IP address voice command. However, if the user issues a voice ensured that the data being transmitted was encrypted. The command configured to trigger an action (configured as use of the HTTPS and the TLS protocols added security to shown in figure 1), then a voice response comes back and is this project. The services used in this paper (Google Assistant played at the speaker in addition to a POST request that is and Alexa) created applets which responded quickly to voice sent to our IFTTT web server (item 2 in figure 2). After our input and performed tasks almost instantaneously. IFTTT IFTTT web server parses the requests, it makes another already includes hundreds of different applets that can POST request to one of the web servers that are attached to a connect to devices, which translates well for those who want relay board. This way, only a single server is exposed to the to create multiple applets.

1064 Voice assistants and VPAs technology has the potential [5]. Segan, Sascha. “Amazon echo dot vs. Google Home: Which Smart to aid those with disabilities. As we mentioned earlier, it is Speaker Is Best?” PCMAG, 19 July 2019, possible for individuals with visual impairments or physical www.pcmag.com/news/amazon-echo-vs-google-home-which-smart- disabilities to benefit from the use of voice assistants. For speaker-is-best. [6]. Ghosh, S., & Gunning, D. (2019). Natural Language Processing many consumers and smart home enthusiasts, using IFTTT is Fundamentals. [electronic resource] (1st edition). Packt Publishing. a cheaper alternative to buying smart accessories. This way [7]. Kepuska, V., & Bohouta, G. (2018). Next-generation of virtual of integrating non-smart accessories into being smart personal assistants (Microsoft , Apple , Amazon Alexa and Google Home). 2018 IEEE 8th Annual Computing and accessories is also better for the environment because it turns Communication Workshop and Conference (CCWC). non-smart accessories/devices into smart doi:10.1109/ccwc.2018.8301638 accessories/devices. This brings use and purpose to non- [8]. IFTTTt. “About.” IFTTTT, IFTTTt.com/about. [9]. Chen, R., Tian, Z., Liu, H., Zhao, F., Zhang, S., & Liu, H. (2018). smart accessories/devices, that may have been disposed of. Construction of a voice driven life assistant system for visually Overall, IFTTT is a simple way to customize and control impaired people. 2018 International Conference on Artificial voice assistants, while being cost-effective and Intelligence and (ICAIBD), and Big Data (ICAIBD), 2018 International Conference On, 87–92. environmentally conscious. https://doi-org.ezproxywit.flo.org/10.1109/ICAIBD.2018.8396172 [10]. Isyanto, H., Arifin, A. S., & Suryanegara, M.. Design and References Implementation of IoT-Based Smart Home Voice Commands for disabled people using Google Assistant. 2020 International [1]. 18 Most Popular IoT Devices in 2020 (Only Noteworthy IoT Conference on Smart Technology and Applications (ICoSTA) Products). (2020, June 30). Retrieved July 16, 2020, from https://doi- https://www.softwaretestinghelp.com/iot-devices/ org.ezproxywit.flo.org/10.1109/ICoSTA48221.2020.1570613925

[2]. “How Voice Assistants Are Changing Our Lives.” Smartsheet, [11]. Peacock, K. (2019, August 27). Google Home + Raspberry Pi Power www.smartsheet.com/voice-assistants-artificial- Strip. Retrieved July 10, 2020, from intelligence#:~:text=The%20History%20of%20Voice%20Assistants https://www.instructables.com/id/Google-Home-Raspberry-Pi- ,- Power-Strip/

Voice%20recognition%20technology&text=At%20the%20Seat [12]. Zhang, N., Mi, X., Feng, X., Wang, X., Tian, Y., & Qian, F. (2019). tle%20World's%20Fair,well%20as%20digits%200%2D9. Dangerous Skills: Understanding and Mitigating Security Risks of [3]. Afanasev, M. Y., Fedosov, Y. V., Andreev, Y. S., Krylova, A. A., Voice-Controlled Third-Party Functions on Virtual Personal Shorokhov, S. A., Zimenko, K. V., & Kolesnikov, M. V. (2019). A Assistant Systems. 2019 IEEE Symposium on Security and Privacy Concept for Integration of Voice Assistant and Modular Cyber- (SP), Security and Privacy (SP), 2019 IEEE Symposium On, 1381– Physical Production System. 2019 IEEE 17th International 1396. https://doi-org.ezproxywit.flo.org/10.1109/SP.2019.00016

Conference on Industrial Informatics (INDIN), Industrial Informatics [13]. ARPABET. (2020, June 28). Retrieved July 14, 2020, from (INDIN), 2019 IEEE 17th International Conference On, 1, 27–32. https://en.wikipedia.org/wiki/ARPABET

https://doi- [14]. Surbatovich, M., Aljuraidan, J., Bauer, L., Das, A., & Jia, L. org.ezproxywit.flo.org/10.1109/INDIN41052.2019.8972015 (2017). Some Recipes Can Do More Than Spoil Your Appetite. [4]. Prospero, M. (2018, January 30). Google Home vs Mini vs Max: Proceedings of the 26th International Conference on World Wide Which Should You Buy? Retrieved July 28, 2020. Web. doi:10.1145/3038912.3052709 https://www.tomsguide.com/us/google-home-vs-google-home-mini- vs-google-home-max,review-4731.html

1065