CHI 2019 Paper CHI 2019, May 4–9, 2019, Glasgow, Scotland, UK Voice Presentation Attack Detection through Text-Converted Voice Command Analysis Il-Youp Kwak Jun Ho Huh Seung Taek Han Samsung Research Samsung Research Samsung Research Seoul, South Korea Seoul, South Korea Seoul, South Korea
[email protected] [email protected] [email protected] Iljoo Kim Jiwon Yoon Samsung Research Korea University Seoul, South Korea Seoul, South Korea
[email protected] [email protected] ABSTRACT KEYWORDS Voice assistants are quickly being upgraded to support ad- Voice Command Analysis; Attack Detection; Voice Assistant vanced, security-critical commands such as unlocking de- Security vices, checking emails, and making payments. In this paper, ACM Reference Format: we explore the feasibility of using users’ text-converted voice Il-Youp Kwak, Jun Ho Huh, Seung Taek Han, Iljoo Kim, and Ji- command utterances as classification features to help identify won Yoon. 2019. Voice Presentation Attack Detection through Text- users’ genuine commands, and detect suspicious commands. Converted Voice Command Analysis. In CHI Conference on Hu- To maintain high detection accuracy, our approach starts man Factors in Computing Systems Proceedings (CHI 2019), May 4–9, with a globally trained attack detection model (immediately 2019, Glasgow, Scotland Uk. ACM, New York, NY, USA, 12 pages. available for new users), and gradually switches to a user- https://doi.org/10.1145/3290605.3300828 specific model tailored to the utterance patterns of atarget user. To evaluate accuracy, we used a real-world voice assis- 1 INTRODUCTION tant dataset consisting of about 34.6 million voice commands Voice assistant vendors (e.g., Apple’s Siri, Amazon’s Alexa, collected from 2.6 million users.