Authloop: End-To-End Cryptographic Authentication for Telephony Over Voice Channels

AuthLoop: End-to-End Cryptographic Authentication for Telephony over Voice Channels Bradley Reaves, Logan Blue, and Patrick Traynor, University of Florida https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/reaves This paper is included in the Proceedings of the 25th USENIX Security Symposium August 10–12, 2016 • Austin, TX ISBN 978-1-931971-32-4 Open access to the Proceedings of the 25th USENIX Security Symposium is sponsored by USENIX AuthLoop: Practical End-to-End Cryptographic Authentication for Telephony over Voice Channels Bradley Reaves Logan Blue Patrick Traynor University of Florida University of Florida University of Florida reaves@ufl.edu bluel@ufl.edu [email protected]fl.edu Abstract asserting an identity (e.g., a bank, law enforcement, etc.), taking advantage of a lack of reliable cues and mecha- Telephones remain a trusted platform for conducting nisms to dispute such claims. Addressing these prob- some of our most sensitive exchanges. From banking to lems will require the application of lessons from a related taxes, wide swathes of industry and government rely on space. The Web experienced very similar problems in the telephony as a secure fall-back when attempting to con- 1990s, and developed and deployed the Transport Layer firm the veracity of a transaction. In spite of this, authen- Security (TLS) protocol suite and necessary support in- tication is poorly managed between these systems, and in frastructure to assist with the integration of more veri- the general case it is impossible to be certain of the iden- fiable identity in communications. While by no means tity (i.e., Caller ID) of the entity at the other end of a call. perfect and still an area of active research, this infrastruc- We address this problem with AuthLoop, the first system ture helps to make a huge range of attacks substantially to provide cryptographic authentication solely within the more difficult. Unfortunately, the lack of similarly strong voice channel. We design, implement and characterize mechanisms in telephony means that not even trained se- the performance of an in-band modem for executing a curity experts can currently reason about the identity of TLS-inspired authentication protocol, and demonstrate other callers. its abilities to ensure that the explicit single-sided authen- In this paper, we address this problem with tication procedures pervading the web are also possible AuthLoop.1 AuthLoop provides a strong cryptographic on all phones. We show experimentally that this protocol authentication protocol inspired by TLS 1.2. However, can be executed with minimal computational overhead unlike other related solutions that assume Internet access and only a few seconds of user time ( 9 instead of 97 ≈ ≈ (e.g., Silent Circle, RedPhone, etc [24, 73, 25, 5, 3, 6, seconds for a na¨ıve implementation of TLS 1.2) over het- 1, 74, 7]), accessibility to a secondary and concurrent erogeneous networks. In so doing, we demonstrate that data channel is not a guarantee in many locations (e.g., strong end-to-end validation of Caller ID is indeed prac- high density cities, rural areas) nor for all devices, man- tical for all telephony networks. dating that a solution to this problem be network agnostic. Accordingly, AuthLoop is designed for and trans- 1 Introduction mitted over the only channel certain to be available to all phone systems — audio. The advantage to this approach Modern telephony systems include a wide array of end- is that it requires no changes to any network core, which user devices. From traditional rotary PSTN phones to would likely see limited adoption at best. Through the modern cellular and VoIP capable systems, these devices use of AuthLoop, users can quickly and strongly iden- remain the de facto trusted platform for conducting many tify callers who may fraudulently be claiming to be orga- of our most sensitive operations. Even more critically, nizations including their financial institutions and their these systems offer the sole reliable connection for the government [28]. majority of people in the world today. We make the following contributions: Such trust is not necessarily well placed. Caller ID is known to be a poor authenticator [59, 18, 67], and 1A name reminiscent of the “Local Loop” used to tie traditional yet is successfully exploited to enable over US$2 Bil- phone systems into the larger network, we seek to tie modern telephony systems into the global authentication infrastructure that has dramati- lion in fraud every year [28]. Many scammers simply cally improved transaction security over the web during the past two block their phone number and exploit trusting users by decades. 1 USENIX Association 25th USENIX Security Symposium 963 Intermediary Telco IP Networks Networks VOIP Cell Network Carrier Web Gateway Internet Services VOIP Proxy Gateway PSTN Figure 1: A high-level representation of modern telephony systems. In addition to voice being transcoded at each gateway, all identity mechanisms become asserted rather than attested as calls cross network borders. A strong end- to-end authentication must be designed aware of all such limitations. Design a Complete Transmission Layer: We de- work; Section 3 presents the details of our system includ- • sign the first codec-agnostic modem that allows for ing lower-layer considerations; Section 4 discusses our the transmission of data across audio channels. We security model; Section 5 formally defines the AuthLoop then create a supporting link layer protocol to en- protocol and parameterizes our system based on the mo- able the reliable delivery of data across the hetero- dem; Section 6 discusses our prototype and experimental geneous landscape of telephony networks. results; Section 7 provides additional discussion about our system; and Section 8 provides concluding remarks. Design AuthLoop Authentication Protocol: After • characterizing the bandwidth limitations of our data channel, we specify our security goals and design 2 Background and Related Work the AuthLoop protocol to provide explicit authentication of one party (i.e., the “Prover”) and option- In this section, we provide an overview of modern tele- ally weak authentication of the second party (i.e., phony networks and review current and proposed prac- the “Verifier”). tices of authentication in those networks. Evaluate Performance of a Reference Implemen- • 2.1 Modern Telephony Networks tation: We implement AuthLoop and test it using three representative codecs — G.711 (for PSTN The landscape of modern telephony is complex and het- networks), AMR (for cellular networks) and Speex erogeneous. Subscribers can receive service from mo- (for VoIP networks). We demonstrate the ability bile, PSTN and VoIP networks, and calls to those sub- to create a data channel with a goodput of 500 bps scribers may similarly originate from networks imple- and bit error rates averaging below 0.5%. We then menting any of the above technologies. Figure 1 pro- demonstrate that AuthLoop can be run over this vides a high-level overview of this ecosystem. channel in an average of 9 seconds (which can be While performing similar high-level functionality played below speaker audio), compared to running (i.e., enabling voice calls), each of these networks is a direct port of TLS 1.2 in an average of 97 seconds built on a range of often incompatible technologies. (a 90% reduction in running time). From circuit-switched intelligent network cores to packet switching over the public Internet, very little information The remainder of this paper is organized as follows: beyond the voice signal actually propagates across the Section 2 provides background information and related borders of these systems. In fact, because many of these 2 964 25th USENIX Security Symposium USENIX Association a) 1-second chirp sweep from 300 - 3300 Hz before AMR-NB encoding b) 1-second chirp sweep from 300 - 3300 Hz after AMR-NB encoding Figure 2: A comparison of a signal (a) before and (b) after being encoded with the AMR codec. Note that while the entirety of the signal is within the range of allowable frequencies for call audio, the received signal differs significantly from its original form. It is therefore critical that a high-fidelity mechanism for delivering data over a mobile audio channel be designed. networks rely on different codecs for encoding voice, user nature of wireless spectrum. Unfortunately, 1G au- one of the major duties of gateways between these sys- thentication relied solely on the plaintext assertion of tems is the transcoding of audio. Accordingly, voice en- each user’s identity and was therefore subject to signifi- coded at one end of a phone call is unlikely to have the cant fraud [53]. Second generation (2G) networks (e.g., same (or even similar) bitwise representation when it ar- GSM) designed cryptographic mechanisms for authen- rives at the client side of the call. As evidence, the top ticating users to the network. These protocols failed to plot of Figure 2 shows a sweep of an audio signal from authenticate the network to the user and lead to a range 300 to 3300 Hz (all within the acceptable band) across of attacks against subscribers [44, 26, 19, 68]. Third 1 second. The bottom plot shows the same signal af- and fourth generation (3G and 4G) systems correctly ter is has been encoded using the Adaptive Multi-Rate implement mutual authentication between the users and (AMR) audio codec used in cellular networks, resulting providers [11, 12, 13]. Unfortunately, all such mecha- in a dramatically different message. This massive differ- nisms are designed to allow accurate billing, and do little ence is a result of the voice-optimized audio codecs used to help users identify other callers. in different telephony networks. Accordingly, success- While a number of seemingly-cellular mechanisms fully performing end-to-end authentication will require have emerged to provide authentication between end careful design for this non-traditional data channel.

Load more