NORTHWESTERN UNIVERSITY Adopting a Gateway Centric View

NORTHWESTERN UNIVERSITY Adopting a Gateway Centric View for Cellular Network Content Delivery A DISSERTATION SUBMITTED TO THE GRADUATE SCHOOL IN PARTIAL FULFILLMENT OF THE REQUIREMENTS for the degree DOCTOR OF PHILOSOPHY Field of Computer Science By John Paul Rula EVANSTON, ILLINOIS June 2017 2 c Copyright by John Paul Rula 2017 All Rights Reserved 3 Committee FabiánBustamante (Chair), Northwestern University Aleksandar Kuzmanovic, Northwestern University Peter Dinda, Northwestern University Jon Crowcroft, University of Cambridge 4 ABSTRACT Adopting a Gateway Centric View for Cellular Network Content Delivery John Paul Rula Mobile traffic is expected to grow tenfold by 2019, topping 24 exabytes of monthly traffic and accounting for nearly half of all Internet traffic. This growth is driven by the increasing number of smart phones and tablets, and the data demands of high bandwidth services enabled by next-generation cellular networks such as LTE/5G. As in the wired Internet, network usage is dominated by content consumption, with the vast majority served through content delivery networks (CDNs). CDNs host and replicate popular content across thousands of servers worldwide, directing users to \nearby" servers. This replica selection is a key determinant of client performance, yet replica selection for cellular clients has previously been overlooked, due to high radio latency, inconsistent throughput, and a limited number of ingress locations which dominated end-to-end latency. NGCNs and their improved performance place a renewed emphasis on replica selection policies for cellular clients. We find that the performance of existing replica selection systems in cellular networks is hindered by their opacity, the dynamic assignment of clients to infrastructure components, the emergence of centralized DNS within cellular networks, and the growth of public DNS in global mobile operators. This opacity prohibits network probes from entering these networks, rendering existing monitoring and measurement systems ineffective. In this dissertation, I argue for the centrality of cellular network packet gateways (PGWs), and that this centrality has critical implications on the architecture, characterization, and performance of cellular networks. PGWs separate the interior mobile network from external data networks, and define the independent network partitions which compose modern cellular networks. I posit that understanding the locations of PGWs and their allocation of clients constitutes sufficient network topology coverage. The presence of PGW's on all routes to 5 and from cellular clients make them ideal proxies of client latency for network services. I demonstrate techniques for characterizing cellular networks which allow both the discovery of PGW locations and their assignments of mobile clients. I designed and implemented two live systems which utilize these techniques to characterize cellular infrastructure: tiller which uses instrumented mobile devices to characterize cellular networks, and machete which uses traces from external vantage points to accomplish this characterization at a global scale. I introduce a novel method of content replica selection which chooses cellular client servers based on the location of a client's PGW, called Gateway-Based Replica Selection (GBRS), and show this achieves near optimal replica selection for cellular clients. 6 Acknowledgements I would like to thank all of those who have believed in, inspired, and supported me throughout this journey. I would not have made it this far without the gracious love and encouragement of my friends and family. To my parents, thank you for for all of your love and unending support. Even at my lowest points, you never gave up on me. To my sister, you've always had my back and helped me to see beyond life's trivialities. I am eternally grateful to my advisor, Fabian Bustamante, for first giving me the opportunity to begin this journey as an undergraduate, and for taking the chance on an unqualified misfit with little computer science experience. I am forever grateful for you seeing the potential in me, and for tirelessly helping me to fully develop it. I want to thank my committee for their guidance throughout this process. To Aleksander Kuzmanovic, thank you for his straightforward feedback and bluntness, Peter Dinda for his thoroughness, and Jon Crowcroft for his insights. I have had the opportunity to work with great researchers through my summer internships. I would like to thank my collaborators at Microsoft Research India, Vishnu Navda, Ranjita Bhagwan and Saikat Guha, for granting me a unique and amazing experience, and showing me how fulfilling pure research can be. I would like to thank those I worked with at Akamai, Marcelo Torres and Pablo Alvarez, for trusting me with the freedom to pursue my work. To my lab mates for keeping me sane throughout this process: Maciej Swiech, Mario Sanchez, James Newman, and Zach Bischoff. Your continued antics kept my spirits high and got me through many hard times. To Kyle Hale, for being an incredible friend and roommate, for always being supportive, and for the many late night discussions exploring the universe. To John Otto for being a great mentor and close friend; and to Marcel Flores, for showing me what hard work looks like. 7 Last, I want to thank Chenault Taylor for being my guiding light through this journey. Your unending love and support have allowed me to grow beyond what I ever thought was possible. You have my eternal love. 8 List of Abbreviations CDN Content Delivery Network GGSN Gateway GPRS Serving Node MNO Mobile Network Operator PGW Packet Gateway SGSN Serving Gateway Support Node SGW Service Gateway UE User Equipment 9 Contents List of Figures 13 List of Tables 22 1 Introduction 24 1.1 Thesis Statement . 25 1.2 Summary of Major Contributions . 28 1.3 Roadmap . 29 2 Background and Related Work 30 2.1 Next-Generation Cellular Networks . 30 2.2 Mobile Network Performance . 32 2.2.1 Components of Cellular Network Latency . 32 2.2.2 Transport Performance Over Wireless Links . 34 2.3 Network Topology Discovery . 35 2.3.1 Wired Network Exploration . 35 2.3.2 Measuring Cellular Networks . 36 2.3.3 Prior Cellular Network Characterization Efforts . 37 2.4 Content Delivery Networks . 38 3 Motivation 41 3.1 Overview . 41 10 3.2 Advancements of NGCNs . 42 3.3 Ineffectiveness of Existing Replica Selection . 43 3.4 Problems Locating Cellular Clients . 44 3.4.1 Cellular Network Opacity . 45 3.4.2 Public DNS Usage for Cellular Clients . 46 3.4.3 Centralized Resolver Structure . 48 3.5 Rise of Mobile Traffic . 50 3.6 A Look to the Future . 51 4 Approach 52 4.1 A Case for PGW Representation of Cellular Clients . 52 4.2 Gateway Representation of Clients . 55 4.2.1 Gateway Clustering Methodology . 55 4.2.2 Clustering Sensitivity . 59 4.2.3 Clustering Accuracy . 60 4.3 Gateway-Based Replica Selection . 61 4.4 Summary and Contributions . 63 5 Cellular Infrastructure Characterization 66 5.1 Overview . 66 5.2 Cellular DNS . 67 5.2.1 Operator DNS Characterization . 68 5.2.2 Cellular Resolver Distance . 71 5.2.3 Cellular DNS Performance . 71 5.2.4 Cellular Resolver Opacity . 72 5.2.5 Client resolver inconsistency . 74 5.2.6 Impact on CDN Replica Selection . 75 11 5.2.7 Public DNS in Mobile Networks . 77 5.3 Cellular Gateways . 81 5.3.1 Identifying Cellular Gateways . 82 5.3.2 Validation of PGW Localization . 88 5.4 Mobile Client Dynamics . 89 5.5 Summary and Contributions . 93 6 A Network-Level View of Mobile Networks 95 6.1 Overview . 95 6.2 MNO Organization . 96 6.2.1 Logical MNO Domains . 97 6.2.2 MNO Motifs . 98 6.3 Data Collection . 99 6.3.1 Data Sources . 99 6.3.2 Mobile Traceroute Processing . 100 6.3.3 Logical Domain Discovery . 102 6.4 An AS-level look at MNOs . 103 6.4.1 Cellular Core ASes . 103 6.4.2 Intra-Network Connectivity . 105 6.5 Cellular Internet Connectivity . 108 6.5.1 Traceroute AS-Connectivity . 108 6.5.2 Path Symmetry to Content . 109 6.6 Related Work . 112 6.7 Summary and Conclusions . 113 7 tiller: An End-Host Solution for Cellular Exploration 115 7.1 Overview . 115 12 7.2 Cellular Network Coverage . 118 7.2.1 Cellular Gateway Clusters . 118 7.2.2 Baseline IP Coverage for Cellular Networks . 120 7.3 Mobile Vantage Point Coverage . 123 7.3.1 Coverage of Individual Vantage Points . 124 7.3.2 Vantage Point Temporal Dynamics . 125 7.3.3 How Many Vantage Points? . 127 7.4 Tiller { An End-Host System for GBRS . 130 7.4.1 Tiller Architecture . 130 7.4.2 Tiller Implementation . 133 7.5 Summary and Contributions . 134 8 Trace-based Clustering of Cellular End-Users 136 8.1 Scaling Cellular Network Exploration . 136 8.2 Data Collection . 138 8.3 Path Characterization of Cellular Networks . 139 8.3.1 Traceroute Characterization . 140 8.3.2 Representing Traces through Sink-Vectors . 143 8.4 Clustering Approach . 147 8.4.1 Similarity of Trace Vectors . 147 8.4.2 Clustering Methods . 149 8.4.3 Ground-Truth Evaluation . 150 8.5 MACHETE . 152 8.6 System Implementation . 155 8.6.1 Deployment Experiences . 156 8.7 Summary and Contributions . 158 13 9 Contributions and Conclusion 159 9.1 Summary and Contributions . 159 9.2 Future Work . 160 A Flexible Experimentation for Mobile Networks 162 A.1 A Case for More Robust Mobile Experimentation . ..

NORTHWESTERN UNIVERSITY Adopting a Gateway Centric View

A DATA-ORIENTED NETWORK ARCHITECTURE Doctoral Dissertation

Communications of the Acm

3 Is Not a Crowd, It's an Anecdote

Privacy Engineering for Social Networks

NARSEO VALLINA-RODRIGUEZ Curriculum Vitae

Norah Boyce Science Lectures Archives (In Academic Year Order – Most Recent First)

NARSEO VALLINA-RODRIGUEZ Curriculum Vitae

Peer Review College Newsletter

Publications

10 Networking Papers: Recommended Reading Jon Crowcroft University of Cambridge Computer Laboratory William Gates Building 15

IC2E Conference Booklet

Social Networks Can Be Used to Tailor Content Staging Decisions to the User Base and Thereby Build Better Data Delivery Infrastructures