Request Routing in Content Delivery Networks

REQUEST ROUTING IN CONTENT DELIVERY NETWORKS by HUSSEIN A. ALZOUBI Submitted in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Dissertation Advisor: Prof. Michael Rabinovich Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY January 2015 The Dissertation Committee for Hussein A. Alzoubi certifies that this is the approved version of the following dissertation: Request Routing in Content Delivery Networks Committee: Michael Rabinovich, Supervisor Christos Papachristou Daniel Saab Francis Merat Date: 12 / 03 / 2014 To my parents for all their effort that led me to reach this point, To my beloved wife Enaas for all her help, patient, and support, To my kids Leen and Jawad with sincere love. Contents List of Tables x List of Figures xi Acknowledgments xiv Abstract xvi Chapter 1 Introduction 1 Chapter 2 The Anatomy of LDNS Clusters: Findings and Implications for Web Content Delivery 9 2.1 Introduction................................ 9 2.2 RelatedWork ............................... 10 2.3 SystemInstrumentation . .. .. 11 2.4 TheDataset................................ 14 2.5 ClusterSize ................................ 16 2.5.1 NumberofClients ........................ 16 2.5.2 Cluster Activity . 18 2.6 TTLEffects................................ 20 2.7 Client-to-LDNSProximity . .. .. 23 2.7.1 Air-Miles Between Client and LDNS . 24 vi 2.7.2 GeographicalSpan ........................ 25 2.7.3 ASSharing ............................ 27 2.8 Top-10LDNSsandClients........................ 30 2.9 ClientSiteConfigurations ........................ 33 2.9.1 ClientsORLDNSs?! ....................... 34 2.9.2 LDNSPools............................ 38 2.10 Implications for Web Content Delivery . 41 2.11Summary ................................. 42 Chapter 3 A Practical Architecture for an Anycast CDN 44 3.1 Introduction................................ 44 3.2 RelatedWork ............................... 48 3.3 Architecture................................ 50 3.3.1 Load-awareAnycastCDN . 50 3.3.2 ObjectivesandBenefits . .. .. 53 3.3.3 Dealing with Long-Lived Sessions . 55 3.3.4 Dealing with Network Congestion . 56 3.4 RemappingAlgorithm .......................... 56 3.4.1 Problem Formulation . 57 3.4.2 Minimizing Cost . 58 3.4.3 Minimizing Connection Disruption . 60 3.5 EvaluationMethodology ......................... 62 3.5.1 DataSet.............................. 62 3.5.2 Simulation Environment . 64 3.5.3 SchemesandMetricsforComparison . 65 3.6 ExperimentalResults........................... 67 3.6.1 Server Load Distribution . 68 3.6.2 DisruptedandOver-CapacityRequests . 70 vii 3.6.3 Request Air Miles . 75 3.6.4 ComputationalCostofRemapping . 78 3.6.5 TheEffectofRemappingInterval . 81 3.7 Summary ................................. 86 Chapter 4 Performance Implications of Unilateral Enabling of IPv6 87 4.1 Introduction................................ 87 4.2 Background ................................ 88 4.3 RelatedWork ............................... 90 4.4 Methodology ............................... 91 4.5 TheDataset................................ 94 4.6 TheResults ................................ 95 4.6.1 DNS Resolution Penalty . 95 4.6.2 End-to-EndPenalty ....................... 96 4.7 Summary ................................. 99 Chapter 5 IPv6 Anycast CDNs 100 5.1 Introduction................................ 100 5.2 Background ................................ 101 5.2.1 IPv6 ................................ 101 5.2.2 TCP................................ 103 5.2.3 IPv6 Mobility Overview . 105 5.3 RelatedWork ............................... 107 5.4 Lightweight IPv6 Anycast for Connection-Oriented Communication . 109 5.5 IPv6AnycastCDNArchitecture. 113 5.6 Summary ................................. 116 Chapter 6 Conclusion 117 viii Bibliography 119 ix List of Tables 2.1 High-level dataset characterization . 14 2.2 ClientsOSbreakdown .......................... 15 2.3 Clientsbrowsersbreakdown . 15 2.4 Activity of client-LDNS associations sharing the same AS . 28 4.1 ThebasicIPv6statistics ......................... 94 x List of Figures 1.1 BasicarchitectureofCDNs........................ 3 1.2 AnycastBasedRedirection. .. .. 5 2.1 MeasurementSetup. ........................... 12 2.2 Distribution of LDNS cluster sizes. 17 2.3 Distribution of sub1 requests and client/LDNS pairs attributed to LDNSclustersofdifferentsizes . 18 2.4 LDNSs Activity in terms of DNS and HTTP requests. 19 2.5 LDNS cluster sizes within TTL windows (all windows). 21 2.6 Average LDNS cluster sizes within a TTL window. (averaged over all windowsforagivenLDNS) ....................... 22 2.7 HTTP requests within TTL windows (all windows). 23 2.8 Average number of HTTP requests per LDNS within a TTL window (averagedoverallwindowsforagivenLDNS). 24 2.9 Air miles for all client/LDNS pairs . 25 2.10 Avg client/LDNS distance in top LDNS clusters . 26 2.11 Avg client/LDNS distance for all LDNS clusters . 27 2.12 CDF of LDNS clusters with a given % of clients/LDNSs outside their LDNS’s/Client’sautonomoussystem. 28 2.13 AS sharing of top-10 LDNSs and their clients . 30 xi 2.14 Air miles between top-10 LDNSs and their clients. 31 2.15 AS sharing of top-10 clients and their LDNSs . 32 2.16 Air-miles for top 10 LDNSs and top-10 clients. 33 2.17 DistributionofLDNStypes . 34 2.18 Cluster size distribution of LDNS groups. 36 2.19 The number of sub1 requests issued by LDNSs of different types.... 37 2.20 Number of sub1 requests issued by One2One LDNSs. 38 2.21LDNSPool ................................ 39 3.1 Load-awareAnycastCDNArchitecture . 51 3.2 Application level redirection for long-lived sessions . 54 3.3 Number of concurrent requests for each scheme (Large files group) . 69 3.4 Number of concurrent requests for each scheme (Small objectsgroup) 71 3.5 Service data rate for each scheme (Large files group) . .. 72 3.6 Service data rate for each scheme (Small objects group) . .... 73 3.7 Disrupted and over-capacity requests for each scheme (Y-axis in log scale).................................... 74 3.8 Average miles for requests calculated every 120 seconds . ... 76 3.9 99th percentile of request miles calculated every 120 seconds . ... 77 3.10 Execution time of the alb-a and alb-o algorithms in the trace environment .................................. 77 3.11 Total offered load pattern (synthetic environment) . .... 80 3.12 Scalability of the alb-a and alb-o algorithms in a synthetic environment 81 3.13 The effect of remapping interval on disrupted connections . .... 82 3.14 The effect of remapping interval on cost (common 6-hour trace period) 83 3.15 The effect of remapping interval on dropped requests (common 6-hour traceperiod) ............................... 84 xii 3.16 The effect of over-provisioning on over-capacity requests (common 6- hourtraceperiod)............................. 85 4.1 Measurement Setup. Presumed interactions are marked in blue font.. 92 4.2 Time difference between A and AAAA “sub” requests . 96 4.3 ComparisonofallIPv6andIPv4delays. 97 4.4 IPv4andIPv6delaysperclient.. 98 5.1 IPv6PacketHeaderFormat. 102 5.2 IPv6DestinationOptionHeaderFormat. 103 5.3 TypicalTCPConnection. 104 5.4 TCP Interaction For an IPv6 Anycast Server . 110 5.5 TCP Interaction For an IPv6 Anycast Established Connection . 111 5.6 IPv6AnycastCDN ............................ 114 5.7 RedirectioninIPv6AnycastCDN. 115 xiii Acknowledgments First and foremost I would like to take this opportunity to express my thanks and gratitude to God for all his blessings and bounties that he has bestowed upon me. I would like to thank my advisor, professor Michael Rabinovich for his great efforts, patience, insights, and endless guidance. I would like also to thank my dissertation committee, Professors Christos Papachristou, Daniel Saab, and Francis Merat for their time and valuable comments. Many thanks to my friends and companions during my journey at Case. Thanks to Osama Al-khaleel, Zakaria Al-Qudah, Ahmad Al-Hammouri, Mohammad Darawad, Mohammad Al-Oqlah, Saleem Bani Hani, Khalid Al-Adeem and all my friends here in the US. Special thanks to Huthaifa Al-Omari and Abdullah Jordan, You have always been supportive and you have made my journey an enjoyable journey. My parents and siblings: Mohammad Rabee, Ahmad, Ali, Sajidah and Sojood, Thank you all for all your help, all your support, and all your encouragements. Last but not least: Enaas my beloved wife, Leen and Jawad the joy of my life. I wish I have the words to express my thanks to you for your endless support, great patience and encouragement. Thank you from the bottom of my heart. Hussein A. Alzoubi Case Western Reserve University January 2015 xiv xv Request Routing in Content Delivery Networks HUSSEIN A. ALZOUBI Internet has become - and continues to grow as - the main distributor of digital media content. The media content in question runs the gamut from operating system patches and gaming software, to more traditional Web objects and streaming events and more recently user generated video content. Content Delivery Networks (CDNs) (e.g. Akamai, Limelight, AT&T ICDS) have emerged over the last decade to help Internet content providers deliver their digital content to end users in a timely and efficient manner. The challenge to the effective operation of any CDN is to redirect clients to the “best” service server from which to retrieve the content, a process normally referred to as “redirection” or “request routing”. Most commercial CDNs make use of DNS-based request routing mechanism to perform redirection. In this

Load more