Resilient Overlay Networks David G. Andersen

Total Page:16

File Type:pdf, Size:1020Kb

Resilient Overlay Networks David G. Andersen Resilient Overlay Networks by David G. Andersen B.S., Computer Science, University of Utah (1998) B.S., Biology, University of Utah (1998) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May 2001 ­c Massachusetts Institute of Technology 2001. All rights reserved. Author........................................................................... Department of Electrical Engineering and Computer Science May 16, 2001 Certifiedby....................................................................... Hari Balakrishnan Assistant Professor Thesis Supervisor Accepted by ...................................................................... Arthur C. Smith Chairman, Department Committee on Graduate Students Resilient Overlay Networks by David G. Andersen Submitted to the Department of Electrical Engineering and Computer Science on May 16, 2001, in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering Abstract This thesis describes the design, implementation, and evaluation of a Resilient Overlay Network (RON), an architecture that allows end-to-end communication across the wide-area Internet to detect and recover from path outages and periods of degraded performance within several seconds. A RON is an application-layer overlay on top of the existing Internet routing substrate. The overlay nodes monitor the liveness and quality of the Internet paths among themselves, and they use this information to decide whether to route packets directly over the Internet or by way of other RON nodes, optimizing application-specific routing metrics. We demonstrate the potential benefits of RON by deploying and measuring a working RON with nodes at thirteen sites scattered widely over the Internet. Over a 71-hour sampling period in March 2001, there were 32 significant outages lasting over thirty minutes each, between the 156 communicating pairs of RON nodes. RON’s routing mechanism was able to detect and recover around all of them, showing that there is, in fact, physical path redundancy in the underlying Internet in many cases. RONs are also able to improve the loss rate, latency, or throughput perceived by data transfers; for example, about 1% of the transfers doubled their TCP throughput and 5% of our transfers saw their loss rate reduced by 5% in absolute terms. These improvements, particularly in the area of fault detection and recovery, demonstrate the benefits of moving some of the control over routing into the hands of end-systems. Thesis Supervisor: Hari Balakrishnan Title: Assistant Professor 2 To my mother, whose unparalleled chicken stock expertise was invaluable. 3 Contents 1 Introduction 12 1.1TheInternetandOverlayNetworks.......................... 12 1.2 Contributions . .................................. 15 2 Background and Related Work 17 2.1InternetRouting.................................... 17 2.1.1 TheOrganizationoftheInternet....................... 17 2.1.2 Routing.................................... 19 2.2OverlayNetworks................................... 22 2.2.1 MBone.................................... 22 2.2.2 6-bone . .................................. 22 2.2.3 TheX-Bone.................................. 22 2.2.4 Yoid / Yallcast . ............................. 23 2.2.5 End System Multicast ............................ 23 2.2.6 ALMI..................................... 23 2.2.7 Overcast . .................................. 24 2.2.8 ContentDeliveryNetworks......................... 24 2.2.9 Peer-to-peerNetworks............................ 24 2.3 Performance Measuring Systems and Tools . .................. 24 2.3.1 NIMI..................................... 25 2.3.2 IDMaps.................................... 25 2.3.3 SPAND.................................... 25 2.3.4 Detour.................................... 26 2.3.5 Commercial Measurement Projects . .................. 26 4 2.3.6 LatencyandLossProbeTools........................ 27 2.3.7 AvailableBandwidthProbeTools...................... 27 2.3.8 LinkBandwidthProbeTools......................... 27 2.3.9 StandardsforTools.............................. 27 3 Problems with Internet Reliability 28 3.1Slowlinkfailurerecovery............................... 29 3.2 Inability to detect performance failures . ....................... 29 3.3 Inability to effectively multi-home . ....................... 30 3.4 Blunt policy expression . ............................. 31 3.5DefiningOutages................................... 31 3.6Summary....................................... 33 4 RON Design and Implementation 34 4.1DesignOverview................................... 34 4.2SystemModel..................................... 36 4.2.1 Implementation................................ 37 4.3BootstrapandMembershipManagement....................... 38 4.3.1 Implementation................................ 39 4.4DataForwarding................................... 40 4.4.1 Implementation................................ 42 4.5 Monitoring Virtual Links . ............................. 44 4.5.1 Implementation................................ 44 4.6Routing........................................ 45 4.7 Link-State Dissemination . ............................. 45 4.8PathEvaluationandSelection............................ 46 4.8.1 Implementation................................ 48 4.9PolicyRouting.................................... 49 4.10PerformanceDatabase................................ 50 4.10.1Implementation................................ 51 4.11Conclusion...................................... 52 5 5 Evaluation 53 5.1 RON Deployment Measurements ........................... 54 5.1.1 Methodology ................................. 54 5.1.2 OvercomingPathoutages.......................... 58 5.1.3 OutageDetectionTime............................ 62 5.1.4 OvercomingPerformanceFailures...................... 63 5.2 Handling Packet Floods . ............................. 72 5.3OverheadAnalysis.................................. 73 5.4RON-inducedFailureModes............................. 76 5.5SimulationsofRONRoutingBehavior........................ 77 5.5.1 Single-hop v. Multi-hop RONs . ....................... 77 5.5.2 RON Route Instability ............................ 77 5.6Summary....................................... 77 6 Conclusion and Future Work 79 6 List of Figures 1-1TheRONoverlayapproach.............................. 14 2-1AnexampleofInternetinterconnections....................... 18 2-2Tier1and2providers................................. 18 3-1LimitationsofBGPpolicy.............................. 31 3-2 TCP throughput vs. loss (Padhye ’98) . ....................... 32 4-1TheRONoverlayapproach.............................. 35 4-2TheRONsystemarchitecture............................. 35 4-3TheRONpacketheader................................ 40 4-4RONforwardingcontrolanddatapath........................ 41 4-5ImplementationoftheIPencapsulatingRONclient................. 43 4-6Theactiveprober................................... 44 4-7 The RON performance database. ........................... 51 5-1 The 13-node RON deployment, geographically . .................. 55 5-212RONnodes,bynetworklocation......................... 56 5-3Loss&RTTsamplingmethod............................ 57 5-4Packetlossratescatterplotcomparison........................ 59 5-5 Packet loss rate scatterplot comparison, dataset 2 .................. 60 5-6Detailedexaminationofanoutage.......................... 61 5-7CDFof30-minutelossrateimprovementwithRON................. 62 5-8 CDF of 30-minute loss rate improvement with RON, dataset 2 ........... 63 5-9Thelimitationsofbidirectionallossprobes...................... 65 5-10 5 minute average latency scatterplot, dataset 1 . .................. 66 7 5-11 5 minute average latency scatterplot, dataset 2 . .................. 67 5-12IndividuallinklatencyCDFs............................. 68 5-13Individualpathlatenciesovertime.......................... 69 5-14Individuallinklossratesovertime.......................... 70 5-15 RON throughput comparison, dataset 1 . ....................... 72 5-16 RON throughput comparison, dataset 2 . ....................... 73 5-17 TCP trace of a flooding attack mitigated by RON .................. 74 5-18Testbedconfiguration.................................. 74 5-19RONencapsulationlatencyoverhead......................... 75 8 List of Tables 3.1Internetlinkfailurerecoverytimes.......................... 30 4.1 Performance database summarization methods . .................. 52 5.1ListofdeployedRONhosts.............................. 54 5.2Ronoutageimprovements.............................. 61 5.3RONlossratechanges................................ 64 5.4 RON loss rate changes in dataset 2 . ....................... 64 5.5 Cases in which the latency and loss optimized paths differ. ............ 71 5.6 TCP throughput overhead with RON . ....................... 76 5.7RONrouteduration.................................. 78 9 These are just a few of the treasures for the stockpot — that magic source from which comes the telling character of the cuisine. - Marion Rombauer Becker, The Joy of Cooking Art is I, science is we. - Claude Bernard Acknowledgments Emerging from my first two years at MIT with my sanity, and a completed thesis, is not something I could have completed alone. I am deeply indebted to many people for their support and patience while I paced, grumbled, bounced,
Recommended publications
  • A Remote Lecturing System Using Multicast Addressing
    MTEACH: A REMOTE LECTURING SYSTEM USING MULTICAST ADDRESSING A Thesis Submitted to the Faculty of Purdue University by Vladimir Kljaji In Partial Fulfillment of the Requirements for the Degree of Master of Science in Electrical Engineering August 1997 - ii - For my family, friends and May. Thank you for your support, help and your love. - iii - ACKNOWLEDGMENTS I would like to express my gratitude to Professor Edward Delp for being my major advisor, for his guidance throughout the course of this work and especially for his time. This work could not have been done without his support. I would also like to thank Pro- fessor Edward Coyle and Professor Michael Zoltowski for their time and for being on my advisory committee. - iv - TABLE OF CONTENTS Page LIST OF TABLES .......................................................................................................viii LIST OF FIGURES........................................................................................................ix LIST OF ABBREVIATIONS.........................................................................................xi ABSTRACT.................................................................................................................xiii 1. INTRODUCTION...................................................................................................1 1.1 Review of Teleconferencing and Remote Lecturing Systems ...............................1 1.2 Motivation ...........................................................................................................2
    [Show full text]
  • What Is Routing?
    What is routing? • forwarding – moving packets between ports - Look up destination address in forwarding table - Find out-port or hout-port, MAC addri pair • Routing is process of populat- ing forwarding table - Routers exchange messages about nets they can reach - Goal: Find optimal route for ev- ery destination - . or maybe good route, or just any route (depending on scale) Routing algorithm properties • Static vs. dynamic - Static: routes change slowly over time - Dynamic: automatically adjust to quickly changing network conditions • Global vs. decentralized - Global: All routers have complete topology - Decentralized: Only know neighbors & what they tell you • Intra-domain vs. Inter-domain routing - Intra-: All routers under same administrative control - Intra-: Scale to ∼100 networks (e.g., campus like Stanford) - Inter-: Decentralized, scale to Internet Optimality A 6 1 3 2 F 1 E B 4 1 9 C D • View network as a graph • Assign cost to each edge - Can be based on latency, b/w, utilization, queue length, . • Problem: Find lowest cost path between two nodes - Must be computed in distributed way Distance Vector • Local routing algorithm • Each node maintains a set of triples - (Destination, Cost, NextHop) • Exchange updates w. directly connected neighbors - periodically (on the order of several seconds to minutes) - whenever table changes (called triggered update) • Each update is a list of pairs: - (Destination, Cost) • Update local table if receive a “better” route - smaller cost - from newly connected/available neighbor • Refresh existing
    [Show full text]
  • SIP Architecture
    383_NTRL_VoIP_08.qxd 7/31/06 4:34 PM Page 345 Chapter 8 SIP Architecture Solutions in this chapter: ■ Understanding SIP ■ SIP Functions and Features ■ SIP Architecture ■ Instant Messaging and SIMPLE Summary Solutions Fast Track Frequently Asked Questions 345 383_NTRL_VoIP_08.qxd 7/31/06 4:34 PM Page 346 346 Chapter 8 • SIP Architecture Introduction As the Internet became more popular in the 1990s, network programs that allowed communication with other Internet users also became more common. Over the years, a need was seen for a standard protocol that could allow participants in a chat, videoconference, interactive gaming, or other media to initiate user sessions with one another. In other words, a standard set of rules and services was needed that defined how computers would connect to one another so that they could share media and communicate.The Session Initiation Protocol (SIP) was developed to set up, maintain, and tear down these sessions between computers. By working in conjunction with a variety of other protocols and special- ized servers, SIP provides a number of important functions that are necessary in allowing communications between participants. SIP provides methods of sharing the location and availability of users and explains the capabilities of the software or device being used. SIP then makes it possible to set up and manage the session between the parties. Without these tasks being performed, communication over a large network like the Internet would be impossible. It would be like a message in a bottle being thrown in the ocean; you would have no way of knowing how to reach someone directly or whether the person even could receive the message.
    [Show full text]
  • The Routing Table V1.12 – Aaron Balchunas 1
    The Routing Table v1.12 – Aaron Balchunas 1 - The Routing Table - Routing Table Basics Routing is the process of sending a packet of information from one network to another network. Thus, routes are usually based on the destination network, and not the destination host (host routes can exist, but are used only in rare circumstances). To route, routers build Routing Tables that contain the following: • The destination network and subnet mask • The “next hop” router to get to the destination network • Routing metrics and Administrative Distance The routing table is concerned with two types of protocols: • A routed protocol is a layer 3 protocol that applies logical addresses to devices and routes data between networks. Examples would be IP and IPX. • A routing protocol dynamically builds the network, topology, and next hop information in routing tables. Examples would be RIP, IGRP, OSPF, etc. To determine the best route to a destination, a router considers three elements (in this order): • Prefix-Length • Metric (within a routing protocol) • Administrative Distance (between separate routing protocols) Prefix-length is the number of bits used to identify the network, and is used to determine the most specific route. A longer prefix-length indicates a more specific route. For example, assume we are trying to reach a host address of 10.1.5.2/24. If we had routes to the following networks in the routing table: 10.1.5.0/24 10.0.0.0/8 The router will do a bit-by-bit comparison to find the most specific route (i.e., longest matching prefix).
    [Show full text]
  • Replication Strategies for Streaming Media
    “replication-strategies” — 2007/4/24 — 10:56 — page 1 — #1 Research Report No. 2007:03 Replication Strategies for Streaming Media David Erman Department of Telecommunication Systems, School of Engineering, Blekinge Institute of Technology, S–371 79 Karlskrona, Sweden “replication-strategies” — 2007/4/24 — 10:56 — page 2 — #2 °c 2007 by David Erman. All rights reserved. Blekinge Institute of Technology Research Report No. 2007:03 ISSN 1103-1581 Published 2007. Printed by Kaserntryckeriet AB. Karlskrona 2007, Sweden. This publication was typeset using LATEX. “replication-strategies” — 2007/4/24 — 10:56 — page i — #3 Abstract Large-scale, real-time multimedia distribution over the Internet has been the subject of research for a substantial amount of time. A large number of mechanisms, policies, methods and schemes have been proposed for media coding, scheduling and distribution. Internet Protocol (IP) multicast was expected to be the primary transport mechanism for this, though it was never deployed to the expected extent. Recent developments in overlay networks has reactualized the research on multicast, with the consequence that many of the previous mechanisms and schemes are being re-evaluated. This report provides a brief overview of several important techniques for media broad- casting and stream merging, as well as a discussion of traditional IP multicast and overlay multicast. Additionally, we present a proposal for a new distribution system, based on the broadcast and stream merging algorithms in the BitTorrent distribution and repli- cation system. “replication-strategies” — 2007/4/24 — 10:56 — page ii — #4 ii “replication-strategies” — 2007/4/24 — 10:56 — page iii — #5 CONTENTS Contents 1 Introduction 1 1.1 Motivation .
    [Show full text]
  • P2P Resource Sharing in Wired/Wireless Mixed Networks 1
    INT J COMPUT COMMUN, ISSN 1841-9836 Vol.7 (2012), No. 4 (November), pp. 696-708 P2P Resource Sharing in Wired/Wireless Mixed Networks J. Liao Jianwei Liao College of Computer and Information Science Southwest University of China 400715, Beibei, Chongqing, China E-mail: [email protected] Abstract: This paper presents a new routing protocol called Manager-based Routing Protocol (MBRP) for sharing resources in wired/wireless mixed networks. MBRP specifies a manager node for a designated sub-network (called as a group), in which all nodes have the similar connection properties; then all manager nodes are employed to construct the backbone overlay network with ring topology. The manager nodes act as the proxies between the internal nodes in the group and the external world, that is not only for centralized management of all nodes to a certain extent, but also for avoiding the messages flooding in the whole network. The experimental results show that compared with Gnutella2, which uses super-peers to perform similar management work, the proposed MBRP has less lookup overhead including lookup latency and lookup hop count in the most of cases. Besides, the experiments also indicate that MBRP has well configurability and good scaling properties. In a word, MBRP has less transmission cost of the shared file data, and the latency for locating the sharing resources can be reduced to a great extent in the wired/wireless mixed networks. Keywords: wired/wireless mixed network, resource sharing, manager-based routing protocol, backbone overlay network, peer-to-peer. 1 Introduction Peer-to-Peer technology (P2P) is a widely used network technology, the typical P2P network relies on the computing power and bandwidth of all participant nodes, rather than a few gathered and dedicated servers for central coordination [1, 2].
    [Show full text]
  • Lab 5.5.2: Examining a Route
    Lab 5.5.2: Examining a Route Topology Diagram Addressing Table Device Interface IP Address Subnet Mask Default Gateway S0/0/0 10.10.10.6 255.255.255.252 N/A R1-ISP Fa0/0 192.168.254.253 255.255.255.0 N/A S0/0/0 10.10.10.5 255.255.255.252 10.10.10.6 R2-Central Fa0/0 172.16.255.254 255.255.0.0 N/A N/A 192.168.254.254 255.255.255.0 192.168.254.253 Eagle Server N/A 172.31.24.254 255.255.255.0 N/A host Pod# A N/A 172.16. Pod#.1 255.255.0.0 172.16.255.254 host Pod# B N/A 172.16. Pod#. 2 255.255.0.0 172.16.255.254 S1-Central N/A 172.16.254.1 255.255.0.0 172.16.255.254 All contents are Copyright © 1992–2007 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 1 of 7 CCNA Exploration Network Fundamentals: OSI Network Layer Lab 5.5.1: Examining a Route Learning Objectives Upon completion of this lab, you will be able to: • Use the route command to modify a Windows computer routing table. • Use a Windows Telnet client command telnet to connect to a Cisco router. • Examine router routes using basic Cisco IOS commands. Background For packets to travel across a network, a device must know the route to the destination network. This lab will compare how routes are used in Windows computers and the Cisco router.
    [Show full text]
  • Routing Tables
    Routing Tables A routing table is a grouping of information stored on a networked computer or network router that includes a list of routes to various network destinations. The data is normally stored in a database table and in more advanced configurations includes performance metrics associated with the routes stored in the table. Additional information stored in the table will include the network topology closest to the router. Although a routing table is routinely updated by network routing protocols, static entries can be made through manual action on the part of a network administrator. How Does a Routing Table Work? Routing tables work similar to how the post office delivers mail. When a network node on the Internet or a local network needs to send information to another node, it first requires a general idea of where to send the information. If the destination node or address is not connected directly to the network node, then the information has to be sent via other network nodes. In order to save resources, most local area network nodes will not maintain a complex routing table. Instead, they will send IP packets of information to a local network gateway. The gateway maintains the primary routing table for the network and will send the data packet to the desired location. In order to maintain a record of how to route information, the gateway will use a routing table that keeps track of the appropriate destination for outgoing data packets. All routing tables maintain routing table lists for the reachable destinations from the router’s location.
    [Show full text]
  • The Dos and Don'ts of Videoconferencing in Higher
    The Dos and Don’ts of Videoconferencing in Higher Education HUSAT Research Institute Loughborough University of Technology Lindsey Butters Anne Clarke Tim Hewson Sue Pomfrett Contents Acknowledgements .................................................................................................................1 Introduction .............................................................................................................................3 How to use this report ..............................................................................................................3 Chapter 1 Videoconferencing in Higher Education — How to get it right ...................................5 Structure of this chapter ...............................................................................................5 Part 1 — Subject sections ............................................................................................6 Uses of videoconferencing, videoconferencing systems, the environment, funding, management Part 2 — Where are you now? ......................................................................................17 Guidance to individual users or service providers Chapter 2 Videoconferencing Services — What is Available .....................................................30 Structure of this chapter ...............................................................................................30 Overview of currently available services .......................................................................30 Broadcasting
    [Show full text]
  • Introduction to the Border Gateway Protocol – Case Study Using GNS3
    Introduction to The Border Gateway Protocol – Case Study using GNS3 Sreenivasan Narasimhan1, Haniph Latchman2 Department of Electrical and Computer Engineering University of Florida, Gainesville, USA [email protected], [email protected] Abstract – As the internet evolves to become a vital resource for many organizations, configuring The Border Gateway protocol (BGP) as an exterior gateway protocol in order to connect to the Internet Service Providers (ISP) is crucial. The BGP system exchanges network reachability information with other BGP peers from which Autonomous System-level policy decisions can be made. Hence, BGP can also be described as Inter-Domain Routing (Inter-Autonomous System) Protocol. It guarantees loop-free exchange of information between BGP peers. Enterprises need to connect to two or more ISPs in order to provide redundancy as well as to improve efficiency. This is called Multihoming and is an important feature provided by BGP. In this way, organizations do not have to be constrained by the routing policy decisions of a particular ISP. BGP, unlike many of the other routing protocols is not used to learn about routes but to provide greater flow control between competitive Autonomous Systems. In this paper, we present a study on BGP, use a network simulator to configure BGP and implement its route-manipulation techniques. Index Terms – Border Gateway Protocol (BGP), Internet Service Provider (ISP), Autonomous System, Multihoming, GNS3. 1. INTRODUCTION Figure 1. Internet using BGP [2]. Routing protocols are broadly classified into two types – Link State In the figure, AS 65500 learns about the route 172.18.0.0/16 through routing (LSR) protocol and Distance Vector (DV) routing protocol.
    [Show full text]
  • The Routing Table Lookup Process
    The Routing Table Part 2 – The Routing Table Lookup Process By Rick Graziani Cisco Networking Academy [email protected] Updated: Oct. 10, 2002 This document is the second of two parts dealing with the routing table. Part I discussed the structure of the routing table, and how routes are created. Part II will discuss how the routing table lookup process finds the “best” route within the routing table. What happens when a router receives an IP packet, examines the IP destination address and looks that address up in the routing table? How does the router decide which route in the routing table is the best match? What affect does the subnet mask have on the routing table? How does the router decide whether or not to use a supernet or default route if there is not a better match? We will answer these questions and more in this document. The network we will be using is a simple three router network. RouterA and Router B share the common major network 172.16.0.0/24. RouterB and RouterC are connected by the 192.168.1.0/24 network. You will notice that RouterC also has a 172.16.4.0/24 subnet which is disconnected, or discontiguous, from the rest of the 172.16.0.0 network. 172.16.1.0/24 172.16.3.0/24 172.16.4.0/24 .1 fa0 .1 fa0 .1 fa0 .1 172.16.2.0/24 .2 .1 192.168.1.0/24 .2 s0 s0 s1 s0 Router A Router B Router C Most of this information is best explained by using examples.
    [Show full text]
  • Analyzing the Internet's BGP Routing Table
    Analyzing the Internet’s BGP Routing Table Geoff Huston January 2001 The Internet continues along a path of seeming inexorable growth, at a rate which has, at a minimum, doubled in size each year. How big it needs to be to meet future demands remains an area of somewhat vague speculation. Of more direct interest in the question of whether the basic elements of the Internet can be extended to meet such levels of future demand, whatever they may be. To rephrase this question, are there inherent limitations in the technology of the Internet, or its architecture of deployment that may impact on the continued growth of the Internet to meet ever expanding levels of demand? There are a number of potential areas to search for such limitations. These include the capacity of transmission systems, the switching capacity of routers, the continued availability of addresses and the capability of the routing system to produce a stable view of the overall topology of the network. In this article we will examine the Internet’s routing system and the longer term growth trends that are visible within this system. The structure of the global Internet can be likened to a loose coalition of semi-autonomous constituent networks. Each of these networks operates with its own policies, prices, services and customers. Each network makes independent decisions about where and how to secure supply of various components that are needed to create the network service. The cement that binds these networks into a cohesive whole is the use of a common address space and a common view of routing.
    [Show full text]