Routing on the Internet

In the beginning there was the ARPANET: • route using GGP (Gateway-to-Gateway Protocol), Computer Networks a distance vector routing protocol Problems: • needed “flag-hour” to update routing protocol • incompatibility across vendors Lecture 17: Inter-domain Routing and BGP

Routing on the Internet Hierarchical Routing Solution: hierarchical routing 2.2 2.2 AS2 • administrative autonomy: 3.2 AS2 Gateway/border router AS1 2.1 3.3 • each network admin can control 3.2 1.1 3.1 routing within its own network AS1 2.1 3.3 • neighboring ASs interact to AS3 1.1 3.1 border routers • internet: network of networks AS3 coordinate routing 3.4 border routers • 3.4 4.1 4.2 allows the Internet to scale: • direct link to router in other AS(s) 4.4 • with 200 million hosts, each router 4.1 AS4 4.2 4.3 can’t store all destinations in its routing table 4.4 • keeps in its routing table: AS4 • 4.3 route updates alone will swamp the links • next hop to other ASs 3.1 dest next • all hosts within its AS 1.* 1.1 Aggregate routers into regions of • hosts within an AS only keep a 2.* 2.1 4.* 2.1 “autonomous systems” (ASs) default route to the border router 3.2 3.2 3.3 3.3 3.4 3.4 1989 point of presence NSFNet Hierarchical Routing The NSFNet backbone (pop) Area hierarchy: Routers in the same AS run same routing protocol Regional • backbone/core: NSFNet networks

• “intra-AS” routing protocol • regional networks: MichNet, Customer BARRNET, Los Nettos, networks • each AS uses its own link metric 2.2 Cerfnet, JVCNet, NEARNet, etc. Users AS2 [Walrand] • 3.2 routers in different ASs can run • campus networks different intra-AS routing protocol AS1 2.1 3.3 1.1 3.1 • internal topology is not shared AS3 border routers between ASs 3.4

4.1 4.2 4.4 AS4 4.3

[Halabi] [Merit Networks]

Commercialization (1994) AS Structure: Other ASs AT&T Tier-1 providers Roughly hierarchical interconnect Sprint Lower tier providers (peer) privately Verizon At center: “Tier-1” ISPs • provide transit service to downstream customers • Tier-1 ASs: top of the Internet • but, need at least one provider of their own hierarchy of ~10 Ass: AOL, • typically have national or regional scope

Tier-1 providers also [Walrand] AT&T, , Level3, interconnect at • includes several thousand ASs Verizon/UUNET, NTT, Qwest, public network access points (NAPs) (formerly Cable & Stub ASs ), Sprint, etc. 2 • do not provide transit service to others • full (N ) peering relationships between Tier-1 providers • connect to one or more upstream providers • has no upstream provider • includes the vast majority (e.g., 85-90%) of the ASs • national/international coverage [Halabi] [Rexford] “Tier-2” ISPs: Smaller “Tier-3” ISPs and Local ISPs (Often Regional) ISPs Last hop (“access”) network (closest to end systems)

Connect to one or more tier-1 ISPs, Tier-2 ISPs also peer privately with local ISP possibly other tier-2 ISPs each other, and local ISP Tier-3 ISP local ISP interconnect at local ISP Local and tier- 3 NAPs ISPs are Tier-2 ISP Tier-2 ISP customers of Tier-2 ISP Tier-2 ISP Tier-1 ISP higher tier ISPs NAP Tier-2 ISP pays tier-1 connecting them ISP for connectivity to Tier-1 ISP to rest of Internet rest of Internet NAP • tier-2 ISP is customer Tier-1 ISP of tier-1 provider Tier-1 ISP Tier-2 ISP Tier-1 ISP Tier-1 ISP Tier-2 ISP Tier-2 ISP Tier-2 ISP local ISP local ISP Tier-2 ISP Tier-2 ISP local ISP local ISP Tier 1 ISP

A Packet Passes AS Number Trivia Through Many Networks AS number is a 16-bit quantity • 65,536 unique AS numbers

local ISP Some are reserved numbers (e.g., for private ASs) Tier-3 ISP • only 64,510 are available for public use Tier-2 ISP Tier-2 ISP Managed by Internet Assigned Numbers Authority (IANA) Tier-1 ISP • gives blocks of 1,024 to Regional Internet Registries NAP • RIRs assign AS numbers to institutions • 49,649 AS numbers in visible use (Feb ’15) Tier-1 ISP Tier-1 ISP Tier-2 ISP In 2007 started assigning 32-bit AS #s Tier-2 ISP Tier-2 ISP local ISP

[Rexford] Growth of AS numbers Interdomain Routing AS-level topology • destinations are CIDR address prefixes (APs, e.g., 12.0.0.0/8) To learn more about Internet AS state see: • nodes are Autonomous Systems (ASs) • Geoff Huston’s CIDR Report • edges are business relationships http://www.cidr-report.org/as2.0/ • CAIDA skitter maps: http://www.caida.org/research/topology/as_core_network/ 4 AS_Network.xml 3

5

2 7 6

1 Web server Client

[Rexford]

Challenges for Interdomain Routing Why SPF is not Suitable Scale • address prefixes (APs): 200,000 and growing Topology information is flooded • ASs: ~50,000 visible ones, and 60K allocated • high bandwidth and storage overhead • routers: at least in the millions • nodes must divulge sensitive commercial information

Proprietary information: Entire path computed locally per • ASs don’t want to divulge internal topologies • high processing overhead in a large network • nor their business relationships with neighbors Route computation minimizes some notion of Policy total distance • no Internet-wide notion of a link cost metric • all traffic must travel on shortest paths • need control over where you send traffic • and who can send traffic through you

[Rexford] [Rexford] Why SPF is not Suitable Why Not Distance Vector? All nodes need common notion of link costs Advantages • works only if policy is shared and uniform • hides details of the network topology Incompatible with commercial relationships • nodes determine only “next hop” toward the destination

Disadvantages National National YES ISP1 ISP2 • route computation still entails minimization of some notion of total distance, which is difficult in an inter- NO domain setting • slow convergence due to reliance on counting-to-infinity Regional Regional Regional ISP3 ISP2 ISP1 to detect routing loop Instead use path vector Cust3 Cust2 Cust1 • easier loop detection

[Rexford] [after Rexford]

Path-Vector Routing Other Advantage: Flexible Policies Avoid counting-to-infinity by advertising entire path Each node can apply local policies • distance vector: send distance metric per destination • path selection: which path to use? • path vector: send the entire path for each destination • path export: which paths to advertise? Loop detection: Examples • each node looks for its own node identifier in advertised path • node 2 may prefer the path “2, 3, 1” over “2, 1” • and discards paths with loops • node 1 may not want node 3 to hear of the path “1, 2” • e.g., node 1 sees itself in the path (3, 2, 1) and discards the path

2 3 2 3 “d: path (2,1)” “d: path (1)” 3 2 1 ✗ data traffic data traffic 1 1 “d: path (3,2,1)” d [Rexford] [Rexford] Internet inter-AS Routing: BGP Internet inter-AS Routing: BGP BGP provides each AS a means to: BGP (Border Gateway Protocol) is the de facto • use prefix-based path-vector protocol standard for inter-AS routing • propagates AP reachability to all routers inside the AS • 06/89 v.1 • obtains AP reachability from neighboring ASs • 06/90 v.2 EGP (Exterior Gateway Protocol) to BGP transition • determines “good” routes to APs based on reachability • 10/91 v.3 BGP installed information and policy • 07/94 v.4 de facto standard • Inter-AS routing is policy driven, not load-sensitive, generally not QoS-based

When an AS advertises an AP to another AS, it is promising to forward any packets the other AS sends to the AP • an AS can aggregate CIDR APs in its advertisement

BGP runs over TCP BGP Messages Pairs of BGP routers (BGP peers) establish semi- permanent TCP connections: BGP sessions BGP messages: • : opens TCP connection to peer and authenticates sender • advantage of using TCP: reliable transmission allows for OPEN incremental updates: updates only when changes occur • UPDATE: advertises a new active path (or withdraws one no • disadvantage: TCP congestion control mechanism slows longer available) down route updates that could decongest link! • KEEPALIVE: keeps connection alive in the absence of UPDATEs; also acknowledges OPEN request • : reports errors in previous message; Failure detection: NOTIFICATION also used to close connection • TCP doesn’t detect lost connectivity on its own • instead, BGP must detect failure • sends KEEPALIVE packets every 60 seconds • hold timer: 180 seconds BGP sessions do not correspond to physical links, but rather business relationship [after Rexford] BGP Operations Path Attributes & BGP Routes

Establish session on AS1 When advertising an AP, advertisement includes BGP TCP port 179 attributes Two important attributes: BGP session • AS-PATH: the path vector of ASs through which the Exchange all advertisement for an AP passed through active routes • NEXT-HOP: the specific internal-AS router to next-hop AS AS2 (there may be multiple exits from current AS to next-hop-AS)

While connection is Exchange incremental ALIVE, exchange route updates UPDATE messages

[Rexford]

Path Attributes & BGP Routes Causes of BGP Routing Changes Sample BGP entry: Topology changes destination NEXT-HOP AS-PATH • equipments going up or down 198.32.163.0/24 202.232.1.8 2497 2914 3582 4600 • deployment of new routers or sessions • address range 198.32.163.0/24 is in AS 4600 BGP session failures • to get there, send to next hop router at address 202.232.1.8 • due to equipment failures, maintenance, etc. • the path there goes through ASs 2497, 2914, 3582, in order • or, due to congestion on the physical path

AS path chosen may not 2 AS hops, 11 router hops Changes in routing policy be the shortest AS path • changes in preferences in the routes • changes in whether the route is exported Router path may be longer than AS path s d Persistent protocol oscillation

3 AS hops, 7 router hops • conflicts between policies of different ASs [after Rexford] [Rexford] BGP Session Failure Routing Change: Before and After

Reacting to a failure AS1 AS1 0 • discard all routes learned • delete the route (1, 0) from the neighbor • switch to next route (1, 2, 0) (1, 0) (2, 0) • send new updates for any • send route (1, 2, 0) to AS3 routes that change (1, 2, 0) 1 • overhead increases with # of routes AS3 1 2 • reason why many Tier-1 ASs filter out • sees (1, 2, 0) replace (1,0) prefixes longer than /24 • compares to route (2, 0) • switches to using AS2 (3, 1, 0) (3, 2, 0) AS2 3

[Rexford] [Rexford]