What Is The ? (Considering Partial Connectivity) Guillermo Baltra John Heidemann ISI/USC ISI/USC [email protected] [email protected]

ABSTRACT Yet the notion of a globally unique “Internet” today faces After 50 years, the Internet is still defined as “a collection of new challenges. IPv6 was proposed as a replacement for IPv4, interconnected networks”. Yet desires of countries for “their and after 20 years of development, it is coming into its own own internet” (Internet secession?), country-level firewalling, network (as of May 2021, 33% of ’s users access it and persistent disputes all challenge the idea of a over IPv6 [30]). Others have proposed separate networks for single set of “interconnected networks”. We show that the political reasons: In 2019, Russia suggested they were going Internet today has peninsulas of persistent, partial connectiv- to demonstrate a Russian Internet that can operate indepen- ity, and that some outages cause islands where the Internet dently of outside influence [55]. has had strong views at the site is up, but partitioned from the main Internet. We on what is allowed on their domestic Internet, protected by propose a new definition of the Internet defining a single, the Great Firewall [2, 3]. In 2020, Huawei suggested a “new global network while helping us to reason about peninsulas Internet” as a set of protocols [26]. We expand about these and islands and their relationship to Internet outages. We challenges in subsection 2.1 provide algorithms to detect peninsulas and islands, find But such grand changes aside, what is the Internet to- that peninsulas are more common than outages, with thou- day? Most users access the Internet from mobile phones, sands of /24s IPv4 blocks that are part of peninsulas lasting a with most behind Network Address Translation (NAT) de- month or more. Root causes of most peninsula events (45%) vices [54], so they use IP but without a public address. Typi- are transient problems. However, a few long-lived cal security recommendations mandate firewalls, preventing peninsulas events (7%) account for 90% of all peninsula time, general addressing. And efforts such as the Interplanetary and they suggest root causes in country- or AS-level pol- Internet [9, 37] and the Internet-of-Things seek to bring “the icy choices. We also show that islands occur. Our definition Internet” into space or into embedded devices. shows that no single country can unilaterally claim to be If the Internet is networks that interconnect, then persis- “the Internet”, and helps clarify the spectrum from partial tent failures in such interconnection are a flaw. Today’s Inter- reachability to outages in prior work. net consists of many public and private companies that agree to exchange traffic through peering, either as a customer KEYWORDS buying service from a provider, or without cost through a settlement-free peering. Yet disputes over peering have some- Internet, Internet reliability, Network outages, Active mea- times resulted in serious congestion [5], and even long-term surements partial unreachability [39]. This unevenness has been recog- nized and detected experimentally [20] and other systems 1 INTRODUCTION detect and route around partial unreachability [1, 38] The primary contribution of this paper is to provide a defi- arXiv:2107.11439v1 [cs.NI] 23 Jul 2021 What is the Internet? The term was first used to specify an early version of nition of “the Internet”: the Internet is the connected component TCP, but without definition12 [ ]. The first definition was “A of public IP addresses that can reach more than 50% of the active, collection of interconnected networks is an internet” in a public IP addresses (section 2). Prior definitions [12, 27, 46] paper comparing X.25 and the ARPANet as two examples of helped distinguish the Internet from alternatives at the time, [46]. The Federal Networking Council (FNC) added but don’t address today’s challenges. We define an Inter- a common address space, with “an agreement between net- net that can answer questions about who “keeps” the name works to interconnect under an evolving set of protocols, in Internet if a nation or group secede. It also helps reason a globally unique address space to enable universal data de- about fragmented reachability (a previous subject [1, 38]) livery” [27]. More recent work considered namespace (DNS) that arises from routing trouble, and persistent peering dis- and IPv6. Many laypeople do not consider such technical de- putes, large-scale firewalls, and carrier-grade network ad- tails and blur the Internet with applications: the World-Wide dress translation. This definition captures the idea ofa single, Web or even Facebook is their Internet. global Internet implicit in some prior definitions (actually two, one per v4 and v6 address space). Although theoretical definition is intractable to fully realize, it provides anideal (one cannot observe from all addresses) it is the asymptote of to strive for. a technical realizable measurement from specific observers, and is not dependent on assertions of authority. The second contribution of this paper is to develop two new algorithms (section 3) to identify two types of network fragmentation that underlie this definition. First, Taitao de- 2.1 Why is it Hard to Define the Internet? tects peninsulas, when a network can reach some parts of The introduction gave classic but informal definitions of the the Internet directly, but not others. Peninsulas result from Internet from 1974 to 1995, definition useful when the idea peering disputes or long-term firewalls. Second, Chiloe de- of connecting different media was research. However, the tects islands, networks that have internal connectivity but economic and political importance of the Internet since then are sometimes cut off from the Internet as a whole. Oural- has stressed these definitions in several ways. gorithms do not require new active measurements, instead Questions of Internet sovereignty have brought multiple they provide a new interpretation of existing active mea- governments to assert control over the Internet, or at least surement datasets that cover millions of IPv4 /24s blocks their portion. In 2009, some suggested the U.S. should have and several interesting events. To this data we add new al- an Internet “kill switch” [51]. In 2019, Russia suggested they gorithms, and new independent traceroutes to validate our would experiment with an independent Russian Internet. results (section 4). Since the 2000s a number of countries have created and Our third contribution is to use these algorithms to un- grown nation-level firewalls and access control, either for derstand to what degree the current IPv4 Internet suffers content moderation [2, 3] or to disconnect during political reachability problems (section 5). Partial reachability is im- unrest [18]. What happens to the Internet if a major portion portant in prior work that routes through a third party [1, 38], secedes and chooses to operate independently? Are there but that work did not quantify how often these solutions are now two Internets? If not, which part is “the” Internet? required. We find that peninsulas are more common than Economic disputes between companies that operate the outages (as defined in the next paragraph), with thousands of Internet have stressed a unified Internet with long-term peer- /24s IPv4 blocks that are part of peninsulas lasting a month ing disputes. On multiple occasions, peering disputes in the or more. Root causes of most peninsula events (45%) are tran- 2000s resulted in persistent congestion between parts of the sient routing problems. However, a few long-lived peninsulas Internet [5, 20]. Peering disputes between Tier-1 ISPs some- events (7%) account for 90% of all peninsula time, and they times result in the long-term inability of parts of the Internet suggest root causes in country- or AS-level policy choices. to reach to other parts [39]. If one defines a Tier-1 ISP as Finally, this understanding and definition of the Internet, operating default-free routing, then any Tier-1 ISP must di- partial connectivity, and outages (section 2) clarifies prior rectly peer with all other Tier-1 ISPs. When two Tier-1 ISPs studies of Internet outages [31, 47, 49, 52, 53] in section 6. We cannot reach a business agreement to exchange traffic (such show that operational systems converge on the conceptual as if they cannot agree on a price), then their customers definition, given that networks are not just “up” or “down”, cannot reach each other. While both ISPs are part of “the but also partially reachable as peninsulas and sometimes Internet”, what does it mean when portions of the Internet disconnected islands. see persistent unreachability? All of the data used and created in this paper is available at Full allocation of the IPv4 address space is a third stress. no cost (subsection 3.1). Our work poses no ethical concerns: Many users access the public Internet only through net- although we use data on individual IP addresses, we have work address translation, sometimes in multiple layers. Some no way of associating them with individuals. Our work was private networks today exceed designated private address IRB reviewed and identified as non-human subjects research space [4]. Are computers that cannot reach each other part (USC IRB IIR00001648). of “the Internet”? The goal of this paper is to explore these new questions. We provide a definition of the Internet that accounts for Internet secession and Internet islands—groups of comput- 2 HOW DO WE DEFINE THE INTERNET? ers that can reach each other but not outside. We describe When we define the Internet, we differentiate between con- methods to detect long-term reachability problems, Internet ceptual and operational definitions [22]. Our conceptual def- peninsulas. We also relate these issues to localized Internet inition in subsection 2.2 articulates what we would like to outages. We consider network address translation and take observe, while our operational definition in section 3 articu- as an epistemological axiom that there should be one Internet, lates how we can estimate this value. While our conceptual or a clear statement that a single Internet no longer exists. 2 2.2 The Internet: A Conceptual Definition hierarchy [13]. Our definition should not create a hierarchy We define the Internet as the connected component of public or designate “special” addresses. This goal is consistent with IP addresses that can reach each other with at least 50% of recent goals for Internet decentralization [21]. active, public IP addresses. This conceptual definition gives Why reachability and not applications? Users care two Internets, one for the IPv4 address space and one for about applications, and a current user-centric view might IPv6. We give our reasoning for this definition below. be designed around perhaps reachability of HTTP. We focus This definition follows from the informal terms of “in- on reachability at the IP layer as a more fundamental con- terconnected networks”, “IP protocol” and “global address cept and has changed only twice since 1969 (with IPv4 and space”. Their defining commonality is the capacity of two IPv6). The dominant application on the Internet has changed computers connected to reach each other independent of multiple times, and many important applications to other their logical location— can traffic from IP address 푥 reach 푦, networks outside the Internet. (E-mail has been transpar- and can 푦 reach 푥. ently relayed to UUCP and FidoNet, and the web to pre-IP We formalize “an agreement of networks to interconnect” mobile devices with WAP.) by considering reachability over public IP addresses: ad- dresses 푥 and 푦 are interconnected if traffic from 푥 can reach 2.3 The Internet Landscape 푦 and vice versa (that is: 푥 and 푦 can reach each other). Net- Our definition of the Internet highlights the Internet’s “rough works are groups of addresses that can reach each other. edges”—places where reachability is limited by intentional Why More than 50%? We require that the Internet in- partitions, peering and policy disputes, and outages. We cludes more than 50% of active addresses to reflect the prin- next identify three types of problems: outages, islands, and ciple that there is one Internet. We believe there should be a peninsulas, and how they relate to the Internet as a whole. clearly defined Internet even if a major nation (or group of nations) chose to secede. Since only one partition can include 2.3.1 Outages. A number of groups have examined Internet the majority of active addresses, this definition provides an outages [31, 47, 49, 52]. These systems observe the IPv4 Inter- unambiguous, unique definition. net and identify networks that are no longer reachable—they This definition suggests that it is possible for the Internet have left the Internet. Often these systems define outages to fragment: if the current Internet breaks into three dis- operationally (network 푥 is out because none of our Vantage connected components when none has a majority of active Points (VPs) can reach it). addresses. Such a result would end a single, global Internet. Conceptually, an outage is when all computers in a block Addresses, Protocols, and Firewalls: This conceptual are off, such as due to a power outage. If the computers definition of the Internet is intentionally vague about the or their network are on but cannot reach the Internet, we addresses and protocols that define reachability. The Internet consider them islands. This conceptual definition differs from today uses both IPv4 and IPv6; our definition applies to both prior operational definitions where an outage exists when protocols, effectively defining two Internets. Reachability external VPs cannot see the target network. can be tested through measurements with specific protocols, Based on our definition of the Internet, an address 푎 is such as ICMP echo request (pings), or TCP or UDP queries. a host-island if it cannot reach any other address, and it Such a test will result in an operational realization of our is part of the Internet if it can reach at least half of all ac- conceptual definition. Such realizations will differ in detail tive addresses. When it is connected to fewer than half the and accuracy, but we suggest they may converge on the ideal. addresses, it is part of an island or peninsula, defined below. Firewalls complicate this matter by blocking some proto- cols or sources for policy reasons, and stateful firewalls may 2.3.2 Islands: Isolated Networks. An island is a group of admit traffic in one direction, or in both directions after itis public IP addresses that are partitioned from the Internet, opened in one direction. Thus different protocols or times but can continue to communicate among themselves. Both might give different answers, and one could define broad outages and islands are unreachable from an external VP, reachability with any protocol in a firewall-friendly manner, but with islands the computers are still active and talking or narrowly. Our measurements in subsection 3.1 use IPv4 among themselves. and ICMP echo requests; prior studies have compared IPv4 Islands occur when an organization that has a single con- and probing protocol [7, 23, 47]. Although, some routers may nection to the Internet loses its router or link to its ISP. A rate-limit ICMP, this is a rare practice [32]. single-office business may become an island after a router Why all addresses? We consider all addresses equally, failure, yet staff may use a webserver with them in theoffice rather than favor old or important locations or addresses. The even though they are unable to surf the public web. In the Internet is global, and was intentionally designed without a smallest case, an island may be a single address that can only ping itself, that is, an address island. 3 100 However, we have seen country-size islands. In 2017q3 we W 200 observed 8 events when it appears that most or all of China 100 200 up (truth) E stopped responding to external pings. Figure 2 shows the up (implied) 100 number of /24 blocks that were down over time, each spike down (truth) N 200 down (implied) more than 200k /24s, between two to eight hours long. We 18:00 20:00 22:00 Jun-04 02:00 04:00 [2017] Source Vantage Point Target Block Last Octet found no problem reports on network operator mailing lists, so we believe these outages were ICMP-specific and likely Figure 1: A block (65.123.202.0/24) showing a one-hour did not affect web traffic. In addition, we assume the millions island for this block and VP E, while the other 5 VPs of computers inside China continued to operate. We consider (2 shown) cannot reach it. Dataset: A28, 2017q2. these cases examples of China becoming an ICMP-island.

450000 400000 2.3.3 Peninsulas: Partial Connectivity. While outages hap- 350000 pen and create islands, a more pernicious problem is partial 300000 connectivity, when one can reach some destinations, but not 250000

200000 others. We call a group of public IP addresses with partial

Blocks Down 150000 connectivity with the Internet a peninsula. The presence of 100000 peninsulas have been recognized for nearly twenty years, 50000 with overlay networks demonstrated to route around them 0 in RON [1] and later LIFEGUARD [38].

2017-06-29 2017-07-13 2017-07-27 2017-08-10 2017-08-24 2017-09-07 2017-09-21 2017-10-05 Peninsulas occur when a multi-homed network has out- ages or peering disputes upstream, or due to firewalls. Figure 2: Number of blocks down in the whole respon- Examples in IPv6: An example of a persistent peninsula sive Internet. Dataset: A29, 2017q3. is the IPv6 peering dispute between Hurricane Electric and Cogent. These two Tier-1 ISPs are unwilling to peer in IPv6 (presumably they cannot agree to business terms about set- We believe that most outages are actually temporary is- tlements), and since they are both Tier-1 ISPs, they are also lands. The conditions are identical from the outside, but can unwilling to send IPv6 to each other through another party be distinguished by an observer within. (the “no valley” routing policy). This problem has existed A Small Island: Figure 1 shows an example of an island since 2009 [39] and can be observed as of June 2020 in DNS- we have observed. In this graph, each strip shows a differ- MON [50] as a small amount of persistent unreachability for ent VP’s view of the last 156 addresses from the same IPv4 IPv6 across most Root DNS letters. /24 block over 12 hours, starting at 2017-06-03t23:06Z. In We further confirm unreachability between Hurricane each strip, the darkest green dots show positive responses Electric and Cogent users in IPv6 by running traceroutes of that address to an ICMP echo request (a “ping”) from that from their own looking glasses [15, 25] to the other’s DNS observer, and medium gray dots indicate a non-response to server (HE at 2001:470:20::2, and Cogent at 2001:550:1:a::d). a ping. We show inferred state as lighter green or lighter While neither can reach their neighbor’s server, they do gray until the next probe. We show 3 of the 6 VPs, with successfully reach their own. We do not see this problem on probes intervals of about 11 minutes (for methodology, see IPv4 (HE at 74.82.42.42, and Cogent at 66.28.0.14). subsection 3.1). Other IPv6 dispute examples are Cogent with Google [48], The island is indicated by the red bar in the middle of the and Cloudfare with Hurricane Electric [28]. Peering disputes graph, where VP E continues to get positive responses from are often the result of traffic imbalance and disagreements several other addresses (the continuous green bars along about a settlement-free or customer/provider relationship. the top). By contrast, the other 5 VPs (2 VPs here, others in An Example in IPv4: We next explore a real-world exam- Appendix B) show many non-responses during this period. ple of partial reachability to several Polish ISPs that we found For this whole hour, VP E and this network are part of an with our algorithms. On 2017-10-23, for a period of 3 hours island, cut off from the rest of the Internet and the other VPs. starting at 22:02Z, five Polish Autonomous Systems (ASes) Country-size Islands: We would typically say that a had 1716 blocks that were unreachable from five VPs while company that loses Internet access experienced an Internet the same blocks remained reachable from a sixth VP. outage and not call it an island. (In fact, with many com- Figure 3 shows the AS-level relationships at the time of panies depending on cloud-hosted services, loss of Internet the peninsula. Multimedia Polska (AS21021, or MP) provides may well stop all work in the company.) service to the other 4 ISPs. MP has two Tier-1 providers: 4 We can confirm this peninsula with additional observa- tions from CAIDA’s Ark traceroutes. During the event we see 94 unique Ark VPs attempted 345 traceroutes to the affected blocks. Of the 94 VPs, 21 VPs (22%) have their last responsive traceroute hop in the same AS as the target address, and 68 probes (73%) stopped before reaching that AS. The remaining 5 VPs were able to reach the destination AS for some probes, while not for others. (Sample traceroutes are in Appendix A.) Although we do not have a root cause for this peninsula from network operators, large number of BGP Update mes- sages suggests a routing problem. In subsection 5.4 we show Figure 3: AS level topology during the Polish penin- peninsulas are mostly due to policy choices. sula.

3 METHODOLOGY 0 100 W 200 To measure partial outages (islands and peninsulas) one must 0 look for outages from multiple independent locations (see 100 E 200 up (truth) up (implied) subsection 6.2). Most prior systems have focused on data 0 down (truth) 100 N 200 down (implied) from a single site (IODA [17], Chocolatine [31], Disco [53]), Oct-23 06:00 12:00 18:00 Oct-24 06:00 12:00 [2017] Source Vantage Point

Target Block Last Octet or have merged data from multiple sites to filter out partial outages (Thunderping [52], Trinocular [47], Wan et al. [58]). Figure 4: A block (80.245.176.0/24) showing a 3-hour We now explicitly consider taking observations from multi- peninsula accessible only from VP W (top bar) and ple locations to detect partial outages and islands, following not from the other 5 VPs (only 2 shown). Dataset: A30, our definitions in section 2. We propose two new algorithms, 2017q4. Taitao to detect peninsulas, and Chiloe, to detect islands (named after a large Patagonian peninsula and island).

Cogent (AS174) and Tata (AS6453). Before the peninsula, our VPs see MP through Cogent. 3.1 Data Sources At event start, we observe many BGP updates (20,275) Our algorithms use data from Trinocular [47] because it is announcing and withdrawing routes to the affected blocks available at no cost [57], is available since 2014, and covers (see Appendix A.) These updates correspond to Tata announc- most of the IPv4 Internet [6]. Briefly, Trinocular watches ing MP’s prefixes. Perhaps MP changed its peering to prefer about 5M IPv4 /24 blocks. In each probing round of 11 min- Tata over Cogent, or the MP-Cogent link failed. utes, it sends up to 15 ICMP echo requests (pings), stopping Initially, traffic from most VPs continued through Cogent early if it proves the block is reachable. It interprets the re- and was lost; it did not shift to Tata. One VP (W) could sults using Bayesian inference, and merges the results from reach MP through Level3 for the entire event, proving MP six geographically distributed VPs. VPs are in Los Angeles was connected. After 3 hours, we see another burst of BGP (W), Colorado (C), Tokyo (J), Athens (G), Washington, DC updates (23,487 this time) and confirming traffic through (E), and Amsterdam (N). In principle our algorithms apply Tata. to other sources of active probing data as future work. To show what our VPs see, Figure 4 shows address reach- We validate our results using CAIDA’s Ark [10], and use ability for a full block affected by this peninsula, similar to AS numbers from Routeviews [43]. All data used in this paper Figure 1. We see the top VP, W, has some responsive ad- is thus publicly available. dresses the whole time (some rows remain green during the We use Trinocular measurements for 2017q4 because this red-band peninsula period). The other five VPs drop out on time period had six active VPs, allowing us to make strong midnight Oct. 24 We show 2 here (others are similar), and statements about how multiple perspectives help. (We use while a few light green addresses indicate we infer they are 2020q3 data in subsection 5.4 because Ark observed a very reachable, all addresses eventually go gray for unreachable. large number of loops in 2017q4.) Problems with different We know this example is a peninsula because VP W is out- VPs reduced coverage for 2019 and 2020, but we verify and side the block and can reach in (unlike Figure 1 where it was find quantitatively similar results for 2020 data in Appen- inside and could not see out.) dix C). 5 3.2 Taitao: a Peninsula Detector the target block and all available VPs from different countries Peninsulas occur when portions of the Internet are reach- fail. Formally, we say there is a country peninsula when the 푏 푖 able from some locations and not others. Peninsulas can set of observers claiming block is up at time is equal 푂푐 ⊂ 푂 be observed when two VPs disagree on block reachability. to 푖,푏 푖,푏 the set of all available observers with valid With multiple VPs, any state other than all-up or all-down observations at country 푐. suggests a peninsula. 푂up 푂푐 = 푖,푏 (2) There are two challenges to detecting peninsulas. First, 푖,푏 we don’t actually have VPs everywhere. If all VPs are on 3.4 Chiloe: an Island Detector the same “side” of a peninsula, their reachability agree even According to our definition in subsubsection 2.3.2, islands though other potential VPs may disagree. Second, VPs ob- occur when the Internet is partitioned, and the smaller com- servations are not synchronized and are spread over an 11- ponent (that with less than half the active addresses) is the minute probing interval, so we expect that different VPs will island. Typical islands are much, much smaller. make observations at slightly different times. Thus two cor- We can find islands by looking for networks that are only rect observations may see before and after a network change, reachable from less than half of the Internet. However, to producing a false agreement or disagreement. classify such networks as an island and not merely a penin- We identify peninsulas by detecting disagreements in sula, we need to show that it is partitioned. Without global block state by comparing valid VP observations that occur knowledge, it is difficult to prove disconnection. In addi- at about the same time. Since probing rounds occur every tion, if islands are partitioned from VPs, we cannot tell an 11 minutes, we compare measurements within an 11-minute island, where a part of the Internet is disconnected but still window. This approach will see peninsulas that last at least active inside, from an outage, where a part of the Internet is 11 minutes, but may miss briefer ones, or peninsulas where disconnected and also cannot communicate internally. VPs are not on “both sides”. For these reasons, rather than looking for islands in the Formally, 푂 is the set of observers with valid observa- 푖,푏 Internet, we instead look for islands that include VPs in their tions about block 푏 at round 푖. We look for disagreements partition. Because we know the VP is active and scanning we in 푂 . For that purpose, we define 푂up ⊂ 푂 as the set of 푖,푏 푖,푏 푖,푏 can determine how much of the Internet is in its partition, 푏 푖 observers that measure block as up at round . We detect a ruling out an outage, and we can confirm the Internet is not peninsula when: reachable to rule out a peninsula. up 퐵 0 < |푂 | < |푂 | (1) Formally, we say that is the set of all blocks on the 푖,푏 푖,푏 퐵up 퐵 Internet that responded in the last week. 푖,표 ⊆ is the set When only one VP reaches a block, that block can be either 표 푖 퐵dn 퐵 of reachable blocks from observer at round , while 푖,표 ⊆ a peninsula or an island. We cannot distinguish between is its complement. them without more information. In subsection 3.4 we add We detect that observer 표 is in an island when half or more more information from what the positive VP sees about other of the observable Internet appears to be down, that is: blocks to determine islands. 퐵up 퐵dn 0 ≤ | 푖,표 | ≤ | 푖,표 | (3) 3.3 Country-Level Peninsula Detection This method is limited to detecting islands that contain Taitao detects peninsulas based on differences in observa- VPs. Also, because observation is not instantaneous, we must tions. Long-lived peninsulas are likely intentional, from pol- avoid confusing short-lived islands with long-lived penin- icy choices. One policy is filtering based on national bound- sulas. For islands lasting longer than an observation period, 퐵up 퐵up aries, possibly to implement legal requirements about data we also require | 푖,표 | → 0. When | 푖,표 | = 0, then we have an sovereignty or economic boycotts. address island. We identify country-specific peninsulas as a special case of Taitao where a given destination block is reachable (or 3.5 Determining Who Has the Internet unreachable) from only one country, persistently for an The U.S. [51], China [2, 3], and Russia [14] have all proposed extended period of time. (In practice, the ability to detect unplugging from the Internet. Egypt disconnected during country-level peninsulas is somewhat limited because the Arab spring [16], and several countries partially disconnected only country with multiple VPs in our data is the United during exams [19, 24, 29, 34]. States. However, we augment non-U.S. observers with data When the Internet partitions, which part is still “the In- from other non-U.S. sites such as Ark or RIPE Atlas.) ternet”? When a business disconnects, or a small country A country level peninsula occurs when all available VPs leaves, the answer is clear. What if a group of countries leave, from the same country as the target block successfully reach splitting the Internet into huge islands of similar sizes? 6 Our definition of the Internet in subsection 2.2 defines the unreachable message to the target address, loop in the path, Internet as the majority of the active, public IP addresses. Ma- or gap limit exceeded. We discard traceroutes halting because jority uniquely provides an unambiguous, externally evalu- of a gap, since gaps indicate problems on the path, and may able decision—the partition exceeding 50% is the Internet. hide an endpoint that would be reachable if it were directly An implication is that if there are multiple partitions and pinged. none keeps a majority, than the global Internet has ended. To correct for differences in target addresses, we must (A plurality is insufficient.) avoid misinterpreting a block as unreachable when the block is online but Ark’s target address is not, we discard traces sent 4 VALIDATING OUR APPROACH to non-ever-active addresses (not in “퐸(푏)” observed from 16 IPv4 censuses [47]), and blocks for which Ark did not get a We next validate detection with Taitao and Chiloe. We com- single successful response. (Even with this filtering, dynamic pare Taitao peninsulas against independent data (subsec- addressing means Ark still sometimes sees unreachables.) tion 4.1) and examine persistent country level peninsulas (sub- To correct for Ark’s less frequent probing, we compare section 4.2). We then compare Chiloe’s single-observer island Trinocular down events that last 5 hours or more. Ark mea- detection against external observers (subsection 4.3). surements are much less frequent (once every 24 hours) than Trinocular’s 11-minute reporting, so short Trinocular events 4.1 Can Taitao Detect Peninsulas? often have no overlapping Ark observations. To confirm We compare Taitao detections from 6 VPs to independent agreements or conflicting reports from Ark, we require at observations taken from more than 100 VPs in CAIDA’s least 3 Ark observations within the peninsula’s span of time. Ark [10]. This comparison is challenging, because both Taitao We filter out blocks that show frequent transient changes and Ark are operational systems with imperfect results, and or show signs of network-level filtering. We define the “reli- because they differ in details such as probing frequency, tar- able” blocks that we keep as blocks that report as up at least gets, and method. It would be incorrect to declare one as 85% of the quarter from each of the 6 Trinocular VPs. We perfect ground truth; instead we note agreement as likely also discard as flaky blocks with frequently inconsistent re- truth, and discuss the likely cause of disagreements. sponses across VPs. (We consider more than 10 combinations Although Ark data to the target is much less frequent of VP as frequently inconsistent.) than Trinocular, Ark makes observations from 171 global The total number of blocks available for study matching locations, so it provides a fresh perspective, and Ark collects our criteria for both Ark and Taitao is 2M. traceroute data to the target, so it allows us to assess the Results of comparison: Table 1 shows detailed results, where peninsulas begin. We expect to see a strong correla- and Table 2 summarizes our interpretation. In this table, tion between Taitao peninsulas and Ark observations. (We dark green indicates true positives (TP): when (a) either considered RIPE Atlas as another external dataset, but use both Taitao and Ark show mixed results, both indicating a Ark because it systematically traceroutes all /24 blocks.) peninsula, or when (b) Taitao indicates a peninsula (1 to 5 Identifying comparable blocks: We study 21 days of sites up but at least one down), Ark shows all down during Ark observations from 2017-10-10 to -31. Ark provides cov- the event and up before and after. We treat Ark in case (b) as erage of all networks in two ways. With Ark team probing, a positive because Ark probing is infrequent enough (only 40 VPs traceroute to every routed /24 about once per day. one probe per team every 24 hours is guaranteed, further For Ark prefix probing, about 35 VPs send traceroutes to.1 this considers non-퐸(푏) addresses which we discard) that we addresses of all routed /24s every day. We use both types cannot guarantee VPs within the peninsula probing available of data: all three teams and all available prefix probing VPs, target addresses. Because peninsulas are fairly rare, so too and we group results by /24 block of the traceroute’s target are true positives, but we see 184 TPs. address. We show true negatives as light green and neither bold nor Ark differs from Taitao’s input (Trinocular) in three ways: italic. In almost all of these cases (1.4M) both Taitao and Ark the target is a random address or the .1 address in each block; show the block up with no conflicting observations. Because it uses traceroute, not ping; and it probes each block once of dynamic addressing [45], many Ark traceroutes end in a per day, not every 11 minutes. We must account for these failure at the last hop (even after we discard never-reachable). differences when comparing the datasets—Ark traceroutes We therefore count this second most-common result (491k will sometimes fail when a simple ping might succeed. First, cases) as a true negative. For the same reason, we include Ark targets a random address or the .1 address, while Trinoc- the small number (97) of cases where Ark reports conflicting ular targets addresses known to respond, so Trinocular has results and Taitao is all-up, assuming Ark terminates at an a higher response rate. Second, Ark’s traceroutes terminate empty address. We include in this category, the 90 events due to four reasons: success in reaching target address, ICMP 7 Ark U.S. VPs Domestic Only ≤ 5 Foreign > 5 Foreign Total Ark WCE 211 171 47 429 Sites Up Conflicting All Down All Up WCe 0 5 1 6 1 20 6 15 Ark WcE 0 1 0 1 wCE 0 0 0 0 2 13 5 11 Peninsula Non Peninsula Wce 3 40 11 54

3 13 1 5 Peninsula 184 251 (strict) 40 (loose) Trinocular wcE 0 4 5 9 4 26 4 19

Conflicting wCe 0 1 1 2 Non 5 83 13 201 12 1,976,701 Taitao Marginal distr. 214 222 65 501 Trinocular 0 6 97 6 Peninsula 6 491,120 90 1,485,394 Table 3: Trinocular U.S. only Agree Table 1: Trinocular and Ark agree- Table 2: Taitao confusion matrix. blocks validation table. Dataset ment table. Dataset A30, 2017q4. Dataset A30, 2017q4. A30, 2017q4.

where Ark is all-down and Trinocular is all-up. We attribute We validate our observations against Ark, following sub- Ark’s failure to reach its targets to infrequent probing. section 4.1, except retaining blocks with less than 85% up- We mark false negatives as red and bold. For these few time. cases (only 12), all Trinocular VPs are down, but Ark reports We only consider Ark VPs that are able to reach the des- all or some responding. We believe these cases indicate blocks tination (that halt with “success”). We note blocks that can that have chosen to drop Trinocular traffic. only be reached by Ark VPs within the same country as do- Finally, yellow italics shows cases where a Taitao penin- mestic, and blocks that can be reached from VPs located in sula is a false positive, since all Ark probes reached target other countries as foreign. block. This scenario occurs when either some Trinocular In Table 3 we show the number of blocks that uniquely VPs have their traffic filtered, or all Ark VPs are “inside” the responded to all U.S. VP combinations during the quarter. peninsula. Light yellow (strict) shows all the 251 cases that We contrast these results against Ark reachability. Taitao detects. For most of these cases (201), Trinocular has We first examine true positives, when Taitao shows a 5 responding VPs and 1 non-responsive, suggesting network peninsula responsive only to U.S. VPs and nearly all Ark problems are near one of the Trinocular VPs. Discarding VPs confirm this result. We see 211 targets are U.S.-only, and these cases we get 40 (orange), a loose estimate. another 171 are available to only a few non-U.S. countries. The strict scenario sees precision 0.42, recall 0.94, and 퐹1 The specific combinations vary: sometimes allowing access score 0.58, and in the loose scenario, precision improves to from the U.K., or Mexico and Canada. Together these make 0.82 and 퐹1 score to 0.88. We consider these results good, but 382 true positives, most of the 501 cases. with some room for improvement. In yellow italics we show 47 cases of false positives where more than five non-U.S. countries are allowed access. In many cases these include many European countries. In light green we show true negatives. Here we include 4.2 Can We Detect Country-Level blocks that filter one or more U.S. VPs, and are reachable Peninsulas? from Ark VPs in multiple countries, amounting to a total Next, we verify our ability to detect country level peninsu- of 69 blocks. There are other categories involving non-U.S. las (subsection 3.3). Our hypothesis is that national borders sites, along with other millions of true negatives, however, sometimes result in long-term network unreachability. For we only concentrate in these few. example, rather than complying with external legal require- In red and bold we show three false negatives. These three ments (such as the EU’s GDPR [56]), an organization may blocks seem to have strict filtering policies, as they were prohibit access from those countries. Alternatively, a coun- reached only by W and not C or E in the 21 days period. try’s military or government websites may limit access to foreign users. Distinguishing country-level peninsulas from simple out- ages requires multiple VPs in the same country. Unfortu- nately the source data we use only has multiple VPs for the 4.3 Can Chiloe Detect Islands? . We therefore look for U.S.-specific peninsulas Chiloe (subsection 3.4) detects islands when a VP within the where only these VPs can reach the target and the non-U.S.- island can reach less than half the rest of the world. When VPs cannot, or vice versa. We begin by considering the 501 less than 50% of the network replies, it means that the VP cases where Taitao reports that only U.S. VPs can see the is either in an island (for brief events, or when replies drop target, and compare that to how Ark VPs respond. near zero) or a peninsula (long-lived partial replies). 8 Target AS Target Prefix Trinocular Sites Up At Before At Before Blk Island Addr Island Peninsula 0 21,765 32,489 1,775 52,479 Ark 1 587 1,197 113 1,671 Island 2 19 2 Country Pen Non Cntry Pen 2 2,981 4,199 316 6,864 3 12,709 11,802 2,454 22,057 Country 382 47 0 8 566 Peninsula 4 117,377 62,881 31,211 149,047

Peninsula Chiloe 5 101,516 53,649 27,298 127,867 Non Cntry 3 98 1-5 235,170 133,728 61,392 307,506 Specific Country Peninsula Table 5: Chiloe confusion matrix, 6 967,888 812,430 238,182 1,542,136 Table 4: Country specific penin- events between 2017-01-04 and Table 6: Halt location of failed sula detection confusion matrix. 2020-03-31. Datasets A28 through traceroutes for peninsulas longer Dataset A30, 2017q4. A39. than 5 hours. Dataset A41, 2020q3.

To validate Chiloe’s correctness, we compare when a sin- 5 QUANTIFYING ISLANDS AND gle VP believes to be in an island, against what the rest of PENINSULAS the world believes about that VP. We next apply our approach to the Internet. For peninsulas: We establish ground truth at a block level granularity—if how often do they occur (subsection 5.1), how long do they VP 푥 can reach its own block when 푥 believes to be in an 푥 last (subsection 5.2), and how big are they (subsection 5.3)? island, while other external VPs can’t reach ’s block, then These evaluations characterize how effective systems using 푥’s island is confirmed. On the other hand, if an external 푥 푥 overlay routing [1, 38] are. We also look at peninsula location VP can reach ’s block, then is not in island, but in a by ISP (subsection 5.4) and country (subsection 5.5). Finally, peninsula. In subsection 6.2 we show that Trinocular VPs we look at island frequency (subsection 5.6) and the implica- are independent, and therefore no two VPs live within the tions of country-level internet secession (subsection 5.7). same island. We take 3 years worth of data from each of the six Trinoc- ular VPs. Since Trinocular measures block asynchronously 5.1 How Common are Peninsulas? every 11 minutes, we use 11 minute timebins as our snapshot Our goal is to quantify how big the problem of peninsulas is. of the Internet. We ignore cases where the VP can access Outages have brought large attention by researchers both in 95% or more of the Internet. Academia and Industry given the dire effects down time has In Table 5 we show that Chiloe detects 23 islands across over network services [59]. three years. In 2 of these events, the block is unreachable Our approach is to measure the duration of peninsulas from other VPs, confirming the island with our ground-truth across the Internet. We also vary the number of VP used methodology. Manual inspection confirms that the remain- to identify whether peninsula down time converges in a ing 19 events are islands too, but at the address level—the diminishing returns curve. VP was unable to reach anything but did not lose power, We consider three types of events: all up, all down, and and other addresses in its block were reachable from VPs at disagreement. In the first two cases, all available VPs agree other locations. Finally, for 2 events, the prober’s block was that the target block is either up or down. In the last case, reachable during the event by every site including the prober VPs disagree on block state, that is, a peninsula. itself which suggests partial connectivity (a peninsula), and We run Taitao over Trinocular data for 2017q4 [57] to therefore a false positive. determine duration in each state. Figure 5 shows the distri- The 566 non-island events (true negatives) are when more bution of peninsulas measured as a fraction of block-time than 5% but less than 50% of the Internet was inaccessible for an increasing number of sites. We consider all possible from a single VP. In each of these cases, one or more other combinations of the six sites. VPs were able to reach the affected VP’s block, showing they With more VPs we get a better view of the Internet’s over- were not an island (although perhaps a peninsula). We omit all state. As more reporting sites are added, more peninsulas the very frequent events when less than 5% of the network are discovered. That is, previously block states erroneously is unavailable from the VP from the table, although they too inferred as all up or all down, are corrected to peninsulas. are true negatives. All-down (left) decreases from an average of 0.00082 with 2 Bold red shows 8 false negatives. These are events that last VPs to 0.00074 for 6 VPs. All-up (right) goes down a relative about 2 Trinocular rounds or less (22 min), often not enough 47% from 0.9988 to 0.9984, while disagreements (center) in- time for Trinocular to change its belief on block state. crease from 0.0029 to 0.00045. Outages (left) converge after 3 sites, as shown by the fitted curve and decreasing variance. Peninsulas and all-up converge more slowly. 9 All Down Disagreement All Up 0.00200 0.00200 1.0000 0.00000

0.00175 0.00175 0.00025

0.00150 0.00150 0.9995 0.00050

0.00125 0.00125 0.00075

0.00100 0.00100 0.9990 0.00100

0.00075 0.00075 0.00125 Complement

Fraction of Block-Time 0.00050 Fraction of Block-Time 0.00050 Fraction of Block-Time 0.9985 0.00150

0.00025 0.00025 0.00175

0.00000 0.00000 0.9980 0.00200 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 Number of Reporting Sites Number of Reporting Sites Number of Reporting Sites

Figure 5: Distribution of block-time fraction over sites reporting all down (left), disagreement (center), and all up (right), for events longer than one hour. Dataset A30, 2017-10-06 to 2017-11-16.

We conclude that a few sites (3 or 4) provide a good esti- The number of day-long or multi-day peninsulas is small, mate of true outages, but peninsulas are common and previ- only 1.7M events (7%, the purple line). However, about 90% ously underreported. Six VPs see slightly more block-time of all peninsula-time is in such longer-lived events (the right, in peninsulas than outages. Other quarters (see Appendix C) dashed line), and 50% of time is in events lasting 10 days show a slightly greater difference. or more, even when longer than 5 hours events are less nu- merous (compare the middle, green line to the left, purple line). Events lasting a day are long-enough that can be de- bugged by human network operators, and events lasting 5.2 How Long Do Peninsulas Last? longer than a week are long-enough that they may represent Peninsulas have multiple root causes: some are short-lived policy disputes. Together, these long-lived events suggest routing misconfigurations while others may be long-term that there is benefit to identifying non-transient peninsulas disagreements in routing policy. In this section we determine and addressing the underlying routing problem. the distribution of peninsulas in terms of their duration to determine the prevalence of persistent peninsulas. We will show that there are millions of brief peninsulas, likely to 5.3 What Is the Size of Peninsulas? routing transients, but that 90% of peninsula-time is in long- When network issues cause connectivity problems like penin- lived events (5 h or more). sulas, the size of those problems may vary, from country-size To characterize peninsula duration we use Taitao to detect (see subsection 5.5), to AS-size, and also for routable prefixes peninsulas that occurred during 2017q4. For all peninsulas, or fractions of prefixes. We next examine peninsula sizes. we see 23.6M peninsulas affecting 3.8M unique blocks. If We begin with Taitao peninsula detection at a /24 block instead we look at long-lived peninsulas (at least 5 h), we see level. We match peninsulas across blocks within the same 4.5M peninsulas in 338k unique blocks. Figure 6a examines prefix by start time and duration, both measured inone the duration of these peninsulas in three ways: the cumula- hour timebins. This match implies that the Trinocular VPs tive distribution of the number of peninsulas of each size for observing the blocks as up are also the same. all events (left, solid, purple line), the cumulative distribu- We compare peninsulas to routable prefixes from Route- tion of the number of peninsulas of each size for VP down views [43]. We perform longest prefix match between /24 events longer than 5 hours (middle, solid green line), and blocks and prefixes. the cumulative size of peninsulas for VP down events longer Routable prefixes consist of many blocks, some of which than 5 hours (right, dashed green line). may not be measurable. We therefore define the peninsula- We see that there are many very brief peninsulas (purple prefix fraction for each routed prefix as fraction of blocks line): about 33% last from 20 to 60 minutes (about 2 to 6 in the peninsula that are Trinocular-measurable blocks. To measurement rounds). Such events are not just one-off loss, reduce noise provided by single block peninsulas, we only since they last at least two observation periods. These results consider peninsulas covering 2 or more blocks in a prefix. suggest that while the Internet is robust, there are many Figure 6b shows the number of peninsulas for different small connectivity glitches (7.8M events). prefix lengths and the fraction of the prefix affected bythe In addition, we see some events that are two rounds (20 peninsula as a heat-map, where we group them into bins. minutes) or shorter. Such events could be BGP transients or We see that about 10% of peninsulas are likely due to failures due to random packet loss. routing problems or policies, since 40k peninsulas affect the 10 50k 0.1 1 Events

Fraction 0.0

0 Duration 1.0 1.0 10 2 0.8 0.9 104 0.9 Duration (> 5hrs.) 0.8 0.8 10 3 0.6 0.7 0.7 103 0.6 Peninsulas (all) 0.6 CDF 4 0.4 0.5 0.5 10 102 0.4 0.4 Events 0.3 0.3 Prefix Fraction 5 0.2 Peninsulas (> 5hrs.) Prefix Fraction 10 0.2 101 0.2 Duration Fraction 0.1 0.1 0 0.0 10 6 1 min 10 min 1 hr 6 hrs 1 day 10 days 0.0 100 0 100k 9 11 13 15 17 19 21 23 0.0 0.1 9 11 13 15 17 19 21 23 Peninsula Duration Events Prefix Length Duration Fraction Prefix Length (a) Cumulative events (solid) and duration (b) Number of Peninsulas (c) Duration fraction (dashed)

Figure 6: Peninsulas measured with per-site down events longer than 5 hours. Dataset A30, 2017q4. whole routable prefix. However, a third of peninsulas (101k, terminate before reaching the target AS. Because peninsulas at the bottom of the plot) affect only a very small fraction of are more often at or in an AS, while outages occur in many the prefix. These low prefix-fraction peninsulas suggest that places, it suggests that peninsulas are policy choices. more than half of peninsulas happen inside an ISP and are not due to interdomain routing. 5.5 Country-Level Peninsulas Finally, we show that longer-lived peninsulas are likely Country-specific filtering is a routing policy made bynet- due to routing or policy choices. Figure 6c shows the same works to restrict traffic they receive. We next look into what data source, but weighted by fraction of time each penin- type of organizations actively block overseas traffic. For ex- sula contributes to the total peninsula time during 2017q4. ample, good candidates to restrain who can reach them for Here the larger fraction of weight are peninsulas covering security purposes are government related organizations. full routable prefixes—20% of all peninsula time during the We test for country-specific filtering (subsection 3.3) over quarter (see left margin). 2017q4 and find 429 unique U.S.-only blocks in 95 distinct ASes. We then manually verify each AS categorized by in- 5.4 Where Do Peninsulas Occur? dustry in Table 7. It is suprising how many universities filter Firewalls, link failures, and routing problems cause peninsu- by country. While not common, country specific blocks do las on the Internet. These can either occur inside a given AS, occur. or in upstream providers. To detect where the Internet breaks into peninsulas, we 5.6 How Common Are Islands? look at traceroutes that failed to reach their target address, Multiple groups have shown that there are many network either due to a loop or an ICMP unreachable message. Then, outages in the Internet [31, 47, 49, 52, 53]. We have described we find where these traces halt, and take note whether halt- (section 2) two kinds of outages: full outages where all com- ing occurs at the target AS and target prefix, or before the puters at a site are down (perhaps due loss of power), and target AS and target prefix. islands, where the site is cut off from the Internet but com- For our experiment we run Taitao to detect peninsulas puters at the site can talk between themselves. We next use at target blocks over Trinocular VPs, we use Ark’s tracer- Chiloe to determine how often islands occur. Chiloe provides outes [11] to find last IP address before halt, and we get target a very limited viewpoint—we can only confirm islands that and halting ASNs and prefixes using RouteViews. are around one of the six VPs, but examination of three years In Table 6 we show how many traces halt at or before of data allows us to confirm islands. the target network. The center, gray rows show peninsulas Figure 7 shows the fraction of the Internet that is reachable (disagreement between VPs) with their total sum in bold. from 6 VPs for each 11 minutes, over three years, from 2017- For all peninsulas (the bold row), more traceroutes halt at or 04-01 to 2020-04-01, with a dotted line at the 50% threshold inside the target AS (235k vs. 134k, the left columns), but they that Chiloe uses to detect an island (subsection 3.4). We run more often terminate before reaching the target prefix (308k Chiloe across each VP for this period. vs. 61k, the right columns). This difference suggests policy is Table 9 shows the number of islands per VP over this implemented at or inside ASes, but not at routable prefixes. period. Over the 3 years, all six VPs see from 1 to 5 islands. By contrast, outages (agreement with 0 sites up) more often In addition, we see that islands do not always cause the 11 W G J C N E 1 GC W J WW WE C E W 0.8 N N E

0.6

0.4

0.2 Fraction of InternetDown

0

2017-04-01 2017-07-01 2017-10-01 2018-01-01 2018-04-01 2018-07-01 2018-10-01 2019-01-01 2019-04-01 2019-07-01 2019-10-01 2020-01-01 2020-04-01

Figure 7: Islands detected across 3 years using six VPs. Datasets A28-A39.

Ark Traceroutes 1M 69k 76k 199k 558k 744k 250M Industry ASes Blocks 1.0 ISP 23 138 0.8 Gap Education 21 167 Loop Communications 14 44 0.6 Healthcare 8 18 Government 7 31 0.4 Unreachable Datacenter 6 11 Success IT Services 6 8 0.2 Fraction of Traceroutes

Finance 4 6 0.0 0 1 2 3 4 5 6 Building Security, Cloud, Construction, 6 (1 each type) Trinocular Sites Observing Up Grocery Stores, Insurance, Media Table 7: U.S. only blocks. Dataset A30, 2017q4 Figure 8: Ark traceroutes sent to targets under partial outages (2017-10-10 to -31). Dataset A30. entire Internet to be unreachable, and there are a number of To evaluate the power of any country or RIR to control cases where from 20% to 50% of the Internet is inaccessible. the Internet, Table 8 shows how many IPv4 hosts and IPv6 We believe these cases represent brief islands, since islands /32s are allocated to each Regional Internet Registry (RIR) shorter than an 11 minute complete scan will only be partially [35, 36]. In IPv4, some address ranges have special purposes observed. We find 12 in the 20% to 50% range, all are short, and therefore not allocatable: like Multicast, Class E blocks and 4 are less than 11 minutes. reserved for future use. Additionally, IANA reserves 0/8, 10/8, We conclude that islands do happen, but they are rare, and and 127/8 for local network usage. at irregular times. This finding is consistent with importance We see that no individual RIR or country can secede and of the Internet at the locations where we run VPs. take “the Internet”, because none controls the majority of IPv4 addresses. ARIN has the largest amount with 1673M allocated, that is, 45.2%. Inside ARIN, the United States has the majority of hosts (1617M). 5.7 Can the Internet Partition? This claim also applies to IPv6, where no RIR or country In subsection 3.5 we discussed threats to secede from the surpasses a 50% allocation. RIPE (an RIR) is close with 46.7%, Internet, and suggested that connection to more than half and China and the U.S. have high country allocations. With the addresses defines “the Internet” in the face of partition. most of IPv6 unallocated, these fractions may change. Most threats to secede have been by countries or groups We conclude that no individual country can declare itself of countries. If a country were to exert control over their “the Internet” without reallocating addresses or colluding allocated addresses would result in a country level island or with other countries. Suggesting, that the Internet today is peninsulas. We next use our reachability definition of more an international creation. than 50% to reason about control of the IP address space. We therefore next look at address allocation by country to see if any country or group has enough addresses to secede and claim to be “the Internet” with a majority of addresses. 12 6 OUTAGES REVISITED up (1) or down (0) and disagrees with the others, and for 퐷∗, 푆 6.1 Observed Outage and External Data the pair disagrees with each other. 푃 ranges from 1, where the pair always agrees, to 0, where they always disagree. To evaluate outage classification with conflicting informa- Table 10(a) shows similarity values for each pair of the 6 tion, we consider Trinocular reports and compare to external Trinocular VPs. (We show only half of the symmetric matrix.) information in traceroutes from CAIDA Ark. No two sites have a similarity more than 0.14, and most Figure 8 compares Trinocular with 21 days of Ark topology pairs are under 0.08. This result shows that no two sites are data, from 2017-10-10 to -31 from all 3 probing teams. For particularly correlated. each Trinocular outage we classify the Ark result as success or three types of failure: unreachable, loop, or gap. Trinocular’s 6-site-up case suggests a working network, 7 RELATED WORK and we consider this case as typical. However, we see that A number of works have previously tried to define the In- about 25% of Ark traceroutes are “gap”, where several hops ternet [12, 26, 27, 46]. As discussed in subsection 2.1, they fail to reply. We also see about 2% of traceroutes are un- distinguish the Internet from other networks of their time, reachable (after we discard traceroutes to never reachable but do not address today’s network disputes and secession addresses). Ark probes a random address in each block; many threats. addresses are non-responsive, explaining these. Previous work has looked into the problem of partial out- With 1 to 5 sites up, Trinocular is reporting disagreement. ages. RON provides alternate-path routing around failures We see that the number of Ark success cases (the green, for a mesh of sites [1]. LIFEGUARD, proposes a route failure lower portion of each bar) falls roughly linearly with the remediation system by generating BGP messages to reroute number of successful observers. This consistency suggests traffic through a working38 path[ ]. While both solve the that Trinocular and Ark are seeing similar behavior, and that problem of partial outages, neither quantifies the amount, there is partial reachability—these events with only partial duration, or scope of partial outages in the Internet. Trinocular positive results are peninsulas. Internet scanners have examined bias by location [33], We observe that 5 sites show the same results as all 6, more recently looking for policy-based filtering58 [ ]. We so single-VP failures likely represent problems local to that measure policies with our country specific algorithm, and VP. This suggests that all-but-one is a good algorithm to we extend those ideas to defining the Internet. determine true outages. Outage detection systems have encountered partial out- With only partial reachability, with 1 to 4 VPs (of 6), we ages. Thunderping recognizes the “hosed” state of partial see likely peninsulas. These cases confirm that partial con- replies as something that occurs, but leaves its study to future nectivity is common: while there are 1M traceroutes sent work [52]. Trinocular discards partial outages by reporting to outages where no VP can see the target (the number of the target block “up” if any VP can reach it [47]. To the best events is shown on the 0 bar), there are 1.6M traceroutes sent of our knowledge, prior outage detection systems have not to partial outages (bars 1 to 5), and 850k traceroutes sent both explained and reported partial outages as part of the to definite peninsulas (bars 1 to 4). This result is consistent Internet, nor studied their extent. with the convergence we see in Figure 5. We use the idea of majority to define the Internet in the face of secession. That idea is fundamental in many algo- 6.2 Are the sites independent? rithms for distributed consensus [41, 42, 44], with applica- Our evaluation assumes VPs do not share common network tions for example to certificate authorities [8]. paths. Two VPs in the same location would share the same local outages, but those in different physical locations will 8 CONCLUSIONS often use different network paths, particularly with a “flat- This paper provided a new definition of the Internet to rea- ter” Internet graph [40]. We next quantify this similarity to son about partial connectivity and secession. We developed validate our assumption. the algorithm Taitao, to find peninsulas of partial connec- We next measure similarity of observations between pairs tivity, and Chiloe, to find islands. We showed that partial of VPs. We examine only cases where one of the pair dis- connectivity events are more common than simple outages, agrees with some other VP, since when all agree, we have no and used these definitions to clarify outages and the Internet. new information. If the pair agrees with each other, but not with the majority, the pair shows similarity. If they disagree with each other, they are dissimilar. We quantify similarity 푆푃 ACKNOWLEDGMENTS for a pair of sites 푃 as 푆푃 = (푃1 + 푃0)/(푃1 + 푃0 + 퐷∗), where The authors would like to thank John Wroclawski for his 푃푠 indicates the pair agrees on the network having state 푠 of input on an early version of the paper. 13 RIR IPv4 hosts IPv6 /32s AFRINIC 121M 3.3% 9,661 3% APNIC 892M 24.0% 88,614 27.8% Sites Events /Year China 345M 9.3% 54,849 17.2% W 5 1.67 CJGEN ARIN 1673M 45.2% 56,172 17.6% C 2 0.67 W 0.017 0.031 0.019 0.035 0.020 U.S. 1617M 43.7% 55,026 17.3% J 1 0.33 C 0.077 0.143 0.067 0.049 LACNIC 191M 5.2% 15,298 4.8% G 1 0.33 J 0.044 0.036 0.046 RIPE NCC 826M 22.3% 148,881 46.7% E 3 1.00 G 0.050 0.100 124M 3.3% 22,075 6.9% N 2 0.67 E 0.058 Allocated 221 100% 318,626 100% All 14 4.67 Table 10: Similarities between Table 8: RIR IPv4 hosts and IPv6 Table 9: Islands detected from sites relative to all six. Dataset: /32 allocation [35, 36] 2017-04-01 to 2020-04-01 A30, 2017q4.

The work is supported in part by the National Science topology_dataset.xml. Foundation, CISE Directorate, award CNS-2007106 and NSF- [11] CAIDA. 2020. The CAIDA UCSD IPv4 Routed /24 Topology Dataset - 2028279. The U.S. Government is authorized to reproduce 2020-09-01 to -31. https://www.caida.org/data/active/ipv4_routed_24_ topology_dataset.xml. and distribute reprints for Governmental purposes notwith- [12] Vinton Cerf, Yogen Dalal, and Carl Sunshine. 1974. Specification of standing any copyright notation thereon. Internet Transmission Control Program. RFC 675. Internet Request For Comments. https://doi.org/10.17487/RFC675 [13] David D. Clark. 1988. The Design Philosophy of the DARPA Internet REFERENCES Protocols. In Proceedings of the 1988 (johnh: folder: networking). ACM, [1] David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, and Robert 106–114. Morris. 2001. Resilient Overlay Networks. In Proceedings of the Sym- [14] CNBC. 2019. Russia just brought in a law to try to disconnect its posium on Operating Systems Principles. ACM, Chateau Lake Louise, Internet from the rest of the world. https://www.cnbc.com/2019/11/01/ Alberta, Canada, 131–145. http://www-cse.ucsd.edu/sosp01/papers/ russia-controversial-sovereign-internet-law-goes-into-force.html. andersen.pdf [15] Cogent. 2021. Looking Glass. https://cogentco.com/en/looking-glass. [2] Anonymous. 2012. The collateral damage of by [16] James Cowie. 2011. Egypt Leaves the Internet. Renesys DNS injection. ACM Computer Communication Review 42, 3 (July 2012), http://www.renesys.com/blog/2011/01/egypt-leaves-the-internet. 21–27. https://doi.org/10.1145/2317307.2317311 shtml. http://www.renesys.com/blog/2011/01/egypt-leaves-the- [3] Anonymous. 2014. Towards a Comprehensive Picture of the Great internet.shtml Firewall’s DNS Censorship. In Proceedings of the USENIX Workshop on [17] Alberto Dainotti, kc claffy, Alistair King, Vasco Asturiano, Karyn Ben- Free and Open Communciations on the Internet. USENIX, San Diego, son, Marina Fomenkov, Brad Huffaker, Young Hyun, Ken Keys, Ryan CA, USA, 7. https://www.usenix.org/system/files/conference/foci14/ Koga, Alex Ma, Chiara Orsini, and Josh Polterock. 2017. IODA: In- foci14-anonymous.pdf ternet Outage Detection & Analysis. Talk at CAIDA Active Internet [4] Cathy Aronson. 2015. To Squat Or Not To Squat? blog https://teamarin. Measurement Workshop (AIMS). http://www.caida.org/publications/ net/2015/11/23/to-squat-or-not-to-squat/. https://teamarin.net/2015/ presentations/2017/ioda_aims/ioda_aims.pdf 11/23/to-squat-or-not-to-squat/ [18] Alberto Dainotti, Claudio Squarcella, Emile Aben, Marco Chiesa, Kim- [5] ArsTechnica. 2010. Peering problems: digging into the Com- berly C. Claffy, Michele Russo, and Antonio Pescapé. 2011. Analysis of cast/Level 3 grudgematch. https://arstechnica.com/tech-policy/2010/ Country-wide Internet Outages Caused by Censorship. In Proceedings 12/comcastlevel3/. of the ACM Internet Measurement Conference. ACM, Berlin, Germany, [6] Guillermo Baltra and John Heidemann. 2020. Improving Coverage 1–18. https://doi.org/10.1145/2068816.2068818 of Internet Outage Detection in Sparse Blocks. In Proceedings of the [19] Dhaka Tribune Desk. 2018. Internet services to be sus- Passive and Active Measurement Workshop. Springer, Eugene, Oregon, pended across the country. Dhaka Tribune (Feb. 11 2018). USA. https://www.isi.edu/%7ejohnh/PAPERS/Baltra20a.html http://www.dhakatribune.com/regulation/2018/02/11/internet- [7] Genevieve Bartlett, John Heidemann, and Christos Papadopoulos. 2007. services-suspended-throughout-country/ Understanding Passive and Active Service Discovery. In Proceedings of [20] Amogh Dhamdhere, David D. Clark, Alexander Gamero-Garrido, the ACM Internet Measurement Conference. ACM, San Diego, California, Matthew Luckie, Ricky K. P. Mok, Gautam Akiwate, Kabir Gogia, USA, 57–70. https://doi.org/10.1145/1298306.1298314 Vaibhav Bajpai, Alex C. Snoeren, and kc claffy. 2018. Inferring Persis- [8] Henry Birge-Lee, Yixin Sun, Anne Edmundson, Jennifer Rexford, and tent Interdomain Congestion. In Proceedings of the ACM SIGCOMM Prateek Mittal. 2018. Bamboozling certificate authorities with BGP. In Conference. ACM, Budapest, Hungary, 1–15. https://doi.org/10.1145/ 27th USENIX Security Symposium. USENIX, Baltimore, Maryland, USA, 3230543.3230549 833–849. [21] DINRG. 2021. Decentralized Internet Infrastructure Research Group. [9] Scott Burleigh, Vinton Cerf, Robert Durst, Kevin Fall, Adrian Hooke, https://irtf.org/dinrg. Keith Scott, and Howard Weiss. 2003. The InterPlaNetary Internet: a [22] Peter K. Dunn. 2021. Scientific Research Methods. https://bookdown. communications infrastructure for Mars exploration. Acta astronautica org/pkaldunn/Book/. 53, 4-10 (2003), 365–373. [10] CAIDA. 2017. The CAIDA UCSD IPv4 Routed /24 Topology Dataset - 2017-10-10 to -31. https://www.caida.org/data/active/ipv4_routed_24_ 14 [23] Zakir Durumeric, Michael Bailey, and J Alex Halderman. 2014. An [42] Leslie Lamport, Robert Shostak, and Marshall Pease. 1982. The Byzan- Internet-wide view of Internet-wide scanning. In 23rd {USENIX} Se- tine Generals Problem. ACM Transactions on Programming Languages curity Symposium ({USENIX} Security 14). 65–78. and Systems 4, 3 (July 1982), 382–401. [24] Economist Editors. 2018. Why some countries are turning [43] D. Meyer. 2018. University of Oregon Routeviews. http://www. off the internet on exam days. The Economist (July 5 2018). routeviews.org. https://www.economist.com/middle-east-and-africa/2018/07/05/ [44] Satoshi Nakamoto. 2009. Bitcoin: A Peer-to-Peer Electronic Cash why-some-countries-are-turning-off-the-internet-on-exam-days System. Released publically http://bitcoin.org/bitcoin.pdf. http:// (Appeared in the Middle East and Africa print edition). bitcoin.org/bitcoin.pdf [25] Hurricane Electric. 2021. Looking Glass. http://lg.he.net/. [45] Ramakrishna Padmanabhan, Amogh Dhamdhere, Emile Aben, kc claffy, [26] Engadget. 2020. China, Huawei propose with a built- and Neil Spring. 2016. Reasons Dynamic Addresses Change. In Proceed- in killswitch. https://www.engadget.com/2020-03-30-china-huawei- ings of the ACM Internet Measurement Conference (johnh: pafile). ACM, new-ip-proposal.html. Santa Monica, CA, USA, 183–198. https://doi.org/10.1145/2987443. [27] Federal Networking Council (FNC). 1995. Definition of “Internet”. 2987461 https://www.nitrd.gov/fnc/internet_res.pdf. [46] Jonathan B. Postel. 1980. Internetwork Protocol Approaches. IEEE [28] HE forums. 2017. Cloudflare Blocked on Free Tunnels now? https: Trans. Comput. 28, 4 (April 1980), 604–611. https://doi.org/10.1109/ //forums.he.net/index.php?topic=3805.0. TCOM.1980.1094705 [29] Samuel Gibbs. 1996. Iraq shuts down the Internet to [47] Lin Quan, John Heidemann, and Yuri Pradkin. 2013. Trinocular: Under- stop pupils cheating in exams. (18 May 1996). https: standing Internet Reliability Through Adaptive Probing. In Proceedings //www.theguardian.com/technology/2016/may/18/iraq-shuts- of the ACM SIGCOMM Conference. ACM, , China, 255–266. down-internet-to-stop-pupils-cheating-in-exams https://doi.org/10.1145/2486001.2486017 [30] Google. 2021. Google IPv6 Statistics. https://www.google.com/intl/en/ [48] Dan Rayburn. 2016. Google Blocking IPv6 Adoption With Cogent, Im- ipv6/statistics.html. pacting Transit Customers. https://seekingalpha.com/article/3948876- [31] Andreas Guillot, Romain Fontugne, Philipp Winter, Pascal Merindol, google-blocking-ipv6-adoption-cogent-impacting-transit- Alistair King, Alberto Dainotti, and Cristel Pelsser. 2019. Chocolatine: customers. Outage Detection for Internet Background Radiation. In Proceedings of [49] Philipp Richter, Ramakrishna Padmanabhan, Neil Spring, Arthur the IFIP International Workshop on Traffic Monitoring and Analysis. IFIP, Berger, and David Clark. 2018. Advancing the Art of Internet Edge Paris, , 8. https://clarinet.u-strasbg.fr/~pelsser/publications/ Outage Detection. In Proceedings of the ACM Internet Measurement Guillot-chocolatine-TMA2019.pdf Conference. ACM, Boston, Massachusetts, USA, 350–363. https: [32] Hang Guo and John Heidemann. 2018. Detecting ICMP Rate Limiting //doi.org/10.1145/3278532.3278563 in the Internet. In Proceedings of the Passive and Active Measurement [50] RIPE NCC. 2020. DNSMON. https://atlas.ripe.net/dnsmon. Workshop (johnh: pafile). Springer, Berlin, Germany, to appear. https: [51] Sen. John D. Rockefeller. 2009. Cybersecurity Act of 2010. https: //www.isi.edu/%7ejohnh/PAPERS/Guo18a.html //www.congress.gov/bill/111th-congress/senate-bill/773. [33] John Heidemann, Yuri Pradkin, Ramesh Govindan, Christos Pa- [52] Aaron Schulman and Neil Spring. 2011. Pingin’ in the Rain. In Pro- padopoulos, Genevieve Bartlett, and Joseph Bannister. 2008. Census ceedings of the ACM Internet Measurement Conference. ACM, Berlin, and Survey of the Visible Internet. In Proceedings of the ACM Inter- Germany, 19–25. https://doi.org/10.1145/2068816.2068819 net Measurement Conference. ACM, Vouliagmeni, Greece, 169–182. [53] Anant Shah, Romain Fontugne, Emile Aben, Cristel Pelsser, and Randy https://doi.org/10.1145/1452520.1452542 Bush. 2017. Disco: Fast, Good, and Cheap Outage Detection. In Pro- [34] Jon Henley. 2018. Algeria blocks internet to prevent students cheating ceedings of the IEEE International Conference on Traffic Monitoring and during exams. (22 June 2018). https://www.theguardian.com/world/ Analysis. Springer, Dublin, Ireland, 1–9. https://doi.org/10.23919/TMA. 2018/jun/21/algeria-shuts-internet-prevent-cheating-school-exams 2017.8002902 [35] IANA. 2021. IPv4 Address Space Registry. https://www.nro.net/about/ [54] Paul F. Tsuchiya and Tony Eng. 1993. Extending the IP Internet rirs/statistics/. Through Address Reuse. ACM Computer Communication Review [36] IANA. 2021. IPv6 RIR Allocation Data. https://www.iana.org/numbers/ 23, 1 (Jan. 1993), 16–33. http://www.cs.cornell.edu/People/francis/ allocations/. tsuchiya93extending.pdf [37] IPNSIG. 2020. InterPlanetary Networking Special Interest Group. http: [55] Patrick Tucker. 2019. Russia Plans to Cut Off Some Internet Access //ipnsig.org/. Next Week. website https://www.defenseone.com/technology/ [38] Ethan Katz-Bassett, Colin Scott, David R. Choffnes, Ítalo Cunha, Vytau- 2019/12/russia-plans-cut-some-internet-access-next-week/162028/. tas Valancius, Nick Feamster, Harsha V. Madhyastha, Tom Anderson, https://www.defenseone.com/technology/2019/12/russia-plans-cut- and Arvind Krishnamurthy. 2012. LIFEGUARD: Practical Repair of some-internet-access-next-week/162028/ Persistent Route Failures. In Proceedings of the ACM SIGCOMM Con- [56] European Union. 2021. Regulation (EU) 2016/679 of the European ference. ACM, Helsinki, Finland, 395–406. https://doi.org/10.1145/ Parliament and of the Council of 27 April 2016 on the protection of 2377677.2377756 natural persons with regard to the processing of personal data and [39] DataCenter Knowledge. 2009. Peering Disputes Migrate to on the free movement of such data. https://eur-lex.europa.eu/eli/reg/ IPv6. https://www.datacenterknowledge.com/archives/2009/10/22/ 2016/679/oj. peering-disputes-migrate-to-ipv6. [57] USC/LANDER Project. 2014. Internet Outage Measurements. listed [40] Craig Labovitz, Scott Iekel-Johnson, Danny McPherson, Jon Oberheide, on web page https://ant.isi.edu/datasets/outage/. and Farnam Jahanian. 2010. Internet Inter-Domain Traffic. In Proceed- [58] Gerry Wan, Liz Izhikevich, David Adrian, Katsunari Yoshioka, Ralph ings of the ACM SIGCOMM Conference. ACM, New Delhi, , 75–86. Holz, Christian Rossow, and Zakir Durumeric. 2020. On the Origin of https://doi.org/10.1145/1851182.1851194 Scanning: The Impact of Location on Internet-Wide Scans. In Proceed- [41] Leslie Lamport. 1998. The Part-Time Parliament. ACM Transactions on ings of the ACM Internet Measurement Conference. ACM, Pittsburgh, Computer Systems 16, 2 (May 1998), 133–169. https://doi.org/10.1145/ PA, USA, 662–679. https://doi.org/10.1145/3419394.3424214 279227.279229 15 0 100 W 200 0 100 C 200 20000 0 100 J 200 0 15000 100 G 200 0 10000 100 up (truth) E Source Vantage Point 200 up (implied) Last Octet of Target Block 0 down (truth) # BGP Update Msgs 100 down (implied) N 5000 200 Oct-23 06:00 12:00 18:00 Oct-24 06:00 [2017]12:00

0 Figure 10: A block (80.245.176.0/24) showing a 3-hour 20:00 21:00 22:00 23:00 Oct-24 01:00 02:00 peninsula accessible only from VP W (top bar) and not from the other five VPs. Dataset: A30. Figure 9: BGP update messages sent for affected Polish blocks starting 2017-10-23t20:00Z. Data source: Route- Views. 100 200 W

100 C [59] Samuel Woodhams and Simon Migliano. 2021. The Global Cost of Inter- 200 net Shutdowns in 2020. https://www.top10vpn.com/cost-of-internet- 100 J shutdowns/. 200 100

200 G

A VALIDATION OF THE POLISH 100 up (truth) E 200 Source Vantage Point PENINSULA up (implied) Last Octet of Target Block 100 down (truth) down (implied) In subsubsection 2.3.3 we reported a peninsula covering 5 200 N polish ASes that lasted for three hours. Figure 3 shows the AS 18:00 20:00 22:00 Jun-04 02:00 04:00 [2017] level map and the relationship between ASes at the time of the peninsula. Multimedia Polska (AS21021) is the provider Figure 11: A block showing a 1-hour island for this of the other 4 ISPs. Similarly, Multimedia Polska has two block and VP E, while other five VPs cannot reach it. main providers, Cogent (AS174) and Tata (AS6453). Before the peninsula, we find that most traffic to Multimedia Polska from our probers is routed through Cogent. traceroutes to the affected blocks. Of the 94 VPs, 21 VPs (22%) Using Taitao we detect that the event starts on 2017-10- have their last responsive traceroute hop in the same AS as 23t20:00Z. About the same time, we observe a high number the target address, and 68 probes (73%) stopped before reach- (20,275) of BGP update messages announcing and withdraw- ing that AS. Table 11 shows traceroute data from a single ing routes to the affected blocks (see Figure 9). These updates CAIDA Ark VP before and during the peninsula described correspond to Tata announcing Multimedia Polska’s prefixes. in subsubsection 2.3.3 and Figure 4. This data confirms the Probably either due to Multimedia Polska changing its peer- block was reachable from some locations and not others. ing relationships, and making Tata more prominent over During the event, this trace breaks at the last hop within the Cogent, or a failure in the Multimedia Polska - Cogent link. source AS. Initially, traffic routed through Cogent is not successful at shifting to Tata, except Level3, which keeps Multimedia Polska reachable from prober W during the whole time. The B VALIDATION OF THE SAMPLE ISLAND event ends after 3 hours, where we see a peak of 23,487 BGP In subsubsection 2.3.2 we reported an island affecting a /24 messages updating routes to the affected prefixes. (see Fig- block where VP E lives. During the time of the event, E ure 9). was able to successfully probe addresses within the same In Figure 10 we provide data from our 6 external VPs, block, however, unable to reach external addresses. This where W is uniquely capable of reaching the target block, event started at 2017-06-03t23:06Z, and can be observed thus living in the same peninsula. in Figure 7. We further verify this event by looking at traceroutes. Furthermore, no other VP was able to reach the affected During the event we see 94 unique Ark VPs attempted 345 block for the time of the island as shown in Figure 11. 16 src block dst block time traces q, 148.245.170.161, 189.209.17.197, 189.209.17.197, 38.104.245.9, 154.24.19.41, 154.54.47.33, 154.54.28.69, 154.54.7.157, 154.54.40.105, 154.54.40.61, 154.54.43.17, c85eb700 50f5b000 1508630032 154.54.44.161, 154.54.77.245, 154.54.38.206, 154.54.60.254, 154.54.59.38, 149.6.71.162, 89.228.6.33, 89.228.2.32, 176.221.98.194 c85eb700 50f5b000 1508802877 q, 148.245.170.161, 200.38.245.45, 148.240.221.29 Table 11: Traces from the same Ark VPs (mty-mx) to the same destination before and during the event block

C ADDITIONAL RESULTS C.3 Additional Confirmation of Size We use Trinocular measurements for 2017q4 because this In subsection 5.3 we discussed the size of peninsulas mea- time period had six active VPs, allowing us to make strong sured as a fraction of the affected routable prefix. In the latter statements about how multiple perspectives help. section, we use 2017q4 data. Here we use 2020q3 to confirm In this section, we verify our results using additional our results. datasets, and find quantitatively similar results. Figure 13b shows the peninsulas per prefix fraction, and Figure 13c. Similarly, we find that while small prefix fraction C.1 Additional Confirmation of the peninsulas are more in numbers, most of the peninsula time is spent in peninsulas covering the whole prefix. This result is Number of Peninsulas consistent with long lived peninsulas being caused by policy Similarly, as in subsection 5.1, we quantify how big the prob- choices. lem of peninsulas is, this time using Trinocular 2018q4 data. In Figure 12 we confirm, that with more VPs more penin- sulas are discovered, providing a better view of the Internet’s overall state. Outages (left) converge after 3 sites, as shown by the fit- ted curve and decreasing variance. Peninsulas and all-up converge more slowly. At six VPs, here we find and even higher difference be- tween all down and disagreements. Confirming that penin- sulas are a more pervasive problem than outages.

C.2 Additional Confirmation of Peninsula Duration In subsection 5.2 we characterize peninsula duration for 2017q4, to determine peninsula root causes. To confirm our results, we repeat the analysis, but with 2020q3 data. As Figure 13a shows, similarly, as in our 2017q4 results, we see that there are many very brief peninsulas (from 20 to 60 minutes). These results suggest that while the Internet is robust, there are many small connectivity glitches. Events shorter than two rounds (22 minutes), may repre- sent BGP transients or failures due to random packet loss. The number of multi-day peninsulas is small, However, these represent about 90% of all peninsula-time. Events last- ing a day are long-enough that can be debugged by human network operators, and events lasting longer than a week are long-enough that they may represent policy disputes. To- gether, these long-lived events suggest that there is benefit to identifying non-transient peninsulas and addressing the underlying routing problem. 17 All Down Disagreement All Up 0.010 0.010 1.000 0.000

0.999

0.008 0.008 0.998 0.002

0.997

0.006 0.006 0.996 0.004

0.995

0.004 0.004 0.994 0.006 Complement

0.993 Fraction of Block-Time Fraction of Block-Time Fraction of Block-Time

0.002 0.002 0.992 0.008

0.991

0.000 0.000 0.990 0.010 2 3 4 5 6 2 3 4 5 6 2 3 4 5 6 Number of Reporting Sites Number of Reporting Sites Number of Reporting Sites

Figure 12: Distribution of block-time fraction over sites reporting all down (left), disagreement (center), and all up (right), for events longer than five hour. Dataset A34, 2018q4.

25k 0.1 1 Events

Fraction 0.0

0 Duration 1.0 1.0 2 0.8 10 0.9 104 0.9 0.8 0.8 Peninsulas (> 5hrs.) 0.7 10 3 0.6 0.7 3 10 0.6 Peninsulas (all) 0.6 CDF 0.5 0.4 0.5 10 4 102 0.4 0.4 Events 0.3

0.3 Prefix Fraction

0.2 Duration (> 5hrs.) Prefix Fraction 5 10 Duration Fraction 0.2 101 0.2 0.1 0.1 0 0.0 10 6 1 min 10 min 1 hr 6 hrs 1 day 10 days 90 days 0.0 100 0 100k 9 11 13 15 17 19 21 23 0.0 0.2 9 11 13 15 17 19 21 23 Peninsula Duration Events Prefix Length Duration Fraction Prefix Length (a) Cumulative events (solid) and duration (b) Number of Peninsulas (c) Duration fraction (dashed)

Figure 13: Peninsulas measured with per-site down events longer than 5 hours during 2020q3. Dataset A41.

18