Connectivity in the LAC Region in 2020
Total Page:16
File Type:pdf, Size:1020Kb
Connectivity in the LAC Region in 2020 Author: Agustín Formoso Coordination/Revision: Guillermo Cicileo Edition and Design: Maria Gayo, Carolina Badano, Martín Mañana Project: Strengthening Regional Internet Infrastructure Department: Internet Infrastructure R&D We often talk about connectivity, but what is it that we are actually talking about? Since the moment networks were connected to each other, operators have been working to improve connectivity between them. In this article, we will present a connectivity study explaining how we measure connectivity in the region of Latin America and the Caribbean and how this has evolved in recent years. Introduction The term connectivity is often used in the Internet industry, yet its meaning may vary depending on the context. Connectivity can be measured based on bandwidth capacity, number of hops, or, in the case of this article, latency. In this sense, when we say that two locations are well connected, it means that the latency between them is low, i.e., the time it takes for a message to travel from source to destination is short. We at LACNIC wish to understand in greater detail the characteristics of network interconnection in Latin America and the Caribbean so that operators will have access to information they can leverage when designing their growth strategies. With this in mind, it is very useful to perform connectivity measurements that cover the whole region, including the entire Caribbean region and not just the LACNIC service region. Connectivity measurements are typically performed between one origin and one destination, or between a few origins and a few destinations. Measurements are generally initiated from nodes in our own infrastructure, or towards our own infrastructure. However, in order to obtain connectivity measurements that cover the entire region, it is necessary to initiate measurements from third-party networks, a challenge that requires the collaboration of multiple actors. Below is a list of platforms that offer the possibility of initiating measurements from third-party networks, along with a brief explanation of their characteristics. - RIPE Atlas is a well-known example that encourages users who wish to collaborate to host a probe (hardware or software) and keep it connected to the Internet. This allows other users of the platform to use these probes to initiate measurements. While the number of RIPE Atlas probes in LAC has been increasing over time, their presence is still insufficient to conduct studies covering the entire region. - CAIDA Ark or Archipelago is a platform similar to RIPE Atlas which is based on second generation Raspberry Pi. Unfortunately, there are only 11 CAIDA Ark probes in 10 autonomous systems in the LAC region. - M-Lab or Measurement Lab is a platform that offers a series of tools to measure various network parameters. Unfortunately, these tools do not focus on latency measurements between third-party networks, but on bandwidth measurements using Network 1 Diagnosis Tools (NDT) between the client and M-Lab servers (not many of which are located in the LAC region). - On the other hand, Speedchecker is a platform with similar functionalities that offers better coverage in the LAC region. Unlike RIPE Atlas, this platform uses only software probes, which makes the system more unstable than the others. However, given that we had used this platform in prior studies, we decided to use it once again so we would be able to compare results with greater confidence. Methodology and Prior Studies LACNIC has conducted similar studies in previous years [click here to read the studies conducted in 2016 and 2017 (in Spanish)]. Generally speaking, these studies provide an overview of the region each year and show significant improvements in the measurements year after year. The measurements were performed using the following methodology: 1. In every country of the LAC region, measurements were scheduled to run every hour. 2. These measurements were ICMP pings, with 10 packets sent for each measurement. 3. The packets’ destination was a random IP address selected from a pool of known IP addresses. a. This pool of IP addresses was comprised of all the Speedtest servers in the region. This allowed having a large number of vantage points located in multiple networks across the region, with a reasonably good uptime (it was highly probable that the IP address would be reachable at the time of performing the measurement). 4. The results were collected and later processed. a. Geolocation information was obtained from Maxmind b. Routing information (autonomous system, announced prefix) was obtained from the RIPE Routing Information Service (RIPE RIS). 2 Results of the 2020 Study For this edition of the study, we began a measurement campaign that ran from early September to early November. During this period, 13,000 measurements were initiated from 26 countries and their destinations were in 25 countries. In turn, 13,000 measurements originated in 332 different autonomous systems. Results Once the data was obtained, measurements were grouped considering three different categories: 1. Country of origin: This category contained all the measurements obtained with probes located in the country, excluding those pointing to the same country. In other words, these were only outgoing measurements. 2. Destination country: This category contained all the measurements obtained with vantage points (servers) located in the country, excluding those originating in the same country. In other words, these were only incoming measurements. 3. Both country of origin and destination: This category contained all the measurements for which the country of origin was the same as their destination country. In other words, these measurements are internal to the country. This category contained all measurements performed with probes and vantage points located in the same country. The results can be seen in the three charts below, which show the median of the results obtained for each country. 3 Measurements with their destination in Measurements with their origin and the country destination in the country Mediciones originadas en el país Medições originadas no país Measurements originating in the country A first glance shows that the values for both the outgoing latency (1) and the incoming latency are between 150 and 200 milliseconds. The charts also show that that, naturally, the internal latency (3) is lower than the outgoing and incoming latencies. Another aspect evidenced by this campaign is that the number of countries with active probes is much higher than the number of countries with test servers (37 and 26, respectively). This difference is the reason why the first chart has more entries than charts 2 and 3. In addition, not all <country_of_origin, destination_country> combinations have been covered by this measurement campaign, at least not at this time. Some countries with poor latencies in the previous charts are worth noting, as this is mainly the result of a bias in the measurement methodology. As mentioned earlier, the destination of the measurements is a pool of servers in the LAC region. Because these servers are not equally distributed across the region (larger countries tend to have more servers), when a new measurement is scheduled there is a higher probability that it will be scheduled for these larger countries. In fact, until the moment of writing this report, the countries that appear in figure 1 with the highest latencies — Cuba (CU), Turks and Caicos Islands (TC), French Guyana (GF), Suriname (SR), Guyana (GY), and Venezuela (VE) —have results exclusively towards those 4 countries. As time goes by and more measurements are scheduled to more countries, this bias will decrease, and measurements will better reflect the reality of regional connectivity. The information in the charts above can also be represented on a map: *Latency – Measurements originating in the country **Latency with their destination in the country ***Latency within a country 5 2020 versus 2017 We already mentioned that we conducted a similar study in 2017. So, how do the results obtained in 2020 compare with the ones obtained in 2017? Just as the results presented in the previous section, it is possible to group and compare the measurements. This chart represents the measurements that originated in each country. The 2017 measurement campaign shows that latencies in certain countries were much higher than those measured in 2020, particularly in Uruguay (UY), Paraguay (PY), Chile (CL) and Bolivia (BO), as can be seen in the graph to the left. We were able to verify that this was not due to a small number of samples, as they had between 100 and 1000 samples. The only country with a measurement bias is Cuba (CU), which had only 1 sample during 2017. Nevertheless, a more detailed analysis of the data shows that most of the countries had worse latency values toward the region than in 2017. Of the 37 countries included in the chart, 28 (75%) had higher values. But what is the reason for this increase? To answer this question, we had to Measurements originating in the country perform a more in-depth analysis of the data. We asked ourselves the following question: Is it possible that connectivity is now worse in these countries? We analyzed this possibility and proposed the following hypothesis: Could it be that the measurements we performed in 2020 covered longer distances than those covered in 2017? This is a possibility, particularly considering that: 1. The vantage points of 2020 were not the same as the ones of 2017: Speedtest nodes are enabled and disabled daily! 2. Every measurement we schedule randomly selects a vantage point, so there is a strong random element at the beginning of the experiment, when we do not yet have a statistically significant number of samples. Could it be that the points selected at the start of this experiment were farther away? 6 To answer this question, we used a geolocation database1 to determine the country of origin and destination city for each measurement.