Skype Relay Calls: Measurements and Experiments
Total Page:16
File Type:pdf, Size:1020Kb
Skype Relay Calls: Measurements and Experiments Wookyun Kho, Salman Abdul Baset, Henning Schulzrinne Department of Computer Science Columbia University [email protected], {salman,hgs}@cs.columbia.edu Abstract—Skype, a popular peer-to-peer VoIP applica- Our work makes two contributions. First, we have tion, works around NAT and firewall issues by routing determined that although Skype exhibits geographical calls through the machine of another Skype user with locality in relay selection, the median one way call unrestricted connectivity to the Internet. We describe an latency is not insignificant, suggesting that further tuning experimental study of Skype video and voice relay calls of Skype relay selection mechanism could be beneficial. conducted over a three-month period, where approxi- mately 18 thousand successful calls were made and over Second, we discuss factors impacting the success rate of nine thousand Skype relay nodes were found. We have Skype calls. This result can be used for improved future determined that the relay call success rate depends on protocol design. the network conditions, the presence of a host cache, The remainder of the paper is organized as follows. and caching of the callee’s reachable address. From our Section II describes the experimental setup and Sec- experiments, we have found that Skype relay selection tion III-A discusses factors affecting success rate of relay mechanism can be further improved, especially when both calls. In Section III-B, we discuss the geographical, ISP, caller and callee are behind a NAT. and autonomous system distribution of relay nodes and Index Terms—Skype, relay. relay calls and online presence of relay nodes. Finally, we conclude and discuss future work in Section IV. I. INTRODUCTION II. EXPERIMENTAL SETUP Skype is a popular peer-to-peer Internet telephony application developed by the founders of the well-known We give a brief description of the Skype software and file sharing application KaZaA [1]. One of the mecha- network architecture and then describe the experimental nisms Skype uses to address network connectivity issues setup. For a detailed description of Skype architecture, in the presence of NATs and firewalls is by relaying calls we refer the reader to [4]. through machines of other Skype users with unrestricted Skype software, thereafter referred to as Skype client connectivity. This mechanism, however, is relatively little (SC), maintains a list of other Skype peers called a host- known. The Skype users allow the relaying of calls from cache (HC) [4]. This list is empty when a SC is run for their machines by agreeing to the Skype end user license the first time after installation, is built during the lifetime agreement (EULA) [2]. The Skype guide for network of a Skype client, and survives the SC termination. administrators suggests that relaying of voice and video During subsequent runs, a SC can contact nodes in the calls can consume network bandwidth up to 4 KB/s and host cache (HC). Additionally, a SC has a built-in list of 10 KB/s for voice and video calls, respectively [3]. Skype about seven default Skype nodes, called bootstrap nodes, does not provide users with any mechanism to disable which it uses to join the Skype overlay for the first time the forced use of this network bandwidth. after installation. In this paper, we attempt to gain insights into the Upon a successful join, a SC establishes a TCP Skype relay selection algorithm by determining the char- connection with the machine of another Skype user or acteristics of relay calls and machines relaying Skype node, referred to as a super node (SN) in this paper. calls in a black-box manner. During a four month period, There are two types of nodes in the Skype overlay; super we made over 38 thousand Skype calls under three differ- node (SN) and ordinary node. Both SN and ordinary ent network setups. The experimental setup forced Skype node run the same Skype software. SNs are responsible to use a relay. Our results show that 18 thousand calls for detecting online SCs, and transmitting signaling were successful and were routed through 9,584 unique messages between SCs [3]. To establish a call, a SC relays. 89.3% of these calls were routed through a relay searches for the callee machine and upon successful in the US. The call success rate depends on the network contact, either directly exchanges media traffic with the setup. For 17.5% of successful Skype relay calls, Skype callee machine or through another Skype node. We refer used a different relay from caller to callee and from a to the node relaying a Skype call as a relay node (RN) callee to caller. and the call being routed as a relay call (RC). A RN is Fig. 1. Unrestricted Fig. 2. NAT: caller and callee behind Fig. 3. Direct-blocked: packets between address and port dependent NAT. caller and callee are dropped. a SN that has sufficient bandwidth to relay a voice or a We have determined the Skype success rate for the video call. above three network setups. We also determined how A key aspect of Skype’s robust connectivity is its the presence of the host cache (a list of online Skype ability to traverse NAT and firewalls by routing a video nodes) affects Skype relay success call rate. Throughout or a voice call through one or more Skype relays. the experiment, we collect IP address and port number To study Skype’s robust connectivity, we devised an of Skype relay nodes and track their online status by experiment in which a call was established between sending a specially crafted Skype message to them. two Skype clients running on machines in our lab at We have also determined the geographical, ISP, and Columbia University. The RTT between two machines autonomous-system (AS) distribution of RNs and RCs was less than one ms. The network conditions between using MaxMind [8] and a AS number lookup utility [9] the caller and callee machines were configured so that and calculated the RTT for RNs. We use this data to they were forced to use a RN. Specifically, the ex- gain insights into the efficacy of Skype relay selection periment was performed under three different network algorithm. setups: unrestricted connectivity (Figure 1), caller and The closest study to ours is an experimental study of callee behind different address and port dependent NAT1 Skype SNs conducted by Guha et al. in 2005 [10]. Guha (Figure 2), and direct-blocked setup, in which caller and monitored the HC of a SC and tracked the population callee, having otherwise unrestricted connectivity, could and presence of SNs. It was not certain if these SNs not send packets to each other (Figure 3). The setups were also acting as relay nodes. We, however, discovered are referred to as unrestricted, NAT, and direct-blocked the RNs by establishing and monitoring calls that were in the paper. For the NAT and direct-blocked setup, the relayed through these Skype nodes. Further, we study the experiment was performed with and without deleting the factors impacting Skype RCs success rate, the presence host cache. The experiments were performed from March information of RNs and their geographical distribution. 27 to July 3, 2007. We used Skype v3.2 for Windows XP in our experiments. III. RESULTS AND DISCUSSION We wrote a script using AutoIt [5] that uses the Skype In this section, we discuss the results of our experi- API [6] to automate the establishment of Skype calls, ments: factors impacting sucess rate of Skype RCs and checking of call status, and gathering and collection of characterization of Skype RCs and RNs. relay data at caller and callee. The duration of a call A. FACTORS IMPACTING SKYPE RELAY CALLS was configured to be one minute and during this time, the script gathers the number of packets sent by the We established over 38,000 calls over a three month caller and callee machines to unique IP address and port period for the three different network setups described in number pairs using WinDump [7]. With a packetization Section II. For 37,761 calls, the network was configured interval of 30 ms (as used by Skype), the caller machine such that Skype was forced to select a relay. Out of should send approximately 1,800 packets to the RN and these 37,761 calls, approximately 18,000 were successful vice versa. After terminating each call, we parse the and the success percentage depended on the network WinDump data and label the IP address and port number setup. Seventeen percent (3,146) of successful RCs used that most frequently appeared in the WinDump data as a different relay from caller to callee and from callee to a relay node, i.e., the IP address and port number with caller. There was almost no difference between the call which a caller or callee exchanged the maximum number success rates of video and voice calls. Table I shows the of packets. detailed call statistics for the three network setups. We pose the following two questions: first, how does network connectivity affects Skype RC success rate and 1Consider an internal IP address and port (X:x) that initiates a connection to an external (Y1:y1) tuple. Let the mapping allocated second, whether retention of HC entries from previous by the NAT for this connection be (X1’:x1’). Shortly thereafter, runs impacts the call establishment. Observe from Table I the endpoint initiates a connection from the same (X:x) to an that the call success rate for the direct-blocked setup that external address (Y2:y2) and gets the mapping (X2’:x2’) on the NAT.