Relay Path Selection for Peer-to-peer Voice-over- IP Systems

A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy

Quang Duc Bui MSc., B.Eng.

Electrical and Computer Engineering College of Science, Engineering and Health RMIT University August 2010 © Copyright by Quang D. Bui 2010 All Rights Reserved

ii To my wife, Duong, and my family.

iii ABSTRACT

Selecting one or multiple peers to relay voice calls is a critical component of large- scale peer-to-peer (P2P) voice-over-IP (VoIP) systems, such as Skype. The challenge is to determine good relay paths for a given voice call in a practical manner. We study different VoIP relay path selection schemes which are usually instantiated during the call session initiation and refreshed periodically. This not only allows end hosts behind Network Address Translations (NATs) or firewalls to establish voice calls but also allows communication in periods of poor performance to be switched to alternate relay paths quickly. We propose an enhanced version of existing algorithms for selecting VoIP relay paths and demonstrate that it improves voice quality. Then, we provide a series of simulations and analytical discussions on the correspondences of relay path performance to overlay network scenarios. We show that there are more opportunities for P2P VoIP systems to obtain good relay paths when selecting relay nodes located at highly connected Transit domains. For better relay path performance, we recommend to select relay nodes whose distances are less than four hops from the source. We also show that, in general, increasing relay density can reduce relay path length but generate more hop overlaps to the default path due to the breadth first search manner in existing algorithms. Finally, we fully address the problem of selecting VoIP relay path by taking a new approach, including the development of overall formulation, a new network model to effectively identify the optimal solution space, and a new heuristic algorithm for selecting VoIP relay paths.

iv ACKNOWLEDGEMENTS

I acknowledge with great pleasure the contributions of my supervisor, Professor Andrew Jennings, who accepted me to study with his team a few years ago. Without him this work would not have been possible. At the beginning of my study, I was not really imagining what kind of experience I was running in. Professor Jennings has provided guidance, knowledge, advice, and direction whenever it was needed. He has always been very patient and encouraging me whenever I met obstacles. I appreciated very much Jennings’s very pragmatic approach to problem solving. I would also like to thank him for improving my writing skills, providing supports to me to participate in conferences. Besides that, a large part of the ideas presented in this thesis belong to Jennings. The simulations in my thesis work were based on the simulation platform that I was studied during the time working for the Traffic Engineering for in the , at Large Scale (TEQUILA) project as a part of my Master program in University of Surrey, UK. Under this project, I have received guidance from Professor George Pavlou and Doctor Panos Trimintzios. I have learnt from them how to do research in the field of Internet Traffic Engineering, as well as to develop programs from the TEQULA’s simulation platform. I would thank to Doctor Himanshu Agrawal in the School of Electrical and Computer Engineering, RMIT University, AU who was studying with me under the supervision of Professor Jennings. He has always been willing to spend his valuable time to discuss the problems with me with patience, and shared his experiences and expertise without any reservation. I would also thank to the research people working in the iPLANE project in the Department of Computer Science, Washington University, USA. During the time I was trying to make a simulation of the real Internet configuration, they have been very helpful to instruct me to use their inter-domain path latency database. All of my questions to them have been responded usefully and without delay.

v Finally, I would thank all of my family for their supports and encouragements during my study. I would not have done this without your supports.

vi TABLE OF CONTENTS

ABSTRACT...... iv

ACKNOWLEDGEMENTS...... v

LIST OF PAPERS ...... x

LIST OF TABLES...... xi

LIST OF FIGURES ...... xii

CHAPTER 1 INTRODUCTION ...... 1 1.1 Problem Statement and Research Objectives ...... 1 1.2 Limitations of the Dissertation...... 3 1.3 Contributions...... 4 1.4 Dissertation Organisation...... 5

CHAPTER 2 BACKGROUND ...... 7 2.1 Unstructured P2P Networks...... 9 2.1.1 Non-hierarchical P2P Networks ...... 10 2.1.1.1 k-Walker Random Walk ...... 10 2.1.1.2 Directed BFS and Indices Based Search...... 11 2.1.1.3 Local Indices Based Search ...... 12

2.1.2 Hierarchical Unstructured P2P Networks...... 13 2.1.3 Summary...... 13 2.2 Structured P2P Networks...... 14 2.2.1 Non-hierarchical Structured P2P Networks...... 15 2.2.1.1 Chord...... 15 2.2.1.2 Pastry...... 16

2.2.2 Hierarchical Structured P2P Networks ...... 17 2.2.3 Summary...... 18 2.3 Locality-aware Peer-to-Peer Algorithms ...... 19 2.3.1 Network Proximity in Distributed Hash Tables...... 19 2.3.1.1 Geographic Layout...... 20

vii 2.3.1.2 Proximity Routing...... 21 2.3.1.3 Proximity Neighbour Selection...... 21

2.3.2 eQuus: a Locality-aware P2P System...... 23 2.3.3 Summary...... 25 2.4 Peer-to-Peer VoIP Systems...... 26 2.4.1 VoIP Performance Parameters...... 27 2.4.2 Skype...... 29 2.4.2.1 Skype System Overview...... 29 2.4.2.2 Skype Functions...... 31 2.4.2.3 Skype Related Research...... 34

2.4.3 P2P SIP Telephony ...... 34 2.4.4 Summary...... 37 2.5 Relay Path Selection ...... 37 2.5.1 Resilient Overlay Network ...... 38 2.5.2 Detour ...... 40 2.5.3 PDF ...... 42 2.5.4 AS-Aware Peer-relay Protocol ...... 44 2.5.5 Earliest Divergence Rule ...... 46 2.5.6 Summary...... 48

CHAPTER 3 AS-LEVEL TOPOLOGY AND SIMULATION DESIGN ...... 49 3.1 Introduction...... 49 3.2 Related Work ...... 50 3.3 AS-level Topology...... 50 3.4 The Design Requirements of Simulation and Analysis Tools ...... 53 3.5 Scale-Free, Hierarchical and Networks ...... 54 3.5.1 Scale-Free Model and Network Hierarchy ...... 54 3.5.2 ToR Graphs...... 56 3.6 AS Graph Generator - A Simulation Package ...... 57 3.7 The Real Internet Configuration ...... 63 3.8 Summary...... 64

CHAPTER 4 MAXIMAL OVERLAP CLOSE RELAY ...... 66 4.1 Introduction...... 66

viii 4.2 Related Work ...... 68 4.3 Minimal Overlap Close Relay Algorithm...... 69 4.4 Comparison between Relay Path Selection Schemes ...... 74 4.4.1 Simulations and Analyses of Alternate Relay Path Selection Schemes ...... 74 4.4.2 Overlap in First Hop Relay Statistic ...... 77 4.5 Experiment with Different Relay Node Allocation ...... 78 4.6 Experiments and Analysis Using the Real Network Configuration ...... 81 4.6.1 The Optimal Hop-count Distance for Relay Nodes...... 81 4.6.2 The Relay Node Population...... 87 4.7 Summary...... 91

CHAPTER 5 K-VIRTUAL SHORTEST RELAY PATH SELECTION...... 92 5.1 Introduction...... 92 5.2 Related Work ...... 93 5.3 Problem Statement, Modelling and Formulation...... 94 5.3.1 Problem Statement...... 94 5.3.2 Mathematical Formulation...... 94 5.3.3 Network Model and Optimal Solution Space ...... 96 5.4 K Virtual Shortest Paths Algorithm...... 102 5.4.1 The Algorithm...... 102 5.4.2 The Costs of Using k-VSP Algorithm ...... 105 5.5 K-VSP Performance Evaluation ...... 107 5.5.1 Simulation Settings ...... 107 5.5.2 Performance Analysis ...... 107 5.5.3 Comparison between Two Versions of the k-VSP Algorithm...... 116 5.6 Summary...... 117

CHAPTER 6 CONCLUSION AND FUTURE WORK ...... 119 6.1 Summary of Contributions...... 119 6.2 Future Work...... 120

BIBLIOGRAPHY...... 121

ix LIST OF PAPERS

 Q. D. Bui and A. Jennings, "Relay path selection approaches in peer-to-peer VoIP systems," in Proc. of Australasian Telecommunication Networks and Application Conf. (ATNAC’08), 2008, pp. 361-366.  Q. D. Bui and A. Jennings, "Relay node selection in large-scale VoIP overlay networks," in Proc. of 1st Intl. Conf. on Ubiquitous and Future Networks (ICUFN’09), 2009.  H. Agrawal, A. Jennings, and Q. D. Bui, "A hybrid approach for Robust Traffic Engineering," in 4th Intl. Symp. on Wireless Pervasive Computing (ISWPC’09), 2009, pp. 1-5.

x LIST OF TABLES

Table 2.1. Summary of different features in P2P applications [35]...... 27 Table 2.2. The Mean Opinion Score for measuring call quality...... 27 Table 3.1. Statistics of inferred AS relationships ...... 57 Table 3.2. AS node name notations ...... 59 Table 4.1. The main operations of algorithms ...... 73 Table 5.1. Summary of Performance Trends of Relay Paths...... 101

xi LIST OF FIGURES

Figure 2.1. Centralised and decentralised (with or without hierarchical) P2P systems...... 8 Figure 2.2. Classification of P2P networks...... 9 Figure 2.3. Example Chord network [35]...... 16 Figure 2.4. An example network consisting of 5 cliques. Nodes that are close-by belong to the same clique, share the same ID, and are responsible for the same set of data items [54]...... 24 Figure 2.5. Skype network. There are three main entities: supernodes, ordinary hosts, and the login server [58]...... 30 Figure 2.6. Difference between SIP-using-P2P and P2P-over-SIP architectures [35] ...... 36 Figure 2.7. Architecture of the Detour virtual Internet [77] ...... 41 Figure 2.8. ASAP system structure [56]...... 45 Figure 2.9. Illustration of Earliest Divergence Rule [9]...... 47 Figure 3.1. Graph skeleton to make sure of node reachability...... 58 Figure 3.2. Adding additional links by following power-law...... 60 Figure 3.3. AS degree distribution of the real Internet and exampled simulated SGB graphs: ‘RouteViews’ - the real Internet degree distribution [56] vs. ‘SGB sim 0-10’ - randomly simulated graphs...... 61 Figure 3.4. CDF of low degree nodes in simulated graphs and the real Internet graph. ...62 Figure 3.5. CDF of the two data sets of latency...... 64 Figure 4.1. A comparison between the calculated delay and measured delay of relay paths...... 70 Figure 4.2. MOCR pseudo code ...... 71 Figure 4.3. Relay path selection in different schemes ...... 72 Figure 4.4. Hop overlap comparison between EDR, ASAP and MOCR (with 95% confidence intervals)...... 76

xii Figure 4.5. Delay comparison between EDR, ASAP and MOCR (with 95% confidence intervals)...... 76 Figure 4.6. Scatter plot of overlap in the first hop relay vs. the whole paths ...... 77 Figure 4.7. Overlap distribution of MOCR paths achieved in different relay node allocations...... 80 Figure 4.8. Delay distribution of MOCR paths achieved in different relay node allocations...... 80 Figure 4.9. Average delay vs. Hop count constraints (with 95% confidence intervals)....82 Figure 4.10. Average number of hop overlaps vs. Hop count constraints (with 95% confidence intervals) ...... 82 Figure 4.11. Performance comparison of relay paths generated by different algorithms with and without 3 hop-count constraint...... 85 Figure 4.12. Average delay deterioration of 3 hop-count constraint relay paths under different relay pool ...... 86 Figure 4.13. Average overlap improvement of 3 hop-count constraint relay paths under different relay pool ...... 86 Figure 4.14. Average delay of paths under different relay node density...... 89 Figure 4.15. Average hop overlap of paths under different relay node density (with 95% confidence intervals) ...... 90 Figure 5.1. The representation of paths as arcs in 2D-Euclidean space; the solution spaces are colored in grey...... 98 Figure 5.2. Example of a default path with 4 intersects with the v-SP...... 99 Figure 5.3. An example of long default direct path from AS A to AS C; A relays traffic to C via the multi-homed AS B...... 103 Figure 5.4. The k-Virtual-Shortest-Path pseudo-code ...... 106 Figure 5.5. Average one-way delay comparison between relay paths selected by algorithms, the default and the v-SP paths (with 95% confidence intervals) 109

xiii Figure 5.6. CDF of one-way delay of relay paths selected by algorithms, the default and the v-SP paths...... 109 Figure 5.7. Session rank of one-way delays ...... 110 Figure 5.8. Average overlap comparison between relay paths selected by algorithms (with 95% confidence intervals) ...... 110 Figure 5.9. Percentage of overlap values compared between algorithms...... 111 Figure 5.10. Average overlaps of relay path selected in different default path lengths (with 95% confidence intervals) ...... 112 Figure 5.11. Average delay of relay path selected in different default path lengths (with 95% confidence intervals) ...... 113 Figure 5.12. Number of default paths in different default path lengths (with 95% confidence intervals) ...... 114 Figure 5.13. The relationship between path distance and hop overlap parameters (with 95% confidence intervals) ...... 115 Figure 5.14. Classifying relay paths within the solution space and the corresponding mean values of overlap (with 95% confidence intervals) ...... 115 Figure 5.15. Identifying relay paths inside and outside the solution space and the corresponding mean values of path distance (with 95% confidence intervals) ...... 116 Figure 5.16. CDF of delay comparison between the two k-VSP versions ...... 118 Figure 5.17. CDF of number of overlaps comparison between the two k-VSP versions 118

xiv Relay Path Selection for Peer-to-peer Voice-over- IP Systems

A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy

Quang Duc Bui MSc., B.Eng.

Electrical and Computer Engineering College of Science, Engineering and Health RMIT University August 2010 © Copyright by Quang D. Bui 2010 All Rights Reserved

ii To my wife, Duong, and my family.

iii ABSTRACT

Selecting one or multiple peers to relay voice calls is a critical component of large- scale peer-to-peer (P2P) voice-over-IP (VoIP) systems, such as Skype. The challenge is to determine good relay paths for a given voice call in a practical manner. We study different VoIP relay path selection schemes which are usually instantiated during the call session initiation and refreshed periodically. This not only allows end hosts behind Network Address Translations (NATs) or firewalls to establish voice calls but also allows communication in periods of poor performance to be switched to alternate relay paths quickly. We propose an enhanced version of existing algorithms for selecting VoIP relay paths and demonstrate that it improves voice quality. Then, we provide a series of simulations and analytical discussions on the correspondences of relay path performance to overlay network scenarios. We show that there are more opportunities for P2P VoIP systems to obtain good relay paths when selecting relay nodes located at highly connected Transit domains. For better relay path performance, we recommend to select relay nodes whose distances are less than four hops from the source. We also show that, in general, increasing relay node density can reduce relay path length but generate more hop overlaps to the default path due to the breadth first search manner in existing algorithms. Finally, we fully address the problem of selecting VoIP relay path by taking a new approach, including the development of overall formulation, a new network model to effectively identify the optimal solution space, and a new heuristic algorithm for selecting VoIP relay paths.

iv ACKNOWLEDGEMENTS

I acknowledge with great pleasure the contributions of my supervisor, Professor Andrew Jennings, who accepted me to study with his team a few years ago. Without him this work would not have been possible. At the beginning of my study, I was not really imagining what kind of experience I was running in. Professor Jennings has provided guidance, knowledge, advice, and direction whenever it was needed. He has always been very patient and encouraging me whenever I met obstacles. I appreciated very much Jennings’s very pragmatic approach to problem solving. I would also like to thank him for improving my writing skills, providing supports to me to participate in conferences. Besides that, a large part of the ideas presented in this thesis belong to Jennings. The simulations in my thesis work were based on the simulation platform that I was studied during the time working for the Traffic Engineering for Quality of Service in the Internet, at Large Scale (TEQUILA) project as a part of my Master program in University of Surrey, UK. Under this project, I have received guidance from Professor George Pavlou and Doctor Panos Trimintzios. I have learnt from them how to do research in the field of Internet Traffic Engineering, as well as to develop programs from the TEQULA’s simulation platform. I would thank to Doctor Himanshu Agrawal in the School of Electrical and Computer Engineering, RMIT University, AU who was studying with me under the supervision of Professor Jennings. He has always been willing to spend his valuable time to discuss the problems with me with patience, and shared his experiences and expertise without any reservation. I would also thank to the research people working in the iPLANE project in the Department of Computer Science, Washington University, USA. During the time I was trying to make a simulation of the real Internet configuration, they have been very helpful to instruct me to use their inter-domain path latency database. All of my questions to them have been responded usefully and without delay.

v Finally, I would thank all of my family for their supports and encouragements during my study. I would not have done this without your supports.

vi TABLE OF CONTENTS

ABSTRACT...... iv

ACKNOWLEDGEMENTS...... v

LIST OF PAPERS ...... x

LIST OF TABLES...... xi

LIST OF FIGURES ...... xii

CHAPTER 1 INTRODUCTION ...... 1 1.1 Problem Statement and Research Objectives ...... 1 1.2 Limitations of the Dissertation...... 3 1.3 Contributions...... 4 1.4 Dissertation Organisation...... 5

CHAPTER 2 BACKGROUND ...... 7 2.1 Unstructured P2P Networks...... 9 2.1.1 Non-hierarchical P2P Networks ...... 10 2.1.1.1 k-Walker Random Walk ...... 10 2.1.1.2 Directed BFS and Routing Indices Based Search...... 11 2.1.1.3 Local Indices Based Search ...... 12

2.1.2 Hierarchical Unstructured P2P Networks...... 13 2.1.3 Summary...... 13 2.2 Structured P2P Networks...... 14 2.2.1 Non-hierarchical Structured P2P Networks...... 15 2.2.1.1 Chord...... 15 2.2.1.2 Pastry...... 16

2.2.2 Hierarchical Structured P2P Networks ...... 17 2.2.3 Summary...... 18 2.3 Locality-aware Peer-to-Peer Algorithms ...... 19 2.3.1 Network Proximity in Distributed Hash Tables...... 19 2.3.1.1 Geographic Layout...... 20

vii 2.3.1.2 Proximity Routing...... 21 2.3.1.3 Proximity Neighbour Selection...... 21

2.3.2 eQuus: a Locality-aware P2P System...... 23 2.3.3 Summary...... 25 2.4 Peer-to-Peer VoIP Systems...... 26 2.4.1 VoIP Performance Parameters...... 27 2.4.2 Skype...... 29 2.4.2.1 Skype System Overview...... 29 2.4.2.2 Skype Functions...... 31 2.4.2.3 Skype Related Research...... 34

2.4.3 P2P SIP Telephony ...... 34 2.4.4 Summary...... 37 2.5 Relay Path Selection ...... 37 2.5.1 Resilient Overlay Network ...... 38 2.5.2 Detour ...... 40 2.5.3 PDF ...... 42 2.5.4 AS-Aware Peer-relay Protocol ...... 44 2.5.5 Earliest Divergence Rule ...... 46 2.5.6 Summary...... 48

CHAPTER 3 AS-LEVEL TOPOLOGY AND SIMULATION DESIGN ...... 49 3.1 Introduction...... 49 3.2 Related Work ...... 50 3.3 AS-level Topology...... 50 3.4 The Design Requirements of Simulation and Analysis Tools ...... 53 3.5 Scale-Free, Hierarchical and ToR Networks ...... 54 3.5.1 Scale-Free Model and Network Hierarchy ...... 54 3.5.2 ToR Graphs...... 56 3.6 AS Graph Generator - A Simulation Package ...... 57 3.7 The Real Internet Configuration ...... 63 3.8 Summary...... 64

CHAPTER 4 MAXIMAL OVERLAP CLOSE RELAY ...... 66 4.1 Introduction...... 66

viii 4.2 Related Work ...... 68 4.3 Minimal Overlap Close Relay Algorithm...... 69 4.4 Comparison between Relay Path Selection Schemes ...... 74 4.4.1 Simulations and Analyses of Alternate Relay Path Selection Schemes ...... 74 4.4.2 Overlap in First Hop Relay Statistic ...... 77 4.5 Experiment with Different Relay Node Allocation ...... 78 4.6 Experiments and Analysis Using the Real Network Configuration ...... 81 4.6.1 The Optimal Hop-count Distance for Relay Nodes...... 81 4.6.2 The Relay Node Population...... 87 4.7 Summary...... 91

CHAPTER 5 K-VIRTUAL SHORTEST RELAY PATH SELECTION...... 92 5.1 Introduction...... 92 5.2 Related Work ...... 93 5.3 Problem Statement, Modelling and Formulation...... 94 5.3.1 Problem Statement...... 94 5.3.2 Mathematical Formulation...... 94 5.3.3 Network Model and Optimal Solution Space ...... 96 5.4 K Virtual Shortest Paths Algorithm...... 102 5.4.1 The Algorithm...... 102 5.4.2 The Costs of Using k-VSP Algorithm ...... 105 5.5 K-VSP Performance Evaluation ...... 107 5.5.1 Simulation Settings ...... 107 5.5.2 Performance Analysis ...... 107 5.5.3 Comparison between Two Versions of the k-VSP Algorithm...... 116 5.6 Summary...... 117

CHAPTER 6 CONCLUSION AND FUTURE WORK ...... 119 6.1 Summary of Contributions...... 119 6.2 Future Work...... 120

BIBLIOGRAPHY...... 121

ix LIST OF PAPERS

 Q. D. Bui and A. Jennings, "Relay path selection approaches in peer-to-peer VoIP systems," in Proc. of Australasian Telecommunication Networks and Application Conf. (ATNAC’08), 2008, pp. 361-366.  Q. D. Bui and A. Jennings, "Relay node selection in large-scale VoIP overlay networks," in Proc. of 1st Intl. Conf. on Ubiquitous and Future Networks (ICUFN’09), 2009.  H. Agrawal, A. Jennings, and Q. D. Bui, "A hybrid approach for Robust Traffic Engineering," in 4th Intl. Symp. on Wireless Pervasive Computing (ISWPC’09), 2009, pp. 1-5.

x LIST OF TABLES

Table 2.1. Summary of different features in P2P applications [35]...... 27 Table 2.2. The Mean Opinion Score for measuring call quality...... 27 Table 3.1. Statistics of inferred AS relationships ...... 57 Table 3.2. AS node name notations ...... 59 Table 4.1. The main operations of algorithms ...... 73 Table 5.1. Summary of Performance Trends of Relay Paths...... 101

xi LIST OF FIGURES

Figure 2.1. Centralised and decentralised (with or without hierarchical) P2P systems...... 8 Figure 2.2. Classification of P2P networks...... 9 Figure 2.3. Example Chord network [35]...... 16 Figure 2.4. An example network consisting of 5 cliques. Nodes that are close-by belong to the same clique, share the same ID, and are responsible for the same set of data items [54]...... 24 Figure 2.5. Skype network. There are three main entities: supernodes, ordinary hosts, and the login server [58]...... 30 Figure 2.6. Difference between SIP-using-P2P and P2P-over-SIP architectures [35] ...... 36 Figure 2.7. Architecture of the Detour virtual Internet [77] ...... 41 Figure 2.8. ASAP system structure [56]...... 45 Figure 2.9. Illustration of Earliest Divergence Rule [9]...... 47 Figure 3.1. Graph skeleton to make sure of node reachability...... 58 Figure 3.2. Adding additional links by following power-law...... 60 Figure 3.3. AS degree distribution of the real Internet and exampled simulated SGB graphs: ‘RouteViews’ - the real Internet degree distribution [56] vs. ‘SGB sim 0-10’ - randomly simulated graphs...... 61 Figure 3.4. CDF of low degree nodes in simulated graphs and the real Internet graph. ...62 Figure 3.5. CDF of the two data sets of latency...... 64 Figure 4.1. A comparison between the calculated delay and measured delay of relay paths...... 70 Figure 4.2. MOCR pseudo code ...... 71 Figure 4.3. Relay path selection in different schemes ...... 72 Figure 4.4. Hop overlap comparison between EDR, ASAP and MOCR (with 95% confidence intervals)...... 76

xii Figure 4.5. Delay comparison between EDR, ASAP and MOCR (with 95% confidence intervals)...... 76 Figure 4.6. Scatter plot of overlap in the first hop relay vs. the whole paths ...... 77 Figure 4.7. Overlap distribution of MOCR paths achieved in different relay node allocations...... 80 Figure 4.8. Delay distribution of MOCR paths achieved in different relay node allocations...... 80 Figure 4.9. Average delay vs. Hop count constraints (with 95% confidence intervals)....82 Figure 4.10. Average number of hop overlaps vs. Hop count constraints (with 95% confidence intervals) ...... 82 Figure 4.11. Performance comparison of relay paths generated by different algorithms with and without 3 hop-count constraint...... 85 Figure 4.12. Average delay deterioration of 3 hop-count constraint relay paths under different relay pool ...... 86 Figure 4.13. Average overlap improvement of 3 hop-count constraint relay paths under different relay pool ...... 86 Figure 4.14. Average delay of paths under different relay node density...... 89 Figure 4.15. Average hop overlap of paths under different relay node density (with 95% confidence intervals) ...... 90 Figure 5.1. The representation of paths as arcs in 2D-Euclidean space; the solution spaces are colored in grey...... 98 Figure 5.2. Example of a default path with 4 intersects with the v-SP...... 99 Figure 5.3. An example of long default direct path from AS A to AS C; A relays traffic to C via the multi-homed AS B...... 103 Figure 5.4. The k-Virtual-Shortest-Path pseudo-code ...... 106 Figure 5.5. Average one-way delay comparison between relay paths selected by algorithms, the default and the v-SP paths (with 95% confidence intervals) 109

xiii Figure 5.6. CDF of one-way delay of relay paths selected by algorithms, the default and the v-SP paths...... 109 Figure 5.7. Session rank of one-way delays ...... 110 Figure 5.8. Average overlap comparison between relay paths selected by algorithms (with 95% confidence intervals) ...... 110 Figure 5.9. Percentage of overlap values compared between algorithms...... 111 Figure 5.10. Average overlaps of relay path selected in different default path lengths (with 95% confidence intervals) ...... 112 Figure 5.11. Average delay of relay path selected in different default path lengths (with 95% confidence intervals) ...... 113 Figure 5.12. Number of default paths in different default path lengths (with 95% confidence intervals) ...... 114 Figure 5.13. The relationship between path distance and hop overlap parameters (with 95% confidence intervals) ...... 115 Figure 5.14. Classifying relay paths within the solution space and the corresponding mean values of overlap (with 95% confidence intervals) ...... 115 Figure 5.15. Identifying relay paths inside and outside the solution space and the corresponding mean values of path distance (with 95% confidence intervals) ...... 116 Figure 5.16. CDF of delay comparison between the two k-VSP versions ...... 118 Figure 5.17. CDF of number of overlaps comparison between the two k-VSP versions 118

xiv CHAPTER 1

INTRODUCTION

1.1 Problem Statement and Research Objectives Providing Quality of Service (QoS) services for the Internet has become a critical demand to the Internet due to the evolution of real-time applications, such as voice-over- IP (VoIP), Video on Demand (VOD) and video/audio instant messaging. A large number of architectures, technologies, and mechanisms enabling IP QoS have been created to enhance the conventional IP service model. The most important frameworks and technologies proposed are Integrate Services (IntServ) [1], Resource reSerVation Protocol (RSVP) [2], (DiffServ) [3], and Multi-Protocol Label Switching (MPLS) [4]. However, delivering end-to-end QoS in the Internet remains a challenging task. An important reason for the lack of deployment of many of these approaches is that the Internet has already become a social infrastructure comprised of a large number of independently operated networks, i.e. autonomous systems (ASs), where peering points provide the connection of separate ASs of the Internet into one cooperating infrastructure. It is difficult to widely deploy new approaches that significantly change the existing architecture of the network such as those proposed in IntServ or DiffServ. The economics of peering make the provisioning of end-to-end QoS unlikely. Whereas most peering agreements are bilateral contracts between ASs at peering points, end-to-end QoS is a cooperative effort of all ASs on an end-to-end path of a flow with service guarantees. Although an Internet Service Provider (ISP) may have an interest in providing QoS guarantees within its own AS, there is a lack of incentive to support similar service guarantees to customers of remote ASs.

1 To overcome these end-to-end QoS interoperability issues, peer-to-peer (P2P) overlay networks have been developed as a higher level mechanism that can support new services to users on top of the network layer without requiring changes to the infrastructure or its business practices. The key to the success of such overlay path switching, is the availability of diverse paths. Most participating nodes in P2P systems can function as traffic relaying points for other nodes. It realises possibilities of constructing a rich set of alternate relay paths which potentially help to bypass performance degradation. P2P implementation is also cost-effective for small companies and even individuals. Over a decade, P2P systems have attracted attention from Internet users as well as research community. From Napster and ICQ, the earliest P2P systems, emerging in 1999, more and more popular systems such as , KaZaA, BitTorrent and Skype have been realised. Motivated by practical successes in VoIP deployment via P2P overlay networks, it is envisioned that overlay systems can be a promising alternative for end-to-end QoS delivery in IP networks. Decentralised structured peer-to-peer systems inherently have high scalability because the capacity scales with user population, robustness and fault tolerance because there is no centralised server and the network self-organises itself. The key to this approach is to find among a rich set of routing paths an alternative route that can avoid congestion, and hence, meet the QoS demands. The objective of this work is to explore new approaches to select alternate relay paths that are suitable for transmitting VoIP traffic. In P2P systems, there is a trade-off between performance and overhead (and hence, latency) in selecting good QoS candidates. This is particularly important for large-scale networks. Large-scale P2P networks are vastly decentralised, and the host population provides an enormous diversity of alternate relay paths. It is however a very dynamic system where peers frequently join and leave. Selecting relay paths by a brute-force solution that continuously monitors all possible alternatives is not feasible in this situation. On the contrary, randomly selecting a relay can often result in poor options. Existing

2 researches, e.g. [5], [6], [7], [8], [9], [10], suggest that ideal relay paths should exhibit good end-to-end performance characteristics with maximal diversity, i.e. how significant relay paths deviate from those of default path, because disjoint paths are unlikely to experience link degradation or failure at the same time. The first research question which needs to be carefully considered is how to quickly identify suitable QoS VoIP relay paths in a scalable way? Since P2P systems operate on top of physical network, their path selection is performed by end hosts running applications according to their QoS requirements, and the underlying IP routing infrastructure is not aware of overlay routing activity. In this sense, P2P routing is also regarded as selfish routing since it does not consider other traffic within the network. The selfish routing optimisation problem, in theory, has been addressed by using the non-cooperative multiple optimisation approach for game theory and economics which first proposed by J. F Nash [11]. A crucial property of selfish routing is that it is possible to reach a traffic equilibrium which is known as the Nash equilibrium. That is, traffic equilibrium is reached in such a way that all routes that are used between a source-destination node pair have equal costs while all unused routes have a higher cost [12]. The main problem of Nash equilibria is that it is difficult in practice to exhibit such an equilibrium [13] because of the complexity in designing such a protocol and frequency of overhead exchange between entities. Therefore, an interesting research question is how to quantify the optimisation problem of selecting VoIP relay paths given a set of optimal objectives? And hence, can we practically obtain near-optimal solutions for the problem in large-scale networks?

1.2 Limitations of the Dissertation In this dissertation, we investigate relay path selection algorithms proposed for large- scale P2P VoIP systems with regard to two significant performance metrics: path latency and path disjointness. While latency is, obviously, very sensitive to VoIP service, path

3 disjointness is considered to improve performance since disjoint backup relay paths are unlikely to experience link degradation or failure at the same time. We later justify our selection of these performance parameters in Section 2.4.1. Due to the stringent time requirement of voice calls, the task of selecting suitable relay path accommodating voice traffic needs to be rapidly processed. It is expected that either the relay candidates are available before making a voice call or they should be selected within the period of call initiation (i.e. several seconds at most). These are the prime conditions when we consider relay path selection algorithms. P2P networks can be classified based on the control over data location. There are two main categories: unstructured and structured networks. Unstructured P2P systems use two kinds of searching, data look up and keyword searching. Structured P2P networks usually employ distributed hash tables (DHTs) which typically support data lookup functionality [14]. More discussion about this topic can be found in the later sections. This work assumes that P2P systems hold efficient search functionalities and protocols which allow peer operation and basic network management. Algorithms studied in this work provide advanced functionalities for localising and selecting relay peers in order to improve the search performance as well as the quality of consequent VoIP calls. Specifically, we study algorithms which associate topological information of large-scale networks to the task of selecting relay paths. We focus on AS-level path, as opposed to physical IP-level path, because this greatly reduces the amount of topology information that needs to be maintained by end systems. It has been argued by T. Fei et al. [8] that this is a more scalable solution for large P2P networks.

1.3 Contributions Our project has three major contributions. Firstly, we propose an enhanced version of existing algorithms focusing on techniques used to select cross-domain relay paths in the large scale VoIP systems. For scalability and efficiency, our heuristic employs AS-level path information so that topological information maintained at end hosts can be

4 significantly reduced while exhibiting better path performance. The proposed heuristic utilises the availability of large number of relay peers in large scale P2P systems and pre- determines, possibly during the call instantiation or periodic update, a pool of close relay candidates in which VoIP traffic can be effectively switched to in the events of network failure or degradation. We illustrate that the proposed algorithm can bring better quality paths with the same constraints as existing methods. Secondly, we provide a series of simulations and analytical discussions on the correspondences of relay path performance to overlay network scenarios. We point out that there are more opportunities for P2P VoIP systems to obtain good relay paths when selecting relay nodes located at highly connected Transit domains. We then quantify the lower threshold for the first hop relay path length. It is recommended for achieving better relay path performance to select relay nodes whose distances are less than four hops from their sources. We also show that, in general, increasing relay node density can reduce relay path length but generate more hop overlaps to the default path due to the breadth first search of existing algorithms. Thirdly, we fully address the problem of selecting VoIP relay path by taking a new approach, including the development of overall formulation, a new network model, and a new heuristic algorithm for selecting VoIP relay paths. Our formulation and network model can be used to effectively present, identify the optimal solution space for the problem of VoIP relay path selection with regard to path latency and path diversity objectives, which has not been addressed before. Using this scheme of localising the optimal solution space, we develop a new algorithm for selecting alternate relay based on latency, hop overlap and the valley-free rule in the AS-level Internet.

1.4 Dissertation Organisation The remainder of this thesis is organised as follows. In Chapter 2, we motivate our research objective of using P2P to provide QoS for VoIP systems through discussion on the evolution of P2P systems, the benefit of applying P2P to VoIP networks. In this

5 chapter, we also provide an overview of the key technologies in P2P and a discussion on VoIP quality performance. In Chapter 3, we discuss the some important aspects in simulating AS-level Internet topology. Subsequently, we introduce two AS graph simulations. Chapter 4 explores methods for identifying suitable VoIP overlay paths in a scalable manner. We discuss and compare different path selection schemes in P2P overlay networks, and introduce a heuristic algorithm for alternate relay path selection. We then look at the choice of relay nodes and its impact on the performance of VoIP overlay paths in interdomain environment. Using simulation of the real Internet configuration, we quantify the lower threshold for the first hop relay path length in hop-count. In Chapter 5, we fully address the second research question by modelling, formulating the problem of optimal VoIP relay path selection with regard to path latency and path diversity objectives. We then describe a complete solution for selecting alternate relay paths in VoIP systems based on latency, hop overlaps and the valley-free nature in the current interdomain routing environment. Finally, Chapter 6 concludes the major results of this dissertation as well as suggesting future work.

6 CHAPTER 2

BACKGROUND

Peer-to-peer (P2P) networks are overlay networks, where nodes are end hosts in unrelated administrative domains in the Internet. Nodes in a P2P system play equal roles; hence, they are also called peers. These peers form a virtual overlay network on top of the physical Internet links by maintaining information about a set of other peers (i.e. neighbours) in the P2P layer. Benefits offered by P2P networks include: (1) No special administration and financial engagements required; (2) Distributed and decentralised; thus, they are potentially fault tolerant and load balanced; (3) Self organised and adaptive. Central to P2P systems is the ability to efficiently lookup and locate data items and to manage them accordingly. Many aspects of P2P systems rest on this functionality. In contrast to centralised server applications, decentralised systems store content in multiple, possibly distant locations within the system. There are two approaches to provide the control over data location and network topology in P2P systems, unstructured and structured. In unstructured P2P networks, there is no rule that defines where a data item is stored and the network topology is symmetric. Searching functionality is typically provided through the flooding of query messages. In structured P2P networks, network configuration and the data locations are precisely defined [15]. Decentralised P2P networks can be further classified into non-hierarchical and hierarchical based on whether the arrangement of the P2P network is a hierarchy (Figure 2.1). In general, the hierarchy of a P2P system is exhibited by the roles that peers participate. In non-hierarchical (or pure P2P) systems, peers are totally equal in terms of the role on the network. In hierarchical P2P systems, such as hybrid systems, some peers serve as a super-peer for a set of normal nodes. Generally, pure P2P systems offer high

7 resilience and load balancing. However, they cannot take advantage of node heterogeneity, scalability and routing efficiency as the hierarchical P2P systems can [15]. Figure 2.2 provides a summary of P2P system classifications used in this dissertation. Note that a hybrid P2P (indicated with dash lines), in practice, may refer to either a hierarchical system (i.e. a combination of centralised and decentralised networks) or a combination of unstructured and structured P2P protocols such as Yappers [16].

P P P P P P P S S S P P S P = superpeer S P P = peer P = peer P P P S = server P P = peer Centralised P2P Nonhierarchical Hierarchical decentralised P2P decentralised P2P

Figure 2.1. Centralised and decentralised (with or without hierarchical) P2P systems.

This chapter provides the fundamentals of P2P routing and its application to voice- over-IP (VoIP) systems. Particularly, we focus on the searching techniques used in decentralised peer-to-peer systems which show our motivations. The remainder of this chapter is organised as follows. In Sections 2.1 and 2.2, we provide brief overviews of some basic searches in unstructured and structured P2P networks, and discuss their advantages and drawbacks. In Section 2.3, we investigate in more detail some advanced locality-aware searching schemes in structured P2P networks that motivate our research. Section 2.4 introduces popular P2P VoIP systems. Section 2.5 relates our work to recent relay path selection schemes.

8 Physical configuration Systems

Centralised Decentralised Hybrid/Hierarchical

Search functionality

Unstructured Structured Hybrid

Hierarchy

Nonhierarchical Hierarchical Nonhierarchical Hierarchical

Figure 2.2. Classification of P2P networks

2.1 Unstructured P2P Networks In this section, we describe the main features of unstructured P2P systems, using combination of information from [14], [15]. More detailed discussions on P2P networks can be found in the same book [17]. In both centralised and unstructured P2P systems, no rule strictly defines where data are stored and which nodes are neighbours to others. These systems either rely on lookups via a central server which stored the locations of all data items or use a flooding technique. In centralised P2P approach, e.g. Napster [18], P2P application at an end host needs first to lookup the location of a data item via a server, and then, the data are transferred directly between peers. Flooding based approach in unstructured P2P systems, such as Gnutella [19], sends lookup probes based on Breadth First Search (BFS) to all peers participating in the system with depth limit D, where D refers to the system-wide maximum number of hops of a query probe. Thus, flooding forms the maximum number of neighbours within a

9 ring centred at the source node with the radius of D-hops. It can be seen that the server- based system suffers from exhibiting a single point of failure as well as being a bottleneck with regard to resources such as memory, processing power, and bandwidth. On the other hand, the flooding-based approach generates a large number of overheads, in which many are replicated, inducing a high consumption of network bandwidth. As a result, neither approach scales well. Alternative schemes have been proposed to address the problem of flooding, including BFS based or Depth First Search (DFS) based approaches. The BFS based scheme includes k-walker random walk [20], iterative deepening [21], directed BFS [21], intelligent search [22], local indices based search [21], adaptive probabilistic search [23], etc. The routing indices based search [24] and the attenuated bloom filter search [25] are variations of DFS. Their main purpose is to try to reduce overhead and to increase quality of query results. Searching schemes in unstructured P2P systems can also be classified as blind and informed searches; or deterministic and probabilistic searches. In an informed search, peers store some metadata to facilitate the search; while in a blind search, they do not keep information about data location. On the other hand, deterministic search means query forwarding is deterministic, while in probabilistic approach, query is probabilistically or randomly forwarded, or it is based on some kinds of ranking. In the next sub-sections, we investigate several unstructured P2P searching schemes.

2.1.1 Non-hierarchical P2P Networks

2.1.1.1 k-Walker Random Walk

The random walk algorithm [20] further improves the search efficiency in the Gnutella system. In this scheme, the querying node forwards query message, i.e. called walker, to k randomly selected neighbours. These neighbours repeatedly choose k of their neighbours and forward the walker to those neighbours. The procedure continues until the desired data item is found. Obviously when k = 1, the amount of message overhead is

10 reduced but it causes the longest searching delay. Increasing k reduces the routing delay by k times, on average, as compared to the case of single walker because there are k times more nodes reached on the same number of walks. The main issue for the k-walker random walk algorithm is data replication. The works in [20] and [26] have tried to address the problem, in terms of the average search overhead for completed queries, by analysing and simulating three replication strategies: uniform, proportional, and square-root. Their results show that the uniform and proportional replications achieve the same average search size and larger than one achieved by square- root strategy. Square-root replication also has smaller utilisation rate then others. Thus, square-root replication can be practically implemented by replicating query data proportional to the number of sites probed.

2.1.1.2 Directed BFS and Routing Indices Based Search

In directed BFS scheme [21], the source node only sends query messages to a set of its neighbours that meet a certain criteria. The most common criterion used is returning history of high-quality results. Consequently, the selected neighbours forward query in the same fashion of BFS. Forwarding the query to a subset of neighbours helps the algorithm reduce the amount of routing overhead. By selecting “good” neighbours, it is possible to maintain the quality of query responses and to lessen routing delay. Heuristics for selecting good neighbours include:  The closest neighbour, identified by number of hop-counts in previous messages;  The most stable neighbour, identified by highest number of returned queries; and  The lowest load neighbour, identified by the message queue delay. However, the message replication in this algorithm is not greatly reduced because apart from the source, all downstream peers involved in the query procedure still broadcast the query based on BFS. To further reduce overhead, routing indices based search is proposed in [24]. It is similar to directed BFS in that all nodes use information about their neighbours to guide the search. However, routing indices uses intelligent neighbour selection for the entire

11 search process, not only the querying node as in the case of directed BFS. In this scheme, a routing index (RI), a distributed data structure, facilitates a node to choose the best neighbours to forward query messages. Given a content query, the algorithm computes the top several best neighbours based on this data structure. Since an RI indicates route to one of its neighbours, instead of destinations, the size of routing table is reduced. In practice, routing indices based search is proposed for content query, e.g. a request for documents that contain a particular word. Hence, a good neighbour is typically the one through which many documents can be quickly found.

2.1.1.3 Local Indices Based Search

The basic idea of the local indices [21] is to get the same number of query results as flooding based with less number of nodes processing a query. Each node in a local indices network maintains metadata (i.e. indices) on all nodes within k-hop distance. Local indices network employs the policy P for all nodes which is typically a list of depths to reach to other nodes. Using P, each node can directly respond queries from nodes, whose depths are listed in P, by checking its local indices and returning the query results without moving the query to other nodes. However, if their depths are smaller than the depth limit defined in P, nodes continue to forward query messages to all other neighbours. On the other hand, if a node is not listed in P, it just simply forwards the query to all of its neighbours and does not check the local indices. When a node joins, leaves or modifies its local indices, all nodes within k-hop distance from it will update their local indices accordingly and broadcast an update message to all of their neighbours. Once receiving the update message, those neighbours check their local indices whether this update contains information that affects their metadata. In summary, the local indices based search uses broadcast of the query message based on a list of depths. Only a selected number of nodes which are stated in the policy P process the query. As a result, this technique is claimed to greatly reduce the aggregate

12 cost of processing queries over the entire system, while maintaining equally high quality of search results.

2.1.2 Hierarchical Unstructured P2P Networks To overcome the drawbacks of flooding, dynamic hierarchy has been introduced for unstructured P2P systems so that not every query message has to be flooded through the whole network. The hierarchical (or hybrid) P2P is launched by the use of a special type of relay peer called superpeer or supernode. A superpeer is a peer which typically has high capacity. All superpeers connect to each others as a pure P2P network. Each superpeer operates as a centralised server to a set of normal nodes called leaf-nodes. However, they should not have more than 50 to 100 leaf-nodes [27], depending on its processing power and the bandwidth connection, to maintain the advantage of self-organisation and decentralisation. Each ordinary node may connect to one or more superpeers. For each query, the querying node forwards query message to one of its superpeers and waits for query results from this superpeer. In hierarchical P2P networks, queries are processed by superpeers only. The mechanisms that superpeers are used include flooding or random walk. This organisation allows searches in hierarchical networks to be implemented more efficiently than in ordinary unstructured P2P systems as the amount of flooded messages is reduced significantly.

2.1.3 Summary Unstructured P2P networks are extremely resilient to node joining and leaving and adapt well to the changing peer population because there is no special structure needed to maintain and no control over data storage. Since their searching mechanisms are often based on flooding, they suffer from generating potentially huge amount of signalling traffic. Such system design is broadly argued not to be scalable despite the fact that many real systems are built based on this scheme. To restrict the network bandwidth consumption, popular unstructured P2P systems employ constraint algorithms which in

13 fact may try to terminate a data query prematurely before the desired data is found. For example, Gnutella uses flooding with limited time-to-live (TTL) for query messages. As a result, unstructured P2P cannot guarantee the quality and performance of resource discovery.

2.2 Structured P2P Networks Structured P2P networks are considered the second generation of P2P. In structured P2P networks, there is no central directory but tight control over the P2P network topology. The neighbour relationship between peers and data locations is strictly defined. There is close coupling between network topology and resource location data. Searching in structured P2P networks is determined based on the particular P2P topology, which should be reconstructed along with the join and leave of peers. Structured P2P networks can guarantee finding a resource (data item) within bounded hops. Most structured P2P systems implement a (DHT) which is an approach for distributed and content-addressable data storage. DHTs employ additional techniques to manage data structures, to add redundancy, and to locate the nearest instances of a requested data item. A DHT contains table entries that are distributed among different peers located in arbitrary locations. Each data item is hashed to a unique numeric key representing for a namespace. Each peer in the network is responsible for a certain number of keys, i.e. a small part of the namespace, and is assigned a unique peer identifier. The DHT search supports two basic operations, including lookup(key), and put(key). The lookup(k) operation is used to localise (e.g., to return the IP address of) a peer responsible for the key k; and, the put(k) operation is used to store a data object (or a pointer to the data object) with the key k in the peer responsible for k. A peer must publish data objects using put(k) operation that were originally stored on it before these objects can actually be retrieved by other peers [15]. To implement the lookup operation, each peer in a DHT network maintains a forwarding table that manages mappings between keys and information (e.g., IP address)

14 of a set of its neighbours. When a peer receives a lookup request, it checks whether it can answer the request itself; if it cannot, the request is forwarded to one of its neighbours based on the forwarding table. The process continues until the result, as long as it exists, is found. Structured P2P systems are usually designed to handle the strong dynamics of peers, to scale to large number of potentially faulty peers based on their highly structured DHT protocols.

2.2.1 Non-hierarchical Structured P2P Networks The most well-known DHT schemes are Chord [28], Pastry [29], Tapestry [30] and CAN [31]. These schemes are based on particular flat data structures representing some traditional interconnection topologies, including ring, mesh, hypercube and other more special graphs such as the d-torus topology (used in CAN), de Bruijn graph used in Koorde [32], Butterfly topology used in Viceroy [33], etc. The underlying topologies have significant impacts on the performance, resilience, and other properties of DHT schemes. We briefly present the two most popular systems: Chord and Pastry.

2.2.1.1 Chord

Chord [28] uses a ring data structure. Chord’s peer identifiers form a uni-dimensional, circular identifier space. Both peers and keys in a Chord system are assigned m-bit identifiers based on a public hash function, such as SHA-1 [34] that generates 160-bit identifiers. Keys are assigned to peers as follows. Key k is assigned to the peer (called the successor) whose identifier is equal or follows the identifier of key k in the identifier space. Each peer keeps a finger table, i.e. list of neighbours, with the size, at most, of m entries. A finger entry contains the identifier, the IP address of the relevant node and some additional routing information. The additional information is not essential for the lookup operation; however, it helps to accelerate the search process. The i th entry in the finger table at peer n contains the first peer s that succeeds n by at least 2i , where 0  i  m and all the arithmetic is modulo 2m . Peer s is called the i th finger of peer n and is calculated with the following formula:

15 fingeri  successorn  2i mod 2m (2.1)

When a peer receives a lookup request for a given key, it checks for the closet preceding peer in its finger table and forwards the key to that peer. This process is repeated recursively until the peer responsible for the key is found. In the example in Figure 2.3, when peer 10 wants to find key 32, it looks up the finger table to find the closest match as start value of 26, and sends the query to peer 26. Similarly, peer 26 in turn sends it to peer 30, which finally sends it to peer 33. Peer 33 is the successor peer for the identifier 32 in the network, hence is responsible for storing information about key 32. The lookup latency is Olog N  because, in each step, the query is forwarded to at least half the remaining distance around the ring [35].

Figure 2.3. Example Chord network [35].

2.2.1.2 Pastry

Pastry [29] uses a tree based data structure which can be generalised as a hypercube. The peer identifier is 128-bit in base 2b , where b is practically 4. Each peer A maintains a set of leaf nodes L in which the identifiers of half of the nodes are closet to and smaller

16 than the identifier of peer A, and the identifiers of the remaining leaf nodes are closet to and larger than the peer A’s identifier. This structure guarantees the correctness of routing because the delivery is guaranteed unless L / 2 peers with their adjacent peer identifiers fail simultaneously. In order to shorten the routing time, each Pastry peer also keeps a routing table to other peers in the identifier space. Each peer A keeps 2b 1 entries for each prefix of its identifier. Each entry in row n refers to a peer whose identifier shares the first n digits with identifier of the current peer A, but whose n 1th digit has one of the 2b 1 remaining other possible values than the present nth digit of the current peer A. Given a query request for the key k, peer A first tries to find in its leaf set a peer whose identifier is numerically closet to key k and to forward the query to that peer. If such peer does not exist, peer A tries to find a peer in its routing table whose identifier shares a longer prefix with key k than A. If no such a peer, A forwards the query to a peer whose identifier has the same prefix to A but is numerically closer to key k than A. Pastry also uses additional proximity heuristics during the process of forwarding queries. To achieve the routing latency of Olog N , each peer needs to maintain Olog N  routing state. The searching schemes used in Tapestry [30] and [36] are variants of Pastry.

2.2.2 Hierarchical Structured P2P Networks With the same notion of hierarchical unstructured P2P systems, hierarchical DHT P2P networks organise peers into different clusters. Each cluster forms its own overlay which all together constitutes the entire hierarchical overlay topology. Peers in the higher tier of the network are called dominating peers or superpeers, which generally require more computing resource, network bandwidth and take more responsibility in routing than normal peers. In most self-organised P2P hierarchical systems, an ordinary peer can become a superpeer based on one or several criterion, which depend on design of the system for a particular application. These criterion comprise of CPU power and storage capacity as

17 proposed by Mizrak et al. in [37], or node stability, connection quality and network bandwidth as in Garces-Erice et al. [38]. The superpeer selection scheme is different for each system. Usually, if a superpeer detects a better peer within its sub-network, the superpeer can promote this peer as superpeer and demote itself to an ordinary node. In the case of superpeer failure, which can be detected through periodic keep-alive messages between peer neighbours, the P2P systems can use protocols such as volunteer service [37] to select another superpeer in order to take over the load of the failed one; or the affected peers may change over to existing neighbour superpeers. Searching operations in each tier of the hierarchy can be similar to one of the non-hierarchical structured P2P schemes mentioned in previous sections. Examples of hierarchical structured P2P systems include Kelips [39], Coral [40], Brocade [41], HIERAS [42], KaZaA [43], etc. More detail about operations of hierarchical structured P2P systems can be found in Section 2.3 when we discuss locality-aware P2P systems.

2.2.3 Summary Structured P2P networks have attracted the attention of the research community because of their advantages including scalability, self-organisation and generality. They have bounded guarantees for resource discovery in which any existing resource can be located within predetermined number of hops. These desirable characteristics of structured P2P networks are realised by the use of DHT schemes. They provide a location- independent naming infrastructure for constructing different types of applications. In hierarchical structured P2P systems, search queries can be processed even more efficiently because superpeers act as centralised servers. These systems also enjoy the advantages of the decentralised topology, i.e. resilience, stability, and load balancing because there are relatively more superpeers in the systems to dismiss the potential problem of becoming a single point of failure. Structured P2P systems are considered costly for maintenance because of the complex operations of DHTs such as maintaining forwarding tables that should be updated

18 whenever peer join or leave. Furthermore, superpeers in hierarchical structured systems play an important role in the top of the hierarchy; hence, failure of some of them might have serious impact on the systems, e.g. Skype registry servers failure in 2007 [44]. Nonetheless, structured P2P networks and their related protocols, algorithms are still a very attractive research topic and are considered future evolution of P2P. Current trends of research include the development of flexible searches [45]; the selection of superpeers so that it is possible to maximise the efficiency, to speed up routing [46]; the correspondences between the overlay P2P networks to the underlying physical network [47], [48]; the relay path diversity [49], and load balancing [50]; etc. In the next section, we elaborate on the research trend to enable the locality-aware capability of structured P2P. All the studied P2P algorithms in this work are assumed to have such functionality.

2.3 Locality-aware Peer-to-Peer Algorithms In most DHT systems, the basic hash functions for the IP address choose peer identifiers randomly and the neighbour relationships are established based solely on these peer identifiers. Locality awareness is not inherent in the design of DHT. As a result, routing stretch of object lookup, i.e. the ratio of the network distance travelled by a lookup query message and the distance between the requesting node and the nearest copy of the target object, can be high. Another challenge of DHT random hashing is load balancing. In a real world corpus, keyword frequency varies. The distribution typically follows Zipf’s law [51], meaning that a few keywords occur very often while many others occur rarely. This problem is called a flash crowd or hot spot. These suggest a new research trend to associate the capability of selecting peers in a geographically informed manner. It is called locality-awareness or network proximity in P2P systems.

2.3.1 Network Proximity in Distributed Hash Tables Structured P2P networks (i.e. DHTs) like [6], [7], [28], [29], [30], [31], [52] offer a novel platform for a variety of scalable and decentralised distributed applications. These systems provide efficient and fault-tolerant routing, object location, and load balancing

19 within a self-organising overlay network. One important aspect of these systems is how they exploit network proximity in the underlying Internet. An initial attempt to do so is in the context of CAN system, which was reported by Ratnasamy et al. in [31]. This approach was quite successfully in reducing path latencies. This work proved the important role of the key-space associated with high dimensionality. Reference [53] discusses three basic approaches suggested for exploiting locality- aware in DHTs to improve routing performance, including geographic layout, proximity routing and proximity neighbour selection.

2.3.1.1 Geographic Layout

In geographic layout approach, the identifiers are assigned in a manner that ensures that peers that are close in the network topology are close in the identifier space. In one implementation, peers measure the round-trip time (RTT) between themselves and a set of landmark servers to map the d-dimensional space onto the physical network, such that peers that are neighbours in the d-dimensional space (and therefore in each other’s routing tables) are close in the physical network. This technique can achieve good performance but it has the disadvantage that it is not fully self-organising; it requires a set of well- known landmark servers. In addition, it may cause significant imbalances in the distribution of peers that lead to hotspots. When considering the use of this method in Chord, Tapestry and Pastry, additional problems arise. Whilst geographic layout provides network locality in the routing, it sacrifices the diversity of neighbouring peers in the identifier space, which has consequences for failure resilience and availability of replicated key-value pairs. Both Chord and Pastry have the property that the integrity of their routing fabric is disrupted when an entire leaf set or successor set fails. Likewise, both protocols replicate key-value pairs on neighbouring peers in the namespace for fault tolerance. With a proximity-based identifier assignment, neighbouring peers, due to their proximity, are more likely to suffer correlated failures or to conspire.

20 2.3.1.2 Proximity Routing

Proximity routing was first proposed in CAN [31]. The routing tables are built without taking network proximity into account but the routing algorithm chooses a nearby peer at each hop from among the ones in the routing table. Each peer measures the RTT to each neighbour (i.e. routing table entry) and forwards messages to the neighbour with the maximum ratio of progress in the d-dimensional space to RTT. As neighbours are spread randomly over the network topology, the distance to the nearest neighbour is likely to be significantly larger than the distance to the nearest peer in the overlay. Additionally, this approach trades off the number of hops in the path against the network distance traversed at each hop; it may increase the number of hops. Because of these limitations, the technique is less effective than geographical layout. Proximity routing has also been used in a version of Chord [28]. Here, a small number of peers are maintained in each finger table entry rather than one, and a message is forwarded to the topologically closest peer among those entries whose identifier is closer to but counter clockwise from the message’s key. Since all entries are chosen from a specific region of the id space, the expected topological distance to the nearest among the entries is likely to be much larger than the distance of the nearest peer in the overlay. Furthermore, it appears that all these entries need to be maintained for this technique to be effective because not all entries can be used for all keys. This increases the overhead of peer joins and the size of routing tables. In conclusion, proximity routing affords some improvement in routing performance, but this improvement is limited by the fact that a small number of peers sampled from specific portions of the identifier space are not likely to be among the peers that are closest in the network topology.

2.3.1.3 Proximity Neighbour Selection

In proximity neighbour selection approach, routing table construction takes network proximity into account. Routing table entries are chosen to refer to peers that are nearby in

21 the network topology, among all live peers with appropriate identifiers. The distance travelled by messages can be minimised without an increase in the number of routing hops. Tapestry and Pastry’s locality properties derive from mechanisms to build routing tables that take network proximity into account. They attempt to minimise the distance, according to the proximity metric, to each of the peers that appear in a peer’s routing table, subject to the constraints imposed on identifier prefixes. Pastry ensures the following invariant for each peer’s routing table: Proximity invariant: Each entry in a peer X’s routing table refers to a peer that is near X, according to the proximity metric, among all live Pastry peers with the appropriate identifier prefix. As a result of the proximity invariant, a message is normally forwarded in each routing step to a nearby peer, according to the proximity metric, among all peers whose identifier shares a longer prefix with the key. Moreover, the expected distance travelled in each consecutive routing step increases exponentially, because the density of peers decreases exponentially with the length of the prefix match. From this attribute, two distinct properties of Pastry with respect to network locality are derived: Total distance travelled: The expected distance of the last routing step tends to dominate the total distance travelled by a message. As a result, the average total distance travelled by a message exceeds the distance between source and destination peer only by a small constant value. Local route convergence: The paths of two Pastry messages sent from nearby peers with identical keys tend to converge at a peer near the source peers, in the proximity space, because in each consecutive routing step, the messages travel exponentially larger distances towards an exponentially shrinking set of peers. Thus, the probability of a route convergence increases in each step, even in the case where earlier (smaller) routing steps have moved the messages farther apart. This result has significance for caching applications layered on Pastry.

22 The routing algorithms in Pastry and Tapestry allow very effective proximity neighbour selection because there is freedom to choose nearby routing table entries from among a large set of peers. This leads to very good route locality properties. Moreover, the join protocol allows Pastry to identify appropriate nearby peers by performing only a small number of network probes.

2.3.2 eQuus: a Locality-aware P2P System In this sub-section, we provide in more detail about a recent research in locality-aware P2P. T. Locher et al. [54] combine several advantages in their proposal of a locality-aware P2P system called eQuus. They assume that all peers are uniformly distributed in a two dimensional Euclidean space for a simple analysis of system properties. The locality awareness criterion in eQuus captures the latency of a lookup operation. eQuus scheme guarantees that the distance travelled by a packet in the network is not much larger than the shortest distance between the source and the destination peer. eQuus employs a hierarchical hypercube topology. Groups of peers that are close to each other according to the chosen proximity metric form the vertices of a partial hypercube. Within such a group (called clique as it forms a complete graph) all peers share the same identifier, which is a bit string of a predefined length d. The length d of the identifiers is referred to as the dimension of the network. Since these nodes share the same identifier, they are also responsible for the same fraction of the key space. This has two interesting properties. First, these nodes ensure a certain degree of redundancy that is required in case data is lost due to the sudden departure or failure of a particular node. Second, consistency among these nodes can be achieved quickly due to the short distance between those nodes. In addition to the connections to all other clique neighbours, each peer has links to peers in other cliques. These additional links ensure the connectivity of the entire system. Figure 2.4 illustrates an example network of eQuus.

23 Figure 2.4. An example network consisting of 5 cliques. Nodes that are close-by belong to the same clique, share the same ID, and are responsible for the same set of data items [54].

In order to give good locality properties, a newly arriving node must join the closest clique in the system by contacting an arbitrary bootstrap node. The contacted node returns the address of one node of each clique in its routing table. The new node then contacts all those nodes and determines the closest one, e.g. by sending several ping messages and waiting for the replies. This procedure is similar to the mechanism described in [55]. Subsequently, a join message is sent to the closest node which again returns addresses of cliques in its routing table. This step is repeated in Olog N  rounds until the closest clique has been found. The contacted node in the close clique informs all the other nodes within the clique about the arrival of a new node and gives the new node all the information it needs to become a fully integrated clique member. The routing table can be copied from any other clique member, since they all share links to the same cliques. Since the degree of each node should not exceed Olog N  to maintain efficient searches, the number of nodes in a clique has to be limited. Once the number of nodes reaches a certain threshold, the clique is confronted with a higher probability of data loss. This observation ensures that, once the number of nodes reaches a certain lower bound,

24 the remaining nodes in the clique have to join another clique, resulting a merging of the two cliques. Likewise, if the number of nodes reaches a specific upper bound, this clique has to be split into two cliques. Hence, apart from the standard operations such as “JOIN” and “LOOKUP”, two additional queries, namely “MERGE” and “SPLIT”, are essential in eQuus. By using such hierarchical structure, eQuus also reduces overhead since nodes joining and leaving a clique only trigger communication among the nodes in the same clique. Changes in the routing tables are only necessary if entire cliques appear or disappear, due to the arrival or departure of a large number of nodes. This allows eQuus to maintain network stability as the life time of cliques is much longer than the lifetime of individual nodes.

2.3.3 Summary P2P networks benefit from such topological information to improve the neighbour relationship. Peers can gradually select close-by neighbours in the virtual space to achieve much better response times for the lookup operations and reduce routing stretch. It also helps to give them inherently strong resilience to churn. This work is concerned with distributed structured locality-aware P2P systems that are practically applicable to provide VoIP service. Specifically, all ordinary peers in the studied systems should at least have some local awareness about locations and paths to their sets of neighbours, as proposed in the EDR system [8]. This knowledge can be acquired using light-weight measurements such as ping, traceroute, or by querying their local superpeers. Superpeers, on the other hand, are assumed to have full knowledge about the overlay network topology, which should be updated and disseminated periodically [56]. In Chapter 5 of this work, we follow the network model used in eQuus [54] and GNP [57] to represent P2P topology in 2-dimensional Euclidian space. This simple modelling allows us to study the P2P path tendency, and to sufficiently approximate the diversity degrees of relay paths.

25 2.4 Peer-to-Peer VoIP Systems The evolution of P2P technologies realises the capability for end-to-end QoS communication services such as voice and video. P2P systems inherently have high scalability because the capacity scales with user population; robustness and fault tolerance since there is no centralised server and the network self-organises itself. VoIP service can be provided as an application of the P2P network where the VoIP clients form a self- organising P2P overlay network to locate and communicate with others. Compared to other P2P applications, the P2P systems providing this delay sensitive service have the following fundamental differences [35]. Firstly, traditional P2P applications such as file sharing systems provide multiple copies of a popular file. Therefore, the reliability of peers is not a problem. On the other hand, in the case of VoIP, we want to talk to the right person, and not similarly named persons. Furthermore, user contact location may change frequently, as opposed to rather stable directory information and locations for a retrievable file. Secondly, when initiating a call, the caller actively waits for the other side to ring. Therefore, P2P VoIP systems are quite sensitive to lookup latency. On the other hand, other applications like file sharing and directory lookup-based systems can tolerate high lookup latency, and the actual file download time tends to be larger than the lookup latency. Thirdly, voice traffic consumes moderate network bandwidth, e.g. a Skype call typically takes only 3-16 kbps [58]. A voice relay peer, therefore, can handle more connections than a file sharing node does. As a result, load balancing and data storage are generally not issues for VoIP peers. Finally, file sharing and directory services may suffer from flash crowd effects. On the other hand, call access patterns are more uniformly distributed. These differences are summarised on Table 2.1. In the following sub-sections, we study the main performance factors of the P2P VoIP systems impacting on our relay path selection algorithms in this dissertation. We then present some P2P VoIP systems.

26 Table 2.1. Summary of different features in P2P applications [35].

Properties/Types File sharing Directory VoIP (for user lookup) Data storage Yes No No Caching Yes Yes No Delay sensitive No No Yes Reliability Having multiple independent Only the intended copies of data helps user must be found

2.4.1 VoIP Performance Parameters Generally, the quality of a VoIP call is influenced by three factors: delay, loss and delay variation (or jitter). The International Telecommunication Union (ITU) has defined Mean Opinion Score (MOS) whish is a subjective quality metric to evaluate human feeling speech quality [59]. MOS is given on a scale of 1 (unacceptable) to 5 (excellent) as shown on Table 2.2 below.

Table 2.2. The Mean Opinion Score for measuring call quality

MOS Rating Perceived Quality 4-5 Excellent Toll quality 3-4 Good Cell phone quality < 3 Fair Unacceptable < 2 Bad Unintelligible

The ITU has also established a method for estimating VoIP call quality from the measured network performances called E-Model [60]. It is computed using the nonlinear function as shown in (2.1), where R is referred to as the R-factor.

MOS  1  0 .035 R  7  10  6 R ( R  60 )(100  R ) (2.1)

27 R-factor is defined as a combination of different aspects of voice quality impairments:

R  R0  I s  I e  I d  A (2.2)

where R0 groups the effects of various noises; Is includes the effect of other impairments that occur simultaneously with the voice signal; Ie covers the impairment caused by different types of losses; Id represents the impairment caused by delays; and, A compensates for the above impairments under various user conditions. In practice, receivers’ buffers in many VoIP systems are used to smooth out the jitter incurred in transmission environments. Using ITU default values, (2.2) can be reduced to

R  94.2  I e  I d . (2.3)

Therefore, both delay and loss are two factors that need to be considered when designing a VoIP network. Recall from begin of this section that the fundamental differences between VoIP relay paths and paths for other P2P applications are that VoIP paths are sensitive to latency and loss, as well as moderate consumption of network bandwidth. In this work, hence, we do not consider network load as a significant factor. Instead, we consider the degree of path diversity represented by the number of overlap hops between default path and alternate relay path. Given the rich inter-connectivity of the large-scale P2P networks, increasing the degree of divergence is considered to improve end-to-end performance because distant relay paths are unlikely to experience link degradation or failure at the same time. Reference [8] has validated the performance improvement when their proposed algorithm, EDR, utilises such disjointness. Furthermore, by maximising relay path diversity, our overall multi-objective optimisation problem (cf. Chapter 5) can also help improving load balancing because load is distributed among disjoint links and dispersed to a more decentralised set of relay peers. Reference [8] also observed that paths with high delay avoidance percentage are also likely to have high loss avoidance percentage, and vice versa. One of the main reasons might be the fact that paths are experiencing long delay if they are part of a congested part

28 of the network, which also causes more losses. We accept this observation in our study to further reduce the complexity. Specifically, we use delay as the second performance metric when comparing different relay path selection schemes.

2.4.2 Skype

2.4.2.1 Skype System Overview

Skype is a P2P VoIP client developed from KaZaA [43] that allows its users to place voice calls and send text messages to other users of Skype clients. In essence, it is very similar to the MSN and Yahoo IM applications, as it has capabilities for voice calls, instant messaging, audio conferencing, and buddy lists. However, the underlying protocols and techniques it employs are quite different. Skype is not an open protocol. All control messages and voice traffic are encrypted. Nonetheless, its successful deployment has attracted much research work. One of the initial studies is the Skype system analysis published by SA. Baset and H. Schulzrinne in [58]. We provide an overview of the Skype system based on this work with a focus on the search functionality of Skype. There are two types of nodes in this overlay network, ordinary hosts and supernodes. An ordinary host is a Skype application that can be used to place voice calls and send text messages. A supernode is an ordinary host’s end-point on the Skype network. Any node with a public IP address having sufficient CPU, memory, and network bandwidth is a candidate to become a supernode. An ordinary host must connect to a supernode and must register itself with the Skype login server for a successful login. Although not a Skype node itself, the Skype login server is an important entity in the Skype network. User names and passwords are stored at the login server. User authentication at login is also done at this server. This server also ensures that Skype login names are unique across the Skype name space. Figure 2.5 illustrates the relationship between ordinary hosts, super nodes and login server.

29 Apart from the login server, there is no central server in the Skype network. Online and offline user information is stored and propagated in a decentralised fashion and so are the user search queries.

Skype login server Message exchange with the login server during login

Ordinary host

Supernode

Neighbour relationships in the Skype network

Figure 2.5. Skype network. There are three main entities: supernodes, ordinary hosts, and the login server [58].

Network Address Translation (NAT) and firewall traversal are important Skype functions. Each Skype node uses a variant of the STUN [61] protocol to determine the type of NAT and firewall it is behind, in decentralised manner. Each Skype client builds and refreshes a table of reachable nodes. In Skype, this table is called host cache and it contains IP address and port number of supernodes. This host cache is stored in the operating system’s registry for each Skype node.

30 Skype claims to have implemented a “third generation of P2P”, i.e. employing the Global Index (GI) technology [62]. The global indexing strategy is an advanced DHT- based keyword searching scheme. Recall that keyword searching is not directly supported by the DHTs. Hence, the global indexing scheme has to maintain in each node the inverted lists of some keywords, and to assign each undivided keyword to a unique node in the system by hashing to a unique key of DHT layer. An inverted list for a keyword contains all the identifiers of objects in which the keyword appears. To answer a query consisting of multiple keywords, the query is sent to peers responsible for those keywords. Their inverted lists are transmitted over the network and intersected to get a list of objects that contain the keywords [14]. To reduce bandwidth consumption and lookup latency, caching is normally implemented meaning that Skype clients cache the user information sent to them from some queries to avoid receiving them again for future queries.

2.4.2.2 Skype Functions

Skype functions can be classified into start-up, login, user search, call establishment and tear down. Start-up When Skype client was run for the first time after installation, it sends a Hypertext Transfer Protocol (HTTP) 1.1 GET request to the Skype server (i.e. skype.com). During subsequent start-ups, a Skype client only sends a HTTP 1.1 GET request to the Skype server (skype.com) to determine if a new version is available. Login Login is perhaps the most critical function to the Skype operation. It is during this process a Skype client authenticates its user name and password with the login server, advertises its presence to other peers and its buddies, determines the type of NAT and firewall it is behind, and discovers online Skype nodes with public IP addresses. These newly discovered nodes are used to maintain connection with the Skype network should the supernode to which Skype client is connected become unavailable.

31 In login process, Skype client must establish a Transmission Control Protocol (TCP) connection with a supernode in order to connect to the Skype network. If it cannot connect to a super node, it will report a login failure. After a Skype client is connected to a supernode, the Skype client must authenticate the user name and password with the Skype login server. The login server is the only central component in the Skype network. It stores Skype user names and passwords and ensures that Skype user names are unique across the Skype name space. Skype client must authenticate itself with login server for a successful login. After logging in for the first time after installation, host cache is initialised with seven IP address and port pairs. Upon first login, host cache is usually initialised with these seven IP address and port pairs. Thus, they are called bootstrap supernodes. In the case where host cache is initialised with more than seven IP addresses and port pairs, it always contains those seven bootstrap supernodes. It is with one of these IP address and port entries a Skype client establishes a TCP connection when a user uses that Skype client to log onto the Skype network for the first time after installation. For the first time login, A Skype client maintains a TCP connection with at least one bootstrap node to acquire the address of the login server. Skype client then establishes a TCP connection with the login server, exchanges authentication information with it through a challenge-response mechanism. Skype is a P2P client and P2P networks are very dynamic. Skype client, therefore, must keep track of online nodes in the Skype network so that it can connect to one of them if its supernode becomes unavailable. Thus, at the end of the login process, Skype client tries to contact with about 20 distinct nodes to advertise its arrival on the network and to build an alternate node table of online nodes. The subsequent login process is quite similar to the first-time login process. The Skype client builds a host cache after a user has logged in for the first time after installation. The host cache gets periodically updated with the IP address and port number of new peers. During subsequent logins, Skype client uses the login algorithm to determine at least one available peer out of the nodes present in the host cache. It then establishes a TCP connection with that node.

32 User Search Skype uses its GI technology to search for a user. Skype claims that search is distributed and is guaranteed to find a user if it exists and has logged in during the last 72 hours. Extensive testing in [58] confirms that the GI guarantees Skype clients to find, within 3-4 seconds, a user who has logged in the Skype network in the last 72 hours. However, the underlying search technique that Skype uses for user search is still not clear. Reference [58] suggests that it uses a combination of hashing and periodic controlled flooding to gain information about online Skype users. Call Establishment and Teardown For users that are not present in the buddy list, call placement is equal to user search plus call signalling. If both users are on public IP addresses, online and are in the buddy list of each other, then upon pressing the call button, the caller host cache establishes a TCP connection with the callee host cache. Signalling information is exchanged over TCP. In the case where the caller is behind port-restricted NAT and callee is on public IP address, signalling and media traffic do not flow directly between caller and callee. Instead, the caller sends signalling information over TCP to an online Skype node which forwards it to callee over TCP. This online node also routes voice packets from caller to callee over UDP and vice versa. If both users are behind port restricted NAT and UDP-restricted firewall, both caller and callee host cache exchange signalling information over TCP with another online Skype node. There are many advantages of having a node route the voice packets from caller to callee and vice versa. First, it provides a mechanism for users behind NAT and firewall to talk to each other. Second, if users behind NAT or firewall want to participate in a conference, and some users on public IP address also want to join the conference, this node serves as a mixer and broadcasts the conferencing traffic to the participants. The

33 negative side is that there will be a lot of traffic flowing across this node. Also, users generally do not want that arbitrary traffic should flow across their machines. During call tear-down, signalling information is exchanged over TCP between caller and callee if they are both on public IP addresses, or between caller, callee and their respective host caches.

2.4.2.3 Skype Related Research

S. A. Baset and H. Schulzrinne [58] also analyse several other key functions of Skype, including NAT/firewall traversal and supernode relay. Following this work, Guha et al. in [63] provide experimental data about Skype system that is useful for future design of P2P VoIP systems, including the population of online Skype clients, the number of supernodes, and their traffic characteristics. Y. Yu et al. in [64] study the Skype system in order to identify P2P traffic and carry out measures on the Skype overlay network with a special ad-hoc tool. Similarly, K. Suh et al. in [65] try to detect and characterise the Skype relay traffic through metrics. In [50], G. Caizzone et al. analyse the scalability aspect of Skype network. It is pointed out that capacity of links is not critical for Skype system scalability. However, it is an issue with the ratio of number of Skype users to the number of supernodes. The supernodes drive the maximum size that the Skype network can reach. The measurement and experiment based studies [56], [66] have demonstrated that the Skype system uses a sub-optimal relay path selection mechanism with large number of unnecessary probes, resulting in heavy network traffic. Reference [56] also suggests through two scenarios that overlay routing paths can be faster (or shorter) than the direct IP routing paths.

2.4.3 P2P SIP Telephony Skype protocol is proprietary and closed. Another approach for P2P VoIP implementation is based on Session Initiation Protocol (SIP) [67], [68]. SIP is a control protocol originally used in Internet Engineering Task Force (IETF) Internet telephony

34 client-server architecture. Its job is to set up, modify, and tear down sessions between session users. The main functions of SIP are to act as a signalling protocol, and to define the type of session for which it is signalling. It can also support sessions via or single unicast, a mesh of unicast sessions, or a combination of these choices. Four major functions that SIP supports are:  Determination a user location;  Determination of media for the session;  Determination of willingness of a user to participate;  Call establishment, call transfer, and termination. One of the most important architectural features of SIP is that it relies on client-server model. The majority of the system cost of this model is in maintenance and configuration, typically by a dedicated system administrator in the domain. It also means that quickly setting up the system in a small environment (e.g., for emergency communications or at a conference) is not easy. P2P Internet telephony using SIP [69], [70], [71], [72], [73] has been proposed to avoid the maintenance and configuration cost of the client-server architecture in SIP, and to prevent catastrophic failures of server based systems. The IETF is conducting a working group (i.e. P2PSIP) that develops standards-track specifications for P2P SIP. This effort is based on the use of REsource LOcation And Discovery (RELOAD) Base Protocol [74], a P2P signalling protocol. The P2P signalling protocol provides the network nodes that form an overlay network with abstract storage, messaging, and security services. In this sub-section, we provide a brief introduction to P2P SIP Telephony architectures of the study [35], focusing on the different P2P SIP architectural implementations. There are two approaches for combining SIP and P2P: replace the SIP location service by a P2P protocol (SIP-using-P2P) [69], [72], and additionally, implement the P2P protocol itself using SIP messaging (P2P-over-SIP). In the first case, P2P is used only for lookups and updates of SIP user’s IP addresses. A scalable and global P2P location service

35 automatically makes the SIP lookups scalable. In the second case, the P2P maintenance protocol can further exhibit two modes: (1) tunnel the P2P protocol messages in SIP, e.g., as a message body or headers, or (2) reuse the semantics of some of the SIP messages and headers to convey proximity and location information. These are fundamentally similar between the two approaches because there is a clear separation between the DHT layer and the SIP layer as shown in Figure 2.6. The difference is that in P2P-over-SIP, the P2P maintenance protocol is also implemented using SIP. Followings are the comparisons of the two architectures. In the SIP-using-P2P architecture, the system can use the optimisations and enhancements done in the external DHT. For example, the message overhead can be reduced for the DHT maintenance. However, the algorithmic overhead of number of messages remains the same and depends on the particular DHT (e.g., Chord) in use. Some SIP specific timers (e.g., retransmission timeout) may not be acceptable for some DHT-based applications, especially if the timers translate to long DHT lookup and update latency.

Figure 2.6. Difference between SIP-using-P2P and P2P-over-SIP architectures [35]

In the SIP-using-P2P architecture, the node needs to implement the particular DHT connector. If multiple DHTs can be used then such implementations need to potentially implement all such DHT connectors.

36 Today, there are multiple P2P protocols that do not interoperate and are not meant to interoperate (e.g., Kademlia, Chord, OpenDHT [75]). Moreover, there is no single protocol or mechanism to talk to any DHT. Thus, the P2P-over-SIP architecture gives us an opportunity to build such an interface using SIP. Using SIP to build the DHT allows us to reuse the existing naming, routing, and security issues from SIP. Moreover, the NAT and firewall traversal mechanisms in SIP can also be used to allow a node behind a NAT to become a super-node. Secondly, SIP features such as redirect and proxy modes are readily reusable in a DHT’s iterative and recursive modes. Moreover, we can transparently reuse the existing SIP-based components such as voicemail and conferencing servers without having them to understand the DHT protocol to update the DHT indicating that they provide the service.

2.4.4 Summary VoIP has been an active area of research and development in the past decades, with a number of companies providing Personal computer (PC)-to-PC and PC-to-phone calls. Their objective is mainly to provide low-cost call service to Public Service Telephone Network (PSTN) from the public Internet. VoIP system architecture can be either client- server or P2P. P2P VoIP has the benefit of avoiding high maintenance and configuration cost of the client-server architecture, and preventing the single point of failures problem in server based systems. In this section, we discuss on the important performance parameters of P2P VoIP systems. We then present two architectures of well-known P2P VoIP systems, including Skype and P2P SIP networks. The remarkable impact of Skype system, and consequently, the intense research effort in the field of P2P VoIP have motivated us in this work.

2.5 Relay Path Selection In this section, we introduce several relay path selection schemes that influence our research work, including the Resilient Overlay Network (RON) [76], Detour [77], Path

37 Diversity with Forward Error Correction System (PDF) [78], AS-aware Peer relay Protocol (ASAP) [56], and Early Divergence Rule (EDR) [8], [9].

2.5.1 Resilient Overlay Network RON [76] is an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance within several seconds, improving over today’s BGP routing protocol that take at least several minutes to recover. A RON is an application-layer overlay on top of the existing Internet routing substrate. The RON nodes monitor the functioning and quality of the Internet paths among themselves, and use this information to decide whether to route packets directly over the Internet or by way of relaying via other RON nodes, optimising application-specific routing metrics. RON's routing mechanism was able to detect, recover, and route around all of them, in less than twenty seconds on average, showing that its methods for fault detection and recovery work well at discovering alternate paths in the Internet. Furthermore, RON was able to improve the loss rate, latency, or throughput perceived by data transfers. RON nodes exchange information about the quality of the paths among themselves via a routing protocol and build forwarding tables based on a variety of path metrics, including latency, packet loss rate, and available throughput. Each RON node obtains the path metrics using a combination of active probing experiments and passive observations of on-going data transfers. Each RON is explicitly designed to be limited in size - between two and fifty nodes - to facilitate aggressive path maintenance via probing without excessive bandwidth overhead. This allows RON to recover from problems in the underlying Internet in several seconds rather than several minutes. The second goal of RON is to integrate routing and path selection with distributed applications more tightly than is traditionally done. This integration includes the ability to consult application- specific metrics in selecting paths, and the ability to incorporate application-specific notions of what network conditions constitute a "fault."

38 The third goal of RON is to provide a framework for the implementation of expressive routing policies, which govern the choice of paths in the network. For example, RON facilitates classifying packets into categories that could implement notions of acceptable use, or enforce forwarding rate controls. Path Evaluation and Path Selection in RON RON routers need an algorithm to determine if a path is still alive, and a set of algorithms with which to evaluate potential paths. The responsibility of these metric evaluators is to provide a number quantifying how "good" a path is according to that metric. These numbers are relative, and are only compared to other numbers from the same evaluator. The two important aspects of path evaluation are the mechanism by which the data for two links are combined into a single path, and the formula used to evaluate the path. Every RON implements outage detection, which it uses to determine if the virtual link between it and another node is still working by using an active probing mechanism. On detecting the loss of a probe, the normal low-frequency probing is re- placed by a sequence of consecutive probes, sent in relatively quick succession spaced by PROBE_TIMEOUT seconds. If OUTAGE_THRESH probes in a row elicit no response, then the path is considered "dead". If even one of them gets a response, then the subsequent higher-frequency probes are cancelled. Paths experiencing outages are rated on their packet loss rate history; a path having an outage will always lose to a path not experiencing an outage. The OUTAGE_THRESH and the frequency of probing (PROBE_INTERVAL) permit a trade-off between outage detection time and the bandwidth consumed by the (low-frequency) probing process. RON does not attempt to find optimal throughput paths, but strives to avoid paths of low throughput when good alternatives are available. From the standpoint of improving the reliability of path selection in the face of performance failures, avoiding bad paths is more important than optimising to eliminate small throughput differences between paths.

39 Oscillating rapidly between multiple paths is harmful to applications sensitive to packet reordering or delay jitter. While each of the evaluation metrics applies some smoothing, this is not enough to avoid "flapping" between two nearly equal routes: RON routers therefore employ hysteresis. Based on an analysis of 5000 snapshots from a RON node's link-state table, it chose to apply a simple 5% hysteresis bonus to the "last good" route for the three metrics. This simple method appears to provide a reasonable trade-off between responsiveness to changes in the underlying paths and unnecessary route flapping.

2.5.2 Detour Detour [77] is an overlay framework proposed for path diversity implementation. This mentioned as virtual Internet, in which routers “tunnel” packets over the public Internet in place of using dedicated links. This design allows easy deployment of experimental infrastructure and, unlike dedicated network testbeds, is subject to real Internet traffic loads. Detour is composed of a set of geographically distributed router nodes interconnected using tunnels. A tunnel can be thought of as a virtual point-to-point link. Each packet entering a tunnel is encapsulated into a new IP packet and forwarded through the Internet until it reaches the tunnel's exit point. This same mechanism has previously been used to form the multicast backbone () and the experimental IPv6 backbone (6BONE). Tunnels are useful because they allow new routing functionality to be prototyped while using the existing network infrastructure. A host wishing to use the Detour network will direct its outbound traffic to the nearest Detour router. Its packets will be forwarded along tunnels within the Detour network and will exit at a point close to the destination. In order that responses return in the same fashion, the system must perform network address translation, so the source address of the packet reflects the exit router and not the actual source. This complication is a necessary consequence of using tunnels to superimpose a new routing framework.

40 The Detour architecture is illustrated in Figure 2.7. It is important to notice that Detour routers are edge devices and do not appear in the core of the network. By controlling routing and congestion control at the edge of the network, Detour system expects to archive sufficient control while avoid potential problems of supporting per- flow processing at the high traffic bandwidths found in the core of the network. Detour routers can exchange information about the measured latency, drop rate and bandwidth available along their tunnels. It also enables the use of dynamic multi-path routing to automatically load balance the Detour system and avoid congestion before it occurs by randomly assigning flows to good paths and by dynamically varying how traffic is spread across such paths. Furthermore, Detour architecture allows routers to be able to classify traffic, and select a routing policy best suited to the needs of each traffic class.

Figure 2.7. Architecture of the Detour virtual Internet [77]

In summary, Detour system enables multipath routing at the router level whereas other P2P relay path selection schemes considered in this section accomplish this task at application level.

41 2.5.3 PDF Path diversification system with forward error correction (PDF) [78] has been proposed for a single sender, single receiver over packet switched networks such as the Internet. PDF system is similar to RON [76] in that it consists of a set of participating nodes that receive and forward packets to other nodes. However, PDF forwards packets simultaneously over multiple redundant paths rather than selecting an optimal path to send all the packets on. The central question to be solved in PDF in one sender one receiver scenario with path diversity is whether there exists sufficiently disjoint paths between a pair of senders and receivers on the Internet to result in uncorrelated loss patterns between paths. If the paths are not entirely disjoint, then the probability of congestion on the shared links between the paths must be small in order to minimise the overall end-to-end loss. PDF project has further characterised the disjointness between the redundant and default paths for various Internet topologies, and show significant reduction in packet loss using PDF system over single path scheme. To set up a communication channel between the sender and receiver, the sender first executes traceroute from itself to all the participating nodes and the receiver. The information returned from the traceroute includes the link latencies, and the names of the routers along the default path from the sender to the receiver and all paths between the sender and the participating nodes. The sender also sends a setup packet to all participating nodes instructing them to execute traceroute from themselves to the receiver. Next, all the participating nodes send the path information between themselves and the receiver obtained from traceroute to the sender. The sender now has the names of the routers and their associated link latencies for the default path, the paths between itself to all participating nodes, and the paths between participating nodes to the receiver. This information is used to compute the optimal redundant path. After the redundant path via a chosen relay node is selected, the sender sends a setup packet to the selected relay node, instructing it to forward packets to the receiver on behalf of the sender from then onward. The setup packet contains a flow ID, IP address, and the

42 port number of the receiver. Upon receiving the setup packet, the relay node stores an entry containing a flow ID, an IP address, and a port number of the receiver in a table. This table is used to forward packets on behalf of different senders to their receivers. Each packet sent from a different sender to the relay node contains a different flow ID in its header. The relay node forwards a packet to the right IP address and port number of the receiver based on its flow ID. Note that all the setup messages and executions of traceroute are done only at the start of the session. In delay sensitive applications, PDF system chooses to only use one relay node to forward packets between a pair of sender and receiver. The reason is that the delay of a redundant path via multiple participating nodes in series is likely to be larger than that of the redundant path via only one node. However, PDF can be easily extended to the case where there are multiple redundant paths via multiple relay nodes. Intuitively, the path selection scheme first finds a set of redundant paths that are as disjoint as possible from the default path. Within this set of redundant paths, it then selects the one that results in minimum latency. An alternative might seem to be to select the redundant path based on traffic characteristics of each link along the path between two nodes. However, PDF does not have knowledge of loss rates and bandwidths for individual links, thus, it chooses the redundant path to be maximally disjoint with the default Internet path so as their losses are uncorrelated. This results in PDF system to be effective in minimising the packet loss rate. PDF system does not aggressively send out probing packets to monitor the current loss rates and bandwidths between participating nodes. Instead, it simply uses the route information between participating nodes, and only invokes traceroute at the start of the session. Hence, the solution is scalable, even though its performance is potentially lower than that of RON with aggressive probing for network conditions. One of the drawbacks of PDF system is that its performance depends on the information provided by traceroute. This information can be incomplete or inaccurate. For example, traceroute can only differentiate between routers and not switches.

43 Another drawback is that some ASs do not report accurately or deliberately hide information about their networks; only certain routers are visible to outside and therefore, complete information is not available.

2.5.4 AS-Aware Peer-relay Protocol S. Ren et al. have proposed an AS-Aware Peer-relay Protocol (ASAP) in [56]. ASAP shows that by having knowledge of AS topology, P2P VoIP systems can yield better voice call performance than the AS-unaware versions of Skype system. It proposes a complete solution for building such large VoIP overlay system. The design of ASAP is based on the following Internet properties. (1) In general, peer nodes with the same IP prefix are relatively close to each other [79]. The collection of all peers with the same IP prefix form a cluster. The direct IP routing latency between two peers in two different clusters can be estimated by the direct IP routing latency between any pair of nodes in their corresponding clusters. (2) With publicly available BGP tables and updates (e.g., [80], [81]), an up-to-date annotated AS graph can be built. (3) The number of AS hops and the latency of a direct IP routing path are correlated, and paths with longer AS hops are likely to have longer latency [82]. (4) An Internet AS-level direct IP routing path usually has the valley-free property [83]. ASAP defines the three types of nodes, bootstrap, cluster surrogate, and normal end hosts. The ASAP system structure is illustrated in Figure 2.8. The operations of the node types are as follows. Bootstraps are normally the powerful, dedicated, and always-on servers used to process VoIP user login and nodes’ join requests. They provide following functions and services to make the entire system informative and intelligent: 1. Build an annotated up-to-date AS graph. 2. Build an IP prefix to cluster surrogate IP mapping table and an IP prefix to AS number (ASN) mapping table. Upon the join request of a new node, translate the node IP to its ASN and its cluster surrogate IP, return the ASN and the cluster surrogate IP to the new node. Note that an AS can have multiple IP prefixes.

44 3. Disseminate the AS graph to surrogates, so that every surrogate has an up-to-date AS graph. 4. Select new surrogates for clusters upon surrogate failures.

Bootstrap’s data Bootstrap1 structure

Internet AS graph Bootstrap2 Cluster surrogate’s IP prefix to cluster data structure Surrogate IP table Cluster’s close IP prefix to ASN cluster set table Internet AS graph Surrogates Bootstraps Cluster’s top node table Surrogate SA Cluster C Surrogate SB

Cluster A End host h3 Cluster B

End host h1 End hosts End host h2

Figure 2.8. ASAP system structure [56].

Surrogates are powerful and stable with high bandwidth network connections within a cluster. These nodes volunteer themselves to provide the following services: 1. Maintain the list of IP addresses of all end hosts in their clusters. 2. Periodically contact bootstrap nodes to retrieve the up-to-date annotated AS graph. 3. Periodically run an algorithm to construct close cluster sets for their clusters (The close cluster set of a cluster are those clusters whose end hosts have short direct IP routing latencies to any end host in this cluster). 4. Process close cluster set requests from other end hosts in their clusters.

45 5. Accept nodal information1 of other end hosts in their clusters. If there are better end hosts, recommend the better end hosts to be new surrogates, become normal end hosts, and notify bootstraps and other end hosts in their clusters of the changes. End hosts are millions of callers/callees in ASAP and they have the following light duties: 1. Get their ASNs and the surrogate IP addresses of their clusters from bootstraps. 2. Become surrogates in their clusters, if they are the only nodes in their clusters. 3. Periodically publish their nodal information to their surrogates. 4. Run an algorithm for selecting relay peers when they initiate VoIP calls. From the perspective of selecting relay paths, ASAP end hosts use an algorithm called ‘select-close-relay’ (here in after refer as ASAP algorithm), which typically chooses the closest peers for relaying voice traffic. It means that ASAP prefers shortest relay paths. For computing suitable paths for relaying voice traffic, ASAP nodes use information both from source and destination nodes. Since this scheme does not take path diversity into account, ASAP relay paths may contain a number of overlaps.

2.5.5 Earliest Divergence Rule Recently, Fei et al. in [8], [9] have developed a scheme for disjoint overlay AS path selection suitable for large P2P VoIP systems called earliest divergence rule (EDR). EDR has demonstrated significant improvement in avoiding network degradation. The method assumes that there is a large number of relay candidates in P2P VoIP system and routing decisions of the P2P system are computed using local knowledge of source nodes only. For example, an end host can send traceroute or ping probes to its surrounded ASs to obtain the AS-level path as well as latency information. Focusing on AS-level path

1 Nodal information includes bandwidth, continuous online time, node processing power, and other related information.

46 information has the advantage of greater AS-level accuracy, lower overhead while enjoying AS path diversity due to valley-free rule [83] in interdomain routing. In the following Figure 2.9, we briefly demonstrate EDR operation in an exampled configuration involving four possible choices of relay nodes.

Dst1 Dst2

Relay1 Relay2 Relay3 Relay4

AS2 AS3 AS4

AS1

Src Figure 2.9. Illustration of Earliest Divergence Rule [9].

In EDR, the source knows the paths towards all relay nodes and makes relay selections based on this information. Among the earliest divergence relay candidates, EDR then chooses k nodes who have maximally allowed delay from source to relay nodes, i.e. first hop relay path. In Figure 2.9, when the destination is Dst1, EDR selects Relay1, Relay3 and Relay4 as candidate relay nodes, while it selects Relay1 and Relay2 when the

47 destination is Dst2, to maximise path disjointness. A relay alternate path is selected randomly or based on additional constraints among those k nodes. The effectiveness of EDR is predicated on some assumptions as follows: 1. By choosing overlay paths that share fewer AS, and therefore hopefully fewer physical links, with the default path, correlation of performance between the paths should be as minimal as can be. 2. Paths that diverge early from the default path will also tend to converge back on it late, minimising their total overlap [9]. 3. By going far from the default path, the likelihood the relay path merges back into the default path is reduced. As a result of long relay path preference, EDR often chooses paths which have rather high delay. Such behaviour can affect the quality of a VoIP call and there might be some better quality paths skipped.

2.5.6 Summary In this section, we describe some relay path selection mechanisms that are closely related to our research work. Their shortcomings in the design of relay path selection method have motivated us to concentrate on the questions in this dissertation: if there is any way to combine the two EDR and ASAP schemes so that alternate relay paths have small values of delay while maintain relatively small number of overlaps; and how much improvement in performance is gained from such a combination.

48 CHAPTER 3

AS-LEVEL TOPOLOGY AND SIMULATION DESIGN

3.1 Introduction In this chapter, we turn our attention to the foundations of our study, and the selection of adequate research methodology. We first focus on the characteristics of inter-domain routing environment, which indicate our motivations of choosing autonomous system (AS)-level Internet topology, as opposed to IP-level topology, as the level for relay path selection algorithms. We discuss the advantages and drawbacks of selecting for AS-level topology, including the feasibility and scalability of relay path selection schemes, network efficiency, and the path diversity. Subsequently, we create a framework for simulating and analysing AS topology. Because the problem we look at is the selection of AS relay paths, we concentrate on macroscopic graph simulations, i.e. we overlook network design principles, and do not provide consideration of network design decisions [84]. Important aspects here include the reproduction of the power-law distribution of node degrees and the network hierarchy observed in Internet topologies, and topology construction techniques that model network structures as annotated graphs, i.e., graphs with different link annotations. Finally, we introduce two simulation packages - a randomly simulated graph generator, and a real AS-level Internet topology simulation. These simulation tools are used as our test-beds throughout the dissertation. The remainder of this chapter is organised as follows. Section 3.2 reviews existing approaches related to our study. Section 3.3 shows our motivation for selecting AS-level alternate relay paths in voice-over-IP (VoIP) peer-to-peer (P2P) overlay systems. In Section 3.4, we present the design requirements of the simulation and analysis tools for this work. In Section 3.5, we discuss the important aspects of the synthetic AS topologies

49 that influence our AS graph simulator. Sections 3.6 and 3.7 respectively explain our two simulation packages for random graph simulation and the synthetic AS-level Internet. Finally, Section 3.8 summaries our work in this chapter.

3.2 Related Work One important characteristic of inter-domain routing is that the AS paths typically follow valley-free rule. It was first reported by L. Gao in [83]. The results have opened a trend of research on the Type of Relationship (ToR) problem, e.g. [85], [86], [87], [88], [89]. AS paths in our simulator are generated with the valley-free pattern. For the synthetic AS topology, we use CAIDA AS relationship database [90] in our simulations. We also use the research results in [84] to reproduce the actual AS relationship portions in our random AS graph simulator. Several works have developed techniques to decompose the AS topology into different hierarchies based on connectivity properties of BGP-derived AS graphs. R. Govindan and A. Reddy [91] propose a classification of ASs into four levels based on their AS degree. Z. Ge et al. [92] classify ASs into seven tiers based on inferred customer- provider relationships, i.e. to exploit the idea that provider ASs should be in higher tiers than their customers. L. Subramanian et al. [93] classify ASs into five tiers based on inferred customer-provider as well as peer-peer relationships. C. X. Dimitropoulos [94] classifies the Internet into six types of AS. The last approach is probably the most up-to- date and is now used in the CAIDA project [95].

3.3 AS-level Topology In the inter-domain environment, ASs form a complete global network. The Border Gateway Protocol (BGP) is deployed on the border routers of ASs to exchange routing information with other BGP routers within the same ASs or in neighbouring ASs. BGP belongs to the class of path-vector routing protocols, which means that each route announcement, called a BGP update, carries an AS path that is used to reach the

50 announced destination. An AS path is a list of ASs arranged in the order that they will be passed through to reach a particular destination. Through the BGP AS paths, business agreements between ASs are realised. For example, small ASs pay larger ASs to connect to the Internet, whereas large ASs seek to attract customer ASs to increase their revenues. Moreover, ASs may wish to send packet via a certain upstream provider that is less expensive; small ASs typically avoid transit traffic between their providers, since this would overwhelm their links. Thus, AS links reflect not only traffic flows but also money “flows”. These business relationships, or AS relationships as they are also known, influence how packets are routed between ASs, and have a direct impact on how BGP routers are configured and translated in technical specifications [84]. AS relationships can be classified in three categories, including customer-provider (C- P), peer-peer (P-P), and sibling-sibling (S-S). In the C-P category, a customer AS pays a provider AS for carrying traffic originated from the customer as well as for delivering traffic destined to the customer. In a P-P relationship, two ASs exchange traffic between their customers but do not exchange traffic from or to their providers or peers. A P-P relationship is typically established between two ASs of similar size and does not involve money exchange. However, the two ASs have mutual incentives to reduce their transit costs by establishing a P-P connection. In contrast to P-P relationship, two sibling ASs can exchange traffic between their providers, customers, peers, or other siblings. Sibling ASs usually belong to the same organisation or to strongly affiliated organisations [83]. This work focuses on the AS-level relay path selection algorithms for three following reasons. The first is the degree of network information granularity. In order to be able to cope with large-scale networks, P2P relay path selection algorithms must use selective information. In such an environment, requiring every end host to have full knowledge about the locations (i.e. IP address) of a set of neighbouring nodes can be too expensive. In contrast, utilising AS-level location information will obviously reduce the state information and monitoring overhead of end hosts. Reference [8] has argued that the advantage of using AS-level path information is that it is relatively easy and accurate to

51 measure by using traceroute for identifying the set of ASs crossed en route to a destination. There have been several studies [96], [97] on how to convert IP-level paths into AS-level paths. Furthermore, even if traceroute returns an incomplete IP-level path, it is still possible to accurately infer the ASs that nodes with unknown IP addresses are associated with. Secondly, as the overlay topology differs significantly from the physical network, it often results in zigzag routes over the physical Internet [98], i.e. queries are forwarded back and forth between peers in different locations due to message flooding. Zigzag routes cause unnecessary bandwidth consumption and high delays. The situation is even worse if those zigzag routes are cross-continental paths. There exist proposals to adapt the virtual overlay to the physical network via cross-layer communication as e.g. in [47], [98], [99] or using methods for signalling traffic compression [100], [101], etc. With AS-aware capability, however, relay path selection algorithms can naturally be supported to eliminate the problem of zigzag routing by control the flood within an AS or a group of close ASs. This helps to reduce zigzag routes significantly since about 93% of ASs are small regional ISPs or customer networks [94], while large inter-continental ASs, i.e. large backbone providers, are not likely to be the places where peers can be installed. The third advantage of using AS-level information for routing is that relay paths can benefit by the path diversity characteristic of inter-domain routing. AS relationships result in restrictions on what AS paths can be used to route traffic between a source and a destination. In particular, a valid AS path must have the following hierarchical structure: an uphill segment of zero or more C-P or S-S links, followed by zero or one P-P link, followed by a downhill segment of zero or more provider-to-customer (P-C) or S-S links. The paths that follow this pattern are called valley-free or valid, which was first discovered by L. Gao in [83]. Basing on this valley-free nature of the routing paths, T. Fei et al. in [8], [9] have demonstrated that using AS-level relay paths increases the path disjointness, and hence, improves service since disjoint back-up relay paths are likely able to avoid failure or downgraded links in the default path. On the other hand, S. Ren et al. [56] have shown

52 that the default direct AS paths can typically be lengthened due to the valley-free rule. As a result, one-hop relay routing via multi-homed customer ASs can be used to provide shorter communication paths. These advantages have motivated us to choose AS-level topologies as the main testing field for the algorithms studied in the dissertation. The main drawback of studying AS- level topology, however, is that we cannot investigate combined scenarios of intra-domain and inter-domain P2P overlay routing.

3.4 The Design Requirements of Simulation and Analysis Tools The next important task is to select a suitable research methodology. In our research conditions, simulation is the most suitable choice. However, choosing the right simulation is not trivial. The research community has made significant advances in developing topology generators for Internet simulations [102], in which some can create networks with locality and hierarchy loosely based on the structure of the current Internet. One of the main problems is that network behaviour simulations have not yet tackled the large- scale nature of network topology and protocols. For example, the PlanetLab topology [103] is only comprised of about 500 sites which certainly cannot substitute for the real AS network. More discussion on the difficulties in simulating the Internet can be found in an interesting paper [104]. Therefore, in this work, we decided to study the relay path selection algorithms in two simulated environments, randomly generated graphs and the real AS-level Internet simulated topology. These provide us opportunities to work with different network sizes and configurations. The major requirements for such a simulation are:

53  It must be able to simulate large-scale networks1. Specifically, the simulation should be capable of generating up to several hundreds of thousand node graphs, i.e. similar to the current AS Internet topology.  Its routing must be able to reflect the valley-free characteristic of the BGP in practical inter-domain routing, i.e. the ToR graph problem.  The structure of simulated graphs should be scale-free, reproducing actual network hierarchy (cf. Section 3.5.1).  The ToR quantities should be similar to the inferred AS relationship statistics of the Internet (cf. Section 3.5.2). Clearly, random graph generators such as GT-ITM [105] , [106], or Tier [107] are not suitable for our research as they have not been designed for simulating valley-free routing environment and their graphs are not scale-free compliant. On the other hand, the BGP model in the SSFnet simulator [108], the most widely-used BGP simulator, exhibits considerable memory demand, thereby preventing simulations larger than a few hundred of BGP routers. The BGP++ [109] also suffers from memory demand as it contains unnecessary features to our research. Two more relevant AS graph generators to our work are described in X. Dimitropoulos [89], [110], and R. Cohen [111]. Unfortunately, these are not yet available for public use. Therefore, we decide to make our own AS graph simulator that meets the above design requirements.

3.5 Scale-Free, Hierarchical and ToR Networks

3.5.1 Scale-Free Model and Network Hierarchy The design of our AS graph simulator is influenced by work on growing scale-free graphs. This network characteristic has been found by three brothers Faloutos [112] from an analysis of the Internet backbone. They have reported that the natural structure (i.e. the degree) of the Internet nodes follows a power law. Consequently, A. L. Barabasi and R.

1 The internal structure of a node should be very abstracted as we do not consider intra-AS routing. This would greatly reduce computational memory requirement while shorten computation time for a simulation.

54 Albert have developed the Scale-Free or Power-Law network model, which can be described as follows [113]: 1. The network grows in time. 2. A new node joining the network will have preferences to whom it wants to be connected. This preferential attachment is modelled in the following way: Each

new node i wants to connect to m0 other nodes that are already in the network. The

probability t  j that some old node j gets one of the m edges is proportional to

its current degree kt  j at time t:

k  j k  j t  t  t (3.1) k v 2m vV t t

with mt being the number of edges in the graph at time t. Thus, the network model works as follows:

1. Begin with a small network of at least m0 nodes and some edges.

2. In each simulation step add one node. For each of its m0 edges draw one of the node j that are already in the graph, each with probability of  j. Albert and Barabasi later in [114] show that, in each time step the probability of an edge attached to a node is proportional to the degree of the node. Thus, it is sufficient that any network model show this preferential attachment in order to generate scale-free networks. We apply this property in our simulator when creating random annotated AS graphs. Specifically, each node in our simulated growing graph may connect to another, in each algorithm step, with a probability that is proportional to its current degree (cf. Section 3.6). It makes sure that our AS graph simulator generates scale-free graphs. Another important aspect of a graph simulation is that it should exhibit a hierarchical structure that is close to the Internet configuration. This network feature has been extensively studied to understand the topological structure of the AS connectivity graph (cf. Section 3.2). In this work, we divide our simulated graphs into five tiers. This division is close to the five tier hierarchical network proposal of Subramanian et al. [93]. This hierarchy is simpler to implement than the network classifications of X. Dimitropoulos

55 [94]. Since our simulated graphs are randomly generated, detailed classification information for AS taxonomy, therefore, does not provide any further consequence to our studied problem.

3.5.2 ToR Graphs As mentioned in Section 3.3, a connection between two ASs can be one of the types C-P, P-C, P-P or S-S. However, actual data of the global hierarchical structure of the Internet is very difficult to collect due to the closed information policies of ISPs. The ToR problem is a research trend that tries to address that difficulty by inferring and annotating the ToRs of AS connections. A typical procedure for inferring the ToRs is as follows [115]: 1. Extract all AS links from one or several BGP database snapshots, such as RouteViews [80] or RIPE [81]. 2. Apply heuristics to infer the C-P, P-C relationships, and annotate AS links. 3. Apply heuristics to infer P-P relationships, and annotate AS links (possibly overriding relationships inferred in step 2). 4. Fix suspicious looking inferred relationships (e.g., a low-degree AS acting as provider to a high-degree AS). 5. Infer S-S connections (i.e. connections of ASs belonging to the same organisation) from WHOIS, and annotate AS links (possibly overriding previous relationship annotations). In this work, we use the ToR annotated AS topologies created by CAIDA [90] and convert them into SGB format [116] which is used in our simulation. These topologies serve as our main test-bed for the studied algorithms (cf. Chapter 5). They also allow us to validate the results derived from simulated graphs (cf. Chapter 4). For the randomly simulated AS graphs, beside the compliance to the scale-free model and hierarchy (cf. Section 3.5.1), we try to make our simulator generate numbers of ToR links similar to the actual percentages in the Internet statistics. Table Table 3.1 illustrates the statistics of inferred AS relationships of the Internet reported in [115].

56 Table 3.1. Statistics of inferred AS relationships

C-P links P-P links S-S links Total Number of links 34,552 3,553 177 38,282 Percentage 90.26% 9.28% 0.46% 100%

3.6 AS Graph Generator - A Simulation Package In order to evaluate methods for relay path selection in different network configurations, we have implemented the relay path selection algorithms in simulation. The Stanford Graph Base (SGB) package [116] has been chosen as it is flexible and powerful software which allows simulation of hundreds of thousands node graphs. SGB is provided as an integral part of the Network Simulator ns-2 [117]. The advantages of using AS graph generator based on SGB software is that we are not only able to generate large topologies but also able to utilise powerful SGB tools for manipulating those topologies. The latter allow us to associate desired network parameters such as bandwidth, delay, loss, load, cost, etc. to these topologies and calculate different traffic engineering problems. It is also rather easy to convert SGB settings to ns-2 format for study of dynamic network scenarios. According to the valley-free rule of the BGP routing paths, we have developed a program to generate random annotated AS graphs in which a node represents an AS and a link reflects a type of relationship (ToR) between ASs. We try to make the simulated graphs similar to the real AS-level Internet in terms of the size, the hierarchy, the degree and the relationships among ASs. The number of nodes is random but around the range of 15,000 to 35,000 nodes. Those nodes are divided into 5 hierarchies according to the guidelines from [93], [111], in which Tier 1 ASs are fully connected together; and, each node in lower layers connects to its corresponding provider in upper layer constituting a kind of skeleton for the graphs. The latter ensures the reachability of a node as every non-

57 Tier 1 node will have at least one provider. This setting is necessary because in the environment of policy routing we are simulating, AS paths are valley-free. If ToRs are randomly assigned, there may be a situation that one source node cannot belong to any permitted path (with respect to the adopted policy) to a specific destination although it has physical connections to other nodes. The graph skeleton assignment is illustrated in Figure 3.1.

T1

T1 T1

T2 T2 T2 T2

T3 T3 T3 T3 T3 T3

S4 S4 S4 …….. S4 S4 S4

S5 S5 S5 …………….. S5 S5 S5

Peer - Peer relationship Provider - Customer relationship

Figure 3.1. Graph skeleton to make sure of node reachability.

We utilise the naming system for nodes in Transit-Stub random graph generator of GT-ITM [106] to assign node names in specific hierarchical level of AS graphs, i.e. Stub ASs level 5, Stub ASs level 4, Transit ASs level 3, top Tier Transit ASs level 1 and 2, as shown in Table 3.2. The name of each node is assigned as soon as it is created during the

58 process of generating the AS graph. This naming system later allows us to assign relay peers to a specific network layer to investigate relay traffic characteristics (cf. Section 4.5).

Table 3.2. AS node name notations

AS categories Notations* Transit AS Tier 1 T1:x.0 Transit AS Tier 2 T2:x.y Transit AS Tier 3 T3:x.y/a.0 Stub AS Tier 4 S4:x.y/a.b Stub AS Tier 5 S5:x.y/a.b/c * Note: x, y,a,b,c  N

We then follow the power-law growing graph model [113] to add additional links to graphs as follows:

The probability of a new node connecting with node i of degree di in layer l of graph hierarchy is proportional to its degree:

d i P(i)  K l (3.2)  d i

where  d i is the total degrees of all current nodes, and Kl is a constant chosen for each layer of graphs (l = 2, 3, 4, 5).

By adjusting the value Kl in each layer through thousands of experiments, we are finally able to generate AS graphs with similar degree distribution to the real Internet. Figure 3.2 illustrates the procedure of adding links by following power-law. Figure 3.3 shows examples of 11 simulated graphs generated from our AS graph simulator, named “SGB sim 0-10”. AS-level Internet topology made by [56] which is derived from RouteViews [80] and other BGP routing tables updated on 26 September 2005 is given for comparison.

59 P-P skeleton link T1 P-C skeleton link Additional P-C link Additional P-P link T1 T1

T2 T2 T2 T2

T3 T3 T3 T3 T3 T3

S4 S4 S4 …….. S4 S4 S4

S5 S5 S5 …………….. S5 S5 S5

Figure 3.2. Adding additional links by following power-law.

It should be noted that the main purpose of our AS graph generation program is for comparing different path selection methods taking into account the overlap hops (cf. Chapter 4). Thus, the similarity in distribution of the lowest degrees of nodes between simulated and the real AS graphs has a more important role. Through extensive refinements, we approach such similarity for low degree nodes as shown in Figure 3.4. Although we achieve very similar distribution for nodes from 3-degree upward, a smaller number of 1- and 2-degree nodes is generated (about 18% smaller). This limitation of our AS graph generator would affect the performance of algorithms. Generally, the number of overlap hops counted would be higher. This however will not have impact in our comparison between algorithms, as they are equally affected. Finally, we assign ToRs to the hierarchical graphs in the way that ASs from the same layers are connected by a peer-peer relationship while ASs from different layers are connected by a customer-provider relationship. The assignment also adopts the suggestion

60 from [115] so that approximately 90% of the links in the AS graphs are of the type C-P while the rest are of the type peer-peer1.

Figure 3.3. AS degree distribution of the real Internet and exampled simulated SGB graphs: ‘RouteViews’ - the real Internet degree distribution [56] vs. ‘SGB sim 0-10’ - randomly simulated graphs.

The SGB graph generator contains additional tools such as for hashing to a particular node in a given graph, and for computing shortest path between two nodes, i.e. Dijkstra algorithm [118]. However, the shortest path routing assumption made by default in SGB is not valley-free aware because the standard SGB graphs do not model AS relationships, which is likely to lead to unrealistic simulation of inter-domain routing. It is already well known that actual AS paths in the Internet are substantially longer than the shortest path [119], [120], [121]. In order to simulate the practical AS paths, we have modified the Dijkstra algorithm in SGB based on the guideline in [122]. This tool allows us not only

1 The type of sibling-sibling has been ignored since it only counts for a small portion (0.46%) of relationships in AS graphs.

61 simulate AS path between any two nodes but also compute one-hop and two-hop relay AS paths in the subsequent implementation of relay path selection algorithms. As a result, all AS paths in our simulation are valley-free shortest paths. In practice, AS paths may not be the shortest paths because each ISP may set different policies for its interdomain routers when making routing decisions. However, our modified Dijkstra algorithm helps to simulate the AS path inflation effect in reality.

Figure 3.4. CDF of low degree nodes in simulated graphs and the real Internet graph.

During the time of constructing this graph simulator, we also provide two additional tools to SGB package, including the k-Shortest Paths (k-SP) algorithm [123], and the Equal Cost Multi-Paths (ECMP) algorithm [124] based on the guideline in [12]. These SGB modules have been used in simulations and analyses of H. Agrawal [125]. Furthermore, we have enhanced the standard k-SP algorithm to comply with the valley- free rule. The enhanced k-SP is applied in our newly introduced algorithm for VoIP relay

62 path selection in Chapter 5 of this dissertation. These software programs are made available to download from the website [126].

3.7 The Real Internet Configuration We use the AS topological information from CAIDA annotated AS network generated on 28 July 2008 [90], which contains 28,531 distinct ASs and 114,976 unidirectional links. Another good source for network information can be found in iPLANE [127]. In our simulations, we use latency information from Inter-PoP links dataset dated 05 November 2008 of [127] to map onto the CAIDA AS topology with the following rules. If there is a mapping between two corresponding links in the two data sets, we assigned the delay value from iPLANE directly to the CAIDA link. Out of 114,976 links in CAIDA topology, 31,136 links find their mappings. The remaining links are assigned delay values taking from random entries within the Inter-PoP links data set. For intra-AS latency, we compute as the followings. The latency incurred to traffic passing an AS is the mean value of latency of all intra-AS measurement for this AS in the Inter-PoP links dataset. There are 7,593 ASs to find their delay values this way. If there is no measurement for an AS, we assigned the mean value of latency of the whole iPLANE data set (i.e. 25 ms). Figure 3.5 compares the distribution of latency values in the two data sets; Inter-PoP links of iPLANE and delay values assigned to CAIDA topology. It can be seen that they have very similar distribution curves, although our CAIDA dataset lacks few extreme values. We use this CAIDA topology with the assigned delay values in all of our experiments described in the next chapters. Using this test bed, we implement three different existing proposals for VoIP relay path selection; the Earliest Divergence Rule (EDR) [8], [9], the select-close-relay() algorithm in AS-Aware Peer-relay Protocol [56], and the Minimal Overlap Close Relay (MOCR) [128].

63 Figure 3.5. CDF of the two data sets of latency

3.8 Summary This chapter gives the scope of our research and explains our research methodology. Specifically, we select AS-level Internet topology because of its feasibility and scalability for relay path selection schemes. AS-level algorithms can also utilise the network efficiency, and the AS path diversity. We have created a new random AS graph simulator based on the existing SGB simulation package. Our simulator reflects important factors of the AS Internet topology, which have been discovered by recent research works. These include the internal structure of the Internet, the relationships between ASs, and the valley-free AS paths that are manipulated by inter-domain routing behaviour. The flexibility of SGB software also allows us to simulate the real Internet configuration at AS-level. These random and

64 synthetic AS graphs provide us opportunities to investigate different aspects of P2P overlay networks and explore different features of relay path selection algorithms.

65 CHAPTER 4

MAXIMAL OVERLAP CLOSE RELAY

4.1 Introduction Voice-over-IP (VoIP) applications using peer-to-peer (P2P) overlay networks have grown dramatically. The largest P2P VoIP system, Skype, has about 521 million users by the end of the third Quarter of 2009. It was reported that 20,365,656 concurrent Skype users were online as of 09 November 2009 [129]. Such significant success has motivated an intense research activity in the field. P2P overlay systems are envisioned to become a promising alternative for end-to-end quality of service (QoS) delivery in IP networks. Such decentralised structured P2P systems inherently have scalability, robustness and fault tolerance because there is no centralised server and the network self-organises. One of the key features of P2P VoIP systems is mechanisms to select one or more suitable backup paths. This is particularly important in interdomain routing scenarios where the existing routing protocol, the Border Gateway Protocol (BGP), is limited to single-path routing capability [130]. Using P2P overlay networks to overcome this drawback of BGP has been proposed and demonstrated by a number of studies in multi-path switching [8], [49], [56], [76], [131], [132], [133], [134], [135]. Their common idea is to exploit the path diversity in IP networks to avoid congestion or degradation in the default path. The operation of the P2P VoIP overlay systems is usually based on some special network components called supernode and relay node. Supernode is responsible for detecting nearby end hosts, transferring signalling messages between caller and callee. Supernodes with sufficient bandwidth can act as relay node to exchange media traffic (i.e. voice calls) between the two parties. In this study, we investigate the task of selecting relay nodes to form alternate relay paths. Since the performance of an overlay path directly depends on how well the relay node is selected, it is useful to study alternative selection

66 policies. It is assumed that communication between end hosts can take place either over the default direct path or over alternate relay paths, which are identified before or during the process of communication initiation. In general, there are two scenarios of making VoIP calls through overlay networks, including intra-domain calls and cross-domain calls. These correspond to making local and national/international calls in the traditional public telephone system. This work focuses on techniques used to select cross-domain relay paths in the large scale VoIP systems. For scalability and efficiency, it has been argued to employ autonomous system (AS)-level path information so that topological information maintained at end hosts can be significantly reduced while exhibiting better path performance [8], [56]. This chapter explores the methods for identifying suitable VoIP overlay paths in scalable manner, i.e. the first research question of this thesis. We discuss and compare different path selection schemes in P2P overlay networks. From extensive simulations, we introduce a heuristic algorithm for alternate relay path selection, which can be seen as a modified and combined version of path selection schemes proposed in [8] and [56]. We illustrate that the proposed algorithm can bring better quality paths with the same constraints as existing methods. We then extend the finding by applying those path selection schemes in different overlay network scenarios. We observe a considerable improvement of path latency when relaying VoIP traffic through top tier ASs. Furthermore, we implement relay path selection algorithms in a synthetic Internet configuration and determine a lower threshold of the hop-count length of first hop relay paths. By increasing the relay node density in the simulated system, we show that the delay of relay paths decreases while the number of overlap hops in relay paths increases. It indicates the conflict between the performance objectives of relay paths. We believe that these findings will be useful for network operators when exploiting P2P systems for QoS service provisioning. It may also be useful for overlay VoIP service providers to consider when designing their P2P systems.

67 The remainder of this chapter is organised as follows. Section 4.2 reviews existing approaches related to our study, their advantages and drawbacks. This motivates our new method of selecting alternate relay paths in P2P VoIP overlay systems described in Section 4.3. In Section 4.4, we demonstrate the effectiveness of our proposal and show some observations about VoIP relay traffic through various simulations with random AS graphs. In Section 4.5, we describe our simulations of relay paths through different tier allocations. Section 4.6 introduces the hop length upper bound of the first hop relay and its correspondence to the quality of relay paths. We then describe the operation of P2P overlay network in different conditions of relay peer distribution. Finally, Section 4.7 summarises our findings in the chapter.

4.2 Related Work Researchers have demonstrated the potential of multi-path overlay routing for enhancing end-to-end network performance. Specifically, the Resilient Overlay Network (RON) proposed in [76] and the Scalable One-hop Source Routing (OHSR) in [133] have been shown to be sufficient to mitigate path failure through one intermediary node overlay routing. RON monitors the availability of IP-layer paths between every pair of participating nodes, and uses overlay paths to forward data packet when the direct paths fail. Due to its full mesh architecture, RON is not scalable for large peer-to-peer networks. OHSR randomly picks k candidate relay nodes and chooses the best one to form a one-hop overlay path. When k is small, random selection can potentially eliminate good relay nodes. The benefit of the diversity degree of alternate P2P overlay paths has been exposed in studies such as [10], [136]. There is a range of investigations of Skype, which provide a good basis for our work. SA. Baset and H. Schulzrinne [58] analyse key functions of Skype, including Skype client operations, Network Address Translation (NAT)/firewall traversal, supernode relay. Following this work, Guha et al. in [63] provide experimental data about Skype system that are useful for future design of P2P VoIP systems, including the population of online

68 Skype clients, the number of supernodes, and their traffic characteristics. In [50], G. Caizzone et al. analyse the scalability aspects of Skype network. The measurement and experiment based studies [56], [66] has demonstrated that the Skype system uses a sub- optimal relay path selection mechanism with large number of unnecessary probes, resulting in heavy network traffic. Reference [56] also suggests two scenarios that overlay routing paths can be faster (or shorter) than the direct IP routing paths. Our work is influenced by two relay path selection algorithms [8], [56] that are capable of applying to P2P VoIP systems (cf. Sections 2.5.4 and 2.5.5). Our analyses in Section 4.6.2 partially follow the Skype superpeer statistics of [50].

4.3 Minimal Overlap Close Relay Algorithm Motivated from the proposed algorithms for relay path selection, Earliest Divergence Rule (EDR) [8] and AS-Aware Peer-relay Protocol (ASAP) [56] described in detail in Sections 2.5.4 and 2.5.5, we propose a new method called Minimal Overlaps Close Relay (MOCR) for selecting alternate relay paths which takes into account both number of overlap hops and relay delay parameters. MOCR computes relay paths in P2P systems using local information of source nodes only, i.e. the same knowledge as EDR uses without requiring any additional information. A source node approximates the round trip relay delay Din,eg according to the estimation described in EDR as the sum of direct path delay Dde and two times delay from source to relay nodes Din,r . That is

Din,eg  Dde  2 Din,r . (4.1)

The idea of MOCR is that it utilises the relay path diversity in the EDR path selection approach, i.e. based on the earliest divergence point approximation. However, as opposed to the long latency relay path preference of EDR, MOCR chooses the shortest relay paths among those relay candidates. In all of our experiments in Section 4.4, we have also measured the actual round trip delay in each relay path. Comparing to its corresponding calculated delay based on

69 Equation (4.1), Figure 4.1 shows that they are strongly correlated. That is a path with long calculated delay tends to have actual long delay and vice versa. Since the round trip delay of default direct paths are uniformly distributed, the actual relay delay between any two nodes is, therefore, correspondent to the first hop relay delay. This leads to our assumption that, due to the strict valley-free rule in BGP routes, a relay peer that deviates just one or two hops away from the earliest divergence point might be good enough to make sure the disjointness of the relay path (to its corresponding default path) toward the destination. If our assumption is correct, MOCR can utilise shorter relay candidates by selecting short first hop relay paths, which may be eliminated in EDR. We later investigate our proposal through extensive simulations in the next section.

Figure 4.1. A comparison between the calculated delay and measured delay of relay paths.

70 Minimal-Overlaps-Close-Relay()

Find default direct path Pde from vin to veg

1 Obtain set of relay nodes Vrelay  V  vin 

1 For each vr Vrelay do

Find relay path Pr from vin to vr

Count number of overlap hops Ohr between Pde and Pr

Compute round trip delay Din,eg from vin to vr to veg End //of for loop

1 Sort Vrelay increasing to Ohr

Add the first n vertices whose Pr are diverged earliest from Pde

2 to the second set of relay nodes Vrelay

2 Sort Vrelay increasing to Din,eg Return the first k vertices as relay candidates.

Figure 4.2. MOCR pseudo code

Figure 4.2 illustrates the MOCR algorithm. Based on the AS path information to relay

1 nodes, the source node vin determines a set of relay candidates Vrelay in favour of the earliest divergence point to its default direct path to destination (in a similar fashion to

1 EDR). A subset of Vrelay which contains k paths with smallest overlap value are selected as

2 1 relay candidates, i.e. Vrelay  Vrelay . The latter are then sorted in order of increasing delay. Subsequently, the path with minimal delay is chosen.

In this pseudo code, vr is a relay node in the set of relay nodes V, veg is the destination node, Dde represents for round trip delay of default direct path, Din,eg represents round trip delay of alternate path from vin to veg, Ohr stands for number of overlaps and P for path.

71 For more comprehension, we illustrate a simple example in Figure 4.3 which involves three possible choices of relay nodes. EDR selects Relay 1 since the path from the Source A to Relay 1 diverges at the earliest point (i.e. the Source A in Figure 4.3) and its relay delay value D1  50ms is the highest. Therefore, EDR selection might skip shorter relay paths. On the other hand, ASAP chooses Relay 3 as it has minimal value of delay

(i.e. D3  20ms ) from the source to the relay node. ASAP, however, may select paths containing more overlaps to the default path. In Figure 4.3, ASAP chooses the relay path from the source to relay node with one more overlap hop (Node B) than EDR does.

In case of MOCR, Relay 2 is chosen as it has the lowest delay (i.e. D2  30ms ) among those satisfying earliest divergence condition. Because we consider both overlap and delay constraints, it is expected that MOCR can select paths with smaller value of delay while maintaining few overlaps.

Relay 3

D3

10 ms Dst. E Src. A 10 ms B C D

D2

30 ms

D1  DA,Re lay2  DRe lay2,Re lay1  50ms Relay 2 D2  DA,Re lay2  30ms D1 D3  DA,B  DB,Re lay3  20ms

20 ms

Relay 1 EDR path MOCR path ASAP path Default path Physical link

Figure 4.3. Relay path selection in different schemes

72 Table 4.1 summarises the main operations we have implemented in the EDR, ASAP and MOCR algorithms.

Table 4.1. The main operations of algorithms

EDR ASAP MOCR Selects paths Selects paths Behaviour in containing Ignored overlap containing first hop relay earliest-diverged parameter earliest-diverged selection point point Number of Simulated overlaps result (average

number of 2.83 3.18 2.85

overlaps at 50

node pool size) Selects the longest below Selects the

Behaviour in upper bound Selects the shortest among

first hop relay among those shortest of all those containing

selection containing available paths earliest-diverged earliest-diverged point Delay point

Simulated

result (average

delay in 197.5 173.5 175 millisecond at

50 node pool

size)

73 4.4 Comparison between Relay Path Selection Schemes

4.4.1 Simulations and Analyses of Alternate Relay Path Selection Schemes To compare with MOCR, we have also implemented two algorithms EDR and ASAP as the representatives for two different schemes of selecting alternate relay paths proposed for VoIP service. For a fair comparison, we assume in all algorithms that source nodes make path selections based only on local knowledge. Specifically, when making a routing decision for a specific relay node, a source node can only use the calculated round-trip delay and (for EDR and MOCR) the computation of earliest divergence point. The upper bound for calculated round-trip delay used by EDR is set to 200ms1. We mention only the calculated delay throughout the chapter since it closely corresponds to the actual delay of a path. The relay delay at relay nodes and intra-AS delay are ignored since we only concerned with the length of relay paths in the experiment. We then run the three algorithms for a large number of AS topologies created from our AS graph generator. For each AS graph, we randomly select 10 pairs of source- destination nodes. We also randomly assign 5% of nodes in each graph as supernodes. For instance, there are more than 1000 supernodes in each simulated graph. For each algorithm, we measure the number of overlap hops and round trip delay of 2 alternate paths it selects in different size of relay node pool. For instance, the pool size is gradually increased from 2 nodes to 500 nodes. Figure 4.4 sketches the average overlap hops comparison between EDR, ASAP and MOCR. While EDR and MOCR exhibit very similar performance, i.e. the average overlap is reduced when relay candidate pool size increases, the average number of overlaps in ASAP is rather stable. It indicates that EDR and MOCR have better performance than

1 During the simulations, we found that the assigned sets of link delays were a bit higher than the real inter-domain link delays of the Internet. In order to set an upper bound for triggering decisions in our studied relay path selection algorithms, we finally agreed to choose the threshold of 200ms one-way delay instead of 150ms to bring more choices for the algorithms to implement their path selection. This assumption does not affect our simulated results and subsequent analyses.

74 ASAP in terms of number of overlap hops. MOCR achieves nearly the same overlap performance as EDR does. Our results can be explained as follows. EDR tries to select long relay paths (under 200ms upper bound) among paths that diverge from the default direct path at the earliest point. It assumes that, by doing so, it is more difficult for the relay paths to merge back to the direct path. In contrast, among earliest diverged relay paths, MOCR uses the shortest first hop relay paths as relay candidates. We doubt that just one or two hops diverged from the source will be enough to keep relay paths deviated from default direct path due to the strict valley-free rule in the interdomain routing. In Figure 4.4, the curve of MOCR is close to EDR indicating that our assumption is reasonable. For instance, those short first hop relay paths are utilised by MOCR efficiently. On the other hand, Figure 4.5 shows that both ASAP and MOCR exploit better alternate relay paths in term of relay delay, compared to EDR. Because of the more efficient use of short first hop relay paths, MOCR generates short relay paths with low overlap. Our results suggest that MOCR produces relay paths with quite comparable delay performance to ASAP paths, in all sizes of pool relay.

75 Figure 4.4. Hop overlap comparison between EDR, ASAP and MOCR (with 95% confidence intervals).

Figure 4.5. Delay comparison between EDR, ASAP and MOCR (with 95% confidence intervals).

76 Overall, we have shown that MOCR achieves better performance than both EDR and ASAP when the problem of finding alternate relay paths takes both delay and hop overlap into consideration.

4.4.2 Overlap in First Hop Relay Statistic The operation of EDR and MOCR relay path selection algorithms is primarily based on the constraint of paths in interdomain routing environment, i.e. valley-free rule. For instance, by selecting relay paths containing earliest divergence point, both EDR and MOCR utilise the constraint of valley-free rule to ensure the path diversity. From the experimental results analysed in the Section 4.4.1, we elaborate on the benefit of using earliest divergence point. In particular, we collect the number of overlaps of first hop relay paths, i.e. the paths from sources to the first relay peers, produced by EDR and MOCR algorithms, and compare with their corresponding total overlaps of relay paths. The analysis is illustrated in Figure 4.6.

Figure 4.6. Scatter plot of overlap in the first hop relay vs. the whole paths

77 The figure shows that although there is a certain correlation (42.8%) between the magnitude of first hop relay overlap and the overall overlap, a large portion of relay paths with small number of first hop relay overlaps have substantial overlap in total to default direct paths. As a result, using relay paths which deviate early from the direct path does not always ensure the selection of disjoint relay paths. This is the trade-off in EDR and MOCR approximations for reducing number of overheads in relay path computation.

4.5 Experiment with Different Relay Node Allocation We further extend our experiments in Section 4.4 by applying relay path selection methods while changing the allocation of relay nodes. The purpose of those experiments is to see whether there is a relationship between relay node distribution and the selected relay paths. This may help improve our understanding of VoIP relay traffic. These relay node assignments are applied to 10 randomly generated AS graphs. We choose in each graph 100 pairs of source-destination to compute performance of relay alternate paths using MOCR algorithm. We measure both delay and overlap parameters of relay paths selected. The results are shown in Figures 4.7 and 4.8. Figure 4.7 shows the cumulative distribution functions of hop overlap in relay paths passing through different AS tiers. The difference in performance distributions among paths relayed in different tiers is not very noticeable. In [137], the author indicated that when the degree of the source AS becomes large, the benefit of EDR becomes less obvious due to the fact that after applying EDR, most of the relay nodes remain as potential candidates. Recall in our AS graph generator that ASs in Tier 1, 2 and 3 are generally high degree ASs; and ASs in Tier 4 and 5 are in lower degrees. However, our AS graphs generated are more close to the real AS topology in terms of the sizes and connectivity distribution, as opposed to small datasets experimented in [137]. Possibly, in such large networks, the degrees of destination ASs also contribute more to the total hop overlap of relay paths. As

78 a result, our observation does not exhibit similar outcomes to the experiment results in [137]. On the other hand, it can be seen from Figure 4.8 that relay delays of alternate paths measured when supernodes are assigned to top tier Transit ASs are much lower than others’ - with more than 50% of relay paths finding short routes to destinations (i.e. less than or equal to 150 ms in our simulations). This indicates that relaying traffic to high degree nodes achieves much better paths in term of delay. Reference [56] illustrates through two scenarios that overlay routing paths can be faster (or shorter) than the direct IP routing paths. The second scenario shows how multi- homed Stub ASs can further improve overlay routing. Our experiment results do not contradict with the observations in [56]. Whereas, our results in Figure 4.8 show that there are higher probability in finding shorter relay paths if a P2P system relays its traffic through a highly connected Transit domain than does via a Stub domain. Note that these observations are within the limitations of our work.

79 Figure 4.7. Overlap distribution of MOCR paths achieved in different relay node allocations.

Figure 4.8. Delay distribution of MOCR paths achieved in different relay node allocations.

80 4.6 Experiments and Analysis Using the Real Network Configuration The validations of our proposed MOCR algorithm using the synthetic AS topology are provided in Chapter 5, in conjunction with the comparisons to our new algorithm for VoIP relay path selection in this chapter. In this section, we use the synthetic AS topology to study several characteristics of relay peer location and distribution.

4.6.1 The Optimal Hop-count Distance for Relay Nodes In this experiment, we look at the impacts of distance between the sources and the first-hop relay nodes on the relay paths. We follow [56] to assume the relay delay of 20ms to overlay traffic passing through a relay machine. Based on the experiment settings, we randomly assign 50% of the nodes in the simulated AS Internet graph to function as relay nodes. The three relay path selection algorithms, i.e. EDR, ASAP and MOCR are then forced to select relay nodes that satisfy the hop count constraints. Specifically, a relay node is selected as a candidate only if its distance from the source, i.e. the first hop relay path, is bigger or equal to a constraint value. We gradually increase this value from 1 to 10 hop-counts. For each value of hop count constraint, we use 1,000 random pairs of source- destination to run path selections. For each pair of nodes, we measure the numbers of overlap hops and latencies of 2 alternate paths each algorithm selects. The results are described in Figures 4.9 and 4.10. As the number of hop constraints increases, the average one-way delays of relay paths increase constantly (Figure 4.9). This phenomenon is only saturated when the length of first hop relay paths reaches 8 hop-counts. This observation is matched with our observation in Figure 4.1 of Section 4.3 that there is a correlation between distance of first hop relays and distance of the whole relay paths. When algorithms are restricted to select “very” far relay nodes only, those relay nodes possibly find routes to the destinations short enough so that the total average length of paths do not increase more; thus, these curves are saturated.

81 Figure 4.9. Average delay vs. Hop count constraints (with 95% confidence intervals)

Figure 4.10. Average number of hop overlaps vs. Hop count constraints (with 95% confidence intervals)

82 In Figure 4.10, the average numbers of overlap hops generated in all algorithms are quickly saturated at the hop-count length of 3 hops. The statistic suggests that selecting relay peers from 3 hops away the source makes relay paths more diverged from their default direct paths; however, first relay path distance further than 4 hop-counts do not guarantee further divergence. Typically, the minimal average overlap limit is linked with the degrees of ASs in a particular network. In the practical AS topology, our results show that the benefits of increasing diversity for relay paths with first relay path distances bigger than 4 hop-counts is not obvious. Selecting hop-count constraint of 3 hops, we further analyse the correspondence of the first relay hop-count constraint metric to relay path performance in studied algorithms. We let the three algorithms, i.e. ASAP, EDR and MOCR, select their relay peers for 1,000 randomly generated VoIP calls in our simulation test-bed. Each algorithm is set to implement with or without 3 hop-count constraint. In this experiment, we also vary the size of relay node pool from 1 to 100 nodes. Among the pool of nodes, each algorithm selects 5 relay candidates. The actual relay node is chosen randomly among those 5 candidates. Other simulation settings are kept unchanged as previous experiments. The results are illustrated in the following Figures 4.11, 4.12 and 4.13, in which notation suffixes of 1 means algorithm selection with no hop-count constraint, and of 3 means 3 hop-count constraint. In Figures 4.11 from (a) to (f), we show the trends of latency and hop overlap parameters of constraint and non-constraint relay path selected by the three algorithms in different pools of relays. In general, it shows that paths with hop-count constraint hold greater delay and less hop overlap. These are consistent with the observations of previous experiment. The hop overlap curves ASAP-3, EDR-3 and MOCR-3 are thresholds for those algorithms. That is, in case ASAP, EDR and MOCR using 5 relay candidates from the pool for relay path selection, their path overlap achievements cannot be smaller than ASAP-3, EDR-3 and MOCR-3 threshold curves in Figures 4.11 from (d) to (f).

83 Figures 4.12 and 4.13 show the average performance differences between constraint and non-constraint relay paths. The difference is calculated as subtraction of performance values (i.e. delay or hop overlap) of two corresponding paths with and without hop-count constraint. We notice that both delay and overlap differences in all algorithms are decreased as the size of relay nodes increases. Thus, the 3 hop-count threshold is more significant when relay path selection algorithm using small pool of relay peers (i.e. around 20 relays). VoIP systems, however, require stringent time for establishing calls. During the call instantiation period, source nodes cannot examine too many relay peers. Therefore, our observation of 3 hop-count constraint for relay nodes is an important recommendation when designing a large-scale P2P VoIP systems. We conclude that relay path selection algorithms, in practical Internet configuration, should select relay nodes whose distances to the source are equal to 3 or 4 hop-counts for the optimal performance of delay and hop overlap metrics.

84 Figure 4.11. Performance comparison of relay paths generated by different algorithms with and without 3 hop-count constraint.

85 Figure 4.12. Average delay deterioration of 3 hop-count constraint relay paths under different relay pool

Figure 4.13. Average overlap improvement of 3 hop-count constraint relay paths under different relay pool

86 4.6.2 The Relay Node Population In this section, we examine the impacts of relay nodes on the performances of existing relay path selection algorithms. The purpose of those experiments is to see whether there is a relationship between relay node distribution and the selected relay paths. Since the performance of an overlay path directly depends on how well the relay node is selected, it is worth studying properties of the relay network with regards to relay node selection behaviour. We use the three algorithms for relay path selection mentioned in the Section 4.4.1 in our simulated AS-level Internet topology (cf. Section 3.7). We randomly choose 1,000 pairs of nodes in the graph as sources and destinations for the experiment. In each algorithm, the relay candidates are set to 10 paths. We measure delays and numbers of overlaps of the 2 best paths each algorithm selects. In the experiment, the relay nodes are randomly allocated and the density is varied from 10% to 100% of the AS nodes. The results are depicted in the following figures. When the density of relay nodes is increased, delays of relay paths generated by all algorithms tend to decrease (Figure 4.14.a). We also show the analysis of the mean values of delay of all data in Figure 4.14.b. The declining of average delay curves suites with our observation in Figure 4.9 because increasing density of relay nodes makes distances of surrounding relay peers of an end host shorter. In Figure 4.15.a and 4.15.b, we observe the increase of the average overlap hops in the situations of low density of relay nodes, i.e. from 10% to 50%. These curves are then saturated at the value of 50%. This can be explained as follows. When generating relay paths, those algorithms run breadth first searches to create their relay node pool. In this particular example, the size of relay node pool is 20 nodes. As a result, the list of 10 relay candidates in the source node contains only nearby relay nodes. As the density of relay nodes increases, the source finds closer and closer relay nodes to be selected from the pool. This phenomenon increases the possibility of relay paths merging back to the default paths. Therefore, the average number of overlap hops is

87 increased. This indicates a trade-off between delay and overlap due to the breadth first search approach in existing algorithms. In practice, Skype uses a high supernode density. We can only roughly deduce the AS-level relay node density of Skype since there is no official information. Currently, there are about 10 million Skype users online at any time of the day. Thus, the ratio of the number of online users to the number of ASs is equal to 350:1. The supernode population can be approximated by using a ratio between the numbers of online users to supernodes of about 300:1 [50]. Therefore, there are slightly more Skype supernodes than the number of ASs. Although some ASs contain more than one supernodes and it is difficult to replay the real distribution of supernodes within the ASs, we still can claim that the AS-level density of supernodes in Skype system is very high and approaches 100%. The high density of supernodes in the practical P2P VoIP overlay network motivates us to introduce a new method explained in Chapter 5 for selecting alternate relay paths based on the valley-free nature of the current interdomain routing environment.

88 (a) Average delay of paths by different algorithms (with 95% confidence intervals)

(b) Analysis of Means delay (error rate alpha-level = 0.05) Figure 4.14. Average delay of paths under different relay node density

89 (a) Average hop overlap of paths by different algorithms (with 95% confidence intervals)

(b) Analysis of Means hop overlap (error rate alpha-level = 0.05) Figure 4.15. Average hop overlap of paths under different relay node density (with 95% confidence intervals)

90 4.7 Summary Relay path selection is one of the critical components of VoIP overlay systems. This chapter looks at the choice of relay nodes and its impact on the performance of VoIP overlay paths in interdomain environment. We discuss and compare the performances of two relay path selection approaches proposed for VoIP overlay systems through extensive simulations. We then propose a new method for relay path computation that takes into account both path disjointed and path latency. It is found that there is a considerable improvement of path performance using the new method. Our analysis also points out the relative merit of relay path selection schemes based on the diversity degree of the first hop relay paths. Moreover, through extensive simulations, we also found that there are more opportunities for P2P VoIP systems to obtain good relay paths when selecting relay nodes located at highly connected Transit ASs. Using the real Internet at the AS- level and simulating different scenarios of relay path selection, we recommend selecting relay nodes whose distances are less than four hops away from the sources. Our analysis shows that the performance of relay paths in existing methods is influenced by the density of the relay nodes. There is a trade-off between delay and hop overlap performance objectives with regard to the relay peer density. In general, increasing relay node density can reduce relay path length but generates more hop overlaps to the default path due to the breadth first search manner in existing algorithms.

91 CHAPTER 5

K-VIRTUAL SHORTEST RELAY PATH SELECTION

5.1 Introduction The main concern in delay sensitive peer-to-peer (P2P) applications like voice-over- IP (VoIP) is P2P node failures due to peer churns and network degradations because of overload at peer relays. Reactive solutions may induce more relay latency and loss. In practice, proactive approach is widely used by exploiting the diversity of P2P path redundancy. However, the problem of selecting the optimal set of alternate relay paths in a large-scale P2P VoIP network has not yet been fully understood. In this work, we turn our attention to the question how to quantify the optimisation problem of selecting VoIP relay paths given a set of optimal objectives for the whole network. Within the scope of the thesis, we aim to address the problem of minimising path latency and maximising path diversity by formulating the multi-objective optimisation problem (MOP) for P2P VoIP networks. We then introduce a network model that effectively presents the optimal solution space of the MOP. The network model allows us to practically obtain a close to optimal set of relay paths by using our new heuristic algorithm for large-scale VoIP networks. VoIP relay path selection methods are usually instantiated during the call session initiation and refreshed periodically. This not only allows end hosts behind Network Address Translations (NATs) or firewalls, who cannot use the direct communication path to establish voice calls, but also allows communication under the periods of poor performance to be switched to alternate relay paths with minimum latency thereby accommodating the stringent nature of voice quality. We, therefore, only consider algorithms with fast convergence time which can be applicable to the practical duration of VoIP call initiating (for large-scale P2P systems, i.e. several seconds at most).

92 Our contributions are:  We model, formulate the problem of optimal VoIP relay path selection with regard to path latency and path diversity objectives, which has not been addressed before.  We develop a complete solution for selecting alternate relay paths in VoIP systems based on latency, hop overlaps and the valley-free nature in the current interdomain routing environment. The remainder of the chapter is organised as follows. In Section 5.2, we review existing researches related to our study. The mathematical formulation of problem and the network model are stated in Section 5.3. In Section 5.4, we present the algorithm for relay path selection. Consequently, the experiments are shown in Section 5.5. We also discuss the advantages and disadvantages of our proposed relay path selection method in this section. Finally, we summarise our findings in Section 5.6.

5.2 Related Work In this chapter, we continue to use the three algorithms, i.e. EDR, ASAP and MOCR, described earlier in Sections 2.5.4, 2.5.5, and 4.3 to make comparison to the newly proposed algorithm. It should be noted here that all three algorithms select one best path out of a pool of relay peers. The mechanisms of collecting relay nodes to the pool in those algorithms are the same as they all choose the neighbour relay nodes. We categorise these as breadth first search relay path selection. In Sections 5.4 and 5.5, we propose and demonstrate a new approach of selecting relay nodes which is not based on the breadth first search scheme and is close to the optimal solution. The network model proposed in this work is based on the early research in network distance prediction problem and its applications in peer-to-peer systems. In the pioneering work, Francis et al. [138], [139] have proposed a complete solution called IDMaps to maintain a virtual topology map of the Internet based on computed distances between hosts. While IDMaps is designed as a client-server architecture solution, Eugene Ng et al. [57] have examined the network distance prediction problem for peer-to-peer. The

93 approach assigns coordinates to each Internet host in such a way that the Euclidean distances in the virtual Euclidean space approximate the actual network distance such as round trip time and one way delay. This is called Euclidean Coordinate Embedding technique which includes two main categories, the landmark based schemes such as GNP [57] and Virtual Landmarks [140], and distributed schemes such as Vivaldi [141]. Using the network distance based Euclidian Coordinates, S. Lee [142] investigates the problem of one hop relay selection in peer-to-peer systems and shows that the Euclidian coordinate based systems can achieve close to the optimal performance in terms of delay.

5.3 Problem Statement, Modelling and Formulation

5.3.1 Problem Statement The problem we are addressing in this work can be described as follows: Given a fixed, annotated autonomous system (AS) topology where there are multiple routing paths between source and destination via P2P overlay network, which routing offers the best degree of path diversity with regard to the default shortest path, at the minimal delay? Considering a set of source-destination pairs in a given AS graph, the problem reduces to the classical MOP. The optimality of this problem is achieved when all path delays are optimised, and then granted solution to optimise path diversity. In a network with N relay nodes, the complexity of computing the desired relay path connecting each pair of nodes, using two-hop relay in the worst case, is O(N2). Given large-scale network scenarios we are studying and the hardness of exhibiting the Nash equilibrium for this MOP [13], therefore, we need to introduce a heuristic method to approximate and evaluate the performance of relay paths.

5.3.2 Mathematical Formulation We first state the general formulation of the studied problem.

94 Consider a Type of Relationship (ToR) graph G = (V, E), where V is the set of vertices, and E is the set of directed edges. In the context of ToR problem, a directed edge (u,v) , where u, v V , means that u is a customer of v. A path in a ToR graph complied with the valley-free rule is called a valid path. Precisely, a path is valid if

j, 1  j  n : (vi ,vi1 )  E, 1  i  j 1,and(vi ,vi1 )  E, j 1  i  n, i, j,n Ν. (5.1)

Given a set M of source-destination pairs, the set of default direct paths for all pairs

DE DE DE DE DE {s,t  M} is denoted as P  p1 , p2 ,..., pm ,..., pM . Each source and destination pair has the knowledge of N relay nodes in the network to establish N relay paths (through

R at least one relay node) Ps,t  p1 , p2 ,..., pn ,..., p N . Among those N relay paths, relay path selection algorithm takes a subset K of solution relay paths for each pair {s,t}

S S S S as Ps,t  p1 ,..., pk ,..., p K ,1  K  N . It implies that the studied path selection algorithms select multiple relay solutions. We also define the set of links shared between a default

DE path and a relay path as rn  pm  pn . Note that all default paths must be valid paths while relay paths may not be because relay paths typically violate the valley-free rule at relay nodes. Using this notation, we define the latency and hop overlap objective as follows. The relay path latency between the two node s and t is computed as the average delay of K solutions. Taking the total path latency of all sources and destinations in the set M, we have

1  R  1 S L  S  min l( p n )  S lp k  (5.2)  M  Ps ,t  R S  M  Ps ,t K  pn Ps ,t \ Ps ,t   K where l(p) denotes the latency function on path p, lp S  min lp R  presents the k R S n pn Ps,t \Ps,t  R latency of the selected path out of the pool relay path Ps,t .

Consequently, we define the overlap objective as the average number of overlaps of all pairs. That is,

1   O  S  min rn  (5.3) M Ps ,t  R S K  pn Ps ,t \Ps ,t  

95 where rn denotes the size of rn , i.e. number of links. With the quantitative definitions (5.2) and (5.3), the general problem is formulated as follows: minimise L and then minimise O (5.4)

subject to condition (5.1) for all direct paths between nodes; M , N  V ; s,t  M , s  t ; K 1.

5.3.3 Network Model and Optimal Solution Space We propose a new network model for illustrating the relay path selection problem. Our model is based on existing approaches in network distance coordinates [57], [142], and its application in selecting relay paths [54]. However, our Euclidian-based model is used only for explaining and evaluating the operation of path selection algorithms, and for approximating the optimal solution space of relay paths. It has been clarified in [57] that the performance of GNP mechanism does not substantially improve with each successive dimension of the Euclidean space. Reference [54] uses 2-dimensional Euclidean space to approximate node distribution and claims that it is sufficient for the P2P system to operate. For the purposes of simple approximating and evaluating, therefore, we present paths between nodes s and t in an annotated AS networks in a 2-dimensional Euclidean space. The length of a path indicates its delay value. We assume in the general situation that all paths from s to t can only diverge at node u at the soonest, and have to be merged back at node v at the latest. Existing studies show that the actual interdomain paths are typically longer due to the valley-free rule. In general, we represent the default direct path between s and t containing an arc from u to v; and the shortest path under no valley-free condition between the two nodes (we call this path the virtual shortest path or v-SP for short) as a straight line passing through the chord [u,v] as shown in Figure 5.1. Later, we will solve more general case in which the default path has intersects with the v-SP.

96 It should be noted here that the actual layout of paths as arcs may or may not be possible in practical network configuration. The intuition of this representation is to use arcs to identify the areas in which paths are spread so that we can approximate the tendency of the network paths and identify the optimal solution of selecting relay paths under our given objectives. Applying similar representation to the default path, it is easily seen that a relay path can be represented by an arc, either above or below the v-SP with regard to the default path, which diverges from v-SP at node x and merges back to v-SP at node y, where x [u, v), y  (x, v]. Generally, a relay path may have several cuts to the v-SP; hence, it should be represented as a path contains a sequence of disjoined parts (sequence of arcs) which diverge from and merge to v-SP at pairs{xi , yi }, i  1,2,... However, later we will see that what we are considering is just the length and the extent spaces between the relay path and the v-SP. Without losing generality, we substitute it by a relay path with one disjoined segment, whose area is equal to the absolute value of subtraction of its areas in two regions, for easier illustration. We denote the regions above and below the secant passing through {s, t}as the upper and lower regions, in which the upper region contains default arcs. Consider the dark areas in Figure 5.1 constituted of the default arc and its projected arc over the v-SP (denoted as projected default path), we observe the following: All arcs between any pair of{x, y}lying within this area are shorter than the default path, which is the continuous arc in Figure 5.1. Our later analytical data show that the average delay of relay paths generated by studied algorithms is normally bigger than the average delay of default paths. Therefore, in terms of the objective for minimising delay, the selections of relay paths which are laid in these grey areas should be first considered since their delay value is shorter than the default path. Hence, we call this area the solution space, and sub-divide it into upper and lower parts with respect to the corresponding regions.

97 Path q hopcount (w, q, G)

UPPER REGION Default path

v-SP path

w Path p

s u x y v t

Optimal solution space

LOWER REGION Projected default path

Figure 5.1. The representation of paths as arcs in 2D-Euclidean space; the solution spaces are colored in grey.

In terms of path diversity, the objective is to select paths with minimum number of hop overlaps between the default path and the relay path. Firstly, we consider a relay path in the upper region. If we gradually enlarge the area of a segment in the upper region between the chord {u, v} and relay path by increasing the length of relay path, there will be a place where the segment is fully overlapped with the segment of the default path. In the upper solution space, we observe that the closer the relay curves to the default path, the longer the paths and the more the probability that relay paths merge to the default path. Outside the upper solution space, however, the further the relay curve is away the default path, the less the probability of overlapping but the longer the latency. Secondly, we notice in the lower region that the layout of relay paths according to their lengths is: as relay paths become longer, they become more divergent from the default path. There is one

98 special case when a relay path fully overlaps the projected default path. In this case, the length of the relay path is equal to the length of the default direct path. We conclude for the path diversity objective that the preferred paths are those who spread on the lower region as far as possible to the v-SP or on the area outside the solution space in upper region as far as possible to the default path. If both objectives in (5.4), i.e. path latency and path diversity, are simultaneously considered, it can be seen that the optimal solution space for the studied problem is the lower solution space constituted by the v-SP and the projected default path, because of their low delay and small probability of overlapping with default path. Now, we solve the general case of multi-segment default paths as an example in Figure 5.2. In this example, the default path cuts the v-SP at 4 points A, B, C, and D. Thus, it can be divided into 2 segments: I and II. The grey parts indicate the projected parts in the solution space, and dashed lines indicate the projected default path. Hence, a default path with a layout of number of intersects to the v-SP can be substituted by a single- segment default path. The area of the substituted segment represents the total areas of all corresponding segments.

Segment I A B C D

Segment II

Figure 5.2. Example of a default path with 4 intersects with the v-SP

99 To quantify the solution space, we introduce the path distance between two paths for approximating the tendency of their merging/diverging to/from each other. The path distance d( p, q) between two paths p and q is defined as the total number of hop-count distance from every node in one path to the other path. Specifically, the hop-count distance from node w p to q is measured by counting the hops in the shortest paths (under no valley-free rule) from w to path q except the two end nodes, i.e. q \{u, v}. That is,

 d( p, q) wp hopcount(w, q \{u, v}, G). (5.5)

Path distance parameter indicates the extent between the two considering paths1. Therefore, it can be used to represent the degree of diversity indicating the probability that a relay path diverges from the default path. If we can measure path distances between relay path p R , default path p DE to the p vSP using (5.5), the comparison between dp DE , p vSP  and dp R , p vSP  will give us an approximation about the relay path placement. We denote the path distance between p DE and p vSP as dp DE , p vSP  D . It can be seen that  if dp R , p vSP  D (5.6), the relay path is inside the solution space;  if dp R , p vSP  D (5.7), it is outside the solution space. From these quantifications, we now summarise the relationships between path latency and path distance of relay paths (with regard to the v-SP) as in Table 5.1. Clearly, the trends of paths in Table 5.1 can be used to evaluate the path performance and to compare the effectiveness of relay path selection methods. The path distance between relay path p R to its default path p DE connecting source s to destination t, taking into account the position of p vSP , can be written as

dp DE , p R   D  dp R , pvSP  (5.8)

1 More precisely, the extent between two paths should be measured by the total (or partial integration) of all derivatives of area between two paths from u to v in Fig. 1. However at the level of approximating diversity degree of paths of a graph, it is not necessary to use such detailed calculation.

100 Table 5.1. Summary of Performance Trends of Relay Paths Upper region Lower region Solution Outside Inside Inside Outside space Path distance

Path latency

Overlap probability Milestones DE path v-SP Projected DE path if the relay path stretches in the upper region; or

d p DE , p R   D  d p R , p vSP  (5.9) if it stretches in the lower region. In practice, it is rather difficult to identify exactly in which region the selected relay path is stretching. However, several milestones can be obtained from (5.8) and (5.9), including  When the relay path fully overlaps the default path, dp DE , p R  0 ;  When the relay path is totally merged with the v-SP path, dp DE , p R  D ;  When the relay path is fully overlaps the projected default path, dp DE , p R  2D . The optimal solution of our studied problem (5.4) can be identified using this network model. As path distance between relay paths and their default paths can directly be measured, those measures combined with the condition (5.6) help to identify whether a selected relay path is lying in the lower region of the solution space, i.e. the grey column in Table 5.1. Quantitatively, such relay path holds the following path distance conditions:

dp R , p vSP  D, and D  dp R , p DE  2D (5.10)

101 Given the combined objective (5.4), we have shown that the ideal solution is when a relay path completely overlaps the v-SP because in this situation, it can obtain the shortest delay while enjoy reasonably high divergence (higher than those in the upper half of the solution space) to the default direct path. This position is identified by our path distance as d p R , p vSP   0 . At this position, relay paths hold the minimal latency, i.e. min L  lp vSP , and enjoy high degree of diversity, dp R , p DE  D . In the case of multiple relay path selection, optimal solutions are paths that hold conditions (5.10).

5.4 K Virtual Shortest Paths Algorithm Using our findings based on the network model and the approximated optimal solution space in Section 5.3.3, it is clearly seen that we should somehow select paths which are as close as possible to the v-SP. In this section, we proposed the k Virtual Shortest Paths (k-VSP) algorithm based on the valley-free rule of interdomain routing to address the problem of selecting alternate VoIP relay paths. We also analyse and discuss the costs of using the algorithm in terms of computation required at end hosts and network overheads.

5.4.1 The Algorithm Reference [56] has shown that multi-homed customer ASs can be used for further improving overlay routing. Consider the part of an AS-level network in Figure 5.3, in which AS B has multi-homed connections to two providers AS D and AS E. Due to the valley-free rule, the default direct path between AS A and AS C is A-D-F-H-I-G-E-C. If AS B is chosen as the intermediary relay, the relay path between AS A and AS C is A-D- B-E-C. Despite the relay delay at AS B, the delay of this relay path is likely to be smaller than the direct path. In Figure 5.3, if we assume there is no valley-free constraint then the direct shortest path (e.g. using Dijkstra algorithm [118]) between AS A and AS C should be A-D-B-E-C, which is the same with the named relay path above. Therefore, if we can select a relay

102 node belonging to this ‘virtual’ shortest path (out of many other choices), there might be a chance for the source node to obtain a relay path that spreads on the direction of the shortest route to the destination. More precisely, the relay node should be selected in the AS node of the virtual shortest path where the valley-free rule is violated if traveling from the source to the destination (i.e. AS B in the shortest path A-D-B-E-C in this example). It is possible because we have assumed a high density of relay nodes in the considered AS- level topology. We believe this algorithm works well in the cases of long distance default direct paths. We extend our idea to cover the situation of generating multiple alternate relay paths in VoIP systems by proposing the use of the well-known k-Shortest Paths (k- SP) algorithm [123] instead of Dijkstra function, with some modifications explained below.

AS H AS I

AS F AS G

Default path

AS D AS E

AS A AS B AS C Virtual shortest path

Peer-peer edge

Provider-Customer edge

Figure 5.3. An example of long default direct path from AS A to AS C; A relays traffic to C via the multi-homed AS B.

103 As shown in [76], one-hop and two-hop relay can be sufficient to bypass most performance degradation on the default direct path. Current relay path selection algorithms normally take two-hop relay as a limit in their relay node selection process. Second relay node selection only takes place after the pool of one-hop relays has been used. The obvious reason for that is the more relay nodes used, the more relay delay added to the path, and the less path stability due to the dynamic scenario of relay nodes joining and leaving the system. Taking only the additional relay delay into account as path stability is not the objective, we can see that two-hop relay makes the relay path closer to the virtual path than one-hop relay if the downstream of virtual path with regard to the first relay node also violates valley-free rule. Despite the additional relay delay, the path from the first relay node through the second to destination may be shorter than the corresponding direct path. Therefore, we make the k-VSP algorithm to enable the more flexible use of two-hop relay. Specifically, two-hop relay path is generated if there is more than one valley-free rule violated points along a virtual path. We now explain the modification for the k-SP algorithm [123] to obtain proper k-VSP relay paths. This modification is for the selection of first relay node in the cases of multiple relay paths. When running experiments with the k-SP algorithm, we realised that several virtual paths between a certain pair of nodes may contain interlacing links. As a result, the first-hop relay nodes located by using two different virtual paths among k-SP may be the same if it is on the interlaced path. This has not been an issue in [143] as the preliminary k-VSP algorithm was only tested with the single shortest path function (namely, Dijkstra function). There are two ways to solve this. One way is to replace the k- SP algorithm by the algorithm for the shortest sets of disjoint paths [144], [145] to make sure there is no interlacing between virtual paths. The other way is to locate the first relay node soon after each iteration of identifying virtual path inside the k-SP function. If this relay node has already been used, we discard the virtual path and find the next. In this chapter, we follow the latter approach.

104 In Section 5.5.2, we show that the k-VSP algorithm works more efficiently with regard to the overlap objective if default paths are longer than 3 hop-counts, compared to other methods. In cases of shorter default direct paths, thus, it is better to use one of the existing algorithms for relay path selection. We recommend using MOCR or ASAP as their paths exhibit lower latency. Pseudo-code of the k-VSP algorithm is presented in Figure 5.4. In this algorithm, P  p1 , p2 ,, pK  represents the set of K virtual paths generated by the

R R R function k-SP(G, s, t, K) without applying valley-free rule. S  p1 , p2 , p K  is the set of relay paths. The function GetViolatedVFRVertex(v, pk ) is used to identify, from the node v, the position where a valley-free violation takes place along the path pk . This function returns ‘Null’ value when a default direct path and the virtual shortest path are coincided; or the remaining path from the first hop relay toward the destination is compliant to valley-free rule. If there is no first-hop relay found by this function, we use the heuristic derived from Section 4.6.1 to select a relay node in the v-SP, with the hop distance from the node v, h(v)  3. The function GenerateNextPathk-SP(G, s, t) actually implements each iteration in the k-SP algorithm to find path. Consequently, the function GenerateRelayPath() creates one-hop or two-hop relay path.

5.4.2 The Costs of Using k-VSP Algorithm The new approach calls on end hosts of a VoIP system to store more knowledge about the network. Specifically, the end hosts must have the whole picture of the AS graph in order to compute virtual shortest paths. This is a limitation of the algorithm as comparing to EDR, MOCR and ASAP. In fact, EDR and MOCR require partial topological information that is local to their end hosts; while an ASAP ordinary node stores only a list of close relay nodes. However, the use of the AS graph for computing relay paths at end host will not incur too much control overhead to the system if structured P2P system is utilised. For example with several bootstraps employed, they can provide such intelligence to a large-scale system as building an up-to-date annotated AS graph and disseminating it to supernodes and to end hosts [56]. Furthermore, the size of memory for an annotated AS

105 graph associated with weights is rather small1. Thus, it is acceptable for VoIP client software running in general computers to employ such data storage.

k-VSP (G, s, t, K) P :  ; S :  ;

p1 := Dijkstra (s, t, G);

if (hop-count of p1  3) then S obtained by ASAP or MOCR algorithms; else k : 1;

P : {p1}; while k  K and P   do R pk :  ;

P : P \ {pk };

v : GetViolatedVFRVertex (s, pk ); if v   then

select v pk ,h(v)  3; if v has not been used then

u :GetViolatedVFRVertex (v, pk ); if u   then //i.e. one-hop relay S pk : GenerateRelayPath(s, v, t); else //i.e. two-hop relay S pk := GenerateRelayPath(s, v, u, t); S S  S  {pk }; k : k 1;

pk : GenerateNextPathk-SP(G, s, t);

P : P  pk ; end //of while-do return S;

Figure 5.4. The k-Virtual-Shortest-Path pseudo-code

1 The most recent Internet AS graphs require about 1.5 MB of storage.

106 The proposed algorithm helps to reduce the amount of calculation for relay paths in large-scale VoIP systems. In a network with N relay nodes, the computation complexity for one-hop relay pool can be as large as O(N) ; and it is O(N 2 ) for two-hop relay paths. In the case of our proposed algorithm, the pool size for one-hop relay paths is just equal to the number k used in the k-Shortest Paths algorithm; and as high as 2k paths if two-hop relay paths are used. In practice, the value of k is normally less than 10, so the number of relay path computations in k-VSP are typically small.

5.5 K-VSP Performance Evaluation

5.5.1 Simulation Settings In this work, we use our synthetic AS-level Internet topology described in Section 3.7 to evaluate relay path selection algorithms, as well as to validate the performance comparisons using random graph simulation in Section 4.4.

5.5.2 Performance Analysis We first compare the performance of latency and hop overlap in relay paths selected by the four considered methods. In this experiment, 1,000 pairs of sources and destinations are randomly chosen simulating P2P VoIP call demands in the AS graph. These calls are assumed to be satisfied with one trip delay below 200ms. Following [56], we choose the relay delay of 20ms. The execution time for the breadth first search algorithms increase as the pool of relays increases. On average, it took about ten minutes in our Intel Core 2 Duo CPU 2.66GHz, 2 GB RAM Linux Server for 100-node pool to find a solution. Therefore, large relay pool sizes are not feasible for VoIP relay path selection. The size of relay pool should be limited to 10 or 20 nodes in order for these algorithms to practically be implemented during the initial session of a VoIP call. To be conservative, however, we gradually increase the size of relay node pool, which is used by ASAP, EDR and MOCR algorithms for selecting relay candidates, from 1 to 200 relay peers and measure the corresponding relay path performance.

107 Firstly, we collect the latency of relay paths generated by algorithms. For each value of relay pool size, each algorithm generates 1,000 paths corresponding for 1,000 relay paths between source-destination pairs. Figure 5.5 sketches the average of delay comparison between the four algorithms. Average delay values of default path and the virtual shortest path are also illustrated for reference. The paths selected by EDR exhibit the most delay since EDR designed to select far relay nodes. On average, EDR path delays approach 200ms, i.e. the single trip delay limit of VoIP calls. On the other hand, the mean latency value of ASAP relay paths decrease when increasing the number of relay candidates. This indicates that ASAP can effectively find short relay paths if large relay candidates are available. MOCR relay paths exhibit lower delay than EDR but higher than ASAP. Compared to the three breadth first search styled methods, k-VSP algorithm generates much lower delay. The flat shape of the k-VSP curve in the graph shows that k-VSP paths do not depend on the relay pool size since the k-VSP algorithm determines relay nodes by inferring from v-SP. Furthermore, the k-VSP algorithm execution time is normally less than a second in all our experiments. Thus, the k-VSP algorithm is more suitable with regard to aspects of latency and computational time. Figures 5.6 and 5.7 present the CDF and the actual value ranked of delay of all 1,000 relay paths selected by each algorithm. It can be seen that about 73.3% of relay paths generated by the k-VSP algorithm have value of delay lower than 200ms, which is almost equal to the default paths. Other methods can only select less than 64% of paths. Secondly, we collect hop overlap data from paths generated. In Figures 5.8 and 5.9, we show the hop overlap statistics. Among those breadth first search algorithms, ASAP relay paths contain more hop overlaps than others do as it ignores overlap attribute. EDR and MOCR exhibit similar overlap performance. Both schemes do a significantly better job than ASAP.

108 Figure 5.5. Average one-way delay comparison between relay paths selected by algorithms, the default and the v-SP paths (with 95% confidence intervals)

Figure 5.6. CDF of one-way delay of relay paths selected by algorithms, the default and the v-SP paths

109 Figure 5.7. Session rank of one-way delays

Figure 5.8. Average overlap comparison between relay paths selected by algorithms (with 95% confidence intervals)

110 Figure 5.9. Percentage of overlap values compared between algorithms

In terms of hop overlap, the k-VSP algorithm produces relay paths superior to the other breadth first search schemes. The EDR and MOCR rules can only be comparable to the k-VSP when the size of relay node pool reaches 50 peers. Our results suggest that the k-VSP algorithm produces relay paths that outperform the breadth first search schemes. Obviously, there always exists at least one path for each pair of source-destination containing no hop overlap to the default path (as long as the source and destination are both multi-homed). Those paths, however, may not always be suitable for VoIP because they may have to traverse a long way to the destination. Therefore, all existing VoIP relay path selection algorithms cannot utilise those paths as their choices are normally limited by the upper bound of latency (i.e. 200ms one trip delay in our experiment). Within this delay limit, our network model indicates that paths stretching in the lower region of the solution space can obtain smaller number of overlaps. As a result of selecting paths close

111 to the v-SP, the k-VSP algorithm may also benefit from the low probability of overlap with the default paths. One interesting question is whether or not selecting relay nodes in the v-SP is always the best choice? To answer, we have collected relay path length measured in hop count for all paths generated by k-VSP algorithm, as well as by the breadth first search schemes, and analyse with their corresponding latency and diversity data. The relationships are presented in Figures 5.10 and 5.11. Figure 5.10 shows that it would be more useful to use k-VSP if the length of default path is rather long. From 4-hop default paths, the k-VSP selects lower overlapping paths than EDR and MOCR. Using the proposed network model, it can be explained as follows: the long default paths with short v-SP paths make larger solution spaces; and hence, there is larger gap between the default and the v-SP paths. As the k-VSP relays traffic via short paths closed to the v-SP, it can enjoy better path diversity in the case of long default paths.

Figure 5.10. Average overlaps of relay path selected in different default path lengths (with 95% confidence intervals)

112 In Figure 5.11, this explanation is also applied for path latency. We can see that with default paths longer than 5 hop-counts, the k-VSP can generate paths with delay values equal to or less than the default paths (on average). In our collected data, only a small number of default direct paths have hop-count distances larger than 9 (Figure 5.12). These values are displayed in Figures 5.10 and 5.11 with either no or large interval bars. We consider these values outliers and discard them from our statistics. We now consider the path distance characteristic proposed in our network model in Section 5.3.3 and its relationship with the diversity degree. In this experiment, we select randomly 1,000 source-destination pairs for relaying voice traffic. The size of relay pool in the three breadth first search algorithms is assigned as 20 relay candidates. In the selected relay paths, we measure the hop overlaps, path distance between them to the v-SP paths and to the default paths.

Figure 5.11. Average delay of relay path selected in different default path lengths (with 95% confidence intervals)

113 Figure 5.12. Number of default paths in different default path lengths (with 95% confidence intervals)

Figure 5.13 shows that as the path distances between relay paths and their corresponding default paths increased, the number of hop overlaps between them decrease (except some outliers). The result indicates that our proposed path distance metric is strongly related to hop overlap, and can well represent the diversity degree of relay paths. Using the collected path distance data from this experiment, we apply (5.10) to classify relay paths selected by four algorithms. Figure 5.14 clearly shows that inside the solution space, relay paths which are lying in the lower region always have lower number of hop overlaps than those are in the upper region. It strengthens our network model for approximating optimal relay path proposed in this chapter. In Figure 5.14, notice in the upper solution space that the k-VSP relay paths exhibits smaller average hop overlap than other paths. In contrast, its relay paths in the lower solution space have higher mean value of hop overlap. This indicates that the k-VSP relay paths are closer to the v-SP than others, which has been foreseen by using our network model.

114 Figure 5.13. The relationship between path distance and hop overlap parameters (with 95% confidence intervals)

Figure 5.14. Classifying relay paths within the solution space and the corresponding mean values of overlap (with 95% confidence intervals)

115 Figure 5.15. Identifying relay paths inside and outside the solution space and the corresponding mean values of path distance (with 95% confidence intervals)

In Figure 5.15, we see that all relay paths generated by k-VSP are inside the solution space. The mean value of k-VSP path distances to the v-SP is also very small compared to others. We conclude that our proposed k-VSP algorithm can effectively generate relay paths close to virtual shortest paths, and hence, enjoy low latency and overlap.

5.5.3 Comparison between Two Versions of the k-VSP Algorithm We first proposed the k-VSP algorithm in [143]. In this preliminary version, the k- VSP only considers single relay node for each virtual path to relay voice traffic. Recall from Section 5.4.1 that relay path can be closer to the virtual paths by the use of second relay hop if the path from the first relay node to destination is not valid. Therefore, we have modified the k-VSP algorithm so that it is forced to select the second relay hop in this case.

116 In this section, we provide comparisons between the two k-VSP versions. We measure latency and number of hop overlaps generated by the two versions for 1,000 random pairs of source-destination. Figures 5.16 and 5.17 illustrate the cumulative distributions of latency and overlap results. In the both categories, the complete k-VSP version slightly outperforms the preliminary version. At 200ms of delay threshold, the complete k-VSP achieves 84.6% of relay paths lower than the threshold while the preliminary one gives only 81.8% of relay paths. In terms of hop overlap, the complete k-VSP achieves 73.8% of paths that contain number of overlaps less or equal than 3 hops while the corresponding preliminary version achieves only 69.2%. Those results have demonstrated that by choosing the second relay node in the cases the valley-free rule is violated on path from the first relay to destination, the k-VSP algorithm can generate relay paths even closer to the optimal solutions, i.e. the virtual shortest paths.

5.6 Summary This chapter proposes a new approach to determine the optimal solution space for selecting the best VoIP relay paths. Our proposed heuristic algorithm is demonstrated to work well in the case of long hop-count direct communication paths. In this situation, we conclude that in order to generate relay path with minimal delay and hop overlaps, P2P VoIP systems should select relay nodes which are along the virtual shortest paths so that the relay path can stretch to fully overlap it. Analytical data show that all relay paths generated by using our proposed algorithm are inside the solution space, which accommodates up to 73.4% of good relay paths while the best of existing methods, in terms of latency, can only achieve up to 63.8% of good paths. In terms of path diversity, our proposed algorithm exhibits the very small number of hop overlaps if using small sizes of pool relay.

117 Figure 5.16. CDF of delay comparison between the two k-VSP versions

Figure 5.17. CDF of number of overlaps comparison between the two k-VSP versions

118 CHAPTER 6

CONCLUSION AND FUTURE WORK

6.1 Summary of Contributions This dissertation studied peer-to-peer (P2P) relay path selection algorithms that are applicable for providing voice-over-IP (VoIP) service in large-scale networks. We first developed a heuristic for light weight relay path selection. Based on extensive simulations and analyses, we then provided some important characteristics in selecting peer relay. Finally, we constructed the overlay problem for optimal VoIP relay path selection; proposed a network model, and a heuristic algorithm which can effectively localise the optimal space. Our contributions can be summarised as follows: 1. In Chapter 3, we developed two network simulation packages that respectively use to generate large-scale random autonomous system (AS)-level graphs, and to simulate synthetic AS-level Internet topologies. The random AS graph simulator effectively generates graphs with links following power law link distribution and type of relationship (ToR) annotation. The synthetic AS-level Internet reproduces CAIDA annotated AS graphs and iPLANE link delay database. The two simulations allow us to test various relay path selection algorithms in different network scenarios. Our work on these network simulations have been reported and used in [125], [128], [143] and made available to download at [126]. 2. In Chapter 4, we introduced a heuristic for light weight alternate relay path selection based on path latency and path diversity. We then provided an extensive range of simulations and analyses with relay path selection algorithms. First, we looked at the choice of relay nodes and its impacts on the performance of VoIP overlay connections. We found that relaying traffic through top tier ASs (i.e. ASs with high node degree) makes relay paths shorten considerably. Secondly, we quantified a lower threshold for the first hop relay path length in hop-count. In the

119 chapter, we also show the conflict between the performance objectives of relay paths. Our work on this part has been reported in [128], [143]. 3. In Chapter 5, we formulated the overall problem of optimising VoIP relay path in a P2P network with regard to path latency and path diversity objectives, and then, proposed a network model for approximation of optimal space for selecting relay paths. We then described a completed method for selecting near-optimal set of alternate relay paths in VoIP systems based on latency, hop overlaps and the valley-free nature in the current interdomain routing environment. We believe our findings have created a new approach for solving the problem of selecting optimal VoIP relay paths in large-scale networks.

6.2 Future Work It is possible to expand the scope of the problem of selecting optimal P2P overlay paths in large-scale VoIP networks by including more objectives such as loss probability, peer load, etc. In this multi-objective problem, one of the most important aspects that needs to be carefully studied is how to organise the set of objectives in priority with regard to the specific requirements in design of P2P VoIP systems. The work on large-scale simulation platform that is able to concurrently simulate two scenarios of VoIP calls through overlay networks, i.e. intra and inter-AS calls, can also be a very useful research tool. The IP-level path information of the simulation may also provide opportunities to study cross-layer interaction in P2P overlay networks.

120 BIBLIOGRAPHY

[1] S. Shenker and J. Wroclawski, "General characterization parameters for integrated service network elements," IETF RFC 2215, September 1997. [2] R. Braden, L. Zhang, S. Berson, A. Herzog, and S. Jamin, "Resource reservation protocol (RSVP) – Version 1 functional specification," IETF RFC 2205, September 1997. [3] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, "An architecture for differentiated services," IETF RFC 2475, December 1998. [4] E. Rosen, A. Viswanathan, and R. Callon, "Multiprotocol label switching architecture." IETF RFC 3031, 2001. [5] W. Cui, I. Stoica, and R. H. Katz, "Backup path allocation based on a correlated link failure probability model in overlay networks," in Proc. of 10th IEEE Intl. Conf. on Network Protocols (ICNP’02), 2002, pp. 236–247. [6] A. Nakao, L. Peterson, and A. Bavier, "A routing underlay for overlay networks," in Proc. of ACM 2003 Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM'03), 2003, pp. 11-18. [7] C. Tang and P. K. McKinley, "A distributed multipath computation framework for overlay network applications," Michigan State University, Tech. Rep. MSU-CSE- 04-18, 2004. [8] T. Fei, S. Tao, L. Gao, and R. Guerin, "How to select a good alternate path in large peer-to-peer systems?," in Proc. of 25th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’06), 2006, pp. 1-13. [9] T. Fei, S. Tao, L. Gao, R. Guerin, and Z.-l. Zhang, "Light-weight overlay path selection in a peer-to-peer environment," in Proc. of 25th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’06), 2006, pp. 1-6. [10] J. Han, D. Watson, and F. Jahanian, "An experimental study of Internet path diversity," Dependable and Secure Computing, IEEE Transactions on, vol. 3, pp. 273-288, 2006. [11] J. F. Nash, "Equilibrium points in n-person games," in Proc. of the National Academy of Sciences of the United States of America, pp. 48-49, 1950. [12] M. Pióro and D. Medhi, Routing, flow, and capacity design in communication and computer networks: Morgan Kaufmann Publishers, 2004. [13] D. Quagliarella, J. Périaux, C. Poloni, and G. Winter, Genetic algorithms and evolution strategy in engineering and computer science: recent advances and industrial applications: John Wiley & Sons, 1998. [14] W. Zheng, X. Liu, S. Shi, J. Hu, and H. Dong, "Peer-to-peer: A technique perspective," in Handbook on theoretical and algorithmic aspects of sensor, ad

121 hoc wireless, and peer-to-peer networks, J. Wu, Ed.: Auerbach Publications, 2006, pp. 591-616. [15] X. Li and J. Wu, "Searching techniques in peer-to-peer networks," in Handbook on theoretical and algorithmic aspects of sensor, ad hoc wireless, and peer-to-peer networks, J. Wu, Ed.: Auerbach Publications, 2006, pp. 617-642. [16] P. Ganesan, Q. Sun, and H. Garcia-Molina, "Yappers: A peer-to-peer lookup service over arbitrary topology," in Proc. of 22nd Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’03), 2003, pp. 1250-1260. [17] J. Wu, Handbook on theoretical and algorithmic aspects of sensor, ad hoc wireless, and peer-to-peer networks: Auerbach publications, 2006. [18] "Napster," [Online]. Available: http://www.napster.com. [19] "Gnutella," [Online]. Available: http://gnutella.wego.com. [20] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, "Search and replication in unstructured peer-to-peer networks," in Proc. of the 16th ACM Intl. Conf. on Supercomputing (ACM ICS'02), 2002, pp. 84-95.

[21] B. Yang and H. Garcia-Molina, "Improving search in peer-to-peer networks," in Proc. of 22nd IEEE Intl. Conf. on Systems (IEEE ICDCS'02), 2002, pp. 5-14. [22] V. Kalogeraki, D. Gunopulos, and D. Zeinalipour-Yazti, "A local search mechanism for peer-to-peer networks," in Proc. of 11th ACM Intl. Conf. on Information and Knowledge Management (ACM CIKM'02), 2002, pp. 300-307. [23] D. Tsoumakos and N. Roussopoulos, "Adaptive probabilistic search for peer-to- peer networks," in Proc. of 3rd IEEE Intl. Conf. on Peer-to-Peer Computing (IEEE P2P'03), 2003, pp. 102–109. [24] A. Crespo and H. Garcia-Molina, "Routing indices for peer-to-peer systems," in Proc. of 22nd IEEE Intl. Conf. on Distributed Computing Systems (IEEE ICDCS'02), 2002, pp. 23-34. [25] S. C. Rhea and J. Kubiatowicz, "Probabilistic location and routing," in Proc. of 21st Annu. Joint Conf. of the IEEE Computer and Communications Societies (INFOCOM'02). 2002, pp. 1248-1257. [26] E. Cohen and S. Shenker, "Replication strategies in unstructured peer-to-peer networks.," in Proc. of ACM Annu. Conf. of the Special Interest Group on Data Communication (ACM SIGCOMM'02), 2002. [27] A. Singla and C. Rohrs, "Ultrapeers: Another step towards Gnutella scalability," Gnutella developer forum, 2002. [28] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, "Chord: A scalable peer-to-peer lookup service for internet applications," in Proc. of ACM Annu. Conf. of the Special Interest Group on Data Communication (ACM SIGCOMM'01), 2001, p. 160.

122 [29] A. Rowstron and P. Druschel, "Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems," in Proc. of 18th Intl. Conf. of the Distributed Systems Platforms (IFIP/ACM'01), 2001, pp. 329-350. [30] K. Hildrum, J. D. Kubiatowicz, S. Rao, and B. Y. Zhao, "Distributed object location in a dynamic network," in Proc. of 14th Annu. ACM Symp. on Parallel Algorithms and Architectures (ACM SPAA'02), Winnipeg, Manitoba, Canada, 2002, pp. 41-52. [31] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker, "A scalable content-addressable network," in Proc. of 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM'01) San Diego, California, United States: ACM, 2001. [32] D. R. Karger and M. F. Kaashoek, "Koorde: A simple degree-optimal distributed hash table," in Proc. of 2nd Intl. Workshop on Peer-to-Peer Systems (IPTPS'03), Berkeley, CA, 2003. [33] D. Malkhi, M. Naor, and D. Ratajczak, "Viceroy: A scalable and dynamic emulation of the butterfly," in Proc. of 21st ACM Annu. Symp. on Principles of Distributed Computing (ACM PODC'02), 2002, pp. 183-192.

[34] "Secure Hash Standard ". vol. FIPS Pub 180-1, NIST, Ed.: Springfield, VA, 1995. [35] K. N. Singh, "Reliable, scalable and interoperable internet telephony." vol. PhD Thesis: Columbia University, 2006. [36] P. Maymounkov and D. Mazieres, "Kademlia: A peer-to-peer information system based on the XOR metric," in Proc. of 1st Intl. Workshop on Peer-to-Peer Systems (IPTPS'02), Cambridge, USA, 2002, p. 2. [37] A. T. Mizrak, Y. Cheng, V. Kumar, and S. Savage, "Structured superpeers: Leveraging heterogeneity to provide constant-time lookup," in Proc. of 3rd IEEE Workshop on Internet Applications (WIAPP'03), 2003. [38] L. Garcés-Erice, E. W. Biersack, K. W. Ross, P. A. Felber, and G. Urvoy-Keller, "Hierarchical peer-to-peer systems," Lecture notes in computer science, pp. 1230- 1239, 2003. [39] I. Gupta, K. Birman, P. Linga, A. Demers, and R. Van Renesse, "Kelips: Building an efficient and stable P2P DHT through increased memory and background overhead," in Proc. of 2nd Intl. Workshop on Peer-to-Peer Systems (IPTPS'03), 2003. [40] M. J. Freedman and D. Mazieres, "Sloppy hashing and self-organized clulsters," in Proc. of 2nd Intl. Workshop on Peer-to-Peer Systems (IPTPS'03), 2003. [41] B. Y. Zhao, Y. Duan, L. Huang, A. Joseph, and J. Kubiatowicz, "Brocade: Landmark routing on overlay networks," in Proc. of 1st Intl. Workshop on Peer-to- Peer Systems (IPTPS'02), 2002. [42] Z. Xu, R. Min, and Y. Hu, "Hieras: A DHT based hierarchical P2P routing algorithm," in Proc. of 32nd Intl. Conf. on Parallel Processing (ICPP'03), 2003, pp. 187-196.

123 [43] "KaZaA," [Online]. Available: http://kazaa.com. [44] "Skype blames outage on user reboot," [Online]. Available: http://technology. timesonline.co.uk/tol/news/tech_and_web/article2292536.ece. [45] Y. J. Joung, L. W. Yang, and C. T. Fang, "Keyword search in DHT based peer-to- peer networks," IEEE Journal on Selected Areas in Communications, vol. 25, p. 46, 2007. [46] S. Roy, H. Pucha, Z. Zhang, Y. C. Hu, and L. Qiu, "Overlay node placement: Analysis, algorithms and impact on applications," in Proc. of 27th Intl. Conf. on Distributed Computing Systems (ICDCS'07), 2007, pp. 53-53. [47] H. Xie, A. Krishnamurthy, A. Silberschatz, and Y. R. Yang, "P4P: Explicit communications for cooperative control between P2P and network providers," [Online]. Available: http://www.dcia.info/documents/P4P_Overview.pdf, 2007. [48] S. Seetharaman, V. Hilt, M. Hofmann, and M. Ammar, "Preemptive strategies to improve routing performance of native and overlay layers," in Proc. of 26th Annu. Joint Conf. of the IEEE Computer and Communications Societies (INFOCOM’07), 2007.

[49] J. Han and F. Jahanian, "Impact of path diversity on multi-homed and overlay networks," in Proc. of the 2004 IEEE Intl. Conf. on Dependable Systems and Networks (DSN’04) 2004. [50] G. Caizzone, A. Corghi, P. Giacomazzi, and M. Nonnoi, "Analysis of the scalability of the overlay Skype system," in Proc. of IEEE Intl. Conf. on Communications (ICC'08), 2008, pp. 5652-5658. [51] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, "Web caching and Zipf-like distributions: Evidence and implications," in Proc. of 18st Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM'99), 1999, pp. 126-134. [52] M. Zhang, J. Lai, A. Krishnamurthy, L. Peterson, and R. Wang, "A transport layer approach for improving end-to-end performance and robustness using redundant paths," in Proc. of USENIX Annu. Technical Conf. (USENIX'04), 2004, pp. 99-112. [53] M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron, "Exploiting network proximity in distributed hash tables," in Proc. of Intl. Workshop on Future Directions in Distributed Computing (FUDICO'02), 2002, pp. 52–55. [54] T. Locher, S. Schmid, and R. Wattenhofer, "equus: A provably robust and locality- aware peer-to-peer system," in Proc. of 6th IEEE Intl. Conference on Peer-to-Peer Computing (P2P’06), 2006. [55] M. Waldvogel and R. Rinaldi, "Efficient topology-aware overlay network," ACM SIGCOMM Computer Communication Review, vol. 33, p. 106, 2003. [56] S. Ren, L. Guo, and X. Zhang, "ASAP: an AS-aware peer-relay protocol for high quality VoIP," in Proc. of 26th IEEE Intl. Conf. on Distributed Computing Systems (ICDCS’06), 2006, pp. 70-70.

124 [57] T. S. E. Ng and H. Zhang, "Predicting Internet network distance with coordinates- based approaches," in Proc. of 21st Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’02), 2002, pp. 170-179. [58] S. A. Baset and H. G. Schulzrinne, "An analysis of the Skype peer-to-peer Internet telephony protocol," in Proc. of 25th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’06), 2006, pp. 1-11. [59] "Methods for subjective determination of transmission quality," ITU-T Recommendation P.800, August 1996. [60] "The E-model, a computational model for use in transmission planning," ITU-T Recommendation G. 107, March 2003. [61] J. Rosenberg, J. Weinberger, C. Huitema, and R. Mahy, "STUN-simple traversal of user datagram protocol (UDP) through network address translators (NATs)." IETF RFC 3489, 2003. [62] "Global Index (GI)," [Online]. Available: http://www.skype.com/skype_p2p explained.html. [63] S. Guha, N. Daswani, and R. Jain, "An experimental study of the Skype peer-to- peer VoIP system," in Proc. of Intl. Workshop on Peer-to-Peer Systems (IPTPS’06), 2006. [64] Y. Yu, D. Liu, J. Li, and C. Shen, "Traffic identification and overlay measurement of Skype," in Proc. of 2006 Intl. Conf. on Computational Intelligence and Security, 2006, pp. 1043-1048. [65] K. Suh, D. R. Figueiredo, J. Kurose, and D. Towsley, "Characterizing and detecting relayed traffic: A case study using Skype," in Proc. of 25th Intl. Conf. on Computer Communications (INFOCOM’06), 2006. [66] W. Kho, S. A. Baset, and H. Schulzrinne, "Skype relay calls: Measurements and experiments," in Proc. of 27th IEEE Intl. Conf. on Computer Communications (INFOCOM), 2008, pp. 1-6. [67] H. Schulzrinne and J. Rosenberg, "Internet telephony: Architecture and protocols – an IETF perspective," Computer Networks and ISDN Systems, vol. 31, pp. 237- 255, 1999. [68] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: session initiation protocol," IETF RFC 3261, June 2002. [69] I. Baumgart, B. Heep, and S. Krause, "OverSim: A flexible overlay network simulation framework," in Proc. of 10th IEEE Global Internet Symp. (GI'07,) Anchorage, AK, USA, 2007, pp. 79-84. [70] K. Singh and H. Schulzrinne, "Peer-to-peer internet telephony using SIP," in Proc. of 2005 AMC Intl. Workshop on Network and Operating System Support for Digital Audio and Video (AMC NOSSDAV'05), Skamania, Washington, USA, 2005, pp. 13-14.

125 [71] S. Baset, H. Schulzrinne, E. Shim, and K. Dhara, “Requirements for SIP-based Peer-toPeer Internet Telephony,” Internet Draft draft-baset-sipping-p2preq-00, Internet Engineering Task Force, Oct 2005. [72] A. Johnston, “SIP, P2P, and Internet Communications,” Internet Draft draft- johnstonsipping-p2p-ipcom-01, Internet Engineering Task Force, Mar 2005. [73] D. Bryan, B. Lowekamp, and C. Jennings, “A P2P Approach to SIP Registration,” Internet Draft draft-bryan-sipping-p2p-02, Internet Engineering Task Force, Mar 2006. [74] C. Jennings, B. Lowekamp, E. Rescorla, S. Baset, and H. Schulzrinne, "Resource location and discovery (RELOAD) base protocol," Internet Draft draft-ietf-p2psip- base-02, Internet Engineering Task Force, 2009. work in progress. [75] S. Rhea, B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker, I. Stoica, and H. Yu, “OpenDHT: a public DHT service and its uses,” SIGCOMM Computer Communication Review, vol. 35, pp. 73–84, 2005. [76] D. Andersen, H. Balakrishnan, F. Kaashoek, and R. Morris, "Resilient overlay networks," in Proc. of 18th ACM Symp. on Operating Systems Principles, Alberta, Canada, 2001.

[77] S. Savage, T. Anderson, A. Aggarwal, D. Becker, N. Cardwell, A. Collins, E. Hoffman, J. Snell, A. Vahdat, G. Voelker, and J. Zahorjan, “Detour: a case for informed internet routing and transport,” IEEE Micro, vol. 19, no. 1, pp. 50–59, January 1999. [78] T. Nguyen and A. Zakhor, "Path diversity with forward error correction (PDF) system for packet switched networks," in Proc. of 22nd Annl. Joint Conf. of the IEEE Computer and Communications (INFOCOM'03). 2003, pp. 663-672. [79] B. Krishnamurthy and J. Wang, "On network-aware clustering of web clients," in Proc. of ACM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication (ACM SIGCOMM'00), 2000, p. 110. [80] "Route Views Project," [Online]. Available: http://www.routeviews.org/. [81] "RIPE Network Coordination Centre," [Online]. Available: http://www.ripe.net/. [82] P. R. McManus, "A passive system for server selection within mirrored resource environments using AS path length heuristics," Applied Theory Communications, New York, 1999. [83] L. Gao, "On inferring autonomous system relationships in the Internet," IEEE/ACM Transactions on Networking, vol. 9, pp. 733-745, 2001. [84] C. X. Dimitropoulos, "Measuring and modeling Internet routing for realistic simulations " in School of Electrical and Computer Engineering. vol. PhD Thesis Georgia, United States Georgia Institute of Technology, 2006. [85] G. Di Battista, M. Patrignani, and M. Pizzonia, "Computing the types of the relationships between autonomous systems," in Proc. of 22nd Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’03), 2003, pp. 156-165 vol.1.

126 [86] X. Jianhong and G. Lixin, "On the evaluation of AS relationship inferences," in Proc. of IEEE Global Telecommunications Conference (GLOBECOM'04), 2004, pp. 1373-1377 vol.3. [87] T. Erlebach, A. Hall, A. Panconesi, and D. Vukadinovic, "Cuts and disjoint paths in the valley-free path model," Internet Mathematics, vol. 3, pp. 333-359, 2006. [88] R. Cohen and D. Raz, "The Internet dark matter-on the missing links in the AS connectivity map," in Proc. of 25th IEEE International Conference on Computer Communications (INFOCOM 2006), 2006, pp. 1-12. [89] X. Dimitropoulos and G. Riley, "Modeling autonomous-system relationships," in Proc. of 20th Workshop on Principles of Advanced and Distributed Simulation (PADS'06), 2006, pp. 143-149. [90] "The CAIDA AS relationships dataset <28-Jul-2008>," [Online]. Available: http://www.caida.org/data/active/as-relationships/. [91] R. Govindan and A. Reddy, "An analysis of Internet inter-domain topology and route stability," in Proc. of 16th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’97), 1997, pp. 850-857 vol.2.

[92] Z. Ge, D. R. Figueiredo, S. Jaiswal, and L. Gao, "On the hierarchical structure of the logical Internet graph," in Proc. of SPIE ITCOM, 2001. [93] L. Subramanian, S. Agarwal, J. Rexford, and R. H. Katz, "Characterizing the Internet hierarchy from multiple vantage points," in Proc. of 21st Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’02), 2002. [94] X. Dimitropoulos, D. Krioukov, G. Riley, and K. Claffy, "Revealing the autonomous system taxonomy: The machine learning approach," in Proc. of 7th Passive and Active Measurement Workshop (PAM’06), 2006. [95] "Autonomous System Taxonomy Repository," [Online]. Available: http://www. caida.org/data/active/as_taxonomy/. [96] Z. M. Mao, J. Rexford, J. Wang, and R. H. Katz, "Towards an accurate AS-level traceroute tool," in Proc. of the 2003 Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM’03): ACM New York, NY, USA, 2003, pp. 365-378. [97] Z. M. Mao, D. Johnson, J. Rexford, J. Wang, and R. Katz, "Scalable and accurate identification of AS-level forwarding paths," in Proc. of 23rd Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’04), 2004, pp. 1605-1615. [98] R. Schollmeier and G. Kunzmann, "GnuViz-Mapping the Gnutella network to its geographical locations," in Praxis der Informationsverarbeitung und Kommunikation (PIK). vol. 26, 2003, pp. 74-79. [99] S. Seetharaman and M. Ammar, "On the interaction between dynamic routing in the native and overlay layers," in Proc. of 25th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’06), 2006. [100] A. Moffat and A. Turpin, Compression and coding algorithms. New York: Kluwer Academic Pub, 2002.

127 [101] D. Salomon, Data compression: The complete reference. New York: Springer, 1997. [102] K. I. Calvert, M. B. Doar, and E. W. Zegura, "Modeling internet topology," IEEE Communications magazine, vol. 35, pp. 160-163, 1997. [103] "PlanetLab - An open platform for developing, deploying and accessing planetary- scale services," [Online]. Available: http://www.planet-lab.org/. [104] S. Floyd and V. Paxson, "Difficulties in simulating the Internet," IEEE/ACM Transactions on Networking (TON), vol. 9, p. 403, 2001. [105] E. W. Zegura, K. L. Calvert, and S. Bhattacharjee, "How to model an internetwork," in Proc. of Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’96), 1996, pp. 594–602. [106] "GT Internetwork Topology Models," [Online]. Available: http://www.isi.edu/ nsnam/ns/ns-topogen.html-gt-itm. [107] M. B. Doar, "A better model for generating test networks," in Proc. of IEEE Global Telecommunications Conference (GLOBECOM '96), New York, 1996, pp. 86-93.

[108] J. H. Cowie, D. M. Nicol, and A. T. Ogielski, "Modeling the global internet," Computing in Science and Engineering, vol. 1, pp. 42-50, 1999. [109] X. Dimitropoulous and G. Riley, "The BGP++ simulation," in Georgia Tech University: [Online]. Available: http://www.ece.gatech.edu/research/labs/ MANIACS, 2005. [110] X. Dimitropoulos, G. Riley, D. Krioukov, and R. Sundaram, "Towards a topology generator modeling AS relationships," ICNP (extended abstract), 2005. [111] R. Cohen and D. Raz, "Acyclic type of relationships between Autonomous Systems," in Proc. of 26th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’07), pp. 1334-1342, 2007. [112] M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On power-law relationships of the internet topology," in Proc. of Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication (ACM SIGCOMM’99), 1999, pp. 251-262. [113] A. L. Barabási and R. Albert, "Emergence of scaling in random networks," Science, vol. 286, pp. 509-512, 1999. [114] R. Albert and A. L. Barabási, "Statistical mechanics of complex networks," Reviews of modern physics, vol. 74, pp. 47-97, 2002. [115] X. Dimitropoulos, D. Krioukov, M. Fomenkov, B. Huffaker, Y. Hyun, and G. Riley, "AS relationships: Inference and validation," ACM SIGCOMM Computer Communication Review, vol. 37, pp. 29-40, 2007. [116] D. E. Knuth, The Stanford GraphBase: A platform for combinatorial computing. New York: ACM Press, 1993.

128 [117] S. McCanne and S. Floyd, "Network Simulator ns-2," The Vint project, [Online]. Available: http://www.isi.edu/nsnam/ns. [118] E. W. Dijkstra, "A note on two problems in connexion with graphs," Numerische mathematik, vol. 1, pp. 269-271, 1959. [119] H. Tangmunarunkit, R. Govindan, S. Shenker, and D. Estrin, “The impact of routing policy on Internet paths,” in Proc. of 20th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’01), 2001. [120] L. Gao and F. Wang, “The extent of AS path inflation by routing policies,” in Proc. of the IEEE Global Internet Symp., 2002. [121] N. Spring, R. Mahajan, and T. Anderson, “Quantifying the causes of path inflation,” in Proc. of the Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication (ACM SIGCOMM’03), 2003. [122] Z. M. Mao, L. Qiu, J. Wang, and Y. Zhang, "On AS-level path inference," in Proc. of 2005 Intl. Conf. on Measurement and Modeling of Computer Systems (ACM SIGMETRICS’05), 2005, p. 349. [123] J. Y. Yen, "Finding the k shortest loopless paths in a network," Management Science, vol. 17, pp. 712-716, 1971. [124] J. Moy, "RFC2328: OSPF Version 2," RFC Editor United States, 1998. [125] H. Agrawal, A. Jennings, and Q. D. Bui, "A hybrid approach for robust traffic engineering," in Proc. of 4th Intl. Symp. on Wireless Pervasive Computing (ISWPC’09), 2009, pp. 1-5. [126] Q. D. Bui: [Online]. Available: http://sites.google.com/site/quangdbui/. [127] "iPLANE: An Information Plane for Distributed Services," [Online]. Available: http://iplane.cs.washington.edu/data.html. [128] Q. D. Bui and A. Jennings, "Relay path selection approaches in peer-to-peer VoIP systems," in Proc. of Australasian Telecommunication Networks and Application Conf. (ATNAC’08), 2008, pp. 361-366. [129] "Skype," [Online]. Available: http://en.wikipedia.org/wiki/Skype#Usage_and_ traffic. [130] J. He and J. Rexford, "Toward internet-wide multipath routing," IEEE Network, vol. 22, pp. 16-21, 2008. [131] C. M. Cheng, Y. S. Huang, H. T. Kung, and C. H. Wu, "Path probing relay routing for achieving high end-to-end performance," in Proc. of IEEE Global Telecommunications Conference (GLOBECOM '04), 2004. [132] A. Akella, B. Maggs, S. Seshan, A. Shaikh, and S. Sitaraman, "A measurement- based analysis of multihoming," in Proc. of 2003 Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication (ACM SIGCOMM’03), 2003.

129 [133] K. P. Gummadi, H. V. Madhyastha, S. D. Gribble, H. M. Levy, and D. Wetherall, "Improving the reliability of Internet paths with One-hop Source Routing," in Proc. of 6th USENIX OSDI, 2004. [134] B. Zhao, L. Huang, J. Stribling, A. Joseph, and J. Kubiatowicz, "Exploiting routing redundancy via structured peer-to-peer overlays," in Proc. of 11th IEEE Intl. Conf. on Network Protocols (ICNP’03), 2003. [135] S. Tao, K. Xu, A. Estepa, T. Fei, L. Gao, R. Guerin, J. Kurose, D. Towsley, and Z.- L. Zhang, "Improving VoIP quality through path switching," in Proc. of 24th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’05), 2005, pp. 2268-2278 vol. 4. [136] Z. Li and P. Mohapatra, "The impact of topology on overlay routing service," in Proc. of 23rd Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’04), 2004, pp. 408-418. [137] T. Fei, "On the relay selection strategy in large peer-to-peer networks," in Dept. of Electrical and Computer Engineering. PhD Thesis, University of Massachusetts Amherst, 2007. [138] P. Francis, S. Jamin, V. Paxson, L. Zhang, D. F. Gryniewicz, and Y. Jin, "An architecture for a global internet host distance estimation service," in Proc. of 18th Annu. Joint Conf. of the IEEE Computer and Communications (INFOCOM’99), New York, 1999. [139] P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavitt, and L. Zhang, "IDMaps: A global Internet host distance estimation service," IEEE/ACM Transactions on Networking (TON), vol. 9, pp. 525-540, 2001. [140] L. Tang and M. Crovella, "Virtual landmarks for the Internet," in Proc. of the 3rd ACM SIGCOMM Conf. on Internet measurement, Miami, Florida, 2003, p. 152. [141] F. Dabek, R. Cox, F. Kaashoek, and R. Morris, "Vivaldi: A decentralized network coordinate system," in Proc. of the 2004 Conf. on Applications, technologies, architectures, and protocols for computer communications (SIGCOMM’04), 2004, pp. 15-26. [142] S. Lee, "Exploiting network distance based Euclidean coordinates for the one hop relay selection," in Management Enabling the Future Internet for Changing Business and New Computing Services vol. 5787/2009: Springer Berlin / Heidelberg, 2009, pp. 527-530. [143] Q. D. Bui and A. Jennings, "Relay node selection in large-scale VoIP overlay networks," in Proc. of 1st Intl. Conf. on Ubiquitous and Future Networks (ICUFN’09), 2009. [144] R. Bhandari, Survivable networks: algorithms for diverse routing: Kluwer Academic Pub, 1999. [145] J. W. Suurballe, "Disjoint path in a network," Networks, vol. 4, pp. 125-145, 1974.

130