Analysis of Multilayer-Encryption Anonymity Networks
Total Page:16
File Type:pdf, Size:1020Kb
ANALYSIS OF MULTILAYER-ENCRYPTION ANONYMITY NETWORKS by Khalid Shahbar Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at Dalhousie University Halifax, Nova Scotia October 2017 c Copyright by Khalid Shahbar, 2017 It is my genuine gratefulness that I dedicate my thesis to the greatest mother, my mother. You are the reason I reach this stage. Your support started a long time before even when you taught me the alphabetic letters. To my father, who passed away before I started my PhD., I wish to share this moment with you. ii Table of Contents List of Tables ................................... ix List of Figures .................................. xi Abstract ...................................... xii List of Abbreviations Used .......................... xiii Acknowledgements ............................... xvi Chapter 1 Introduction .......................... 1 1.1 Research Objectives . 3 1.2 Contributions . 6 1.3 Structure . 6 Chapter 2 Overview of Multilayer-encryption Anonymity Networks 8 2.1 Tor Network . 9 2.2 JonDonym Network . 10 2.3 I2P Network . 13 2.4 Summary . 14 Chapter 3 Related Literature ....................... 16 3.1 Measuring Anonymity . 16 3.2 Identifying Anonymity Networks by Discovering Infrastructure . 18 3.3 Identifying Application on Top of Anonymity Networks . 20 3.4 Discovering Hidden Services . 23 3.5 Packet Inspection . 26 3.6 First N-Packets for Traffic Classification . 27 3.7 Summary . 28 iii Chapter 4 Weighted Factors for Measuring Anonymity Services . 30 4.1 Proposed Factors . 32 4.1.1 The Level of Information Available for the Service Provider . 32 4.1.2 Blocking Anonymity and Obfuscation Options . 35 4.1.3 Application and Anonymity . 39 4.1.4 Authority and Logs . 41 4.1.5 Threat Models . 43 4.2 Evaluation . 45 4.2.1 Factor Calculation . 45 4.2.2 Weight Calculation . 46 4.2.3 Weighted Anonymity Factor . 47 4.2.4 Evaluation Case Study . 48 4.2.5 Expanding the Quantification . 50 4.3 Summary . 51 Chapter 5 Anon17: Network Traffic Dataset of Anonymity Services 52 5.1 Data Collection and Traffic Types . 53 5.1.1 Tor Data . 53 5.1.2 TorApp . 53 5.1.3 Tor PT . 54 5.1.4 I2PApp80BW . 54 5.1.5 I2PApp0BW . 54 5.1.6 I2PUsers . 54 5.1.7 I2PApp . 55 5.1.8 JonDonym . 55 5.2 Dataset Features and Format . 55 5.3 Summary . 57 Chapter 6 Research Methodology ................... 58 6.1 Data Collections . 58 6.2 Machine Learning Algorithms . 59 6.2.1 C4.5 . 59 6.2.2 Random Forests . 61 6.2.3 Naive Bayes . 61 6.2.4 Bayesian Network . 63 6.3 Flow Exporters . 64 6.4 Summary . 66 iv Chapter 7 Experiments on the Identification of Anonymity Net- works ............................... 67 7.1 Tor Behaviour to Circuits and Flows Analysis . 67 7.1.1 Cells in the Tor Network . 68 7.1.2 Circuit Level Classification . 70 7.1.2.1 Cells Per Circuit Life Time . 72 7.1.2.2 Uplink Cells . 72 7.1.2.3 The Ratio of the Downlink Cells to the Uplink Cells 72 7.1.2.4 Exponentially Weighted Moving Average (EWMA) . 73 7.1.3 Flow Level Classification . 73 7.1.4 Evaluation of Circuit and Flow Level Approaches . 75 7.1.4.1 Setup . 75 7.1.4.2 Circuit Level Classification Data . 76 7.1.4.3 Flow Level Classification Data . 76 7.1.5 Performance Metrics . 76 7.1.6 Results and Discussion . 77 7.1.6.1 Circuit Level Classification Results . 77 7.1.6.2 Flow Level Classification Results . 78 7.1.6.3 The Performances of the Classifiers Employed . 80 7.2 The Effects of Shared Bandwidth on I2P Tunnels . 83 7.2.1 Data Collection and Setup . 84 7.2.1.1 Browsing . 84 7.2.1.2 Instant Relay Chat . 85 7.2.1.3 Downloading Files Using Torrent (I2PSnark) . 85 7.2.2 Data Analysis . 85 7.2.2.1 Tunnel-Based Data Analysis . 85 7.2.2.2 Applications and User-Based Data Analysis . 86 7.2.3 Clustering Tunnels Using SOM . 89 7.2.4 Discussion . 91 7.3 Summary . 92 Chapter 8 Traffic Flow Analysis of Obfuscated Traffic ....... 94 8.1 Tor Pluggable Transports . 94 8.1.1 Data Collection . 94 8.1.1.1 Obfs3 Traffic . 95 8.1.1.2 FTE Traffic . 96 8.1.1.3 Scramblesuit Traffic . 96 8.1.1.4 Meek Traffic . 96 8.1.1.5 Flashproxy Traffic . 96 8.1.1.6 Other Traffic . 97 8.1.2 Pluggable Transport Flow Analysis . 97 v 8.1.2.1 Split and Cross-Validation Analysis . 97 8.1.2.2 Reduced Number of Features . 99 8.1.2.3 Binary Classification . 100 8.1.3 Discussion . 101 8.2 JonDonym Traffic Forwarding . 103 8.2.1 JonDonym Flow Behaviour . 103 8.2.2 TCP/IP and Skype Forwarding . 104 8.3 Summary . 105 Chapter 9 Packet Momentum ...................... 107 9.1 Packet Behaviour in Anonymity Networks . 108 9.2 Proposed Features . 109 9.2.1 Maximum Packet Size . 111 9.2.2 Frequency of Maximum Packet Size . 112 9.2.3 Second Maximum Packet Size . 112 9.2.4 Second Maximum Packet Size Frequency . 112 9.2.5 Packet Sequence . 112 9.2.6 Sequence Speed . 115 9.2.7 Packet Momentum . 115 9.3 Traffic Analysis Using Packet Momentum . 117 9.3.1 Anonymity Network Identifications . 120 9.3.2 Identification of Applications and Anonymity Networks . 120 9.4 Packet Momentum Validation . 121 9.4.1 Number of Packets . 121 9.4.2 Number of Features . 123 9.5 Performance Under Different Classifiers . 124 9.6 Summary . 126 Chapter 10 Conclusion ............................ 127 10.1 Dataset . 127 10.2 Anonymity Measurement . 127 10.3 Machine Learning Algorithms . 128 10.4 Traffic Flow Analysis of Anonymity Networks . 129 10.5 Efficiency and Accuracy Using Packet Momentum . 129 vi 10.6 Future Work . 130 Bibliography ................................... 131 Appendices .................................... 140 Appendix A Calculation of the Features on Packet Momentum ... 141 A.1 Calculation of Packet Sequence . 141 A.2 Calculation of Sequence Speed . 142 A.3 Calculation of Packet Momentum . 142 Appendix B Packet Momentum Pseudo code .............. 145 vii List of Tables Table 4.1 Default browser settings for anonymity services. 41 Table 4.2 Proposed anonymity factors. 46 Table 4.3 Calculating the weights. 46 Table 4.4 Final weights of the factors. 47 Table 4.5 Evaluated factors for users (A), (B) and (C). 48 Table 5.1 The number of traffic flows in each data set. 55 Table 5.2 Anon17 data set features. 56 Table 7.1 Flow exporter attributes. 74 Table 7.2 Circuit level classification results. 78 Table 7.3 Flow level classification results - uniform classes. 79 Table 7.4 Flow level classification results - downsampled classes. 80 Table 7.5 Methods used to achieve the best accuracy. 81 Table 7.6 Binary classifier on the tunnels. ..