Measuring, Modelling and Understanding Internet Traffic
Total Page:16
File Type:pdf, Size:1020Kb
PRODUCED ON ACID-FREE PAPER MEASURING, UNDERSTANDING AND MODELLING INTERNET TRAFFIC Nicolas Hohn SUBMITTED IN TOTAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY JULY 2004 DEPARTMENT OF ELECTRICAL AND ELECTRONIC ENGINEERING THE UNIVERSITY OF MELBOURNE AUSTRALIA A mes parents, pour leur amour, encouragement et constant support, sans qui rien ne serait. iii Abstract This thesis concerns measuring, understanding and modelling Internet traffic. We first study the origins of the statistical properties of Internet traffic, in particular its scaling behaviour, and propose a constructive model of packet traffic with physically motivated parameters. We base our analysis on a large amount of empirical data measured on different networks, and use a so called semi-experimental approach to isolate certain features of traffic we seek to model. These results lead to the choice of a particular Poisson cluster process, known as Bartlett-Lewis point process, for a new packet traffic model. This model has a small number of parameters with simple networking meaning, and is mathematically tractable. It allows us to gain valuable insight on the underlying mechanisms creating the observed statistics. In practice, Internet traffic measurements are limited by the very large amount of data generated by high bandwidth links. This leads us to also investigate traffic sampling strate- gies and their respective inversion methods. We argue that the packet sampling mechanism currently implemented in Internet routers is not practical when one wants to infer the sta- tistics of the full traffic from partial measurements. We advocate the use of flow sampling for many purposes. We show that such sampling strategy is much easier to invert and can give reasonable estimates of higher order traffic statistics such as distribution of number of packets per flow and spectral density of the packet arrival process. This inversion technique can also be used to fit the Bartlett-Lewis point process model from sampled traffic. We complete our understanding of Internet traffic by focusing on the small scale behav- iour of packet traffic. To do so, we use data from a fully instrumented Tier-1 router and measure the delays experienced by all the packets crossing it. We present a simple router model capable of simply reproducing the measured packet delays, and propose a scheme to export router performance information based on busy periods statistics. We conclude this thesis by showing how the Bartlett-Lewis point process can model the splitting and merging of packet streams in a router. v Declaration This is to certify that: (i) the thesis comprises only my original work; (ii) due acknowledgement has been made in the text to all other material used; and (iii) the thesis is less than 80000 words in length, exclusive of tables, maps, bibliographies, appendices and footnotes. Nicolas Hohn vii Preface The work presented in this thesis is the result of original research conducted by the author. Parts of it have been published, or submitted for publication, as follows: Chapters 3 and 4: [81] N. Hohn, D. Veitch, and P. Abry, “Investigating the scaling behaviour of Internet flow arrivals”, in Proc. International Conference on Self-Similarity and Applications, Annales Mathematiques´ Blaise Pascal, Clermont Ferrand, France, May 2002. [80] N. Hohn, D. Veitch, and P. Abry, “Does fractal scaling at the IP level depend on TCP flow arrival processes ?”, in Proc. ACM Internet Measurement Workshop, pp. 63–68, Marseille, France, November 2002. [83] N. Hohn, D. Veitch, and P. Abry, “The impact of the flow arrival process in Internet traffic”, in Proc. IEEE ICASSP, pp. VI 37–40, Hong Kong, April 2003. [3] P. Abry, P. Flandrin, N. Hohn, and D. Veitch, “Invariance d’echelle´ dans l’Internet”, in Proc. Colloque Mesure de l’Internet, Nice, France, May 2003. [82] N. Hohn, D. Veitch, and P. Abry, “Cluster Processes, a Natural Langage for Network Traffic”, IEEE Transactions on Signal Processing, Special Issue on Signal Processing in Networking, 51(8):2229–2244, August 2003. [173] D. Veitch, N. Hohn, and P. Abry, “Multifractality in TCP/IP traffic : the case against”, (submitted). Chapter 5: [78] N. Hohn and D. Veitch, “Inverting sampled traffic”, in Proc. ACM Internet Measure- ment Conference, pp. 222–233, Miami, USA, October 2003. Best student paper award. [79] N. Hohn and D. Veitch, “Inverting sampled traffic”, IEEE/ACM Transactions on Net- working, (fast track submission). Chapter 6: [142] K. Papagiannaki, D. Veitch, and N. Hohn, “Origins of microcongestion in an access router”, in Proc. Passive and Active Measurment Workshop, Antibes, France, April 2004. [84] N. Hohn, D. Veitch, K. Papagiannaki, and C. Diot, “Bridging router performance and queueing theory”, in Proc. ACM SIGMETRICS conference, New York, USA, June 2004. ix Chapter 7: [85] N. Hohn, D. Veitch and T. Ye, “Splitting and merging of a traffic model: validation”, (submitted). x Acknowledgements If you have an apple and I have an apple and we exchange these apples, then you and I still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas. George Bernard Shaw I would like to thank Darryl Veitch, my PhD advisor, for his support, guidance and availability. I thoroughly enjoyed working and “exchanging ideas” with him. I will look back at our late night enlightening discussions and desperate moments before dead lines with fond memories. He made my PhD studies a great experience, as much scientifically than personally. Thanks go to Iven Mareels and Stephen Hanly, members of my PhD committee, for their assistance and suggestions over the course of my work and in the preparation of this thesis. The financial supports from the Commonwealth government of Australia through an In- ternational Postgraduate Research Scholarship, from the University of Melbourne, Ericsson and the Australian Research Council Special Research Center for Ultra-Broadband Infor- mation Networks were crucial to the successful completion of this project and are gratefully acknowledged. The story that led me to leave the French Alps and complete a PhD in Australia is too long and too incredible to be fully accounted here. A couple of moments stand out: a job offer from the Bionic Ear Institute just days before I was due to reluctantly leave Australia to complete my military duties in France, and a fax from the Vice-Chancellor of the University of Melbourne to support my visa application when I was about to be deported. I cannot thank enough the persons involved in these life changing events. Studying in Australia for my MSc and my PhD has been an amazing journey, not be- cause of all the miles flown, but because I met some great people along the way. From a research perspective, I was very lucky to work at Ecole Normale Superieure´ de Lyon (France) with Patrice Abry at multiple occasions. I am grateful to the people at the Cooperative Association for Internet Data Analysis in San Diego (USA), Ecole Normale Superieure´ de Paris, Intel Research Cambridge and Laboratoire d’Informatique de Paris VI for their kind hospitality and financial support during my short visits. I would also like to give a special thank to the folks from the IP group at Sprint Advanced Technology Laboratories in San Francisco (USA) for making my stay there such a great experience. On a more personal note, I would like to thank my friend Jean for taking me moun- taineering on Makalu 2 in the Himalayas and thus showing me that one can still have a life xi during a PhD. I am also grateful to all the amazing people from the Melbourne University Mountaineering Club with whom I shared some wonderful adventures and epics. Being so far from home means that I did not see my family as much as I would have wished. I thank them all for their support and understanding. Last but not least, I would like to thank Andrea for coping with my working hours and my long overseas trips, and for bringing so much in my life over the years. Melbourne, Australia May 2004 Nicolas Hohn xii Contents List of Tables xvii List of Figures xix Principal Notations xxi 1 Introduction 1 1.1 The Internet . 1 1.1.1 History and fundamentals . 1 1.1.2 Organization . 2 1.2 Philosophy and aims of this thesis . 3 1.3 Teletraffic engineering . 5 1.3.1 Definition . 6 1.3.2 Traffic modelling . 6 1.4 Internet traffic models . 9 1.4.1 Black box traffic models . 10 1.4.2 Physical models . 13 1.5 Contributions and thesis outline . 15 1.5.1 Contributions . 15 1.5.2 Outline . 16 1.6 How to read this thesis . 17 2 Mathematical background 19 2.1 Introduction . 19 2.2 Self-similarity and other scaling behaviours . 19 2.2.1 Self-similarity . 19 2.2.2 Long-Range Dependence . 20 2.2.3 Multifractals . 22 2.2.4 Infinitely Divisible Cascades . 22 2.3 Point Processes . 23 2.3.1 Introduction . 23 2.3.2 Definitions . 25 2.3.3 Moments . 27 2.3.4 Density spectrum . 29 2.3.5 Operations on point processes . 31 2.4 Wavelet analysis . 33 2.4.1 Definition . 34 2.4.2 Properties . 35 2.4.3 Estimation . 35 xiii 2.4.4 Making sense at small scales . 37 2.5 Conclusion . 38 3 Empirical observations and semi-experiments 41 3.1 Introduction . 41 3.2 The data and data processing . 42 3.2.1 Passive measurements . 42 3.2.2 First observations . 43 3.2.3 IP flow decomposition . 44 3.2.4 Central observations: biscaling and heavy tails . 46 3.3 Flow arrival process .