Multi-Layer Network Monitoring and Analysis
Total Page:16
File Type:pdf, Size:1020Kb
UCAM-CL-TR-571 Technical Report ISSN 1476-2986 Number 571 Computer Laboratory Multi-layer network monitoring and analysis James Hall July 2003 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2003 James Hall This technical report is based on a dissertation submitted April 2003 by the author for the degree of Doctor of Philosophy to the University of Cambridge, King's College. Some figures in this document are best viewed in colour. If you received a black-and-white copy, please consult the online version if necessary. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/TechReports/ Series editor: Markus Kuhn ISSN 1476-2986 Abstract Passive network monitoring offers the possibility of gathering a wealth of data about the traffic traversing the network and the communicating processes generating that traffic. Significant advantages include the non-intrusive nature of data capture and the range and diversity of the traffic and driving applications which may be observed. Conversely there are also associated practical difficulties which have restricted the usefulness of the technique: increasing network bandwidths can challenge the capacity of monitors to keep pace with passing traffic without data loss, and the bulk of data recorded may become unmanageable. Much research based upon passive monitoring has in consequence been limited to that using a sub-set of the data potentially available, typically TCP/IP packet headers gathered using Tcpdump or similar monitoring tools. The bulk of data collected is thereby minimised, and with the possible exception of packet filtering, the monitor's available processing power is available for the task of collection and storage. As the data available for analysis is drawn from only a small section of the network protocol stack, detailed study is largely confined to the associated functionality and dynamics in isolation from activity at other levels. Such lack of context severely restricts examination of the interaction between protocols which may in turn lead to inaccurate or erroneous conclusions. The work described in this dissertation attempts to address some of these limitations. A new passive monitoring architecture | Nprobe | is presented, based upon ‘off the shelf' components and which, by using clusters of probes, is scalable to keep pace with current high bandwidth networks without data loss. Monitored packets are fully captured, but are subject to the minimum processing in real time needed to identify and associate data of interest across the target set of protocols. Only this data is extracted and stored. The data reduction ratio thus achieved allows examination of a wider range of encapsulated protocols without straining the probe's storage capacity. Full analysis of the data harvested from the network is performed off-line. The activity of interest within each protocol is examined and is integrated across the range of protocols, allowing their interaction to be studied. The activity at higher levels informs study of the lower levels, and that at lower levels infers detail of the higher. A technique for dynamically modelling TCP connections is presented, which, by using data from both the transport and higher levels of the protocol stack, differentiates between the effects of network and end- process activity. The balance of the dissertation presents a study of Web traffic using Nprobe. Data collected from the IP, TCP, HTTP and HTML levels of the stack is integrated to identify the patterns of network activity involved in downloading whole Web pages: by using the links contained in HTML documents observed by the monitor, together with data extracted from the HTML headers of downloaded contained objects, the set of TCP connections used, and the way in which browsers use them, are studied as a whole. An analysis of the degree and distribution of delay is presented and contributes to the understanding of performance as perceived by the user. The effects of packet loss on whole page download times are examined, particularly those losses occurring early in the lifetime of connections before reliable estimations of round trip times are established. The implications of such early packet losses for pages downloads using persistent connections are also examined by simulations using the detailed data available. 5 Acknowledgements I must firstly acknowledge the very great contribution made by Micromuse Ltd., without whose financial support for three years it would not have been possible for me to undertake the research leading to this dissertation, and to the Computer Laboratory whose further support allowed me to complete it. Considerable thanks is due to my supervisor, Ian Leslie, for procuring my funding, for his advice and encouragement, his reading and re-reading of the many following chapters, his suggestions for improvement and, perhaps above all, his gift for identifying central issues on the occasions when I have become bogged down in the minutiae. Thanks also to Ian Pratt who is responsible for many of the concepts underlying the Nprobe monitor, and who has been an invaluable source of day-to{day and practical advice and guidance. Although the work described here has been largely an individual undertaking, it could not have proceeded satisfactorily without the background of expertise, discussion, imaginative ideas and support provided by my past and present colleagues in the Systems Research Group. Special mention must be made of Derek McAuley who initially employed me, of Simon Cosby, who encouraged me to make the transition from research assistant to PhD student, and of James Bulpin for his ever-willing help in the area of system administration: to them, to Steve Hand, Tim Harris, Keir Fraser, Jon Crowcroft, Dickon Reed, Andrew Moore, and to all the others, I am greatly indebted. Many others in the wider community of the Computer Laboratory and University Computing Service have also made a valuable contribution: to Margaret Levitt for her encouragement over the years, to Chris Cheney and Phil Cross for their assistance in providing monitoring facilities, to Martyn Johnson and Piete Brookes for accommodating unusual demands upon the Laboratory's systems, and to many others, I offer my thanks. Thank you to those who have rendered assistance in proof-reading the final draft of this dissertation: to Anna Hamilton, and to some of those mentioned above | you know who you are. Very special love and thanks goes to my wife, Jenny, whose continual encouragement and support has underpinned all of my efforts. She has taken on a vastly disproportionate share of our domestic and child-care tasks, despite her own work, and has endured much undeserved neglect to allow me to concentrate on my work. My love and a big thank-you also go to Charlie and Sam who have seen very much less of their daddy than they have deserved, and to Jo and Luke, `my boys', with whom I have recently sunk fewer pints than I should. Finally, the r^ole of an unlikely player deserves note. In the early 1980's I was unwillingly conscripted into the ranks of Maggie's army | the hundreds of thousands made unemployed in one of the cruelest and most regressive pieces of social engineering ever to be inflicted upon this country. For many it spelled unemployment for the remainder of their working lives. With time on my hands I embarked upon an Open University degree, the first step on a path which lead to the Computer Laboratory, the Diploma in Computer Science, and eventually to the writing of this dissertation | I was one of the fortunate ones. 7 Table of Contents 1 Introduction 21 2 Background and Related Work 31 3 The Nprobe Architecture 45 4 Post-Collection Analysis of Nprobe Traces 79 5 Modelling and Simulating TCP Connections 111 6 Non-Intrusive Estimation of Web Server Delays Using Nprobe 141 7 Observation and Reconstruction of World Wide Web Page Downloads 159 8 Page Download Times and Delay 175 9 Conclusions and Scope for Future Work 193 A Class Generation with SWIG 201 B An Example of Trace File Data Retrieval 205 C Inter-Arrival Times Observed During Web Server Latency Tests 213 Bibliography 219 9 Contents Table of Contents 7 List of Figures 14 List of Tables 16 List of Code Fragments and Examples 17 List of Acronyms 18 1 Introduction 21 1.1 Introduction . 21 1.2 Motivation . 22 1.2.1 The Value of Network Monitoring . 22 1.2.2 Passive and Active Monitoring . 22 1.2.3 Challenges in Passive Monitoring . 23 1.3 Multi-Protocol Monitoring and Analysis . 26 1.3.1 Interrelationship and Interaction of Protocols . 26 1.3.2 Widening the Definition of a Flow . 27 1.4 Towards a More Capable Monitor . 27 1.5 Aims . 29 1.6 Structure of the Dissertation . 29 2 Background and Related Work 31 2.1 Data Collection Tools and Probes . 31 2.2 Wide-Deployment Systems . 33 2.3 System Management and Data Repositories . 34 2.4 Monitoring at Ten Gigabits per Second and Faster . 34 2.5 Integrated and Multi-Protocol Approaches . 35 2.6 Flows to Unknown Ports . 38 2.7 Measurement Accuracy . 39 2.8 Studies Based upon Passively Monitored Data . 39 2.8.1 A Brief Mention of some Seminal Studies . 39 2.8.2 Studies of World Wide Web Traffic and Performance . 40 2.9 Summary . 42 10 Contents 3 The Nprobe Architecture 45 3.1 Design Goals . 45 3.2 Design Overview . 46 3.2.1 Principal Components . 46 3.2.2 Rationale . 48 3.3 Implementation . 51 3.3.1 Packet Capture . 51 3.3.2 Scalability . 54 3.3.3 On-Line Packet Processing and Data Extraction .