Flow-Based Compromise Detection Rick Hofstede Graduation Committee
Total Page:16
File Type:pdf, Size:1020Kb
Flow-based Compromise Detection Rick Hofstede Graduation Committee Chairman: Prof. dr. P.M.G. Apers Promotor: Prof. dr. ir. A. Pras Co-promotor: Prof. Dr. rer. nat. G. Dreo Rodosek Members: dr. A. Sperotto University of Twente, The Netherlands Prof. dr. P.H. Hartel University of Twente, The Netherlands Delft University of Technology, The Netherlands TNO Cyber Security Lab, The Netherlands Prof. dr. ir. L.J.M. Nieuwenhuis University of Twente, The Netherlands Prof. dr. ir. C.T.A.M. de Laat University of Amsterdam, The Netherlands Prof. dr. ir. H.J. Bos VU University, The Netherlands Prof. Dr. rer. nat. U. Lechner Universit¨atder Bundeswehr M¨unchen, Germany Prof. Dr. rer. nat. W. Hommel Universit¨atder Bundeswehr M¨unchen, Germany Funding sources EU FP7 UniverSelf { #257513 EU FP7 FLAMINGO Network of Excellence { #318488 EU FP7 SALUS { #313296 EIT ICT Labs { #13132 (Smart Networks at the Edge) SURFnet GigaPort3 project for Next-Generation Networks CTIT Ph.D. thesis series no. 16-384 Centre for Telematics and Information Technology CTIT P.O. Box 217 7500 AE Enschede, The Netherlands ISBN: 978-90-365-4066-7 ISSN: 1381-3617 DOI: 10.3990/1.9789036540667 http://dx.doi.org/10.3990/1.9789036540667 Typeset with LATEX. Printed by Gildeprint, The Netherlands. Cover design by David Young. This thesis is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 3.0 Unported License. http://creativecommons.org/licenses/by-nc-sa/3.0/ This thesis has been printed on paper certified by FSC (Forest Stewarding Council). FLOW-BASED COMPROMISE DETECTION THESIS to obtain the degree of doctor at the University of Twente, on the authority of the Rector Magnificus, prof. dr. H. Brinksma, on account of the decision of the graduation committee, to be publicly defended on Wednesday, June 29, 2016 at 16:45 by Richard Johannes Hofstede born on May 28, 1988 in Ulm, Germany. This thesis has been approved by: Prof. dr. ir. A. Pras (promotor) Prof. Dr. rer. nat. G. Dreo Rodosek (co-promotor) Acknowledgements The list of people that contributed in one way or another to this thesis is almost endless. To avoid that I forget to include anyone here in this acknowledgement, I simply want to thank anyone who worked with me during my career as a Ph.D. student. You have helped shaping this thesis to what it has become. Thank you! Rick Hofstede Diepenveen, May 29, 2016 vi Abstract Brute-force attacks are omnipresent and manyfold on the Internet, and aim at compromising user accounts by issuing large numbers of authentication attempts on applications and daemons. Widespread targets of such attacks are Secure SHell (SSH) and Web applications, for example. The impact of brute-force at- tacks and compromises resulting thereof is often severe: Once compromised, at- tackers gain access to remote machines, allowing those machines to be misused for all sorts of criminal activities, such as sharing illegal content and participating in Distributed Denial of Service (DDoS) attacks. While the number of brute-force attacks is ever-increasing, we have seen that only few brute-force attacks actually result in a compromise. Those compromised devices are however those that require attention by security teams, as they may be misused for all sorts of malicious activities. We therefore propose a new paradigm in this thesis for monitoring network security incidents: compromise detection. Compromise detection allows security teams to focus on what is really important, namely detecting those hosts that have been compromised instead of all hosts that have been attacked. Speaking metaphorically, one could say that we target scored goals, instead of just shots on goals. A straightforward approach for compromise detection would be host-based, by analyzing network traffic and log files on individual hosts. Although this typically yields high detection accuracies, it is infeasible in large networks; These networks may comprise thousands of hosts, controlled by many persons, on which agents need to be installed. In addition, host-based approaches lack a global attack view, i.e., which hosts in the same network have been contacted by the same attacker. We therefore take a network-based approach, where sensors are deployed at strategic observation points in the network. The traditional approach would be packet-based, but both high link speeds and high data rates make the deployment of packet-based approaches rather expensive. In addition, the fact that more and more traffic is encrypted renders the analysis of full packets useless. Flow-based approaches, however, aggregate individual packets into flows, providing major advantages in terms of scalability and deployment. The main contribution of this thesis is to prove that flow-based compromise detection is viable. Our approach consists of several steps. First, we select two target applications, Web applications and SSH, which we found to be important targets of attacks on the Internet because of the high impact of a compromise and their wide deployment. Second, we analyze protocol behavior, attack tools and attack traffic to better understand the nature of these attacks. Third, we viii develop software for validating our algorithms and approach. Besides using this software for our own validations (i.e., in which we use log files as ground-truth), our open-source Intrusion Detection System (IDS) SSHCure is extensively used by other parties, allowing us to validate our approach on a much broader basis. Our evaluations, performed on Internet traffic, have shown that we can achieve detection accuracies between 84% and 100%, depending on the protocol used by the target application, quality of the dataset, and the type of the monitored network. Also, the wide deployment of SSHCure, as well as other prototype de- ployments in real networks, have shown that our algorithms can actually be used in production deployments. As such, we conclude that flow-based compromise detection is viable on the Internet. Contents 1 Introduction1 1.1 Compromise Detection.........................4 1.2 Network Monitoring..........................6 1.3 Objective, Research Questions & Approach.............9 1.4 Contributions.............................. 12 1.5 Thesis Organization.......................... 13 I Generic Flow Monitoring 19 2 Flow Measurements 21 2.1 History & Context........................... 22 2.2 Flow Monitoring Architecture..................... 25 2.3 Packet Observation........................... 28 2.4 Flow Metering & Export........................ 34 2.5 Data Collection............................. 47 2.6 Lessons Learned............................ 50 2.7 Conclusions............................... 55 3 Flow Measurement Artifacts 57 3.1 Related Work.............................. 58 3.2 Case Study: Cisco Catalyst 6500................... 59 3.3 Experiment Setup........................... 60 3.4 Artifact Analysis............................ 61 3.5 Conclusions............................... 66 II Compromise Detection 69 4 Compromise Detection for SSH 71 4.1 Background............................... 73 4.2 SSH Attack Analysis.......................... 76 4.3 Detecting SSH Brute-force Attacks.................. 79 4.4 Analysis of Network Traffic Flatness................. 83 4.5 Including SSH-specific Knowledge................... 96 4.6 Conclusions............................... 104 x CONTENTS 5 Compromise Detection for Web Applications 107 5.1 Background............................... 109 5.2 Histograms for Intrusion Detection.................. 112 5.3 Detection Approach.......................... 117 5.4 Validation................................ 122 5.5 Conclusions............................... 127 III Resilient Detection 129 6 Resilient Detection 131 6.1 Background & Contribution...................... 132 6.2 DDoS Attack Metrics......................... 133 6.3 Detection Algorithms.......................... 135 6.4 Validation................................ 138 6.5 Feasibility................................ 144 6.6 Conclusions............................... 151 7 Conclusions 153 7.1 Research Questions........................... 153 7.2 Discussion................................ 157 Appendix: A Minimum Difference of Pair Assignment (MDPA) 161 A.1 Calculation............................... 161 A.2 Normalization.............................. 162 Appendix: B CMS Backend URLs 163 Bibliography 165 Acronyms 181 About the Author 185 CHAPTER 1 Introduction The Internet has become a critical infrastructure that facilitates most digital communications in our daily lives, such as banking traffic, secure communications and video calls. This makes the Internet a prime attack target for criminals, nation-states and terrorists, for example [61]. When it comes to attacks, we there are two different classes that regularly make it to the news, namely those that are volumetric and aim at overloading networks and systems, such as Distributed Denial of Service (DDoS) attacks, and those that are particularly sophisticated, such as Advanced Persistent Threats (APTs). This is also confirmed by [158], but interestingly enough, it also reports brute-force attacks to be among the Top-3 of network attacks on the Internet [158], as shown in Figure 1.1. Although these attacks exist already for years, their popularity is increasing evermore [145], [147]. One of the main targeted services of these attacks is Secure SHell (SSH), a powerful protocol that allows for controlling systems remotely. The compromise of a target immediately results in adversaries gaining unprivileged control and the impact