Generating Datasets Through the Introduction of an Attack Agent in A
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Datateknik 2021 | LIU-IDA/LITH-EX-A--2021/013--SE Generating Datasets Through the Introduction of an Attack Agent in a SCADA Testbed – A methodology of creating datasets for intrusion detection re- search in a SCADA system using IEC-60870-5-104 Hur en SCADA testmiljö med IEC-60870-5-104 protokollet un- der attack kan skapa data att använda för nätverksbaserade in- trångdetekteringssystem August Fundin Supervisor : Chih-Yuan Lin Examiner : Simin Nadjm-Tehrani Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrätt Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer- ingsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Över- föring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och till- gängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet än- dras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/. Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to down- load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/. © August Fundin Abstract In December 2015 a power outage was caused by a hacking attack in Ukraine. This further highlighted the ongoing increase of attacks on critical infrastructure and the vulnerabilities of the aging industrial control systems governing it. Supervisory Control and Data Acqui- sition (SCADA) is an example of such a system. Studying the intrusion of adversaries and anomalies in SCADA systems is no easy feat. Administrators of SCADA systems rarely share data as they risk getting their weaknesses detected. Hence, datasets containing this data need to be acquired through other means. In this study, a SCADA testbed simulating a real-world counterpart was used to create datasets for intrusion detection. As the testbed had no previously documented attacks, this study also investigated how the testbed reacted to generated attacks. This study focused on attacks on the communication protocol IEC-60870-5-104. The chosen approach to obtain datasets was to construct a so-called attack-bot, generating attacks during scenarios where network traffic was recorded. After a scenario, a user has access to labeled network traffic, ready to be used when training intrusion detection systems. This kind of data is traditionally challenging to create. There are few publicly available qualitative testbeds and generating data without a testbed comes with a whole set of dif- ficulties. The results illustrate how this study’s approach can generate high quality data with a rather small effort. Acknowledgments I would like to thank Chih-Yuan Lin and Simin Nadjm-Tehrani, my supervisor and my ex- aminer. For the guidance as well as the valuable feedback and discussions for the duration of my work. I would like to follow that up with a hearty thanks to Erik Westring, Peter Andersson and Tommy Gustafsson at FOI for aiding me with RICS-el. And finally, thanks to all of you who gave me much needed encouragement when I needed it! iv Contents Abstract iii Acknowledgments iv Contents v List of Figures vii List of Tables viii Abbreviations ix 1 Introduction 1 1.1 Motivation . 2 1.2 Aim............................................ 2 1.3 Research Questions . 3 1.4 Delimitations . 3 1.5 Thesis Outline . 3 2 Background 4 2.1 SCADA . 4 2.2 IEC-60870-5-104 . 6 2.3 SCADA Vulnerabilities . 10 2.4 SCADA Exploits . 11 2.5 RICS-el . 13 3 Related Work 16 3.1 Dataset Generation . 16 3.2 Attack Types and Attack Evaluation . 19 4 Methodology of Dataset Generation 22 4.1 Attack-Bot Implementation . 23 4.2 Experiment Setup . 23 4.3 Dataset Generation Workflow . 25 5 Attack Generation in RICS-el 30 5.1 Attack Model . 30 5.2 Attack Scenario Implementation . 31 6 Method of Evaluation 36 6.1 Dataset Requirements . 36 6.2 Evaluation of Datasets Requirements . 37 6.3 Attack Impact Evaluation . 38 v 7 Results and Evaluations 39 7.1 Impact of Attacks . 39 7.2 Created Datasets . 43 7.3 Review of Requirements . 44 8 Discussion 46 8.1 Results . 46 8.2 Method . 48 8.3 Sources . 50 8.4 The Work in a Wider Context . 50 9 Conclusion 52 9.1 Dataset Creation in RICS-el . 52 9.2 Attack Generation in RICS-el . 53 9.3 Future Work . 54 A Appendix Attack-Bot Configurations 56 A.1 List of Flags . 56 A.2 Configfile Options . 57 B Appendix Attack-Code Template 58 Bibliography 59 vi List of Figures 2.1 An overview of SCADA . 5 2.2 APDU with fixed and variable length . 7 2.3 APCI control field formats . 8 2.4 Information contained in an ASDU . 9 2.5 Overview of RICS-el . 13 2.6 Interactions between bots and SCADA in RICS-el . 14 4.1 Network configuration in the experiment setup . 23 4.2 The attack-bot in RICS-el’s dataflow . 24 4.3 The dataset generation workflow . 25 4.4 Running scheduled attack scenarios . 27 4.5 Flowchart of iterative dataset evaluation . 29 vii List of Tables 2.1 Common ASDU functions in RICS-el . 8 5.1 Overview of implemented attack-scenarios . 31 5.2 IP addresses of IEC-104 devices . 32 6.1 Attack success criteria . 38 7.1 Result of the scanning attack . 40 7.2 Results of the DoS attacks . 40 7.3 Results of the sequence attack . 41 7.4 Results of the MitM attacks . 41 7.5 Result of the replay attack . 42 7.6 Results of the injection attacks . 43 7.7 Recorded datasets . 44 7.8 Operator actions in each scenario . 44 A.1 List of flags . 56 viii Abbreviations APCI Application Protocol Control Information. APDU Application Protocol Data Unit. ARP Address Resolution Protocol. ASDU Application Service Data Unit. CoT Cause of Transmission. CSV Comma Separated Values. DMZ Demilitarized Zone. DoS Denial-of-Service. FOI Swedish Defence Research Agency. HMI Human-Machine-Interface. ICMP Internet Control Message Protocol. ID Identity. IDS Intrusion Detection System. IE Information Element. IEC International Electrotechnical Commission. IEC-104 IEC-60870-5-104. IO Information Object. IOA Information Object Address. IP Internet Protocol. IT Information Technology. ITF Invalid Time Flag. LAN Local Area Network. MitM Man-in-the-Middle. NIDS Network Intrusion Detection System. ix NSTB National SCADA Test Bed. NTP Network Time Protocol. ORG Originator Address. OT Operation Technology. pcap Packet Capture. PLC Programmable Logic Controllers. RICS Resilient Information and Control Systems. RTT Round-Trip Time. RTU Remote Terminal Units. S3 SUTD Security Showdown. SCADA Supervisory Control and Data Acquisition. SQ Structure Qualifier. SSH Secure Shell. STARTDT Start Data Transfer. STOPDT Stop Data Transfer. SUTD Singapore University of Technology and Design. TCP Transmission Control Protocol. TCP/IP Internet protocol suite. TESTFR Test Frame. TTL Time To Live. VM Virtual Machine. VPN Virtual Private Network. WAN Wide Area Network. 1 Introduction SCADA is a control system that encompasses both devices interfacing with physical machin- ery and computers of geographically distributed critical infrastructure, such as power grids. Organizations managing power grids need SCADA systems to control and monitor safe and reliable operations [10]. SCADA systems and their protocols were previously used in isolated networks with propri- etary solutions. However, this has changed over the last decades. Components are now stan- dardized instead of specialized, to improve maintainability. Instead of proprietary software, more publicly known software is used to ease the integration of systems. Connections be- tween SCADA networks and the organization’s corporate networks have been added. These changes have made SCADA systems easier to operate. But at the same time, the changes have also made SCADA systems more vulnerable. The connections to the corporate network open up for intruders to penetrate the system in new ways. Devices and protocols that then become exposed often have known vulnerabilities [10]. Cyberattacks targeting SCADA systems are undeniably happening in today’s society. The power grid cyberattack in Ukraine, December 2015, is believed to be the first example of a power outage deliberately caused by a hacking attack [25]. Since then, there has been an increase in reports of attacks on SCADA systems with malicious intent [41, 29]. Security re- searchers need to find ways of detecting anomalies and intrusions in SCADA systems.