DEGREE PROJECT IN INFORMATION AND COMMUNICATION TECHNOLOGY, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2017

Implementation of a reference model of a typical IT infrastructure of the office network of a power utility company

DIMITRIOS VASILEIADIS

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING

1 2 Abstract

Power utility companies are really important in our daily routine since they provide us with power delivery, which is essential in today’s society. With the advance of technology, a lot of the procedures that were being manually done by these companies to deliver electrical power, have been automated and centrally controlled by Supervisory Control And Data Acquisition (SCADA) systems. Therefore, this automation must be protected from external attackers that want to hurt control systems (e.g. SCADA), either by stealing sensitive data or even by getting control of the control system and changing parameters and functions that are essential for the good and healthy function of these systems. Exploiting vulnerabilities in the office network can bring an adversary a step closer in getting access to the control system. It is not sufficient on its own, but the adversary can launch further attacks from there targeting the control system. The aim of this thesis is to construct a reference model of a typical IT in- frastructure of the office network of a power utility company, with a simplified implementation in CRATE. CRATE (Cyber Range And Training Environment) is the environment that was used for the implementation of the thesis, provided by the Swedish Defense Research Agency [FOI]. After the implementation is finished, a SCADA system of an enterprise will be installed in CRATE and will be connected with this office network. Once this is done, the Swedish Defense Research Agency will simulate cyber-attacks in a more complete infrastructure. The point of this thesis is to make the office infrastructure as close to a real en- terprise network, although there with some differentiation, part of it on purpose and part of it due to some limitations.

3 Sammanfattning

Elbolags existens ¨aressentiell i dagens samh¨allemed tanke p˚aatt elbolagen levererar el vilket viktiga samh¨allsfunktioner beror p˚a. Idag har avancerad teknologi gjort det m¨ojligtatt konvertera diverse procedurer som tidigare genom- f¨ortsmanuellt f¨oratt leverera elektricitet att genomf¨orasautomatiserat och centralstyrt via systemet Supervisory Control And Data Acquisition (SCADA- system). Det ¨ard¨arf¨orytterst viktigt att skydda det automatiserade systemet fr˚anexterna angripare d.v.s. IT-intr˚angsom vill skada det anv¨andastyrsys- temet t.ex. SCADA-styrsystem. Skada kan ske antingen genom att stj¨ala k¨ansligauppgifter eller f˚akontroll ¨over styrsystemet och ¨andrap˚aparametrar och funktioner som ¨arv¨asentliga f¨orett v¨alfungerande system. Att exploatera s˚arbarheteri kontorsn¨atverk kan m¨ojligg¨oraatt motst˚andarehamnar ett steg n¨armareatt f˚atillg˚angtill styrsystemet. Syftet med denna studie ¨aratt konstruera en referensmodell av en typisk IT- infrastruktur av kontorsn¨atetp˚aett elbolag med en f¨orenkladtill¨ampninginom Cyber Range And Training Environment (CRATE). CRATE ¨aren milj¨of¨or cyber¨ovningarsom tillhandah˚allsav Totalf¨orsvarets Forskningsinstitut (FOI) och som har anv¨ants i denna studie f¨oratt implementera referensmodellen. N¨arreferensmodellen av kontorsn¨atethar implementerats, installeras det ¨aven ett SCADA-system i CRATE, och dessa kopplas tillsammans. D¨armedska FOI kunna simulera cyberattacker i en mer komplett infrastruktur. Syftet med denna studie var att skapa ett s˚apass verkligt kontorsn¨atsom m¨ojligttrots viss differentiering j¨amf¨ortmed verkligheten p˚agrund av bl.a. vissa begr¨asningar.

4 Acknowledgements

This thesis has been one of the most rewarding projects I have encountered at KTH and in general in my entire academic career. The knowledge I gained from this thesis in amazing. I would like to thank prof. Mathias Ekstedt for giving me the opportunity to work on this thesis, to get to know the department that he works for, his constant feedback and his support. A man with a great will to assist to the point he can his students and drive them to the best possible outcome. I would also like to thank the PhD candidate, Matus Korman, as he was my supervisor at this thesis. I express nothing but gratitude for the patience he showed to all the questions that I kept asking him, which most of the time felt totally irrelevant with the subject. Of course I would like to thank Mr. Asif Iqbal, a new PhD candidate at the department, that brought a lot of his knowledge at the industry ˆa“and not only - upon his arrival, a knowledge that we lack as students. Special gratitude goes to my friends for their support not only these 2 years at KTH, but throughout my entire academic career. The best part comes always at the end and here is the point where I would like to express my family, not only for their financial support all these years, but also for their psychological support that has been a tremendous help, for believing in me in my ups and my downs! Thank you all!

5 Contents

Abstract 3

Abstrakt 4

Acknowledgements 5

Contents 7

List of Figures 8

1 Introduction 9 1.1 Introduction to the subject area ...... 9 1.2 SCADA systems ...... 9 1.3 Cyber-attacks on SCADA networks ...... 10 1.4 Corporate environment ...... 10 1.5 CRATE ...... 11 1.6 Scope and goal of the thesis ...... 12 1.7 Thesis overview ...... 12

2 Background 14 2.1 Literature review ...... 14 2.2 Relevant concepts ...... 15 2.3 Protocols ...... 17 2.3.1 Transport Layer Protocols ...... 18 2.3.2 Application Layer Protocols ...... 19

3 Method and Implementation 21 3.1 Method ...... 21 3.1.1 Validation ...... 22 3.2 Implementation ...... 22 3.2.1 Demilitarized zone ...... 24 3.2.2 Office network ...... 26 3.2.3 Intranet ...... 27 3.2.4 Engineering network ...... 30

4 Data flows 31 4.1 DMZ ...... 31 4.2 Office network ...... 32 4.3 Intranet ...... 32 4.4 Engineering network ...... 33

5 Discussion and conclusion 34

References 36

A APPENDIX - Virtual machines configuration 40

B APPENDIX - Thesis validation 43

6 APPENDIX - Data flows source code 45

D APPENDIX - Screenshots from CRATE 46

7 List of Figures

1 Source: 2015 Dell Annual Security Report ...... 10 2 CRATE Architecture [4] ...... 11 3 Metamodel of CySeMoL’s Probabilistic Relational Model (RPM). - [35] ...... 17 4 OSI model - Created with [54] ...... 18 5 RICS-el enterprise - Figure created with [54] ...... 23 6 The DMZ network of the RICS-el enterprise - Figure created with [54] ...... 24 7 The Office network of the RICS-el enterprise - Figure created with [54] ...... 26 8 The Intranet network of the RICS-el enterprise - Figure created with [54] ...... 27 9 The SCADA network of the RICS-el enterprise - Figure created with [54] ...... 30 10 Screen-shot of the RICS-el enterprise ...... 46 11 Screen-shot of the DMZ network of the RICS-el enterprise . . . . 47 12 Screen-shot of the Office network of the RICS-el enterprise . . . . 48 13 Screen-shot of the Intranet network of the RICS-el enterprise . . 49 14 Screen-shot of the SCADA network of the RICS-el enterprise . . 50

8 1 Introduction

1.1 Introduction to the subject area When we refer to the IT (Information Technology) infrastructure of an en- terprise, we refer to the composite hardware, software, network resources and services that the enterprise uses. IT has become necessary for enterprises to stay competitive. IT provides support to enterprises, it supports communica- tion between people (e.g. mail, VoIP, web servers and browsers etc.), it supports storage of data of different kind (e.g. file systems, file servers, databases etc.), it supports operations of technical administration (e.g. of IT itself) and even supports and actually controls industrial processes (e.g. production facilities, chemical facilities, power companies etc.). Of course IT does more than that. Every modern car relies on tons of IT without which it would not even start the engine. As can be easily understood from the above, IT plays a big part in today’s enterprises. This is why many companies have their own IT departments, and many of them have them located in different places, either they have one or many departments. Besides securing the systems and the network of a company from external cyber-intruders, physical security of the servers and the systems could also be required, in order for a company to stay up and running. There have been many attacks in IT systems of companies that led them to a financial and operation loss for either short or a large amount of time. This could either be achieved by attacking directly a system or trying to nest a file or trojan into another system for a while and then attack the targeted system. This can be encountered if the entire infrastructure of an enterprise is modelled and well- defined concerning the connections of its systems between them. It provides the possibility to test the infrastructure for various attacks and prevent them.

1.2 SCADA systems Supervisory Control And Data Acquisition (SCADA) systems have greatly evolved over the last years. From the point where these systems where standalone with custom hardware and with custom software as well, now there are many pieces of software and hardware according to the need of the enterprise. This has led to reducing costs in many sections at companies, including operational and maintenance costs. As a result, SCADA systems are controlling a big percent- age of critical infrastructures worldwide, including nuclear power plants, power transmission, electricity generating plants etc. [1] However, all these benefits of present day’s SCADA systems also come with downsides. One of those downsides is their vulnerability to intrusions. An at- tacker can exploit some vulnerabilities either from the computers where SCADA systems are installed or from the network that the SCADA systems operate on. Without the appropriate security measures, the entire infrastructure of a enter- prise can be compromised. In the case-scenario that this thesis is presenting, a SCADA system of a typical power utility enterprise is taken into consideration. SCADA supervises the condition of electricity generating plants and the condition of cables in the entire infrastructure through measurements from sensors, controls whether

9 switches shall turn on or off - those are some of the applications that we expect our SCADA system to have.

1.3 Cyber-attacks on SCADA networks According to Dell SonicWALL Security Annual Threat Report for 2015 [2], the attacks that targeted SCADA networks doubled the year 2014 compared to year 2013. Especially for the month January 2014, there were 675.186 attacks com- pared to 163.228 attacks on January 2013. These networks have been attacked in various ways, with buffer overflow vulnerabilities being the main attack method with almost 25% of the attacks.

Figure 1: Source: 2015 Dell Annual Security Report

The graph in Figure 1 shows the different type of attacks that occurred in 2014 in SCADA networks and their perspective percentages. A percentage of these attacks start from the office network, so the need to secure the infrastruc- ture increases.

1.4 Corporate environment One of the sources that an intrusion can start is the corporate environment. For example, an employee of the company might download a file from the In- ternet to his computer which can contain a malware (could be trojan, virus or worm) capable to spread and infect the entire office network. For the implemen- tation of this thesis, a simple office network is taken into consideration, with not many differences from a typical office network. What can be found in a working environment are cell phones, printers, computers etc., there could also be an intranet where different types of systems are installed, like an enterprise resource planning system or a price book management system etc. Furthermore,

10 a demilitarized zone exists in most office network where there is typically a mail server, a web server, eventually also an Internet proxy etc. Due to the advance of the technology and the fact that control systems have taken more and more responsibilities, a separation had to be done as a first stage of prevention against cyber-attacks. This is because if SCADA system is installed in a computer which is located at the office network, it is more vulnerable in adversaries given the fact that the office network is connected to the Internet, while with the separation of the SCADA network and the office network this is not the case. However the SCADA system can still be affected by the office network, every time it connects to the Internet to get updates or there is careless information exchange between the two networks.

1.5 CRATE Cyber Range And Training Environment (CRATE) is an environment created by the Swedish Defense Research Agency (FOI) [3]. It is a controlled envi- ronment with great resources, where many virtual machines can be smoothly deployed and configured. It also contains traffic emulators that can be used to generate traffic in the network, CRATE is used mostly for creating computer networks and performing experiments and exercises for cyber security [4]. CRATE has been developed thanks to a donation of the Swedish Meteoro- logical and Hydrological Institute (SMHI) [5] and the national supercomputer center to the Swedish Defense Research Agency. Old computers from these two organizations were used as a hardware platform for exercises and afterwards in- vestments to support new hardware and maintain the infrastructure were made.

Figure 2: CRATE Architecture [4]

11 As can be seen in Figure 2, CRATE hardware consists of 800 servers. There is a web based tool that can be used to create computer networks by configuring network infrastructure, and there is also the possibility to deploy different oper- ating systems and applications on each computer of this network separately, as well as specifying users at each machine. This is located in the game network, which is the network that as users we are allowed to use. The administration network is held by FOI [4]. CRATE also has a Key Management Service (KMS). A KMS is used to activate volume licensed Microsoft [6] products in bulk. It allows Microsoft products to activate within the internal network (in this case CRATE network), without having to connect to a Microsoft server for product activation. There is a KMS server and client computers of the internal network search the server with the help of Domain Name System (DNS) [7]. Clients try to communicate with the server every 7 days in order to renew the license, because the activation of products in clients is valid only for 180 days. If the communication of the client to the KMS server fails, then the machine will become unlicensed. KMS can be used to host only several Microsoft operating systems [8]. In order to connect to CRATE, a VPN Box is used which provides access to the network managed by the Swedish Defense Research Agency. Additionally, there is a pre-configured image of a virtual machine that can grant entrance to the game network and the web-based tool.

1.6 Scope and goal of the thesis The scope of this thesis is to implement the infrastructure of an office network of a power utility enterprise. A SCADA system of a enterprise will be then connected to this infrastructure and cyber-attacks are going to be carried out in the entire infrastructure to check for vulnerable spots. There are two goals for this thesis. The first one is realism. The aim is to create an as much as possible realistic infrastructure, meaning that this could actually exist in the real world as a power utility company infrastructure. The second goal is the usability of this reference model and how easy it will be to implement cyber-attack exercises in this infrastructure. With the completion of this thesis, the implementation will remain in CRATE and with full development on the infrastructure (configurations etc.), this project will be used for different cyber-attack exercises done by FOI. The plan is to make the reference model as easy as possible for someone who shall use this infrastructure for this purpose.

1.7 Thesis overview Chapter 1 provides an introduction to this project, some general information about SCADA systems, cyber-attacks that have been done on these systems and some information about office networks, chapter 2 provides a literature review for the subject and there is explanation of protocols and concepts relevant to this project, chapter 3 includes the methodology that this thesis work has followed, the validation of this thesis and the implementation of the project with extensive detail, what machines are being used at each network separately etc., chapter 4 describes the data flows that should exist in the infrastructure and in chapter 5 there is a discussion about the topic and conclusion, as well as proposal for future work, based on this thesis project. There are also four Appendices, Appendix

12 A contains information about the source images that each virtual machine has, and Appendix B contains two interviews with two industry experts concerning the validation of this thesis. Appendix C contains the source code of some basic data flows that can be used in the infrastructure to generate network traffic and Appendix D contains screenshots from the implementation of the infrastructure in the web tool that is provided by CRATE.

13 2 Background

2.1 Literature review For the literature review for this project, the following methodology was used. Three big databases with scientific papers were mainly used for the research, which are Google Scholar [9], Association for Computing Machinery [10] and Sci- ence Direct [11]. The research was done based on keywords, like: ”power utility companies”, ”office IT infrastructures”, ”SCADA systems”, ”office networks”, etc. The literature on the office IT infrastructure is limited, especially for the office of a power utility enterprise. There are some papers that are close to this work, like [12], which emphasizes how important the problem of vulner- ability and risk analysis of critical infrastructures is. [13] tries to deal with cyber-security in critical infrastructures by proposing a new SCADA framework which will contain real-time monitoring, anomaly detection, impact analysis and mitigation strategies. However, there is a big part of the literature that focuses on the security of SCADA communication networks entirely. A brief mention to a small part of the literature follows. [14] focuses on how to improve SCADA security and proposes a few ways to improve it, which are (a) wrap SCADA protocols with external cryptographic tools and (b) enhance SCADA protocols with cryptographic tech- niques. They also test these proposals and provide results in that paper. [15] describes twenty-four risk assessment methods that have either been developed or applied for SCADA systems. [16] gives a thorough review for documented standards that exist on SCADA networks and they review the state-of-the-art communication and security aspects of SCADA. Finally, [17] is also a thorough paper which gives an explanatory review on security measures taken in SCADA systems used in electric power grids. The VIKING report [18], is a report that contributed a lot for this thesis. The VIKING objectives were: • To investigate the vulnerability of SCADA systems and the cost of cyber- attacks on society • To propose and test strategies and technologies to mitigate these weak- nesses • To increase the awareness for the importance of critical infrastructures and the need to protect them There is one more paper that focuses on Security issues in SCADA networks [19], where there is an overview of research issues that focus on strengthening the security of SCADA network and discusses the general threats and vulnerabilities that these networks face. At the beginning of this thesis, the book [20] was studied extensively in order to get an understanding of how SCADA works when being used for power sys- tems and smart grids. It provides information about those systems in detail and includes explanation about each component that comprises a SCADA network. A paper written in 1996 [21] gives a definition of what security infrastructure is and provides a list of objectives and functions that according to the authors a secure infrastructure must have.

14 Furthermore, there is a part of the literature that focuses on the security of the critical infrastructures, such as [22], where there is a proposal for a SCADA framework with the following four components: 1) real time monitoring, 2) anomaly detection, 3) impact analysis, and 4) mitigation strategies. Along with this, there is the development of an impact analysis method which is used to evaluate system, scenario and leaf-level vulnerabilities. In this project, a reference model of a typical power utility company was cre- ated. A reference model defines a reference architecture, and efforts have been made to provide a generic reference model for different type of architectures, like for energy systems [23] or it can be used for different reasons, like financial reasons [24]. The main purpose of reference models is to reuse the architec- tural knowledge that they provide, and eventually further extending them and modifying them. For the different components that compose the infrastructure of the enter- prise that was implemented in this thesis, ideas were taken mostly from on-line sources like [25], which consists of IT support professionals. According to the services described in [26] that are necessary for an IT infrastructure, systems that provide these kind of services were taken for the implementation as well, like financial management process etc. In [27], a model has been developed that conceptualizes and directs the emerging area of strategic management of information technology. In [28], it is mentioned that IT infrastructure includes a group of shared IT resources that provide a foundation to enable present and future business applications. These resources include a) computer hardware and software (e.g., operating systems), b) network and telecommunications technolo- gies, c) key data, d) core data-processing applications and e) shared IT services.

2.2 Relevant concepts According to the Software Engineering Institute (SEI) [29] reference model is a division of functionality into elements together with the data flow among those elements. CRATE provides a framework for building reference models and it is not the only. There are many tools that provide a framework for designing enterprise architecture. There is the Agile Enterprise Architecture [30] tool, which can be used to create diagrams to plan, analyze and communicate the architecture of the enterprise. It can also be used to build road maps that describe the architecture content over time. One more tool is the Enterprise Architect [31]. This platform provides mod- eling for business and IT systems, software and systems engineering, real-time and embedded development, modelling business processes and much more func- tions. Furthermore, there is The Open Group Architecture Framework (TOGAF) [32] which is a framework that includes a detailed method and a set of supporting tools for developing an enterprise architecture. There is a variety of tools for designing enterprise architecture models. How good a tool is depends on the use of this tool. For example, in order to evaluate the interoperability of an information system, a model must be built, but the information required from the model will be completely different compared to the scenario where the model is is used to evaluate the system’s usability. [33]

15 Moreover, there is P2CySeMoL, which is an attack graph tool [34]. This tool is capable of estimating the cyber security levels of enterprise architecture. It is based on its predecessor, CySeMoL [35]. CySeMoL however relates more to the reference model that was implemented in this thesis project, because it includes attacks and countermeasures of relevance to industrial control and SCADA systems security. The way CySeMoL works is by using Bayesian net- works and attack graphs [36] to produce a map of reachability across an IT architecture model. This will calculate the probabilities of an attacker reaching each attack step of each asset that exists in the reference model, under certain circumstances that can be changed. The following figure (Figure 3) shows the metamodel of the CySeMol Prob- abilistic Relational Model (RPM). It consists of 22 classes, 102 attributes and 32 class relationships that show what information an architecture model should contain. For each class, two boxes that contain attributes can be seen. The upper box contains the countermeasures associated with the class while the lower box contains the attack steps associated with the class. However, only some parts of the CySeMol metamodel that hang together with what was imple- mented in this thesis are going to be briefly introduced here. The OperatingSys- tem and the ApplicationClient are types of software. An OperatingSystem can be related to a NetworkZone where traffic is allowed between software instances. The NetworkInterface class can connect multiple NetworkZones and mark them as trusted or untrusted zones. A NetworkInterface relates to Firewall. Firewall imposes rules for NetworkInterface. ZoneManagementProcess related to Net- workZone and describes the management process of systems that are located in the NetworkZone. A DataFlow has a Protocol, and a DataFlow can read or write to a DataStore owned by a SoftwareInstance.

16 Figure 3: Metamodel of CySeMoL’s Probabilistic Relational Model (RPM). - [35]

2.3 Protocols In this subsection, a brief explanation of the concepts that are being used in the data flows (see chapter 4) and their usability is going to be provided. The Open System Interconnection (OSI) reference model is a networking model created by the International Organization for Standardization (ISO - [37]). Its goal was to standardize data networking protocols to make commu- nication among computers all over the world easier. As can be seen from the following figure (Figure 4), the OSI model has seven layers, each layer defines a set of typical networking functions. These layers refer to protocols and stan- dards that implement the function that is specified by each layer [38].

17 Figure 4: OSI model - Created with [54]

In the following subsections, definitions of the protocols that are being used is provided and the categorization is made based on the layer that each protocol belongs to, according to the OSI reference model.

2.3.1 Transport Layer Protocols - Transmission Control Protocol (TCP) Transmission Control Protocol (TCP) is a transport layer protocol and it provides a reliable host-to-host communication. It is a connection-oriented protocol, which means that a connection needs to be established in order for the communication and exchange of data to take place [39]. Acknowl- edgments are sent from the destination host to source host when a packet is received. Error checking and data flow control is also provided. It is used by a large number of applications such as email, remote computer access etc. - User Datagram Protocol (UDP) User Datagram Protocol (UDP) is also a transport layer protocol. It has the same scope as TCP protocol but with a number of differences. UDP is used to send individual datagrams between two networked devices without

18 initiating a connection or providing error checking [40]. It is much more lightweight protocol than TCP. UDP datagrams can arrive out of order (they might not arrive at all). No datagram has knowledge about the previous or the following datagram. Furthermore, there is no flow control meaning that datagrams can arrive to the destination host faster than they can be used.

2.3.2 Application Layer Protocols - Domain Name System (DNS) protocol Domain Name System (DNS) is like the phone book for Internet. When someone tries to enter a website using its host-name, DNS translates that address into an Internet Protocol (IP - [41]) address. A DNS server in an enterprise is necessary to keep the mappings of the IP addresses that have been assigned to this enterprise (typically enterprises are assigned with a range of IP addresses). The DNS protocol can work on both transport layer protocols (TCP and UDP) and it uses port 53 [7]. - Dynamic Host Configuration Protocol (DHCP) Dynamic Host Configuration Protocol (DHCP) is used to assign dynamic IP [41] address to devices that connect to a network [42]. DHCP simpli- fies a lot network administration, since the assignment of IP addresses to hosts is done automatically, rather than having a network administrator performing this task manually. It works with UDP and it uses port 67 for the server and port 68 for the client. - File Transfer Protocol (FTP) File Transfer Protocol (FTP) is a protocol used to transfer files between a client and a server in a network. The FTP client creates a connection, usually from a random port N, to the server command port 21. Afterwards the server connects with the client using its data port 20 [43]. The first implementations of FTP were as a command line utility but nowadays a lot of graphical interfaces exist. Access through web browsers to ftp servers is also available. The packets that are sent from a host to a client are broken down into segments. Acknowledgment for every packet is send from the recipient. - Hyper Text Transfer Protocol (HTTP) Hyper Text Transfer Protocol (HTTP) is a stateless (each command is executed independently) application level protocol used for hypertext web browsing. It basically contains the set of rules that are necessary for a web browser and a web site to communicate. Within that, transfer of different data can be made (text, sound, video etc.). HTTP uses port 80 [44]. - HTTP over TLS (HTTPS) HTTPS is the cryptographically secure version of the HTTP protocol. Transport Layer Security (TLS - [45]) protocol provides encryption to the connection established. HTTPS uses port 443 [46]. HTTPS can also run over SSL (see below - [47]), but TLS is the predominant nowadays.

19 - Lightweight Directory Access Protocol (LDAP) Lightweight Directory Access Protocol (LDAP) is an application protocol used to access and edit directory services. It keeps information in a hier- archical order and it can provide any set of record inside an intranet, for example a directory based on the last name of the employees. A LDAP server uses port 389 [48]. - Post Office Protocol version 3 (POP3) Post Office Protocol version 3 (POP3) is a mail protocol that allows a client to retrieve a mail from a mail server by downloading them on the computer. POP3 protocol uses port 110 [49]. When the mail is down- loaded from the client in his machine, then POP3 server deletes the mail. However, there is the possibility for an administrator to configure the protocol to store mails on the server for a defined period of time.

- Real-Time Media Flow Protocol (RTMFP) Real-Time Media Flow Protocol (RTMFP) is a protocol that is used for transmission of data between Adobe Flash Platform technologies [50]. RTMFP provides information about the direct transfer of data, sound and audio between flash players.

- Secure Socket Layer (SSL) Secure Socket Layer (SSL) protocol, along with TLS, is a security proto- col. TLS is the predecessor of SSL. They are both used to establish an encrypted link between application clients and servers over insecure net- works, typically a web browser and a web server [47]. SSL certificates are used in order to bind a cryptographic key to an organizations details [51]. With the use of SSL, sensitive information like credit card numbers can be transmitted securely. - Simple Mail Transfer Protocol (SMTP) Simple Mail Transfer Protocol (SMTP) is used for sending and transferring mails. A mail can be sent either from a mail client to a mail server or from a mail server to another mail server. An enterprise uses an SMTP server in order to handle all the mails of that enterprise. SMTP uses port 25 [52].

20 3 Method and Implementation

This chapter contains a description of the methodical aspects of this thesis work, as well as the implementation of the infrastructure in CRATE.

3.1 Method This chapter focuses on the method this thesis was developed. At the beginning of the thesis, [1] was read thoroughly for a better understanding of SCADA systems. Afterwards, since this thesis was going to be implemented in CRATE, an understanding of what is CRATE and how it works was necessary. Further- more, CRATE web tool was introduced and a considerable amount of time was given to learn the purpose and use of the web tool, to become familiar with the graphical interface of it and examine little by little all the different parts and components of the web tool. The interaction with FOI had limitations. A wiki is provided in CRATE, but many pages were written in Swedish, which made it a bit hard to comprehend. Dr. Mathias and Mr. Matus helped as much as they could to overcome this. Afterwards, there was a lot of information gathering concerning the it in- frastructure of a typical power utility office, as well as the it infrastructure of a typical office, since the only major difference from the infrastructure that was built in this thesis and an infrastructure of a typical office, is that in this thesis there are engineering workstations that communicate with the SCADA network. Other than that, the infrastructure is the same as the infrastructure of a typical office. Furthermore, there was not that much information concern- ing the IT infrastructure of power utility companies, this is why there was a turn to the literature. For the information gathering, a lot of on-line material and papers contributed, where most of them were found by using three major search engines: 1) Google Scholar [9], 2) Association for Computer Machinery [10] and 3) Science Direct [11]. After that and before starting implementing the infrastructure in CRATE, some draft drawings were made to obtain the idea of how the infrastructure should look, and then the implementation started in CRATE, although both drawings and the model of CRATE kept updating even after this point, before ending up to the final model. A lot of time was spent in CRATE in order to learn how to configure the machines, how to connect them and how to start building the infrastructure. Help was also provided from reference models that already existed in CRATE. After the entire implementation of the infrastructure was finished, Dr. Math- ias arranged the two interviews that are shown in the Appendix, so that the model could be validated. The time for the interviews was limited, 1 hour each, so a proper preparation had to be done beforehand. The interviews were semi structured. A semi structured interview is when there is an ˆaœinterview guideˆa developed by the interviewer, which contains a list of questions that need to be covered during the interview, but there is the possibility to swerve from these questions when necessary in order for the interviewer to gain a better under- standing of the topic of interest. Semi-structured interviews are best used when there is only one chance to meet with the respondent, which was the case in this thesis. These types of interviews can provide reliable, comparable quali- tative data [53]. That was the aim of these two interviews. Qualitative data

21 were obtained (as can be seen from Appendix B - Thesis validation) and were processed.

3.1.1 Validation Concerning the validation of the thesis, there has been immediate interaction with the industry. Two industry experts contributed with their knowledge to the validation of this thesis. For more information about the interviews, refer to Appendix B - Thesis validation. There is an infinite amount of different IT infrastructures in the real world, so the scope of the validation of the thesis was to get at least one positive feedback concerning the reference model of the IT infrastructure of the office network that was implemented in this thesis. During these interviews, especially the second, it confirmed that this thesis could actually refer to a realistic IT infrastructure of a typical company and in extension of a typical power utility company, which was what this thesis aimed for. Apparently, this infrastructure represents only a small sized company, as in real world for big industries, there is a variety of more intelligent and specialized systems that are being used and could not have been implemented in this thesis. The preparation for the interviews is mentioned above. The process of the data that was gathered from these two interviews was made qualitatively and the results from the evaluation of the data were positive. Both respondents showed interest to the infrastructure, the systems and applications that are shown and being used in the infrastructure represent systems and applications that are used in real environments (even though more advanced and complex systems and applications exist), open source software exists in real environments as well and the separation of the networks reflects the separation of networks in real environments. From the above, the conclusion was that the infrastructure presented in this thesis can be a realistic infrastructure.

3.2 Implementation As already mentioned, this thesis was implemented with the help of CRATE. This chapter will demonstrate the infrastructure that was modeled using CRATE, along with every component of that infrastructure. Our infrastructure (Figure 10) comprises of the demilitarized zone, the office network, the Intranet network and the engineering network. The demilitarized zone (DMZ) is the network where incoming and outgoing traffic from the enterprise to the outside world has to pass. The office network is the network where the employees of the en- terprise are located. The Intranet contains systems and servers that are needed for the enterprise to function properly and are being used by the employees of the company. The engineering network is the network where engineers that communicate with the SCADA network are located.

22 Figure 5: RICS-el enterprise - Figure created with [54]

The EXT network that appears on (Figure 5) is the external network that the gateway is connected to the Internet.

23 3.2.1 Demilitarized zone

Figure 6: The DMZ network of the RICS-el enterprise - Figure created with [54]

The demilitarized zone (DMZ), is a part of the enterprise network that contains hosts that should be accessible from the Internet (Figure 6). The DMZ in general is the network to place devices and hosts that need to be accessible, but that access puts them at higher risk. The picture below shows the network topology of the DMZ in the RICS-el enterprise. For the implementation of this thesis, in the DMZ zone we have placed the following: • Proxy server (proxy)

• Domain Name System (DNS) server (dns)

• Web server (web-server)

• Mail server (mail server)

In order to understand why these particular machines were placed in the DMZ zone, an explanation of what each machine does in the network is given. A proxy server acts like an intermediary between the workstations that there are in the enterprise network and servers that requests are being sent to them by the workstations. These servers are actually servers on the Internet, especially web servers. So for example, when an employee tries to reaches a web site from the Internet, his request will go to the proxy server. If the proxy server has the web site that the employee requests in its cache memory, then it sends the page to the employee without having to forward the request. If the web site is not in

24 the cache memory, then the proxy server behaves as a client himself, forwarding the request of the employee as his own and when the response is received, it is forwarded back to the user that made the request. Proxy servers are useful in an enterprise network because they provide administrative control, they facilitate security and the caching services could be really helpful. As a proxy server in this implementation, Squid [55] was installed. A Domain Name System (DNS) [7] server hosts records of a DNS database. Those records are being used to resolve DNS queries issued by client comput- ers, like queries for the name of web sites that exist for the RICS-el enterprise. Typically for a small network, the DNS name space is administrated by orga- nizations that specialize in DNS administration, such as the Internet Service Provider (ISP) that the network belongs to, but in this case the enterprise has its own DNS server. A Web server is a program that uses HTTP (Hyper Text Transfer Protocol) and hosts files that are used to form the web page to users that make a request to access the web page. So whenever a client tries to reach the web page of our RICS-el enterprise, the Web server ”constructs” the web page that the client requests to see in his computer. As a web server for this implementation, Apache [56] was installed and configured. A mail server is the server that controls mails over the network. It receives mails from client computers outside of the network and typically stores them on the server, so users will log in to the server and read the mails (e.g. using a web interface) or download the mails to their computers. It also receives mails from the inside of the corporate network and delivers them to other mail servers to help the mail reach its destination. The mail server can view logs for incoming and outgoing messages, it can scan for viruses and filter spam mails and in general have control over the mail of the enterprise. As a mail server for this project, Postfix [57], which is an open source mail server, was implemented. As can be seen from the above, the components that are used in the DMZ zone require access from the Internet as well. This is why they have been placed from this side of the network of the entire enterprise. From the picture above, it is clear that the DMZ network connects directly to a gateway (gw). This gateway has a firewall as well and it provides the first line of protection to the network. It receives packets from the Internet and redirects them to the DMZ zone. This gateway is also connected to the office network, which will be analyzed in the next chapter, so when a packets needs to be sent from the office network to the DMZ, the gateway forwards each packet in the correct destination.

25 3.2.2 Office network

Figure 7: The Office network of the RICS-el enterprise - Figure created with [54]

The office network is the part of the enterprise network were all the employees of the enterprise are located (Figure 7). In this network, there is almost everything that can be found in a typical corporate environment. The components of this network are the following: • Workstations (offices)

• Mobile phones

• Printer

Since this is a small enterprise, the implementation contains eight (8) differ- ent working stations, two (2) different mobiles and one (1) printer. One detail here is that mobile phones connect through a wireless connection. This means that there is a router running and its operating system can also be compromised. Furthermore, the attacker does not have to breach the physical perimeter of the office, but he can stand just outside trying to get into the network since it is wireless. All machines used in the implementation have source images loaded. Those source images include the applications that are running on that machine. There

26 is not a machine that has the exact same applications loaded with another machine. There is a variation among the source images that have been used for the implementation, as can be seen from the Appendix A.

3.2.3 Intranet

Figure 8: The Intranet network of the RICS-el enterprise - Figure created with [54]

The Intranet of the enterprise is the part of the network that can be accessed only by the employees of the enterprise (Figure 8). There is a router that acts as a gateway between the office network and the intranet. In this network, the following machines can be found: • File Transfer Protocol (FTP) Server

• Directory Service

• Database Server

• Enterprise Resource Planning (ERP) system

• Price-book Management System

• Work-flow Management System (WFMS)

• Content Management System (CMS)

27 • Customer Relationship Management (CRM) system

For a better understanding of why these machines were placed in the Intranet network of the enterprise, an explanation of each system is provided below. The use of a File Transfer Protocol (FTP) Server inside an enterprise is that so all employees can share documents and files, especially files that their size is big enough to be sent via email. It basically is a file storage system that can be used from employees of the company, even if they are not in the same local area network with other employees of the company. They can also upload any file they want and make it visible throughout the company or make collaboration with a colleague located in a different place much easier. As an FTP Server, Golden FTP Server [58] was used. A Directory Service provides one more layer of security inside a enterprise. With that, a enterprise can be divided into Organizational Units (OUs) and each OU can have different access to different parts of the network. Each OU contains a list of users and according to the OU they have specific access. Users also have to provide their credentials when trying to connect so that the directory service can identify the user. If the user provides no credential, the directory service will prompt him to the services that only have public access. As a directory server for this project, LDAP was used. [48] A Database Server is essential for a enterprise and can be useful in many ways. A database server can store information about customers, billing infor- mation, purchase information and in general a database server can find many uses. In this project, it is going to be used for storing information about each customer, what type of services he uses from the enterprise, a payment history but it could also be used to store information about enterprise reports, order logs etc. As a Database Server for this project, mySQL was used. [59] An Enterprise Resource Planning (ERP) system is a system that helps auto- mate many back office functions. One important use of the ERP system is that it can hold financial information like the assets of a enterprise, bank accounts, budget and cash available for a enterprise. As an ERP system for this project, an open source program was used, [60]. A Price-book Management system helps set retail prices, set the amount that the product of the enterprise is going to be sold to customers, according to their needs and the extent of their orders. Also, it can be useful in order to create the invoices and keep track of the payments that each customer does. Furthermore, it can be used to set dates regarding the change of the price of a product, as well as keep track of all the financial transactions of the company. This system can also be considered a Financial Transaction system. As a Price- book Management system for this project, open source GnuCash [61] was used. The Work-flow Management System (WFMS) can be used to set a specific set of actions for a job to be done automatically. WFMS is divided into stages, where at each stage either a group or an individual is responsible for the tasks that need to be done on that stage. Once one stage is done, WFMS informs the next stage that they are ready to start working on their part of the work flow and makes sure that it provides all the data necessary for the previous stages that are needed on that stage as well. For example, a customer calls to the enterprise saying that he has a trouble. Utility desk personnel receives the call and creates a new job request in the WFMS. If it is a technical problem, a technician will see that there is a problem and will try to resolve it. In

28 case it is not his area of expertise, this technician will forward it to another department etc. In general, there is a case that the first-line personnel exists only to record the problem and the technician to classify the problem and sent it to the corresponding department. As a Work-flow Management System for this project, Joget Community Edition [62] was used. The Content Management System (CMS) is an application that is used for making web content management easier. It is basically a back-end system that helps making changes and updates to the web site without having to change the code, stores files used in the website of an enterprise in an organized manner and in some cases, it can be used for marketing purposes by showing advertises based on a user’s specific characteristics. Furthermore, it can be used for internal purposes of a company, like having their own documentation, wiki etc. As a Content Management System for this project, Joomla [63] was used. The Customer Relationship Management (CRM) system is where all impor- tant information for all customers of the enterprise is held. Both new and old customers are saved in the CRM system. It helps with customer support; if a customer has a complaint then the CRM system can provide a solution or alert the human resources department about a complaint made in order to see if the complaint was resolved or not, it can help with the marketing section of the enterprise with promoting new products or packages for a enterprise, either to all the customers or to specific target groups. As a Customer Relationship Management, open source SuiteCRM [64] was used. SuiteCRM can be accessed through a browser in localhost/suitecrm once it is up and running on local ma- chine. For the installation of SuiteCRM, the installation of the entire LAMP ( Apache MySQL PHP) Software Bundle [65] was required.

29 3.2.4 Engineering network

Figure 9: The SCADA network of the RICS-el enterprise - Figure created with [54]

The engineering network of the RICS-el enterprise has only two engineering stations (Figure 9). These stations are equipped with a FTP client so that they can communicate with the substations located in the SCADA network (which is not shown in the figure) and exchange files, either files from the historian server that SCADA has or in general files from SCADA that could be useful for the enterprise. Furthermore, SCADA can ask from the engineering stations files or updates.

30 4 Data flows

In this chapter, there is going to be an analysis concerning the data flows that should exist in the infrastructure. In order to show the actual data flows in the infrastructure, the virtual machines and the software set up should be configured laboriously to make the data flows as it would in a real case. The creation of some basic data flows was done in this thesis, as can be seen in Appendix C. The scripts that were created were not implemented in the laboratory environment. This chapter will provide a more theoretical approach of the data flows that should exist and should be implemented in the infrastructure, rather than their actual implementation. As in the previous chapter, the data flows will be described for each network individually at the beginning and for the entire infrastructure afterwards, for a better understanding.

4.1 DMZ The DMZ network of our infrastructure has a lot of incoming and outgoing data flows. Requests from the external (EXT) network are redirected through the gateway to the demilitarized zone. All four (4) machines that are in the DMZ have interaction with the Internet. For this network, there also incoming and outgoing flows from the office network as well. The working stations all use the proxy server, the mail server and the web server as well. This traffic also is being redirected from the gateway to the correct recipient. In the case where a working station from the office network wants to open a new site in its browser, it communicates with the proxy that is located in the DMZ zone. This proxy server acts more like a web proxy server, meaning that it forwards HTTP requests. Most proxy servers like Apache [56], which is being used in this project, run on port 8080. This is a data flow that occurs every time an employee wants to use a site, which is pretty often. The web server located in the DMZ accepts requests from the EXT network (the outside world) and runs on port 80. These data flows are not allowed to continue to the Intranet. DNS server runs on port 53 and a connection to the DNS server is being done every time an employee tries to resolve an address that is not located in the proxy server. For the mail server that is located in the DMZ zone, there are many data flows to be described. A mail server accepts and receives messages. For the outgoing messages, SMTP (Simple Transfer Management Protocol) [52] server can use port 25, but there is no encryption so this is not the optimal choice. SMTP can also use ports 587 and 465, the first uses TLS (Transport Layer Security) [45] protocol to enhance the security of the message and the second uses SSL (Secure Socket Layer) [47] protocol. For the incoming messages, Post Office Protocol version 3 (POP3) [49] is being used. Port 110 can be used to accept incoming messages but with no encryption used and port 995 can be used for incoming messages as well but with the SSL protocol for security. These data flows occur many times during the day, since employees need to send and receive mails constantly.

31 4.2 Office network As mentioned above, for the office network there are data flows from the de- militarized zone. Besides that, the office network also has interaction with the intranet and the SCADA network of the infrastructure. In general, the office network is the only network that has immediate interaction with all networks that exist in the infrastructure. There are incoming and outgoing data flows with all the servers that exist in the intranet, as well as the SCADA network of the company. These flows will be explained in the following subsections. There are also flows among the working stations, the mobiles and the printer that exist in the office network. More specific, working stations in the office network use constantly ports 80 and 443, which are the ports that Hypertext Transfer Protocol (HTTP) [44] and HTTP Over TLS (HTTPS) [46] use. Since some working stations have Skype [68] installed, they will use a random port from 1024 to 65535 for incoming connections, although the best option would be for the network administrator to configure a specific port that Skype will be using. When Skype is open at employees’ machines, Skype communicates all the time, even if there are no active calls and video calls. Updates on statuses and photos of contacts or messages that are being sent to an employee through Skype are all flowing through Skype data flows. Furthermore, for the working stations that have Adobe Flash Player [66] installed, they will make use of the port 1935 in order to connect to Adobe Media Server using Real-Time Media Flow Protocol (RTMFP) [50] over either Transmission Control Protocol (TCP) [39] or User Datagram Protocol (UDP) [40] to connect [67]. There is also the case where each working station from the office network wants to connect to the printer. In order to do so, port 9100 is used and they can connect using either UDP or TCP.

4.3 Intranet The Intranet communicates with the working stations of the office network and eventually computers in other networks within the company. For the size of a typical utility company like this, employees are checking a lot of information daily that exist in the intranet servers. There is no other incoming and outgoing traffic to that network. All traffic is being redirected by the gateway that is placed between the networks. The working stations send data to all the machines that exist in the Intranet and accordingly each machine replies with data to the working stations. There is also communication between the Intranet systems that can happen thousands of times a day. The systems that exist in the Intranet are interconnected and communicate with each other. Furthermore, a lot of information will be feeding daily to some of the Intranet servers, like meter readings for customers. When an employee from the office network wants to connect to the File Transfer Protocol (FTP) [43] server, it creates a connection from a random port of his working station to the port 21, which is the command port of the FTP server. The server then connects with the client in port 20, which is the data port of the FTP server. For the Directory Service, we use Lightweight Directory Access Protocol (LDAP) [48]. LDAP can use both TCP and UDP protocols as transport proto-

32 cols and the port that is uses is 389. LDAP traffic can also be encrypted with SSL protocol. The port that is being used for that purpose is 636. For the Database Server, MySQL [59] is being used. Port 3306 is the default port that MySQL server uses. For the Enterprise Resource Planning System, Odoo [60] was used. Odoo server uses by default port number 8069. The Price-book Management System (GnuCash [61]) the Work-flow Man- agement System (Joget Community Edition [62]), the Customer Relationship Management System (SuiteCRM [64]) and the Content Management System (Joomla [63]), all these systems use Tomcat [69], so they all run on port 8080. The idea here is that not every employee needs to connect to each system. One employee is responsible for the Price-book Management System (e.g. accoun- tant) etc.

4.4 Engineering network The engineering stations that exist in the SCADA network of the enterprise also communicate only with the working stations of the office network. There is also communication with the actual SCADA system, but since this is not presented in the infrastructure, there will be no data flows for that part. The working stations can ask from the engineering stations for data from the historian server of the SCADA system, so the engineering stations will first communicate with the SCADA system and then with the working station that asked for the specific data. That data should be stored in an FTP server, so the communication will be done through ports 20 and 21 like mentioned above.

33 5 Discussion and conclusion

For the purposes of this thesis, a reference model of a typical IT infrastructure of the office network of a power utility company was implemented, with a simplified implementation in CRATE. The importance of doing so lies on the importance of the security of critical infrastructures from cyber-attacks that can lead to vast damages, affecting a lot of people (this is also where this thesis work introduces sustainability). A solid implementation in CRATE allows us to begin a series of experiments in critical infrastructures concerning cyber-attacks. Based on this reference model, different variations of more complex models can take place. The same idea exists for other critical infrastructures, such as government facilities, nuclear power plants etc. The office networks can be a source of attacks, so care must be given to that part as well. A lot of intelligence gathering took place regarding the components of an enterprise, what servers each enterprise has and how it uses them, since the function of some servers may overlap. The lack of experience in the industry of the author was a major setback for the implementation. Several limitations had to be faced for the implementation of this thesis. Commercial software could not be used, so except from some commercial free- ware like Adobe Reader, the rest of the applications that is being implemented in the source images that are loaded into the virtual machines, is open-source. It must be noted, that these applications usually are of lower complexity, but it all cover the basic demands that there is from these applications. For a fully finished implementation of the model in CRATE, the validation and configuration of the machines is necessary, as well as the creation of scripts with data flows. This was not done in this thesis for two reasons. Firstly, when a machine was created and inserted into the model, a source image was created for that machine (see Appendix A ˆa“Virtual machines configuration). When the creation of those source images was made, there was a pool of operating systems and application that could be chosen, so it was believed until the last moment of the thesis that the creation of these source images was made by FOI, which eventually turned out that it was not. Source images have to be created and uploaded to a FTP server that is provided by FOI. This leads us to the second reason of why the validation of the model is not part of this thesis. For the creation and configuration of these source images as well as the installation and configuration of different applications, a considerable large amount of time is necessary. Such a validation requires much resources and involvement of other parties, and although it is intended in the future, it has not taken place yet due to feasibility reasons. Future efforts in this line of work can include implementation of infrastruc- tures that could be either similar with the one implemented in this thesis or completely different. The same infrastructure that was implemented for this thesis could be more secure, but it also could be less secure. For example, more firewalls, security systems and configuration components can be added to increase security. Another modification that could be done in the same infras- tructure is if even more working stations were added, with different applications each working station to increase the complexity. Furthermore, a second office network could be added and connected through virtual private network (VPN) to the office infrastructure of this enterprise and simulate attacks in that entire

34 infrastructure. Additionally, other critical infrastructures could be implemented in CRATE to examine effects on cyber-attacks as well, based on this infrastruc- ture but with changes to adapt it to the scenario needed.

35 References

[1] Ronald L. Krutz - Securing SCADA systems, January 2005 [2] Dell Security Annual Threat Report, https://software.dell.com/docs/ 2015-dell-security-annual -threat-report-white-paper-15657.pdf [3] Swedish Defence Research Agency, https://www.foi.se/en.html [4] CRATE - Cyber Range And Training Environment, http: //www.foi.se/en/Our-Knowledge/Information-Security-and -Communication/Information-Security/Lab-resources/CRATE/ [5] Swedish Meteorological and Hydrological Institute (SMHI), http://www. smhi.se/en [6] Microsoft, https://www.microsoft.com/ [7] Domain Names - Implementation and Specification, https://www.ietf. org/rfc/rfc1035.txt, November 1987 [8] Microsoft Technet - Understanding KMS, https://technet.microsoft. com/en-us/library/ff793434.aspx [9] Google Scholar https://scholar.google.com/ [10] Association for Computer Machinery http://www.acm.org/ [11] Science Direct, http://www.sciencedirect.com/ [12] Enrico Zio - Challenges in the vulnerability and risk analysis of criti- cal infrastructures, Reliability Engineering and System Safety 152 (2016) 137ˆa“150,March 2016 [13] Chee-Wooi Ten, Govindarasu Manimaran, Chen-Ching Liu, Cybersecurity for Critical Infrastructures: Attack and Defense Modeling, July 2010 [14] Sandip C. Patel, Ganesh D. Bhatt, and James H. Graham, Improving the Cyber Security of SCADA Communication Networks, Communications of the ACM, vol. 53, no. 7, July 2009 [15] Yulia Cherdantseva, Pete Burnap, Andrew Blyth, Peter Eden, Kevin Jones, Hugh Soulsby, Kristan Stoddart, A review of cyber security risk assessment methods for SCADA systems, Computers & Security 56 (2016) 1ˆa“27,Oc- tober 2015 [16] Jingcheng Gao, Jing Liu, Bharat Rajan, Rahul Nori, Bo Fu, Yang Xiao, Wei Liang, C. L. Philip Chen, SCADA communication and security issues, Security and Communication Networks 2014; 7:175ˆa“194,January 2013 [17] Edward Chikuni, Maxwell Dondo, Investigating the Security of Electrical Power Systems SCADA, 2007 [18] Framework 7 Collaborative STREP Project, Vital Infrastructure, Net- works, Information and Control Systems Management, May 2010

36 [19] Vinay M. Igure, Sean A. Laughter, Ronald D. Williams, Security issues in SCADA networks, Computers & Security 25 (2006) 498 ˆa“506 2006

[20] Mini S. Thomas, John D. McDonald, Power System SCADA and Smart Grids, 2015 [21] Joseph G. Maley, Booz-Allen & Hamilton Inc., Enterprise Security Infras- tructure, 1080-1383/96 IEEE Proceedings of WET ICE ’96, 1996 [22] Chee-Wooi Ten, Govindarasu Manimaran, Chen-Ching Liu - Cybersecurity for Critical Infrastructures: Attack and Defence Modeling, IEEE Transac- tions on SystemsS, Mam, and Cyberneticsˆa”Part A: Systems and Humans, vol. 40, no. 4, July 2010 [23] Irlbeck, M., D. Bytschkow, G. Hackenberg and V. Koutsoumpas, Towards a bottom-up development of reference architectures for smart energy systems, Software Engineering Challenges for the Smart Grid (SE4SG), 2013 2nd International Workshop on. IEEE, pp. 9ˆa“16,2013. [24] Australian Government - Department of Finance and Deregula- tion, Australian Government Architecture Reference Models, https: //www.finance.gov.au/sites/default/files/AGA-RM-Final-v3. 0-July-2013.pdf [25] Hanson’s Geek - IT Support Professionals, https://www.hansongeek. com/login/service%20detail.php?recordID=2513000 [26] Carolina Ferreira, Andrea Nery, Placido Rogerio Pinheiro, A Multi-Criteria Model in Information Technology Infrastructure Problems, Procedia Com- puter Science 91 ( 2016 ) 642 ˆa“651, July 2016 [27] J. C. Henderson, N. Venkatraman, Strategic Allignment: Leveraging Infor- mation Technology for Transforming Organizations, IBM Systems Journal, Vol.38, Nos 2 & 3, 1999

[28] N.B. Duncan, Capturing flexibility of information technology infrastructure: A study of resource characteristics and their measure, Journal of Manage- ment Information Systems, 12(2), 37 57, 1995

[29] Software Engineering Institute, https://www.sei.cmu.edu/ [30] Agile Enterprise Architecture, https://www.corso3.com/ enterprise-architecture [31] Enterprise Architect - Version 13, http://www.sparxsystems.com/ products/ea/ [32] The Open Group Architecture Framework, http://pubs.opengroup.org/ architecture/togaf8-doc/arch/ [33] Pontus Johnson, Robert Lagerstr¨om,Per Narman and M˚artenSimonsson, Enterprise Architecture Analysis with Extended Influence Diagrams, Febru- ary 2007

37 [34] Hannes Holm, Khurram Shahzad, Markus Buschle, Student Member IEEE, and Mathias Ekstedt, Member IEEE, P2CySeMoL: Predictive, Probabilistic Cyber Security Modeling Language November/December 2015 [35] T. Sommestad, M. Ekstedt and H. Holm, The cyber security modeling lan- guage: A tool for assessing the vulnerability of enterprise system architec- tures, IEEE Systems Journal, vol. 7, no. 3, September 2013 [36] Sheyner O., J. Haines, S. Jha, R. Lippmann and J. M. Wing, Automated generation and analysis of attack graphs, Security and privacy, 2002. Pro- ceedings. 2002 IEEE Symposium on. IEEE, pp.273ˆa“284

[37] International Organization for Standardization (ISO), http://www.iso. org/iso/home.html [38] Wendell Odom, Open System Interconnection (OSI) Networking Model, Of- ficial Cert Guide - CCENT/CCNA ICND 1, Third Edition,

[39] RFC 793 - Transmission Control Protocol (TCP), https://tools.ietf. org/html/rfc793, September 1981 [40] RFC 768 - User Datagram Protocol (UDP), https://www.ietf.org/rfc/ rfc768.txt, August 1980 [41] RFC 791 - Internet Protocol - DARPA Internet Program - Protocol Speci- fication, https://tools.ietf.org/html/rfc791, September 1981 [42] RFC 2131 - Dynamic Host Configuration Protocol (DHCP), https://www. ietf.org/rfc/rfc2131.txt, March 1997 [43] RFC 959 - File Transfer Protocol (FTP), https://www.ietf.org/rfc/ rfc959.txt [44] RFC 2616 - Hypertext Transfer Protocol (HTTP/1.1), https://tools. ietf.org/html/rfc2616, June 1999 [45] RFC 2246 - The Transport Layer Security (TLS) Protocol - Version 1.0, https://tools.ietf.org/html/rfc2246, January 1999 [46] RFC 2818 - HTTP Over TLS, https://tools.ietf.org/html/rfc2818, May 2000

[47] The Secure Sockets Layer (SSL) Protocol - Version 3.0, https://tools. ietf.org/html/draft-ietf-tls-ssl-version3-00, November 1996 [48] RFC 4511 - Lightweight Directory Access Protocol (LDAP): The Protocol, https://tools.ietf.org/html/rfc4511, June 2006 [49] RFC 1939 - Post Office Protocol Version 3, https://www.ietf.org/rfc/ rfc1939.txt, May 1996 [50] RFC 7425 - Adobe’s RTMFP Profile for Flash Communication, https: //tools.ietf.org/html/rfc7425, December 2014 [51] What are SSL Certificates? https://www.digicert.com/ ssl-certificate.htm

38 [52] RFC 2821 - Simple Mail Transfer Protocol, https://www.ietf.org/rfc/ rfc2821.txt, April 2001

[53] Cohen D, Crabtree B, Qualitative Research Guidelines Project. http:// www.qualres.org/HomeSemi-3629.html July 2006 [54] Microsoft Visio, https://en.wikipedia.org/wiki/Microsoft_Visio [55] Squid: Optimising Web Delivery, http://www.squid-cache.org/

[56] The Apache Software Foundation, https://www.apache.org/ [57] Postfix Mail Server, http://www.postfix.org/ [58] Golden FTP Server, http://www.goldenftpserver.com/ [59] MySQL, https://www.mysql.com/

[60] Open Source ERM and CRM - Odoo, https://www.odoo.com/ [61] Free - GnuCash, https://www.gnucash.org/ [62] Joget Workflow - Open Source Workflow Software and Business Process Management Software, https://www.joget.org/

[63] Joomla, https://www.joomla.org/ [64] SuiteCRM ˆa“Open Source CRM for the world, https://suitecrm.com/ [65] LAMP Software Bundle, https://en.wikipedia.org/wiki/LAMP_ (software_bundle)

[66] Adobe Flash Player, http://www.adobe.com/software/flash/about/ [67] Adobe Media Server - Configure Ports, https://helpx.adobe.com/ adobe-media-server/config-admin/configure-ports.html [68] Skype, https://www.skype.com/en/

[69] Apache Tomcat, http://tomcat.apache.org/ [70] Oracle VM VirtualBox, https://www.virtualbox.org/ [71] MS-DOS, https://en.wikipedia.org/wiki/MS-DOS

39 A APPENDIX - Virtual machines configuration

For every virtual machine that has been used for this project, different source images have been created and loaded into the virtual machines. A source image is an image created with Oracle VM VirtualBox [70] which contains the oper- ating system and applications that we choose to load into a virtual machine. Before the source image is created, the installation of all the applications is re- quired in the source image, including the operating system. It should be noted that all operating systems and applications used, have been installed with de- fault parameters as defined from their distributions. Below there are 2 tables. The first table shows the virtual machines that correspond to the name of each system and office that was show in Figure 5 that have been installed in CRATE and what source image they use, and the second table shows the content of each source image.

NAME VIRTUAL MACHINE NAME SOURCE IMAGE Gateway gateway ricsel-gateway DNS dns ricsel-dns Web Server web server www.ricsel.se Mail Server mail server ricsel-mail Proxy Server proxy ricsel-proxy Office 1 office1.win ricsel-office1 Office 2 office2.win ricsel-office1 Office 3 office3.win ricsel-office2 Office 4 office4.win ricsel-office2 Office 5 office5.win ricsel-office3 Office 6 office6.win ricsel-office3 Office 7 office7.win ricsel-office4 Office 8 office8.win ricsel-office4 Mobile 1 mobile1.win ricsel-mobiles Mobile 2 mobile2.win ricsel-mobiles Printer printer.win ricsel-printer Gateway 2 gateway2.win ricsel-gateway FTP Server ftp.backoffice ricsel-ftp Directory Service directory service.backoffice ricsel-directory Database Server database server.backoffice ricsel-database ERP erp.backoffice ricsel-erp Pricebook pricebook.backoffice ricsel-pricebook Workflow workflow.backoffice ricsel-workflow CMS cms.backoffice ricsel-cms CRM customer.backoffice ricsel-customer Gateway 3 gateway3 ricsel-gateway Engineer 1 engineer1 ricsel-engineering Engineer 2 engineer2 ricsel-engineering

Table 1: Virtual machines and their source images

40 SOURCE IMAGE OS APPLICATIONS ricsel-gateway Linux, Ubuntu 64 SSH Server All, Bird 1.4.0, Dnsmasq Generic, DHCP Server Generic, DHCP Server Linux Specific Options, Firewall Linux Generic ricsel-dns Linux, Debian DNS Bind www.ricsel.se Linux, Ubutu 64 WWW Server Apache2 ricsel-mail Linux, Debian 64 Mail Server Generic ricsel-proxy Linux, Fedora 64 Zaboss Zproxy 1.0 ricsel-office1 Windows, Windows7 64 7-zip 15.12, Adobe Reader 11, Flash 10.1.53.64, Fire- fox 14.0.1, JRE 1.3.1, Mi- crosoft Office 2013, Skype Generic, Windows Media Player 10.0.0.3997 ricsel-office2 Windows, Windows8 64 7-zip 9.20, Adobe Flash 21, Adobe Reader DC 20015.010.20060, Filezilla 3.5.3, Microsoft Office 2010, PowerPoint 2003 11.0.5529.0, Windows Me- dia Player 11.0.5721.5262, Skype Generic ricsel-office3 Linux, Ubuntu Filezilla 3.17.0.1, Mozilla Firefox 47.0, p7Zip 9.20.1, Adobe Flash v11.2.202.644, Skype Generic ricsel-office4 Linux, Fedora 64 Filezilla 3.16.1, Python 2.7, Adobe Flash 21, Word- press, Google Chrome 49 ricsel-mobiles Linux, Fedora x64 AdobeRreader, OpenJDK version 1.8.0 111, Adobe Flash v11.2.202.644, Adobe Reader DC 2015.010.20060, JRE 1.7.0.10, Skype Generic ricsel-printer Windows, Windows 2003 Windows Server 2003 Standard Edition SP1, Internet Explorer 6.0.3790.3959, MDAC 2.5, MSXML 3.0 Post SP8, Windows Media Player 10.0.0.3997 ricsel-ftp Windows, Windows 7 64 Golden FTP Server 4.7 ricsel-directory Linux, Debian 64 Directory Server LDAP

41 ricsel-database Linux, Debian 64 mySQL Server All ricsel-erp Linux, Debian 64 Odoo All ricsel-pricebook Linux, Fedora 64 GnuCash 2.6.13 ricsel-workflow Linux, Ubuntu 64 Joget Community Edition 5.0.9 ricsel-cms Windows, Windows8 64 Joomla 3.6 ricsel-customer Linux, Ubuntu 64 SuiteCRM 7.7.8 ricsel-engineering Linux, Fedora 64 7-zip 15.12, Bro 2.4, Filezilla 3.17.01, Mail Server Generic, Python 2.7 Table 2: Applications and OS on each source image

42 B APPENDIX - Thesis validation

For the thesis validation, two interviews with two separate individuals took place. Mr. Stephan St˚aleredand Mr. Anders Ahrsj¨oare the two industry specialists that contributed in this part of the thesis. Their work is related to SCADA systems and security in infrastructures, so those two opinions are of real value for this work. After the implementation of the reference model in CRATE, prof. Mathias Ekstedt had the courtesy to introduce me to these gentlemen for an interview (around 1 hour each). The core of the interview included a list of questions, but since it had more the format of a discussion, different questions and answers came up from each interview after the introduction questions. Below some questions with their answers are going to be shown, including the outcomes that were drawn from each interview. The interviews will be placed by chronological order, so the interview with Mr. Ahrsj¨oAnders will be shown first. • Q: From a first look, do you think the infrastructure is realistic? A: The infrastructure seems realistic, but only for a small power utility company. • Q: In a real-company, is there any open-source software running? A: There are some open source-programs that are being used in an enter- prise. Most of the times we are not aware that a program that we use is open source. • Q: Is there any system that is being used in companies like Rics-el that we have not taken into consideration in the architecture? A: In our company we have more than 900 systems and more than 400 applications (systems of systems). • Q: What types of systems from the ones you use would you say that are not necessary? A: The systems that are outdated. • Q: Concerning the office network of a company, is there any chance of having 2 or more separate office networks with different access rights to each network (e.g. office network with no Internet access and office net- work with Internet access)? A: There could be N systems, not only 1. • Q: Following the previous question, is there any chance of having 2 or more separate intranet networks? A: The answer is the same as the previous question. • Q: What is your opinion regarding the way the systems are separated into the networks? Is this realistic? A: Yes, the separation is good. • Q: What connections and data flow exist between the office networks and the SCADA (industrial control) environment? What applications communicate and what protocols do they use for that? A: There should be no connection at all between those two networks.

43 1st Interview Outline: Although Mr. Anders helped a lot with this project, his opinion did not quite validate this thesis work. Furthermore, Mr. Anders was more interested in the SCADA network which is not implemented in this thesis, than the office network. The interview with Mr. St˚alered Stephan is shown below.

• Q: From a first look, do you think the infrastructure is realistic? A: Yes, it looks pretty realistic. • Q: What is your opinion concerning the size of the company? A: For a small company, it is just fine. • Q: In a real-company, is there any open-source software running? A: In our company we use some open-source software, but we try to reduce this as much as possible. It is something that you do not want in a corporate environment. • Q: Where do you think FTP server should be placed? Intranet or DMZ? A: In general, nowadays there is no need for a FTP server, unless your company created graphic models that require a lot of space. Other than that, the FTP server could be located in either of the two networks. You could also have one FTP server in each network. • Q: Is there any system that is being used in companies like Rics-el that we have not taken into consideration in the architecture? A: For the size of company that you simulate, the systems that you use are sufficient. • Q: Concerning the office network of a company, is there any chance of having 2 or more separate office networks with different access rights to each network (e.g. office network with no Internet access and office net- work with Internet access)? A: You can have 1 more office network and 1 more intranet with different access rights each but under the same physical lan. • Q: What is your opinion regarding the way the systems are separated into the networks? Is this realistic? A: The separation of the systems is done mostly in the most convenient way for the company, but in general it seems good. • Q: In our infrastructure there are 4 networks ˆa“DMZ, office, intranet and engineering networks. What do you think of that separation? Are there any other networks in a typical corporate environment? A: In our case we have 2 more networks. One network is a home network, so when an employee goes back home and takes his work laptop with him and wants to connect to the company, there is a home network where he gets access. The other network that we use is a network to ”run” the building. Every lamp, TV, plug etc. in our offices is connected straight to that network.

2nd Interview Outline: Mr. Stephan was really positive, there was a fruitful conversation and he liked the work that has been done on this thesis and he validated that this could be a realistic infrastructure in its total.

44 C APPENDIX - Data flows source code

In this appendix, the source code of some of the data flows that were imple- mented is provided. These scripts can be placed in the virtual machines and run as batch or bash scripts, depending on whether it contains MSDOS [71] shell commands or it is built for Unix based systems. For this thesis, the following scripts were created:

• The following script is a batch script that opens Google Chrome browser, creates a connection with www.google.com. After 10 seconds the browser window closes. This could be used to every virtual machine that has Windows and Google Chrome. The source code for this script is: start ”C:path-to-application \Google \Chrome \Application \chrome.exe” http://www.google.com TIMEOUT /T 10 taskkill /IM chrome.exe

• The following script is a batch script that will assign DHCP and update the DNS server. The source code for this script is: @echo off

netsh interface ip set address ”Local Area Connection” dhcp ipconfig /registerDNS now • The following script is a bash script that will connect with the MySQL database server and show the tables that exist in there. The source code for this script is: mysql -h database server address -u [user] -p[pass] <

• The following script is a batch script that connects to the FTP server with the IP address given, as well as correct username and passwords, changes the directory, places a .txt file and then quits the connection. The source code for this script is: open ftp ip address username password lcd c:\path to save txt put samplefile.txt quit

45 D APPENDIX - Screenshots from CRATE

In this appendix, screenshots from the topology as implemented in CRATE are provided.

Figure 10: Screen-shot of the RICS-el enterprise

46 Figure 11: Screen-shot of the DMZ network of the RICS-el enterprise

47 Figure 12: Screen-shot of the Office network of the RICS-el enterprise

48 Figure 13: Screen-shot of the Intranet network of the RICS-el enterprise

49 Figure 14: Screen-shot of the SCADA network of the RICS-el enterprise

50 TRITA EE 2017:01 ISSN 1653-5146

51

www.kth.se