<<

Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Datateknik 2021 | LIU-IDA/LITH-EX-A--21/044--SE

Automating installation for cyber security research and testing public exploits in CRATE Att automatisera mjukvaruinstallationer för cybersäkerhets- forskning och testandet av publika angreppskoder i CRATE

Johan Hedlin Joakim Kahlström

Supervisor : Niklas Carlsson Examiner : Andrei Gurtov

External supervisor : Jonas Almroth

Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer- ingsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko- pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis- ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker- heten och tillgängligheten finns lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman- nens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida https://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a period of 25 years starting from the date of publication barring exceptional circumstances. The online availability of the document implies permanent permission for anyone to read, to down- load, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional upon the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: https://www.ep.liu.se/.

Johan Hedlin © Joakim Kahlström Abstract

As cyber attacks are an ever-increasing threat to many organizations, the need for con- trolled environments where cyber security defenses can be tested against real-world at- tacks is increasing. These environments, called cyber ranges, exist across the world for both military and academic purposes of various scales. As the function of a cyber range involves having a set of computers, virtual or physical, that can be configured to replicate a corporate network or an industrial control system, having an automated method of config- uring these can streamline the process of performing different exercises. This thesis aims to provide a proof of concept of how the installation of software with known vulnerabili- ties can be performed and examines if the software is vulnerable directly after installation. The Cyber Range And Training Environment (CRATE) developed by the Swedish Defence Research Agency (FOI) is used as a testbed for the installations and FOI-provided tools are used for launching automated attacks against the installed software. The results show that installations can be performed without Internet access and with minimal network traf- fic being generated and that our solution can rewrite existing software packages from the Chocolatey to work with an on-premises repository with an 85% success rate. It is also shown that very few publicly available exploits succeed without any man- ual configuration of either the exploit or the targeted software. Our work contributes to making it easier to set up environments where cyber security research and training can be conducted by simplifying the process of installing vulnerable applications. Contents

Abstract iii

Contents iv

List of Figures vii

List of Tables viii

List of Listings ix

Glossary x

Acronyms xi

1 Introduction 1 1.1 Motivation ...... 1 1.2 Aim...... 2 1.3 Research questions ...... 2 1.4 Contributions ...... 2 1.5 Delimitations ...... 3 1.6 Disclaimer ...... 3 1.7 Thesis outline ...... 4

2 Background 5 2.1 Cyber Range And Training Environment (CRATE) ...... 5 2.2 Package managers for Windows ...... 6 2.2.1 Chocolatey ...... 7 2.2.2 ...... 7 2.2.3 Scoop ...... 7 2.2.4 Others ...... 7 2.3 Automation tools ...... 8 2.3.1 Ansible ...... 8 2.3.2 Chef ...... 9 2.3.3 Puppet ...... 10 2.3.4 Salt ...... 12 2.4 Virtual machine setup tools ...... 13 2.4.1 Boxstarter ...... 13 2.4.2 Packer ...... 14 2.5 Vulnerability-related naming schemes ...... 14 2.5.1 Common Platform Enumeration (CPE) ...... 14 2.5.2 Common Vulnerabilities and Exposures (CVE) ...... 14 2.6 Metasploit ...... 15 2.7 Scanning, Vulnerabilities, Exploits and Detection (SVED) ...... 15

iv 2.8 SVED Visualization Tool (SVIZ) ...... 16 2.9 Related work ...... 17

3 Method 20 3.1 Automating software installation ...... 20 3.1.1 Package managers selection ...... 21 3.1.2 Automation tool selection ...... 22 3.1.3 Summary of the tool selection ...... 23 3.1.4 Chocolatey list of packages ...... 23 3.1.5 Feedback from installation ...... 24 3.1.6 Database ...... 24 3.1.7 Mapping to vulnerabilities ...... 24 3.1.8 Online installation tests ...... 27 3.1.9 Rate limiting and excessive use ...... 27 3.1.10 Internal repository ...... 28 3.1.11 Offline installation tests ...... 30 3.2 Selecting exploits for evaluation ...... 31 3.2.1 Version difference ...... 31 3.2.2 Name and vendor difference ...... 32 3.2.3 Exploit selection criteria ...... 33 3.3 Automatic testing of exploits ...... 33 3.3.1 Preparing VMs ...... 34 3.3.2 Creating an attack sequence ...... 34 3.4 Manual testing of exploits ...... 35 3.5 Vulnerable state ...... 36

4 Results 37 4.1 Automating software installation ...... 37 4.1.1 With online access ...... 37 4.1.2 Evaluating the reliability of Chocolatey’s online repository ...... 38 4.1.3 Internal repository ...... 39 4.2 Selecting exploits for evaluation ...... 40 4.3 Automatic testing of exploits ...... 40 4.4 Manual testing of exploits ...... 41 4.5 Vulnerable state ...... 42

5 Discussion 44 5.1 Results ...... 44 5.1.1 Internal repository ...... 44 5.1.2 First usage ...... 45 5.1.3 Exploit testing ...... 46 5.2 Method ...... 47 5.2.1 Alternative methods of performing automated software installations . . 47 5.2.2 Less focus on automated testing ...... 47 5.2.3 Source criticism ...... 47 5.3 Challenges ...... 47 5.3.1 Hitting the rate limit during downloading and testing packages . . . . . 47 5.3.2 Database ...... 48 5.3.3 Corrupt output from Ansible ...... 48 5.3.4 Strange URLs ...... 48 5.4 The work in a wider context ...... 49

6 Conclusion 51

v 6.1 Research questions ...... 51 6.2 Future work ...... 52 6.2.1 Improving the mapping from program and version to CPE ...... 52 6.2.2 Improving the exploit suggestion process ...... 52 6.2.3 Further automating the internalization process ...... 53

Bibliography 54

A Automatic exploit test results 59

vi List of Figures

2.1 Simplified illustration of CRATE ...... 6 2.2 Typical attack sequence in SVED ...... 16 2.3 Screenshot of the attack graph creator in SVED ...... 16 2.4 Screenshot of the attack graph section of SVIZ ...... 17

3.1 Information flow for automated software installation ...... 21 3.2 Internal repository using Sonatype Nexus 3 ...... 30

4.1 Output from the software installation process ...... 37 4.2 Installation result statistics ...... 38 4.3 Installation results over time ...... 39 4.4 SVIZ showing a part of the attack sequence ...... 41

vii List of Tables

2.1 Example of CPE entries ...... 14

3.1 Latest versions and their age (days old as of 2021-02-23) available from different package managers ...... 22 3.2 Number of versions available from different package managers ...... 22 3.3 Chocolatey’s comparison of automation tool integration ...... 23 3.4 Example of database structure ...... 24 3.5 Possible installation result codes ...... 26 3.6 Example of Chocolatey packages with CPEs and actions ...... 27 3.7 Example of version differences for pairs of program versions and CPE versions . . 31 3.8 Example of name and vendor differences ...... 33 3.9 Example of name and vendor differences with partial matches allowed ...... 33

4.1 Installation result statistics ...... 38 4.2 Summary of exploit results ...... 41 4.3 Manually tested exploits ...... 42 4.4 Manually tested exploits, reason for failure and required action ...... 42

A.1 Exploit results, file format ...... 59 A.2 Exploit results, server ...... 60 A.3 Exploit results, browser ...... 61

viii List of Listings

2.1 Ansible YAML example ...... 9 2.2 Chef Ruby example ...... 10 2.3 Puppet example based on the official documentation ...... 11 2.4 Puppet Bolt example ...... 12 2.5 Salt example ...... 13

3.1 Package list in JSON format ...... 21 3.2 JSON example from Ansible ...... 25 3.3 List of file extensions that are downloaded by the internalizer ...... 29 3.4 Pseudocode for the version difference metric ...... 31 3.5 Pseudocode for the name and vendor difference calculation ...... 32

4.1 Playbook to put Elasticsearch into a vulnerable state ...... 42 4.2 Playbook to put FreeSWITCH into a vulnerable state ...... 43

5.1 URL with embedded variable ...... 45 5.2 Part of the JSON file that got corrupted ...... 49 5.3 Obfuscated URL ...... 50

ix Glossary

Administration network (CRATE) ...... see control plane (CRATE), 5

Capture The Flag (CTF) Cyber security competition where teams race to acquire text snip- pets (flags) through hacking exercises...... 17

Control plane (CRATE) The part of CRATE where supporting systems such as exercise con- trol services and virtual machine management are located...... see also CRATE, 5

Cyber range An isolated environment where cyber attacks and other security research can be performed in a controlled environment...... 1, 2, 3, 5, 17, 18, 44, 49

Event plane (CRATE) The part of CRATE where vulnerable machines are located and where environments are created for various exercises. Isolated from all non-event plane net- works and the Internet...... see also CRATE, 5, 15, 30

Game network (CRATE) ...... see event plane (CRATE), 5

Penetration testing A process where a group of people attempt to hack into a company’s systems (with their permission) for the purpose of finding and reporting issues that an attacker could take advantage of...... 15

Regular expression (regex) A language used to specify patterns that can be matched against a given text input...... 29

VirtualBox A program for managing and running virtual machines. . . . . 20, 23, 27, 30, 40, 41

x Acronyms

API Application Programming Interface ...... 5, 15, 20, 23, 27, 28, 30, 40, 48

CLI Command Line Interface ...... 23

CPE Common Platform Enumeration ...... 14, 18, 24, 26, 29, 31, 32, 33, 34, 40, 52

CRATE Cyber Range And Training Environment ...... 1, 2, 5, 6, 12, 13, 14, 15, 17, 18, 20, 27, 28, 30, 34, 35, 40, 41, 47, 49

CSV Comma-Separated Values ...... 26

CTF Capture The Flag ...... see Capture The Flag (CTF), 17

CVE Common Vulnerabilities and Exposures ...... 14, 17, 18, 52, 53

CVSS Common Vulnerability Scoring System ...... 14

CWE Common Weakness Enumeration ...... 14

CyRIS Cyber Range Instantiation System ...... 18

DSL Domain-Specific Language ...... 9, 10

EULA End User License Agreement ...... 35, 45

FOI Swedish Defence Research Agency...... 1, 2, 3, 5, 6, 14, 15, 16, 17, 23, 26, 28, 30, 32, 40, 47, 51, 52

GUI Graphical User Interface ...... 15, 19

HTTP Hyper Text Transfer Protocol ...... 5, 15, 38, 40, 48

ICMP Internet Control Message Protocol ...... 34

IP Internet Protocol ...... 27, 35

JSON JavaScript Object Notation ...... 20, 24, 48, 51

LAN Local Area Network ...... 42

MSI ...... 6

NAT Network Address Translation ...... 27

NVD U.S. National Vulnerability Database ...... 14, 52

OS ...... 1, 18, 34

PDF Portable Document Format ...... 45, 46

PHP PHP: Hypertext Preprocessor (recursive acronym)...... 7, 26, 46

xi RDP Remote Desktop Protocol ...... 5

SCADA Supervisory Control And Data Acquisition ...... 1

SMB Server Message Block ...... 46

SSH Secure SHell ...... 5, 11, 12, 20

SVED Scanning, Vulnerabilities, Exploits and Detection ...... 15, 16, 33, 34, 35, 40, 41, 47

SVIZ SVED Visualization Tool...... see also SVED, 16, 35, 40

URL Uniform Resource Locator ...... 21, 28, 29, 40, 44, 48, 53

VM Virtual Machine...... 13, 14, 16, 28, 30, 34, 40

VPN Virtual Private Network ...... 5

VXLAN Virtual eXtensible LAN ...... see also LAN, 34

WinRM Windows Remote Management ...... 11, 12, 20, 30

XML eXtensible Markup Language ...... 28

YAML YAML Ain’t Markup Language (recursive acronym) ...... 8, 9, 11, 12

xii 1 Introduction

This chapter presents an introduction to what the project is about and the reason behind it.

1.1 Motivation

As modern society is connected to the Internet in a wider range than ever before, there is also an increasing threat of cyber attacks [1, 2]. Systems that previously have not been connected to the Internet before such as Supervisory Control And Data Acquisition (SCADA) units are increasingly connected to meet a demand for remote control [3] and concepts like smart cities [4] introduce a wider attack surface [5]. During the last year, the demand to be able to work from home has increased due to the COVID-19 pandemic [6], which introduces yet another attack surface [7]. To counter this threat, multiple projects are in progress to increase both education and research within the cyber security area around the world [8, 9, 10, 11]. There are also efforts to improve the resilience against attacks on the underlying protocols that connect industrial devices [12]. Cyber ranges are great for doing controlled research and practice as they are isolated from the Internet and have the potential to provide the necessary resources that are needed as well as flexibility for different scenarios [13]. The Swedish Defence Research Agency (FOI) has an objective to increase the cyber defense capabilities of Sweden, where education and research are included. To reach this objective, FOI maintains the Cyber Range And Training Environment (CRATE) which is used to provide an environment where cyber attacks and exploits can be tested and demonstrated in a safe environment [14]. The cyber range can be set up with a large number of virtual machines that can, for example, simulate different components in a business network. When testing different exploits it is common for these to only work on a few versions of a vulnerable piece of software, thus necessitating the ability to install specific versions of many different applications in an efficient way. Today, FOI usually installs applications either by including them in the Operating System (OS) images that are deployed to the virtual machines, or through post- deployment scripts that have to be custom-made for each set of applications. Both of these solutions are rather inflexible, as preparing one OS image for every set of applications would lead to inefficient usage of disk space and take a lot of time to create. Creating installation scripts for every virtual machine would be quicker but require tailoring to the specific OS

1 1.2. Aim

that the virtual machine contains. As Windows lacks a built-in package manager (although this might be about to change [15]), this would require a number of additional steps to either perform the installation through scripting or by first installing a third-party package manager. By using a bare-bones base image that only contains the operating system and necessary remote management tools, applications could be installed automatically without having to create multiple images or custom scripts for each combination of software. FOI has created tools and databases that can query which exploits a given application and version is vulnera- ble to. These can likely also be used for mapping published software vulnerabilities to their corresponding applications and exploits. Having the ability to automatically install the correct software and version to test a given vulnerability would make it faster to demonstrate it for a client or use it as part of an exercise, and it would make it easier to test a large number of exploits without having to manually install each piece of software.

1.2 Aim

The purpose of this thesis is firstly to create a program that can take in a list of applications that should be installed, and then perform the installation of these. By having this program also include information about the relevant vulnerabilities and exploits for each installable application, it should then be possible to install known-vulnerable versions as well. As the targeted machines will be located inside an isolated cyber range with restricted Internet access, the created program should also be able to function without Internet access. It should also minimize the amount of network traffic to these machines to avoid leaving traces that can interfere with exercises inside CRATE. Since publicly available attacks usually target older versions of applications, it is also important that the applications that can be installed today are also available several years in the future when new attacks against them might have been published. Directly relying on installer packages from the Internet might thus create problems if binaries and other installation dependencies are later removed by a software vendor. A secondary aim of the thesis is to utilize this program’s ability to create a vulnerable system, taking advantage of the cyber range available at FOI to test a large number of exploits against a set of machines that have been configured with a large set of vulnerable programs. This serves to validate that the program can be used on a larger scale, as well as provide some data about the accuracy of publicly available exploits and if applications are vulnerable right after installation or if they usually require some configuration first.

1.3 Research questions

RQ1: How can older, potentially vulnerable software be stored reliably to support replicable security research?

RQ2: How can the installation process of such software be automated for scalable tests?

RQ3: Is the software in a vulnerable state from the point of installation and if not, how can it be placed into such a state?

1.4 Contributions

The main contributions of this thesis consist of the following:

2 1.5. Delimitations

• Developing a tool that automatically installs a given set of software using the automation tool Ansible and the Chocolatey package manager. We show that software can be stored in an internal repository for stability and installed automatically into virtual machines in a cyber range without the need for Internet access.

• Providing an insight into how reliable community-made Chocolatey packages are over a period of six years, and developing a tool that can semi-automatically internalize existing packages such that they can be stored reliably for later use and be used without Internet access.

• Using automated attacks from the Metasploit framework to show that a set of auto- matically selected applications with known vulnerabilities are mainly not vulnerable to attacks right after they have been installed. A method of placing these into a vulnerable state as part of the installation process is also demonstrated for a small set of packages.

1.5 Delimitations

To keep the scope of the thesis focused on creating a process and prototype for automated software installation, the work will only focus on installing software on . Other versions of Windows such as Windows 8 or are also of interest and should work with small or no modifications to the prototype, but at least initially only one operating system will be targeted. -based machines are also in use at FOI, and they would like to see support for Linux as well in the future, but for now, it would require a large amount of effort to maintain support for both platforms. Regarding the exploit testing process, it will be divided into two methods. Automated testing will be limited to the types of attacks that can be automated efficiently. Reading through exploit scripts written by many different authors and trying to install dependencies, modifying the scripts to suit each target, and finding the correct parameters would take too much time to allow for large-scale testing of exploits. However, this will be done for a few selected programs as part of manual testing. This thesis will focus on using exploitation frameworks that include many scripts that conform to a given standard and can be configured through parameters that share the same name across all exploits. Exploits that require manual interaction on the targeted system will only be considered in manual testing unless these actions are of a kind that can be automated. Copying a file to the target system can be automated quite easily, but sending a sequence of keyboard inputs or mouse clicks will be more difficult since each program would require a different set of inputs. FOI has tools that can automate many common tasks and sequences in an exploitation attempt, including copying files to the target and executing them, and these will be utilized to test as many exploits as possible. Developing exploits from scratch or using standalone exploit modules which do not follow a common structure will be considered out of scope for this thesis.

1.6 Disclaimer

The authors of this thesis do not take a stand on whether certain package managers contain more vulnerable software than others. Any existing program could potentially be vulnerable, and a larger selection of software will also lead to a larger amount of possible vulnerabilities in this software. Since a vulnerability is usually patched as soon as possible after it has been discovered, all package managers which provide up-to-date versions of their programs are equal in terms of risk to the end-user. The availability of older, vulnerable versions does not indicate that any package manager is unfit for regular use.

3 1.7. Thesis outline

1.7 Thesis outline

Chapter 2 presents the thesis background, the project’s environment, and the tools that are used. Tools related to security testing are then presented followed by related work. Chap- ter 3 contains a comparison between different tools for automation and program installation, presents the process of developing the automated software installer, how exploits are auto- matically chosen against a set of applications, the method of converting packages for offline use, and how exploits were tested against the applications. Chapter 4 contains the result of attempting to install and internalize a large set of applications, and the outcome of launching hundreds of attacks against a subset of these. Chapter 5 discusses the result and method along with some interesting observations and challenges. In Chapter 6 the conclusions are discussed and the research questions are answered. Lastly, thoughts about future work are presented.

4 2 Background

In this chapter, the project’s environment together with the tools used is presented. First, the environment and tools regarding the installation process are presented. Second, tools regard- ing the security tests used are presented. Third, related work within the area is presented.

2.1 Cyber Range And Training Environment (CRATE)

The Cyber Range And Training Environment (CRATE) is a cyber range that allows for a large number of virtual machines to be set up to simulate a large-scale network with multiple virtual organizations and industrial systems. Consisting of approximately 800 physical servers, it is developed and used by FOI for use in demonstrations, training, and other exercises [14]. The virtual machines in CRATE are split into an isolated game network, where all the vulnerable hosts and attacker-controlled machines are located, and an administration net- work, which interfaces the game network to the outside world and manages the state and configuration of all virtual machines located inside the game network. The administration network can be interfaced with using a set of Hyper Text Transfer Protocol (HTTP) Applica- tion Programming Interfaces (APIs), which can help automate the setup of large networks. A simplified illustration with the relevant information for this project can be seen in Figure 2.1. Each relevant part shown in the figure is discussed in the following sections and chapters. In the figure, the administration network is called the control plane and the game network is known as the event plane. The easiest way to access the control plane is by using Secure SHell (SSH) through a Virtual Private Network (VPN), and the event plane by using either SSH or the Remote Desktop Protocol (RDP) tunneled through the physical nodes (which are themselves accessed through the control plane). The workstations that are used for develop- ment are located outside of this network, with access to both the Internet and CRATE. They are shown in the top-left of Figure 2.1.

5 2.2. Package managers for Windows

Our workstations

Internet

Control plane Event plane

VPN tunnels Virtual machines

Database server SVED SVED injector

Virtualization servers

SRVproxy Pegasus Chocolatey repo

Figure 2.1: Simplified illustration of CRATE

2.2 Package managers for Windows

While package managers that can install software without interaction from the user are the rec- ommended way of acquiring software in Linux distributions such as [16], on Windows systems it is either downloaded directly from the vendor or through the [17]. While the Microsoft store provides a unified method of installing software from multiple ven- dors, automating the installation of packages from it does not seem straightforward1. Some software can be distributed in a format that allows for unattended or silent installations, such as Microsoft’s Windows Installer (MSI) format which has a /quiet option for performing the installation without displaying a user interface [18]. Other applications might be distributed in formats that use a non-standardized way of performing a silent install or might lack the option of a silent install altogether [19]. Since installing applications automatically is desirable for many system administrators [20], solutions have appeared that aim to overcome these issues by using custom-made installation procedures and scripts tailored for each application’s specific way of performing a silent install. Some of these solutions are discussed in the following sections. Only free-to-use solutions are considered since many paid editions of these programs are licensed either per-user or per- machine that the software is deployed to, of which the latter would be prohibitively expensive for FOI due to CRATE containing up to thousands of virtual machines. A user-based licensing model might be more feasible, but could also be very expensive and difficult to manage if every user that wants to create virtual machines in CRATE with a custom set of software each require their own license.

1https://serverfault.com/questions/1018220/how-do-i-install-an-app-from-windows- store-using-

6 2.2. Package managers for Windows

2.2.1 Chocolatey

Chocolatey2 provides a set of tools that can be used to package standalone binaries, scripts, or installers into a unified format that can then be installed, uninstalled, or upgraded through their command-line utility [21]. There is also an officially supported repository3 of community- made packages available for many common desktop applications and services such as Adobe Acrobat Reader, , VLC media player, Python 3, MySQL Server, and others. As of 2021-02-08, the repository contains 8 173 packages [22]. For many of these, there are also older versions available, which is of interest since many exploits are only applicable to certain versions before the vulnerabilities they exploit are patched. For example, the oldest version of Mozilla available in the repository is version 15.0 from 2012. However, due to issues with distribution rights the Chocolatey repository often cannot contain the installer binaries themselves, instead fetching them directly from the application vendor at install time [23]. Thus, it is not guaranteed that these old versions can still be installed. If manual packaging is an option, packages that include these installers can be created and hosted on a local repository for internal use.

2.2.2 Windows Package Manager

The Windows Package Manager4 (also known as WinGet) is a tool from Microsoft that is currently available as a beta version. Like the other package managers, it is capable of in- stalling packages through a command-line interface and retrieves the programs to be installed directly from the vendor [24]. Several common desktop applications such as Adobe Acrobat Reader, Google Chrome, VLC media player, and Python 3 are available from the community repository5, which as of 2021-02-08 contains 1 249 packages [25].

2.2.3 Scoop

Scoop6 has around 2 013 packages available in its repository as of 2021-02-08 [26, 27], with a mix of developer-oriented command-line tools and applications with a graphical interface. Many common programs such as web browsers and media players are available. It lacks good support for older versions since all applications have one metadata file each, preventing two versions with the same program name from existing. While a repository of different versions is available7 with 129 packages, it mainly contains beta versions and major releases of interpreters like PHP: Hypertext Preprocessor (PHP), Python, and Node.js where a new version might be incompatible with older code or make other large changes. The versions here have also had their names truncated a bit, so a package named ”python39” might install Python 3.9.2. Scoop also installs packages for the local user only by default, which could create problems for exploits that assume that programs are installed in a specified location, but packages seem to be able to be installed globally as well by supplying a command-line parameter.

2.2.4 Others Several other package managers are also available, but many of these either do not target Windows software or have too limited of a selection of packages available.

2https://chocolatey.org/ 3https://chocolatey.org/packages 4https://docs.microsoft.com/en-us/windows/package-manager/ 5https://github.com/microsoft/winget-pkgs 6https://scoop.sh/ 7https://github.com/ScoopInstaller/Versions/tree/master/bucket

7 2.3. Automation tools

Ninite

Ninite8 offers a selection of about 90 applications (as of 2021-02-08) [28] that can be installed by selecting which applications should be installed and then downloading a single program that installs them with minimal user interaction. However, this selection needs to be made on ’s website and the resulting installer cannot be run silently without purchasing a per- machine license. The installer always installs the latest version of each selected application, even if the installer itself was downloaded a long time ago, making this solution poorly suited for replicability purposes based on the motivations and aims discussed in Sections 1.1 and 1.2.

AppGet

AppGet9 contains about 1400 packages as of 2021-02-08 [29], but like Scoop it also lacks support for older versions due to the same architectural choice of only having one metadata file per application. On 2020-08-01 the service was shut down permanently due to Microsoft releasing WinGet (see Section 2.2.2) [29]. Since it is open-source and the package repository is still available for anyone to reuse for other projects, it would be possible (with some work) to install packages from AppGet, but compared to the alternatives that are actively maintained and offer older versions it might not be a good candidate for this thesis.

NuGet

NuGet10 is another package manager that allows for automated installation of packages and is also what Chocolatey is built on. In contrast to Chocolatey, its repository11 mainly deals with libraries for .NET development [30], and is thus of limited use for regular users. A version of Firefox seems to be available, but only a selected handful of versions exist, the newest of which being two years old. As of 2021-02-08, the repository contains 227 263 packages [31].

2.3 Automation tools

Several tools exist for installing software automatically and managing a large number of machines. Some of the most common ones are detailed in this section together with some examples of the configuration syntax for each of them. While most of these examples are tailored for Linux-based systems, they can be adapted to work on Windows as well.

2.3.1 Ansible

Ansible12 is an automation tool currently maintained by . It aims to automate system deployment, software installation, and other configuration in a simple-to-use way without requiring specialized agent software to be installed on the managed machines [32]. Custom actions can be implemented in Python if the included selection of 3365 modules13 proves insufficient. The YAML Ain’t Markup Language (YAML) format is used to define which actions should be performed on a set of machines. A file containing a set of such actions, structured into a list of tasks, is called a playbook. An example of a short playbook containing two tasks can be seen in Listing 2.1.

8https://ninite.com/ 9https://appget.net/ 10https://www.nuget.org/ 11https://www.nuget.org/packages 12https://www.ansible.com/ 13https://docs.ansible.com/ansible/2.9/modules/list_of_all_modules.html

8 2.3. Automation tools

- name: Configure web servers hosts: webservers

tasks: - name: Install latest version of Apache : name: apache2 state: latest - name: Update Apache config file template: src: files/apache_web.conf dest: /etc/apache2/apache2.conf

Listing 2.1: Ansible YAML example

2.3.2 Chef

Chef14, or Chef Infra, is another automation tool that is developed by Progress. It focuses on solving the problem of managing multiple machines in an organization. It is divided into three parts: a workstation from where the administrator can initiate changes and administer the network, a Chef server that collects the cookbooks and translates the instructions from the workstation into code and policies and distribute them towards the clients. The Chef clients are the nodes, computers, servers, or virtual machines that should be controlled [33]. This means that Chef uses a master-agent model where the administrator must upload the instructions, called recipes (or a cookbook if there are multiple recipes), to a server which in turn can convert the instructions into code that is sent to each machine. Each machine has an agent that receives the instructions and executes them. The instructions that are used to construct the recipes are written in Ruby Domain-Specific Language (DSL), which could allow for a more complex set of instructions than YAML but are also more complex to set up and grasp for users not familiar with Ruby. In its foundation it is open-source, but the company also offers an enterprise version where server hosting is provided and with additional support for a cost [34]. Listing 2.2 shows an example of a recipe for setting up an Apache server using Ruby.

14https://www.chef.io/

9 2.3. Automation tools node.default['main']['doc_root']= "/vagrant/web" execute "apt-get update" do command "apt-get update" end apt_package "apache2" do action :install end service "apache2" do action[ :enable, :start] end directory node['main']['doc_root'] do owner 'www-data' group 'www-data' mode '0644' action :create end cookbook_file" #{node['main']['doc_root']}/index.html" do source 'index.html' owner 'www-data' group 'www-data' action :create end template "/etc/apache2/sites-available/000-default.conf" do source "vhost.erb" variables({ :doc_root => node['main']['doc_root']}) action :create notifies :restart, resources(:service => "apache2") end

Listing 2.2: Chef Ruby example [35]

2.3.3 Puppet

Puppet15 is also a good candidate for automation. The main objective of puppet is to solve the problem of managing multiple servers and computers, much like Ansible and Chef. It is also open-source and written in Ruby. Similar to Chef, it uses an agent for each client that receives and executes the instructions. The instructions that are executed are written in Puppet’s DSL, also called Puppet Code. See Listing 2.3 for a sample of the syntax. This code is used to describe the desired state of the system, not how to achieve that state [36]. Puppet then applies the necessary steps to achieve this state using the agent. However, as the objective of this project is to only set up a certain environment and not maintaining its state, some of Puppet’s features might be irrelevant for this thesis.

15https://puppet.com/

10 2.3. Automation tools case $operatingsystem{ , redhat:{ $service_name= 'httpd'} debian, :{ $service_name= 'apache2'} } package { 'apache2': ensure => installed, } service { 'apache2': name => $service_name, ensure => running, enable => true, subscribe => File['apache2.conf'], } file { 'apache2.conf': path => '/etc/apache2/apache2.conf', ensure => file, require => Package['apache2'], source => "puppet:///modules/apache2/apache2.conf", # This source file would be located on the primary Puppet server at # /etc/puppetlabs/code/modules/apache2/files/apache2.conf }

Listing 2.3: Puppet example based on the official documentation [37]

Puppet Bolt

Puppet Bolt16 is a newer solution from the Puppet developers which is agentless and can use SSH or the Windows Remote Management (WinRM) protocol to connect to machines in a network [38]. It is an open-source project which aims to perform as-needed orchestration directly from a local workstation. Scripts could be executed directly from the terminal or structured into YAML files much like how Ansible does it. See Listing 2.4 for an example of such a file. This could be a potential candidate as it is agentless and seems easy to integrate into an infrastructure.

16https://puppet.com/docs/bolt/latest/bolt.html

11 2.3. Automation tools parameters: targets: type: TargetSpec steps: - name: install_apache task: package targets: $targets parameters: action: install name: apache2 description: "Install Apache using the packages task"

Listing 2.4: Puppet Bolt example [38]

2.3.4 Salt

Salt17 or Saltstack, is an open-source automation tool now owned by VMware and is based on Python. It is a powerful and scalable automation tool that uses ZeroMQ for fast and flexible communication with the managed nodes [39]. There is also the possibility for a node to act as a proxy for another node so that proprietary systems or resource-limited machines can be managed as well [40]. Salt has a structure with a Salt master and several Salt minions, where the Salt master controls the Salt minions that are located on each machine as an agent. As it uses YAML code to structure its instructions, it is very similar to the other alternatives. An example can be seen in Listing 2.5. VMware also offers Salt SSH, which is an agentless version of Salt that executes the com- mands using the SSH protocol. This however negatively affects the speed of Salt, as it loses the benefits of ZeroMQ. This would be an acceptable loss, but unfortunately Salt SSH does not seem to be supported on Windows unless you buy the enterprise version, which also uses WinRM instead of SSH [41]. This enterprise version seems to be part of VMware’s vRealize Automation offering, which costs around 1 400 SEK each year per managed node [42]. This would be very costly with the hundreds of machines in CRATE and is thus not an option.

17https://www.saltstack.com/

12 2.4. Virtual machine setup tools apache2: pkg.installed apache2 Service: service.running: - name: apache2 - enable: True - require: - pkg: apache2

Turn Off KeepAlive: file.replace: - name: /etc/apache2/apache2.conf - pattern: 'KeepAlive On' - repl: 'KeepAlive Off' - show_changes: True - require: - pkg: apache2

/etc/apache2/conf-available/tune_apache.conf: file.managed: - source: salt://files/tune_apache.conf - require: - pkg: apache2

Listing 2.5: Salt example [43]

2.4 Virtual machine setup tools

In addition to the more general automation tools presented so far, there are also solutions that aim to provide a method of setting up a virtual machine with a certain configuration or set of applications. While these sound like an excellent fit for the use case described in 1.1, they come with some limitations that might pose a problem for future expansion or architectural changes.

2.4.1 Boxstarter

Boxstarter18, being created by the company Chocolatey Software, aims to simplify the deploy- ment of Windows machines by automatically installing a given set of Chocolatey packages in these. It can install packages on a remote system and handle a couple of common issues that might prevent packages from installing, such as pending reboots, other concurrent installation processes, Windows updates being installed simultaneously, or disk encryption being enabled [44]. One difficulty however is that it needs to run on a Windows system and can only manage other Windows systems. Since the physical nodes in CRATE are Linux-based, this means that Boxstarter would have to be run locally on each Virtual Machine (VM) and be controlled via Ansible. It would also not be possible to utilize it to control Linux/Unix-based systems in the future.

18https://boxstarter.org/

13 2.5. Vulnerability-related naming schemes

2.4.2 Packer

Packer19 aims to automate the creation of virtual machine images for different platforms and can as part of this process also run other automation tools such as those mentioned in Section 2.3 to install software or make other configuration changes [45]. While this would perhaps be a good solution to automate the deployment of virtual machines in CRATE, it would also require that large parts of the existing infrastructure around preparing these are replaced. For this reason, it might be a good option to consider if the deployment process is to be redesigned in the future, but for this thesis it would likely lead to incompatibilities and issues if the created installation tool should be capable of installing packages on VMs that have already been set up by FOI’s existing tools.

2.5 Vulnerability-related naming schemes

2.5.1 Common Platform Enumeration (CPE) Common Platform Enumerations (CPEs) are a method of representing software products and packages in a structured way [46]. Each CPE entry contains the type of product (e.g. operating system or application), the vendor, product name, and version. The version field can be omitted or replaced with an asterisk if no specific version of the product is referenced. There can also be information about a specific hardware platform, edition, language, etc. included in the CPE, all of which can also be omitted if desired. The current revision of the CPE standard (as of the beginning of 2021) is v2.3 from 2011 [47], but older v2.2 IDs are still common. An example of CPEs for a few applications can be seen in Table 2.1.

Software CPE v2.3 Mozilla Firefox 60 cpe:2.3:a:mozilla:firefox:60.0:*:*:*:*:*:*:* VideoLAN VLC 3.0.12 cpe:2.3:a:videolan:vlc_media_player:3.0.12:*:*:*:*:*:*:* Windows 10 2004 64-bit cpe:2.3:o:microsoft:windows_10:2004:*:*:*:*:*:x64:*

Table 2.1: Example of CPE entries

2.5.2 Common Vulnerabilities and Exposures (CVE) Common Vulnerabilities and Exposures (CVEs) are a way of assigning a unique ID to a pub- lished vulnerability in software. This ID usually takes the form of CVE-YYYY-NNNNNN, where YYYY is the year the vulnerability became publicly known (or when the CVE was requested, if earlier) and a unique number NNNNNN of 4 digits or more. Each CVE entry contains a description of the vulnerability, and references to relevant information such as security advisories or vendor websites [48]. The U.S. National Vulnerability Database (NVD) also maintains a database with additional information for every CVE, namely a Common Vulnerability Scoring System (CVSS) score indicating the severity of the vulnerability, a Com- mon Weakness Enumeration-ID (CWE) that indicates what category of flaw in the that gave rise to the vulnerability (such as out-of-bounds write, use-after-free, improper authentication. etc.), and affected CPEs [49].

19https://www.packer.io/

14 2.6. Metasploit

2.6 Metasploit

Metasploit20 is a penetration testing framework developed by Rapid7 that contains many modules for scanning networks, exploitation, obfuscation, and more [50]. These modules are written in Ruby and conform to a standardized specification that makes it easy to run modules as part of an automated process. As of 2021-02-26, there are 2102 modules available in the ”exploits” category, which all aim to exploit either network-based or local vulnerabilities in applications or embedded systems. The exploits are further divided into groups depending on their method of delivery, or what type of service they target. Remote exploits are launched against a network service without user interaction, while browser exploits attempt to exploit a user’s . File format exploits are delivered as a file that the user needs to open with the vulnerable program. There are also 592 payloads that can be delivered via an exploit to perform some operation. There are payloads for many operating systems and programming languages that can be chosen depending on the exploit target, e.g. an operating system service or a web app. Many of these platforms also have multiple payloads for different actions such as creating a shell that connects back to the attacker, creating a remote desktop session, adding a user account, rebooting the machine, formatting the hard drive, or playing a message over the speakers using text-to-speech.

2.7 Scanning, Vulnerabilities, Exploits and Detection (SVED)

Scanning, Vulnerabilities, Exploits and Detection (SVED) is a framework for designing and executing chains of attacks against any number of hosts in an automated way [51]. It is developed by FOI and used in CRATE for automating attacks using the Metasploit framework (see Section 2.6). An attack in SVED can consist of several actions and flow control that determines which action should be taken depending on the result of a previous one. For example, an attack against a network can start with a port scan of a certain machine. If port 80 (used for HTTP traffic) is open and the returned version matches an expected value, an exploit can be executed against that machine. If successful, the next action could be set up to gather passwords from the compromised machine which are then used to attack a database server, and so on. These attack chains could also include redundancy for failed attacks, such that another service is targeted instead of HTTP if no web server is running, or another machine might be selected as the target. These chains of events are set up manually, but FOI has also created a prototype of another framework that uses artificial intelligence to automatically select the best course of action for every given situation. This should allow for more automated exploitation of networks in CRATE, but so far it is not viable to use in practice. For the purpose of this thesis, the main parts of SVED that are relevant are the administra- tive interface and the injectors. The administrative interface contains an API and a Graphical User Interface (GUI) for creating sequences of actions, and the injectors are virtual machines that can automatically be placed into the correct network segment inside CRATE’s event plane for the purpose of actually launching the attacks that are directed by the administrative part of SVED. Both the SVED manager service and an injector are shown in the center of Figure 2.1. Also see Figure 2.3 for an example of this interface. An overview of the targeted network is displayed in the left-hand pane, and attacks and actions can be added and manipulated in the

20https://www.metasploit.com/

15 2.8. SVED Visualization Tool (SVIZ) right-hand pane. The puzzle-piece icon marked with a red background is the starting point of the attack graph. Depending on the result of the action, execution can continue along either the green arrow for a successful action or the red for a failed one. The displayed example shows a small segment of a larger attack test, where a typical sequence of events is followed as described in Figure 2.2. The VM is first restored from a previously taken snapshot, then the injector is placed into the correct network segment, after which it checks if the target is responding, and finally it launches the exploit. If it succeeds, the established session is logged as a success and then destroyed, and then (or if any step fails) the VM is again restored from a snapshot in preparation for the next attack.

Restore VM Prepare injector Check target VM Launch exploit Collect logs

Restore a snapshot of Place the injector into Attempt to ping the The injector launches Logs are collected the VM's state taken the same network as target VM to see if it is the exploit against the from the injector and before any attacks the target so that the functioning correctly target machine stored for later were launched injector can analysis communicate with it

Figure 2.2: Typical attack sequence in SVED

Figure 2.3: Screenshot of the attack graph creator in SVED

2.8 SVED Visualization Tool (SVIZ)

To analyze the logs generated from SVED, the SVED Visualization Tool (SVIZ) is used to gain an understanding of which actions and attacks succeeded and which failed. SVIZ was developed by Bedhammar and Johansson as part of their thesis project at FOI and can process the log files emitted by SVED to display the results in relation to their corresponding action in SVED [52]. See Figure 2.4 for an example of the interface. Exploits are represented with triangles and supporting actions such as resetting VMs are represented by cogwheels. Suc- cessfully completed actions are displayed in green, pending ones in yellow, and failed actions are marked with red. By using this tool, locating interesting events in a larger set of attacks is significantly easier, especially when having multiple machines running in parallel.

16 2.9. Related work

Figure 2.4: Screenshot of the attack graph section of SVIZ

2.9 Related work

Cyber range testing is an area that is currently evolving around the world [10, 13]. It’s not a new idea, but it is highly relevant now that we are connected online more than ever. The need for an isolated space where cyber security and cyber defense could safely be tested and practiced is increasing. A cyber range is also a suitable place to arrange Capture The Flag (CTF) competitions which are useful to gain practical experience within different areas of cyber security. There is development in progress within academia, the industry, and the military, some with more open information available than others [11]. This section will process some of the related work that is openly available. At FOI there has been a lot of research already using CRATE. Holm and Sommestad [53] used CRATE to do an empirical study of how reliable automated attacks are with as little modification as possible to show if the need for advanced knowledge of systems and computers are necessary to perform a hack, or if a person without specialized knowledge can succeed thanks to the advancement of ready-made offensive cyber tools. The study performed 1223 exploitation attempts using 45 unique exploits on a total of 204 virtual machines, where the exploits were chosen based on the result of automatic vulnerability scanning. Only eight of the launched exploit attempts succeeded, all of which were from a single unique exploit module. Gustafsson and Almroth studied in their article [54] the tools used in the automation of cyber ranges. First, they look at the current status and research within the field of automation for cyber ranges. Then they describe how CRATE is built, what automation tools are used within this system, and what role the tools have in research and training. Later the authors compare which automated tools are currently used at different cyber ranges and state which parts of each cyber range are automated. This article concludes that automation has been utilized within cyber ranges for many years, but there is still a need for further improvements to be able to meet the demand for research and training capabilities. The task of finding relevant published vulnerabilities given a software’s name and version has been examined in earlier studies. Since CVEs are of interest to systems administrators that wish to keep the machines under their control safe from vulnerabilities, Sanguino and

17 2.9. Related work

Uetz [55] have examined the possibility of using an inventory of installed software products to automatically pick the best matching CPEs for each installed software. This would then be combined with automated monitoring of newly published CVEs to alert an administrator if a new vulnerability is published for a CPE that matches an installed piece of software. The authors use the edit distance between two strings to compare installed programs against CPEs, but also take advantage of the fact that their input data has separate fields for vendor and software name. This allows the matching algorithm to only match name Ø name and vendor Ø vendor instead of accidentally allowing matches between vendor and name from different software products. The authors conclude that using a fully automated assignment of CPEs in this fashion leads to many instances where the matching algorithm assigns erroneous identifiers because of differences in naming conventions between the application vendor and the CPE directory. Thus, a manual review of the suggested candidates from the automated system was necessary to maintain an inventory of CPEs that could be used for matching against the feed of new vulnerabilities. Pham et al. propose the Cyber Range Instantiation System (CyRIS) [56] as a tool to create and set up a cyber range automatically. In many ways, it is similar to how CRATE works, with the use of virtual environment and setup automation. However, it is limited to one OS as of publication, CentOS 7, and is mainly focused on setting up specific training scenarios. CRATE already has many parts implemented and is more versatile in that it is used for both research and training in different scales. Parts of the work could be of interest but would be of more use in the setup of Linux machines, which are not in the scope of this thesis. There are several methods for automating software installation as described in Section 2.2 which all rely on a preexisting repository of scripts for passing the correct flags or options to each piece of software. If the desired software is not available in such a repository, it would have to be packaged in some way to allow for unattended installation. A study of different methods of achieving this [57] examines four ways that applications could be automatically installed. A script that sends keyboard and mouse input to interact with the software installer is the basis for three of these, where the method of generating the script differs. It can either be created by hand, by automatically finding text and buttons that according to a ruleset indicate which stage the installation process is in and which action should be taken, or by monitoring a user installing the software and recording the actions taken by that user. The fourth method presented is to again let a user install the program, but record all files and registry entries that are created and packaging these into a bundle that can later be recreated on a target machine. The author concludes that the methods that rely on interacting with the installer might be too time-consuming for the process to be worth the effort, or too unreliable in different environments where parameters like screen resolution might change the position of a button from what the script is expecting. They instead suggest that the method involving recording changes to the filesystem and the registry is the most reliable one due to it not needing to interact with the software installer at all. While a with already written scripts for installing packages would likely be the most convenient solution, these alternative methods of installing applications can be more relevant when it comes to automatically preparing the software for exploitation attempts or general use. As stated in the research questions (Section 1.3), the software that is to be installed in CRATE should ideally be configured such that they are vulnerable to the attacks that are to be tested against them. This would in most cases include emulating a normal work- ing environment where the programs have been started at least once, as opposed to a fresh installation where a license agreement might be displayed before the program is launched. This could interfere with the attack if, for example, the program does not expose certain func- tionality (or even launch) until the agreement has been accepted. From an overview of the installation scripts supplied in the repositories of existing package managers for Windows (see Section 2.2) it seems like these are only concerned with installing the application and not with the initial launch or configuration of it. Using a script or a registry + filesystem snapshot

18 2.9. Related work like the author examines [57] could be a solution for configuring applications and software packages to be in the proper state for them to be vulnerable to attacks. A study by Dashevskyi et al. [58] aims to provide a solution for creating environments for security testing in a repeatable way. The authors mainly focus on web applications and related exploits, with Docker21 containers being used instead of virtual machines to host the vulnerable services. Configuration of these can be done through the use of Dockerfiles, which can contain instructions on how to build the containers via shell commands and the copying of files from the host system. This approach could likely be used for other applications as well, but might not be a good fit for client-side applications since Docker containers are supposed to run isolated from both each other and the rest of the system [59]. Combined with the limited support that Docker has for Windows, this could make it difficult to run regular desktop applications that use GUIs or use other types of functionality that might not be accessible from inside a container.

21https://www.docker.com/

19 3 Method

This chapter presents the method that was used, the process of building the tool and testing the result.

3.1 Automating software installation

The contents of this section primarily serve as a method of answering RQ2, but they also contain information that is relevant for the other research questions. The information flow that this work aims towards can be seen in Figure 3.1, where the administrator specifies in JavaScript Object Notation (JSON) format which programs each machine should have installed. An example is shown in Listing 3.1. This should then be processed to be compatible with an automation tool that orchestrates the installation process. As CRATE utilizes VirtualBox to host the virtual machines, it would be preferable if it would be possible to let the communication go through VirtualBox’s API, as this would reduce the amount of network traffic generated that could disturb exercises. The use of SSH or WinRM could be used if the VirtualBox API does not work. The automation tool then instructs the package manager on the virtual machine what programs to install. Logs from the installation should then be transferred back to the administrator in JSON format. To decide which package manager and automation tool would be suitable for this project, those presented in Sections 2.2 and 2.3 will be compared in the following section.

20 3.1. Automating software installation

{ "vm_name": "win10-ent2004-x64-off2013.src", "username": "Administrator", "password": "admin_password", "packages":[ { "name": "vlc", "version": "3.0.14", "platform": "win_chocolatey" }, { "name": "firefox", "version": "89.0", "platform": "win_chocolatey" }, { "name": "git", "version": "2.31.1", "platform": "win_chocolatey" } ] }

Listing 3.1: Package list in JSON format

Control Plane Event Plane

Administrator

Virtual machines

Physical Node Virtual machines

Figure 3.1: Information flow for automated software installation

3.1.1 Package managers selection A comparison of the latest available versions and their age for a few applications is shown in Table 3.1. Some package managers have packages that refer to installers or Uniform Resource Locators (URLs) which point to the latest version available from the software vendor, these are marked with ”latest” in the table. These packages do not need to be updated by their author whenever a new version of the packaged application is released, but it becomes difficult to determine which version will be installed since the one specified in the package’s description or metadata will not be accurate. While having recent versions of software available is important for many users, the avail- ability of older versions is of interest for the questions examined in this thesis. Since as many exploits as possible should be able to be tested, there is a need for a large set of older versions to increase the chance that a vulnerable version that matches the exploit is found. A compari- son of the availability of older versions of a few common applications is shown in Table 3.2. Note that Google Chrome is only available as an auto-updating package from Chocolatey and WinGet, meaning that it is not possible to install older versions since all older versions

21 3.1. Automating software installation

Google Chrome Mozilla Firefox LibreOffice VLC Package manager Version Age Version Age Version Age Version Age Chocolatey latest 1 85.0.2 14 7.1.0 20 3.0.12 37 WinGet latest 1 84.0 70 7.0.1 179 3.0.11 253 Scoop 87.0.4280.88 83 85.0.2 14 7.1.0.3 17 3.0.12 37 AppGet latest 1 latest 0 6.4.2.2 334 3.0.11 253 Ninite 88.0.4324.190 1 85.0.2 14 7.1.0 20 3.0.12 37

Table 3.1: Latest versions and their age (days old as of 2021-02-23) available from different package managers

in the repositories still download the latest version of Chrome during the installation process. From Table 3.2 it is clear that Chocolatey offers the largest selection of versions, and thus it is the most relevant for use in this project. While the presence of older versions in the package repository is not a guarantee that they all work, as some applications may have been removed by the vendor or otherwise broken over time, having the ability to attempt an installation of an older version is better than only being able to install the latest version. As mentioned in Section 1.6, the presence of older versions, which might contain vulner- abilities, does not at all indicate that a certain package manager is more vulnerable or worse than another. A regular user should typically always install the latest available stable version of each package, which is the default for all the examined package managers. The availability of older versions does not constitute a security risk to regular users of these managers. Scoop, AppGet, and Ninite all lack the ability to specify a version when installing a pack- age, which limits the number of available versions to one. Installing older versions with Scoop or AppGet could be done by extracting older versions of a package’s metadata from the ver- sion control systems used by these tools to manage their software repositories. However, this requires additional work if two applications from different points in time are to be installed at the same time, and also complicates the process of locating the correct package given a desired software name and version.

Google Chrome Mozilla Firefox LibreOffice VLC Package manager Versions Versions Versions Versions Chocolatey 1 192 42 47 WinGet 1 18 9 2 Scoop 1 1 1 1 AppGet 1 1 1 1 Ninite 1 1 1 1

Table 3.2: Number of versions available from different package managers

3.1.2 Automation tool selection The automation tools listed in Section 2.3 all present similar solutions to the same problem: performing a sequence of actions automatically on a set of remote machines. Thus, the choice of which one to use is mostly dependent on the need for special features, such as the prefer- ence for agentless solutions for this thesis, personal preference regarding the language used

22 3.1. Automating software installation to define actions, or which have been used earlier for other projects. In this case, existing work has already been done at FOI using Ansible in combination with VirtualBox, and we have some previous experience with Ansible and Python. The Chocolatey developers have also compared different automation tools that Chocolatey works well with [60]. As seen in Table 3.3, Ansible and Puppet both support all examined features from Chocolatey. As Ansible has support for all the examined features and is agentless, these facts make it a fitting solution to use for the rest of the thesis.

Configuration managers Ansible Chef Puppet Salt Manage packages  Install Chocolatey  Install Chocolatey from internal source  Manage sources  Manage source type  Manage features  Manage config settings 

Table 3.3: Chocolatey’s comparison of automation tool integration [60]

3.1.3 Summary of the tool selection From Sections 3.1.1 and 3.1.2 it can be seen that Ansible and Chocolatey seem fitting for use in this thesis, and they will therefore be the basis for the rest of the work described in this chapter. While tools like these can be used in combination to install programs automatically, FOI would like a tool that can produce a report on what software and versions are available to be installed, how the installation went, what errors if any were encountered, etc. in a format that can be integrated into another interface. To avoid making large changes to this future integration, it would also be beneficial to have this tool act as a unified interface to several package managers if one lacks some desired applications or other operating systems are to be supported. This means that Ansible and Chocolatey can be used to do the heavy lifting, with our developed tool serving as a wrapper to these that exposes the needed functionality in a machine-readable format. If needed, this construction also allows for the replacement of Ansible/Chocolatey if another tool is later determined to be a better option.

3.1.4 Chocolatey list of packages While FOI had an existing list of 1 615 Chocolatey packages with different versions for 149 unique packages, this was a small subset of the thousands of packages available in Choco- latey’s community repository. There was also more information available about each package, such as the name of the vendor, a description, publication date, and more, which could be use- ful to have. Unfortunately, this information is not easily gathered without making thousands of requests to Chocolatey’s web API. Through monitoring the traffic generated when using Chocolatey’s Command Line In- terface (CLI) tools to view available packages, it was discovered that it sends around 700 requests totaling 120 MB of data to display the names and latest version of around 5000 pack- ages. Based on a few package samples, it was determined that each package has around 10 versions on average, which would require 2 000 requests to download all packages and ver- sions. Since this would be equivalent to making three "list packages" calls with the CLI tools, it was deemed non-invasive enough that it would be feasible. A one-second delay was intro- duced between each request to decrease the impact on Chocolatey’s servers, and the entire

23 3.1. Automating software installation

package listing was then downloaded. In total, 3 118 requests were needed to accomplish this, which was slightly more than expected but still within acceptable limits.

3.1.5 Feedback from installation It is important to get feedback from the installations on the virtual machine in order to see the results from the installations. Without them, it is hard to know which software exists on the virtual machine. It is also relevant to know the reason why a potential installation failed. As the feedback Chocolatey gives from the installations is local to each virtual machine, the information needs to be transferred to the host in a suitable manner. Chocolatey gives feedback right to the terminal, with the possibility to set a verbose flag to get different amounts of information. Ansible has good support for extracting the response from Chocolatey that can be converted into JSON files. By extracting information from these JSON files, only the most relevant information could be filtered to be displayed in the terminal or somewhere else. The JSON files consist of all the logs regarding Ansible and Chocolatey, how the tasks went, potential error codes, and more. Listing 3.2 contains an example of this output for a package installation that failed due to a file that was missing from the vendor’s servers.

3.1.6 Database To keep track of all the programs and the including information about attacks, installation results, versions, and others, a database was set up. To simplify the deployment of this database, it was placed on one of the workstations that were used during development (seen in the top-left of Figure 2.1). MariaDB1 was chosen as a database to be able to collaborate with updating the collection of installation results and other data. The database consists of eleven columns: software, cpe, action, name, version, verified_working, installs_ok, result, interaction_required, comment, and result_internal. The first three are from the Chocolatey list of packages above, followed by a parsed name and version from the software column and then information about the installation. An example of this can be seen in Table 3.4. The output from the installation process for each package is parsed for common error strings to present a more condensed version of the error message. A result code based on this is also stored in the database to indicate what type of error occurred. Table 3.5 contains a list of these result codes.

name version verified_working installs_ok result interaction_required ... Git 2.19.0 False True ok — ... OpenOffice 4.1.7 False True ok — ... openvpn 2.2.2 False True installer_hangs yes ...

Table 3.4: Example of database structure

3.1.7 Mapping to vulnerabilities To answer RQ3, exploit testing against the installed programs will be performed. To know what exploit to use against a specific program, a mapping needs to be created between the program and possible exploits. As an exploit often is tied to a specific version or multiple versions of a program, the use of CPE codes (as described in Section 2.5.1) creates a unified and standardized name for each program and version. Each of these is then matched to a set of possible Metasploit modules that can be used to exploit the installed program.

1https://mariadb.org/

24 3.1. Automating software installation

{ "uuid": "39d0dd24-2e12-4a75-a1f3-52b107db3ed3", "counter": 25, "stdout": "fatal: [win10-ent2004-x64-off2013.src]: FAILED! => {\"changed\": ãÑ false, \"command\": \" [...]\"", "start_line": 26, "end_line": 28, "runner_ident": "d41cedc2-1f22-4c14-8036-1bb3bc4420c1", "event": "runner_on_failed", "pid": 64765, "created": "2021-05-04T11:56:22.861142", "parent_uuid": "9f306331-db9d-d558-fcd1-00000000000d", "event_data":{ "playbook": "playbook.yml", "playbook_uuid": "e05eb35e-ad1c-4e3a-a992-7e8b53bb795a", "play": "Install Chocolatey packages", "play_uuid": "9f306331-db9d-d558-fcd1-000000000006", "task": "Install WindowsAzureLibsForNet 2.9", "task_uuid": "9f306331-db9d-d558-fcd1-00000000000d", "task_action": "win_chocolatey", "task_args":"", "task_path": "playbook.yml:36", "host": "win10-ent2004-x64-off2013.src", "remote_addr": "win10-ent2004-x64-off2013.src", "res":{ "changed": false, "invocation":{ "module_args":{ [...] } }, "rc": 404, "command": ":\\ProgramData\\chocolatey\\bin\\choco.exe install ãÑ WindowsAzureLibsForNet --fail-on-unfound --yes --no-progress ãÑ --limit-output --timeout 2700 --version 2.9", "stdout": "Installing the following ãÑ packages:\\nWindowsAzureLibsForNet\r\n [...]", "stderr":"", "msg": "Error installing package(s) 'WindowsAzureLibsForNet'", "stdout_lines":[ "Installing the following packages:", "WindowsAzureLibsForNet v2.9 [Approved]", "windowsazurelibsfornet package files install completed. Performing ãÑ other installation steps.", "Attempt to get headers for ãÑ https://download.microsoft.com/download/B/4/A/B4A8422F-C564- u ãÑ 4393-80DA-6865A8C4B32D/MicrosoftAzureLibsForNet-x64.msi ãÑ failed.", " The remote file either doesn't exist, is unauthorized, or is ãÑ forbidden for url [...]", "Error while running 'C:\\ProgramData\\chocolatey\\lib\\ u ãÑ WindowsAzureLibsForNet\\Tools\\ChocolateyInstall.ps1'.", "Chocolatey installed 0/1 packages. 1 packages failed.", " See the log for details ãÑ (C:\\ProgramData\\chocolatey\\logs\\chocolatey.log).", ], "stderr_lines": [], }, "start": "2021-05-04T11:55:59.944377", "end": "2021-05-04T11:56:22.861014", "duration": 22.916637, "ignore_errors": true, "uuid": "39d0dd24-2e12-4a75-a1f3-52b107db3ed3" } }

Listing 3.2: JSON example from Ansible

25 3.1. Automating software installation

Result Description ok Installation completed successfully. not_found One or more files referenced in the package could not be down- loaded from the Internet. possibly_broken The package was marked as possibly broken by Chocolatey due to failed automated tests. not_in_repo The package was not found in Chocolatey’s repository. Typos can cause this, but when using an existing list of packages the most likely cause is that the package have been removed from the repository. checksums_mismatch The checksum of a retrieved file does not match the one specified in the package. installer_hangs The installation process never finished and had to be cancelled. This can occur when an installer requires user input, which causes trouble in an automated install. other_error Another type of error occurred that caused the installation to fail.

Table 3.5: Possible installation result codes

This is done by using a script provided by FOI that uses string matching logic based on an input, in this case program name, vendor, and version. The output is a Comma-Separated Values (CSV) list with the name and version of the software, CPE code, the confidence level of the match between the program and CPE code, possible exploits for that CPE code and the confidence level of the match between CPE and exploit. See Table 3.6 for an example of the suggested CPE and its matching exploit for a few applications. However, text matching doesn’t always provide a correct CPE or exploit match, which is essential for a successful exploit. Because even if a program is vulnerable, attacking with the wrong exploit would fail. Optimizing this algorithm is out of the scope for this thesis, but improvements can be done without changing the algorithm itself by adjusting the amount of data to give it for each package such that it can produce a good match. However, providing too much information could give it an opportunity to match against irrelevant information, so some care has to be taken when deciding what input data to provide to it. The result of this mapping is also stored in the database of programs. While this process works quite well when there are exact matches to be found or only a few CPEs that are relevant to the given application, it can often suggest mappings with high confidence even though there are small differences that can be very important when looking for a vulnerable version. In the first row of the example in Table 3.6, the package -manager 2.3.0 has been mapped to the CPE corresponding to PHP 4.3. This is completely wrong, as php-manager is used to manage several PHP installations and will not be vulnerable to attacks that target PHP itself. The second row is more accurate, where there is only a small difference in the version numbers. The third row is an exact match, and the associated exploit should hopefully work on that software. The last row has the correct application name, but version 7.34 is mapped to a CPE with version 6.34. The string-based matcher suggests this with high confidence since only one character differs, but the difference between version 6.34 and 7.34 is very likely too large to make the exploit targeting 6.34 work on version 7.34. The CPEs also include a field with the name of the application vendor, which is sometimes missing or the same as the application name, but when present it can be matched against by the tools created by FOI. However, the vendor is usually absent from the name of Chocolatey packages, with the package containing Mozilla Firefox simply being called firefox. This re- duces the opportunities that the matcher has to produce accurate results, and thus it would be

26 3.1. Automating software installation

Software CPE Action php-manager 2.3.0 cpe:/a:php:php:4.3 exploit/multi/http/activecollab_chat IcoFx 1.6.4.02 cpe:/a:icofx:icofx:1.6.4 exploit/windows/fileformat/icofx_bof freeSSHD 1.2.6 cpe:/a:freesshd:freesshd:1.2.6 exploit/windows/ssh/freesshd_authbypass mirc 7.34 cpe:/a:mirc:mirc:6.34 exploit/windows/misc/mirc_privmsg_server

Table 3.6: Example of Chocolatey packages with CPEs and actions

beneficial to also have the vendor name available for every package in the Chocolatey reposi- tory. Thanks to the list of packages retrieved from Chocolatey in Section 3.1.4, this information is now available for every program in the repository.

3.1.8 Online installation tests To verify which Chocolatey packages work and can install their program correctly, automated tests were conducted. Local virtual machines (outside CRATE so that they have access to the Internet) running Windows 10 were used as a test bench to install the different programs. By using the created tool, a set of untested programs were chosen in a random manner such that only a single version of a program is tested on the same machine. Multiple sets can be created to test multiple versions in parallel, but this requires additional virtual machines as installing two versions of the same package on a single machine would create conflicts. Unfortunately, VirtualBox’s API might not be thread-safe, as concurrent runs sometimes failed due to VirtualBox-related errors. This means that even with multiple virtual machines, the installation tests cannot be performed on the same computer without interfering with each other. One solution would be to utilize multiple computers, which for this project is limited to two physical workstations and four physical nodes in CRATE.

3.1.9 Rate limiting and excessive use Chocolatey has implemented a rate limiting [61] system to mitigate against misuse of their API where if a certain number of requests are made within a certain period of time they will temporarily ban the requesting Internet Protocol (IP) address for one hour. Not all API calls are considered the same, for example, downloading Chocolatey itself is limited to five occurrences per minute per IP address, and downloads of other packages are limited to 20 attempts per minute per IP address. Individual users will probably never reach this limitation but organizations that are behind a proxy or share an IP address through Network Address Translation (NAT) may very well reach this limit. They also have a different API to fetch information about each Chocolatey package, which the Chocolatey command-line tool uses for listing available packages. This API is not as sensitive to large amounts of traffic as the others, as Chocolatey’s listing command makes several hundred requests over a small period of time. However, while it may not be as sensitive it is important to not abuse it. Chocolatey also has a more severe mechanism to protect themselves and the community from excessive use [62], where the result is a temporary (but not automatically lifted) IP address ban. The exact limit is not publicly available but they provide a rough guideline of how clients using their API are supposed to behave. Excessive use could be for example if one were to install 100 packages per hour on average during some duration of time [62]. As Chocolatey expects that this would happen due to misconfiguration and not malicious intent, they can be contacted to resolve the issue. A simple solution to avoid putting too much stress on Chocolatey’s services would be to simply implement a delay between downloading each package, but that would only be

27 3.1. Automating software installation effective on a very limited number of machines and does not scale well at all when setting up whole networks and multiple machines. A better solution that is also recommended by Chocolatey would be to instead host an internal repository of packages, which can also cache downloaded packages from the Internet to reduce the load on their servers if a package is downloaded multiple times [62].

3.1.10 Internal repository This section relates to RQ1 and will focus on how Chocolatey packages can be stored reliably in an internal repository.

Motivation As mentioned in Section 3.1.9, Chocolatey recommends that organizations host their own internal repository for multiple reasons, not only does it reduce traffic load to their servers but it also provides stability and reliability where dependencies can be stored on-premises. This also decreases the installation time for larger packages, as fetching files from an internal server is likely to be faster than downloading data directly from the Internet. As mentioned in Section 2.2.1, there are limitations on what Chocolatey can have in their public repositories due to distribution rights. Their automatic packaging service is also not feasible due to their pricing plan, which would cost upwards of 150 000 SEK per year assuming all VMs in CRATE would need to be managed using this system. From a previous study at FOI, a proof of concept was made regarding how Chocolatey might be used in CRATE. This study concluded that Chocolatey was a good candidate for CRATE in regards to the installation of Windows programs but more work was needed. An- other outcome from this study was an internal repository based on Sonatype Nexus Repository Manager OSS2, which got populated with a few Chocolatey packages. The location of the repository inside CRATE can be seen in the bottom-center of Figure 2.1. But for the repository to be useful, Chocolatey packages need to be downloaded together with all of their dependen- cies and then repacked so that packages can locate everything they need from this repository. Simply using a caching repository will not work, since it can only cache package downloads and not the actual installers that are typically retrieved from the Internet when the package is installed.

Internalization To solve the issue of the packages not being self-contained, a tool was created which auto- matically downloads and repackages (internalizes) existing packages. First of all, information about the packages was needed. A Python script was created to fetch information about all the available Chocolatey packages, containing name, versions, URLs, and validation info among others. Precautions were made to ensure that the API was not stressed too much. This information was structured in the eXtensible Markup Language (XML) format which allowed for easy extraction of the relevant information. The validation info is the result of Chocolatey’s automatic validation tests, which indicate if packages can be installed or if they are broken. They firstly test if the package contains the necessary metadata according to Chocolatey’s established specifications [63] and then if the installation and uninstallation can be done in silent mode without user interaction [64]. These tests are done periodically over both new and already existing packages. However, there are still some old versions that might still be broken even if the validation info states otherwise, the validation and verification information can therefore not be blindly trusted. But with this information, the known broken packages can be removed. The XML file is also used to get information about the vendor for the programs.

2https://www.sonatype.com/nexus/repository-oss

28 3.1. Automating software installation

This information improves the mapping between the program and the CPE codes which is explained in Section 3.1.7. The second part is to download the packages, their dependencies and repack them before uploading them to the repository. This is done by using a Python script that requests the package using its name and version to downloads the content to a temporary directory. The downloaded package is a NuGet package which is then opened and modified. The files that need to be modified are the specification file which states what dependencies the package has, and the Powershell scripts which contain the URL addresses that are relevant to the package installation. This part is done recursively so that all dependencies are downloaded and rewritten as well. Each file in the package is searched for URLs through the use of a regular expression (regex). Since these URLs can appear inside comments or refer to documentation instead of a binary file, the matched URLs are filtered such that those that contain one of the extensions in Listing 3.3 are downloaded.

[.exe,.msi,.zip,.7z,..gz,.key,.pgp,.gpg,.asc,.sig, .xpi,.sha256,.msu,.iso,.rar,.plgx,.jar,.msp]

Listing 3.3: List of file extensions that are downloaded by the internalizer

This is not an exhaustive list, as it is difficult to account for every type of file that an application could need to download. The above list was built iteratively by attempting to internalize a set of packages, identifying which files failed to be downloaded, adding their extensions to this list, and repeating. As a measure to account for unknown file types, URLs that appear as part of a variable assignment to a variable containing the substring ”url” are downloaded even if they do not contain any of the extensions listed above. This is because Chocolatey packages often seem to contain a variable named url, url64, urlbase or similar that is downloaded as part of the installation process. As some vendors use URLs such as https://example.com/download?version=latest that lack any extension that can indicate which type of file will be downloaded, this can help to successfully internalize more packages while avoiding URLs that point to documentation or other info that is irrelevant for the automated installation process. Each downloaded file is then uploaded to the internal repository and the URL pointing to it is rewritten such that it points to the internal repository instead of the Internet. To improve the ratio of packages that can be successfully internalized, the script can also be run in a semi-manual mode where the user is given the option to ignore missing files (for when an URL was incorrectly identified or not needed for a successful installation) and prompted for a replacement value for variables that appear inside URLs. As some package maintainers support multiple languages or simply want to reduce the effort of updating a package whenever a new version of the program is released by the vendor, URLs can contain variables that are evaluated and replaced at installation time. This poses a problem for the internalizer as the URL is not valid when extracted from the file. For simpler types of variable references where the variable can be replaced by a simple search-and-replace operation, the user-provided value can be used to remove the variable reference and replace it with a static value. Some packages use variables for things such as language/locale or the current package version, which can be identified by the user of the script and entered manually. This can limit some functionality such that only a single language can be installed, but this is better than the package failing completely. An assisted manual mode is also implemented where the user is offered to manually edit each script inside the package, which can be useful for more complicated variable replace- ments. The script then takes care of retrieving the URLs, rewriting them, and pushing the

29 3.1. Automating software installation modified package to the internal repository. This saves time over completely manual packag- ing and offers a lot of flexibility for packages that are deemed as must-haves. When the package has been modified, it can be uploaded to the internal repository which is done by using an API. See Figure 3.2 for an example of how the repository manager interface looks after a few packages have been uploaded. To use the internal repository instead of the public one, one simply needs to change the source for Chocolatey which can be done using Ansible.

Figure 3.2: Internal repository using Sonatype Nexus 3

3.1.11 Offline installation tests After having internalized (see Section 3.1.10) packages that could successfully be installed with access to the Internet, these were also tested offline by setting up Chocolatey to pull packages from the internal repository instead of the public community repository. This serves to validate if the packages still work correctly after having been repackaged or if there are Internet-dependent parts remaining that the internalizer could not rewrite. These offline tests are conducted within CRATE, which means no access to the Internet and limited ways to access the VM itself. Traffic to the internal repository is routed through a proxy server which allows access to the repository from inside the isolated event plane (seen at the bottom of Figure 2.1). To minimize the network traffic and logs produced on machines inside CRATE, the preferred way to control the VMs is through the VirtualBox API3. This API also has a community-made Python binding4 that FOI has used to create a plugin for Ansible that allows it to communicate over this API instead of over the network via WinRM. With only a little modification, this plugin could be used to execute playbooks from the physical node with instructions of which programs to install and Chocolatey would install them inside the VM.

3https://www.virtualbox.org/sdkref/index.html 4https://pypi.org/project/virtualbox/

30 3.2. Selecting exploits for evaluation

3.2 Selecting exploits for evaluation

This section, and the following ones, will present the method used to answer RQ3.

3.2.1 Version difference To filter out some of the exploit suggestions from Section 3.1.7 that are less likely to be ex- ploitable, each program’s version and its matched CPE version were compared using a dis- tance metric more suited for version numbers. The method used for this consisted of com- paring each part of a dotted version specifier between each pair of supplied versions. The absolute difference between the integers making up these parts was multiplied with a weight- ing factor (starting at 10) that was decreased by a factor of 10 for each part from left to right in the version number. These differences were then summed up for the entire version number and returned as the distance metric between each pair. The comparison is stopped when one of the versions has no more parts left to compare. Continuing with the assumption that miss- ing fields should be filled with zeroes could have been better in some cases, but Chocolatey packages sometimes have the date when the package was created appended to the version number. This would have caused large differences between for example 1.2.3.20210303 and 1.2.3 (a difference of 202103.03 in this case) even though the actual program versions are very similar or exactly the same. See Table 3.7 for an example of the distance metric being calculated for a few version pairs. Listing 3.4 also contains pseudocode of the metric calculations.

Program version CPE version Distance metric 6.6.0 6.2.0 4.0 2.4.4 2.0.1 4.3 2.2.4 2.2.8 0.4 1.3.1035 2.1 12.0

Table 3.7: Example of version differences for pairs of program versions and CPE versions

def version_similarity(ver1, ver2): diff=0 weight= 10

# Pair up major versions, then minor, then patch level for ver in zip(ver1.split('.'), ver2.split('.')): ver1_part= strip_nondigit_chars(ver[0]) ver1_part= int(ver1_part) if ver1_part else 0

ver2_part= strip_nondigit_chars(ver[1]) ver2_part= int(ver2_part) if ver2_part else 0

diff= diff+ abs(ver1_part- ver2_part) * weight weight= weight/ 10

return diff

Listing 3.4: Pseudocode for the version difference metric

31 3.2. Selecting exploits for evaluation

3.2.2 Name and vendor difference Another filtering step that was applied to the list of packages consisted of utilizing the struc- tured information about each Chocolatey package to validate the given CPE suggestions. As the metadata contains separate fields for e.g. application name and vendor, a distance metric can be calculated between name Ø name and vendor Ø vendor for a given application and its suggested CPE. The provided mapping script is designed to accept unstructured input and will attempt to match any part of a candidate CPE with any part of the input data for a given application. Filtering based on the similarity between fields of the same type will reduce the number of erroneous suggestions while avoiding the need to reimplement the exploit-suggesting part of the mapping script. The Python package fuzzywuzzy5 is used to calculate the distance metric between the package metadata and the suggested CPE. This package uses a similar method of this distance to the one used by FOIs existing script for mapping free-form text to CPEs, but also includes preprocessing steps that can ignore word order or allow partial matches between text strings [65]. This can be useful for packages where the Chocolatey package name might deviate from the one specified in the CPE, such as VLC against VLC Media Player, but can also make Git for Windows accidentally match versions of the Windows operating system itself. Having the ability to separate the vendor field from the name field helps in this regard since it is unlikely that the Git for Windows vendor The Git Development Community will match the Windows vendor Microsoft. Listing 3.5 contains a pseudocode description of the metric calculations. from fuzzywuzzy import fuzz def get_similarity(vendor, name, version, cpe): cpe_vendor, cpe_name, cpe_version= split_cpe(cpe) cpe_vendor= cpe_vendor.replace("_","") cpe_name= cpe_name.replace("_","") vendor_ratio= 100- fuzz.token_set_ratio(vendor, cpe_vendor) name_ratio= 100- fuzz.token_set_ratio(name, cpe_name) ver_similarity= version_similarity(version, cpe_version) return (vendor_ratio, name_ratio, ver_similarity)

Listing 3.5: Pseudocode for the name and vendor difference calculation

An example of all three metrics (vendor difference, name difference, and version difference) can be seen in Table 3.8. Note that the match ratio for vendors and names produced by fuzzywuzzy have been inverted to match the behavior of the version difference field, where a lower value indicates a closer match. Thus, the vendor difference and the name difference fields range from 0 to 100 with 0 being the best possible match, and the version difference field ranges from 0 to 8. An example of the same mappings being compared with partial matches allowed is shown in Table 3.9. Python and Adobe Flash Player get lower differences since they partially match the suggested CPE, but other applications with less overlap can remain at the same difference. The application dotnet-windowshosting also becomes a much better match with the operating system Windows NT since ”Windows” is present in full in the application name, which is less desirable. For the exploit selection process, non-partial metrics were used since this would reduce the number of false positives where an application is matched against an incorrect CPE.

5https://pypi.org/project/fuzzywuzzy/

32 3.3. Automatic testing of exploits

Vendor Application Version Vendor Application Version CPE diff diff diff python python2 2.7.17 cpe:/a:python:python:2.6.7 0 8 2.0 imageglass imageglass 4.0.4.15 cpe:/a:php:php:5.4.0 100 100 14.4 adobe flashplayerppapi 27.0.0.183 cpe:/a:adobe:flash_player:7 0 21 200 dot dotnet-windowshosting 5.0.3 cpe:/o:microsoft:windows:nt 67 50 50

Table 3.8: Example of name and vendor differences

Vendor Application Version Vendor Application Version CPE diff diff diff python python2 2.7.17 cpe:/a:python:python:2.6.7 0 0 2.0 imageglass imageglass 4.0.4.15 cpe:/a:php:php:5.4.0 100 100 14.4 adobe flashplayerppapi 27.0.0.183 cpe:/a:adobe:flash_player:7 0 8 200 dot dotnet-windowshosting 5.0.3 cpe:/o:microsoft:windows:nt 33 0 50

Table 3.9: Example of name and vendor differences with partial matches allowed

3.2.3 Exploit selection criteria Out of the applications which could successfully be installed (see Section 3.1.8) and packaged (see Sections 3.1.10 and 3.1.11), those with an application name difference and vendor differ- ence of 15 and below and a version difference of five and below were selected as candidates for exploitation. These values were selected to limit the total number of attacks to a man- ageable level where some of the most irrelevant attacks would be skipped. As Metasploit contains around 2 000 exploits, and these only work against a subset of versions for the soft- ware they target, testing more software and version combinations than this would also lead to many attacks failing since all available exploits will have been tested already. Thus, if the best-matched exploits are tested first, there will be limited benefit in continuing to test exploits past this point. Allowing a small difference for the name and vendor would compensate for minor differ- ences in naming convention, such as with python2 versus python in Table 3.8. A maximum version difference of five would also allow slightly mismatched versions to be tested in case the vulnerability had been present across several versions. Thus, minor differences would still be accepted, but those where the major version differs (1.x versus 2.x) would be rejected. This also helps to catch errors in the name and vendor matching process where an application and its CPE candidate might use different version schemes and thus have a high version difference. For example, the version control system Git might be mapped to a CPE referring to the web service GitLab since they have similar names, but as the latest version of the Git package is 2.31.1 and GitLab is on 13.11.3, the version difference will be 130.2 which will be rejected.

3.3 Automatic testing of exploits

As mentioned in Section 1.2, existing tools such as SVED (see Section 2.7) are used in conjunc- tion with the ability to install vulnerable software to perform an evaluation of the success rate of publicly available exploits. To be able to test a large number of exploits against many pieces of software, the exploitation process itself must also be automated. This is achieved through the steps described in this section.

33 3.3. Automatic testing of exploits

3.3.1 Preparing VMs A set of VMs needs to be prepared for placement in CRATE to allow for attacks to be launched using SVED. These should have as many vulnerable programs as possible installed to allow as many exploits as possible to have a chance at succeeding. Since the automated software installation tool from Section 3.1 can install applications directly in CRATE without needing Internet access, this can be used on existing bare-bones OS images to avoid having to create new images with the applications baked in. As for which applications to install, the program-to-CPE-to-exploit mapping mentioned in Section 3.1.7 will be used to install applications that have a good match with an exploit that is available in Metasploit. A number of exploits that were matched against these applications are not likely to succeed, however, due to missing prerequisites or the fact that some exploits depend on several applications. For example, an exploit against the QuickTime media player also requires the Safari browser to be installed for the exploit to be delivered to the target machine. These exploits are sometimes available in a standalone version as well, where a victim is expected to open a file that was sent to them. The suggested exploits were filtered based on the reported accuracy of the suggestions, the metrics described in Sections 3.2.2 and 3.2.3, and the OS family of the exploit to remove exploits that would not work on Windows. Thus, many exploits might not be applicable for the installed applications, but since testing can occur overnight it will not take much additional time. To speed up the process of launching attacks, multiple SVED injectors will be used to launch several attacks simultaneously. This requires multiple instances of the target system as well, as running several exploits concurrently against the same machine can cause conflicts and lead to instability if services crash as a result of a failed or unreliable exploit.

3.3.2 Creating an attack sequence Each exploit attempt consists of several parts, which are described in this section. See Sec- tion 2.7 and Figure 2.2 for a description of SVED and an overview of the actions described below. Firstly, the targeted VM is restored to a snapshot of a known-good state where all relevant applications are installed, but before any attacks have been attempted. This is needed since some exploits, particularly those that target services instead of client programs, can affect the targeted program such that it becomes unstable or crashes after the exploit attempt [66]. Exploits can also leave files or other traces that might interfere with further attempts. Using snapshots gives each exploit the best possible chance to succeed and ensures that the exploits that are tested later in the sequence are not at a disadvantage (due to a more unstable system) compared to those that appear earlier. Secondly, the SVED injector that will deliver the exploit is switched into the same Virtual eXtensible LAN (VXLAN) as the target system, which makes the injector appear on the same network segment as the target, and thus it can communicate with it. Thirdly, the injector attempts to verify that the target is online and responding by sending an Internet Control Message Protocol (ICMP) echo request (ping) to it and checking for a response. Finally, the exploit is launched. The result of the exploit (whether or not any form of command-line session was created on the target system) and logs from the exploitation process will be gathered by SVED and stored for later analysis. Sequences of actions such as copying files to the target machine and opening these with specific applications are already present in SVED for the purpose of triggering a client-side exploit that requires some user interaction. Additional triggers were created for exploits that appeared several times against different software versions or with many possible settings for the exploit target (which OS or application version is used and such, SVED can try all targets automatically).

34 3.4. Manual testing of exploits

To analyze these logs, the tool SVIZ (described in Section 2.8) was used to visualize the attacks and make it easier to pick out the relevant information. As the test was launched with four machines running in parallel, the amount of logs were large.

3.4 Manual testing of exploits

After having automatically tested the exploits as described in Section 3.3, the exploits were manually examined and compared with the list of installed applications to determine if they would have any probability of succeeding. Since the mapping process is based on string matching, a high confidence score is not a guarantee that the listed exploit is suitable for the given program and version. This examination was performed by reading through the descrip- tion of each suggested exploit and seeing if the correct program is targeted, if the program version matches the ones the exploit supports, and if there are any specific preconditions that are not fulfilled by default (e.g. an additional module needs to be present, a user account or a database table must be created). The exploits that were identified as being good matches for their respective programs were manually tested to determine if these could be executed successfully outside SVED. The purpose of this was to validate that SVED could exploit the identified vulnerabilities correctly, and see if any of these would behave differently when run several times. As exploits are sometimes not deterministic, this manual testing attempts to determine whether or not some exploits might have failed due to the inherent randomness of where objects are loaded into memory or other parameters that cannot be known in advance. This manual testing consists of both launching exploits outside of SVED, and if this succeeds, creating an exploit sequence in SVED where a single exploit is run multiple times against a target. Some exploits might also rely on additional interaction outside of what is currently available in SVED, such as dismissing popups asking if the user wants to change their default web browser, accept the End User License Agreement (EULA), or enable automatic updates. Another manual testing method that was used consisted of using a virtual machine in- side CRATE running the Metasploit framework (see Section 2.6) which allowed for manually preparing and executing attacks against the victim. The suggested exploit in question could then be executed with more control and with the ability to further investigate the potential configuration options of an exploit. This method also gives the ability to make the necessary adjustments to fine-tune the attack. The main drawback of this method is the time that needs to be spent on a single attack, which makes it infeasible to perform on a larger scale in this thesis. It is however powerful for small sample tests. First, the virtual machines were prepared with the necessary programs, using the installa- tion scripts from this project. The firewall and antivirus were disabled as most payloads from Metasploit are detected rather quickly by default and it is out of scope to attempt to avoid detection. A snapshot was then taken to quickly be able to revert to the starting point. Then, the chosen exploit was loaded into Metasploit with default settings except for the IP address of the target system. The outcome of the attack directed the next action to be taken, if the attack succeeded then the test is over. If the attack fails then further investigations are made to figure out what went wrong, which can happen due to various reasons. The protocol analysis tool Wireshark6 was used when investigating, specifically since it can give an insight into the network traffic between the attacker and the target. This can be used to determine if an attack caused any error messages to be returned to the attacking host, or if there was no program listening on a certain port. If the cause of failure is considered possible to work around then further adjustments are made in an effort to make the exploit succeed. One example of such adjustment could be to change the payload used in Metasploit, as these can vary in regards to

6https://www.wireshark.org/

35 3.5. Vulnerable state stability and could target different platforms or require a certain script language to be installed. Another example is to look at the description of the exploit for possible re- quirements that need to be fulfilled before the exploit could work, such as creating a directory in a certain location, changing a configuration file, or starting a specific process. Because the description is in plain text, it is hard to automatically capture the requirements and determine how to set the program up correctly.

3.5 Vulnerable state

As mentioned in Section 2.9, Chocolatey packages are typically only concerned with installing applications and not the initial configuration of them. This can lead to exploits against vul- nerable applications failing due to a database service not being started or being configured to only accept local connections. This cannot be considered a failure on the exploit’s part since a real-world deployment of a database server would involve having the service actually be running and likely accepting remote connections from another server as well. To determine whether a failed exploit is the result of an unreliable/broken exploit or simply because the software was not in the correct state, the exploits tested in Section 3.4 are also examined for configuration options and other parameters which prevent the application from being in a vulnerable state. The purpose of this is to identify if an exploit requires a specific configuration that might only be in use for a few use cases, or if the application was not configured at all after having been installed through a Chocolatey package. Simpler configuration actions such as starting a service or adding some dummy data to a database could then be automated as part of the automated software installation tool described in Section 3.1.

36 4 Results

In this chapter, the results are presented. They consist of two main sections, the automation of software installations and the testing of exploits.

4.1 Automating software installation

The automation process makes use of Ansible and Chocolatey together with custom Python scripts. As described in Sections 2.2.1 and 2.3.1, Ansible is used to automate the process and Chocolatey is the package manager used. The tool can operate in two different modes, one using Chocolatey’s public repository (which requires an Internet connection) and one which utilizes the internal repository. An example of the output displayed to the user can be seen in Figure 4.1.

Figure 4.1: Output from the software installation process

4.1.1 With online access Installing software using Ansible and Chocolatey with online access worked well. By using Python scripts, a list with the wanted software was converted into an Ansible playbook. For Chocolatey, the software’s name and version acted as a unique identifier to locate it in the

37 4.1. Automating software installation repository, which is the only required information needed for installing a piece of software. The same is true for the internal repository as well. Many packages, especially older packages, have been broken in one way or another. The most common issues were HTTP 404 errors where the files for an application are missing. The most likely cause of this is that a software vendor has decided to remove older versions of their software, or that the files have been moved to another server/location and the Chocolatey package has not been updated with this change. Another common error was that dependen- cies for a package failed to install, due to either having been superseded by another package or due to the aforementioned issues. With the online repository, 811 out of the 1 354 tested packages (59.9%) that were deemed to be possibly vulnerable by the process described in Section 3.1.7 could be installed successfully. Table 4.1 and Figure 4.2 present a breakdown of the installation results using the categories defined in Table 3.5.

Result % of packages ok 59.9 other_error 22.1 not_found 8.4 checksums_mismatch 5.5 possibly_broken 3.3 installer_hangs 0.7 not_in_repo 0.1

Table 4.1: Installation result statistics

Figure 4.2: Installation result statistics

4.1.2 Evaluating the reliability of Chocolatey’s online repository During the development of the automated installation tool, many packages were installed as part of the testing process. The results of these installations were saved into a database (see Section 3.1.6) to keep track of what packages had already been tested. The additional

38 4.1. Automating software installation metadata downloaded for each package (mentioned in Section 3.1.4) included a field for the date on which the package was created, which can be combined with the installation results to compare these for packages of different age. In Figure 4.3 the relative frequency of each installation status is shown over a time period of six years. The graphs from Figure 4.3 show that the reliability of packages from Chocolatey’s online repository is only moderate at best. Even packages that have been created within the last four months can only be installed successfully around 65% of the time. This figure varies a lot between months, but there is a downwards trend when looking back at previous years in Figure 4.3b. This shows that packages break over time, which can cause trouble when older versions of a program are required for security research. The rate of not_found errors, indi- cating that a vendor has removed one or more application files from their servers, especially increases for older packages. These errors are encountered for less than 5% of packages newer than four months, while the rate steadily increases up to 20% for packages around 5-6 years old.

(a) Grouped per month

(b) Grouped per year

Figure 4.3: Installation results over time

4.1.3 Internal repository The existing internal repository was populated with Chocolatey packages from Chocolatey’s public repository by using custom Python scripts as mentioned in Section 3.1.10. The scripts downloaded and modified packages that then were uploaded to the internal repository. By changing the repository address that Chocolatey uses to search for packages, the internal repository could be used completely without access to the Internet and its contents will also be more reliable over time in contrast to the public one thanks to this. However, some packages were more difficult to repackage than others. For example, some packages only contained a downloader, which can selectively download only the desired components of a software suite or present a nice interface to the user while downloading, instead of the full program. These cannot be repackaged automatically as the files that will be downloaded cannot be

39 4.2. Selecting exploits for evaluation determined statically without executing the installer. Other packages had URL addresses that were calculated at runtime and thus could not be statically rewritten. Some had even obfuscated their URLs. While some of these could be manually rewritten, it would be a time- consuming process if a large number of packages need to be modified and thus not a viable choice unless there is a specific interest in a certain package. About 85% of packages that can successfully be installed with an Internet connection also work when repackaged using the semi-automated process described in Section 3.1.10 (i.e. with a user answering questions about variable replacements and missing files, but not performing any manual editing of files). Since the internalization process takes additional time on top of the time it takes to install the resulting package, only a subset of the working packages were internalized and tested. In total, these currently amount to 378 packages that can successfully be installed inside CRATE. These will also remain installable in the future since the internal repository is located on-premises. On average, it takes around 24 seconds to internalize one package, which is dominated by a 5-second delay that is introduced after each HTTP request to reduce the load on the servers hosting the Chocolatey packages and the files referenced therein.

4.2 Selecting exploits for evaluation

After having retrieved a list of all Chocolatey packages and their versions as described in Sec- tion 3.1.4, 121 379 unique entries were passed to the CPE-to-exploit mapping script provided by FOI (see Section 3.1.7 for more details). This script was set up to only return matches with a confidence score of two or higher on a scale from 0 to 4. After processing these entries, 2 053 546 CPE candidates and possible exploits were returned. These were filtered based on the similarity between name, vendor, and version as described in Section 3.2.3, resulting in 4 641 entries remaining. These were distributed over four virtual machines in preparation for the automated testing. Due to issues where the VirtualBox API might not be thread-safe, only one VM was run on each physical machine, limiting the number of virtual machines that could be used for testing. Since multiple versions of the same application cannot be installed at the same time on a single machine, this also limits the number of versions that can be tested for a single package to four, one per VM. With these limitations, only 98 packages could be installed on the VMs for a total of 436 attacks using 137 unique exploits.

4.3 Automatic testing of exploits

The automatic tests that were executed by SVED were run four times, where each time took on average seven hours. These were started in the afternoon and evaluated the next morning. The result from the attacks was visualized using SVIZ (see Section 2.8) which gave an overview of all the attacks with the ability to further investigate the logs. See Figure 4.4 for a partial view of the attack sequence. Many of the attempted attacks fail as indicated by the red triangle icons, but in the upper-right corner and on the left some green triangles, indicating successful attacks, can be seen. From the first and second runs, all attacks failed. As most of the attacks were Flash-based on a system without Flash, this was expected. These attacks were likely suggested since their descriptions mention the web browser that the exploit was developed for, which causes the exploits to be suggested by the exploit suggestion process (see Section 3.1.7) even if only the web browser is installed and Flash is not. Although, some attacks that had the potential to succeed did also fail.

40 4.4. Manual testing of exploits

Figure 4.4: SVIZ showing a part of the attack sequence

Category Successful exploits Failed exploits Total fileformat 4 24 28 server 0 251 251 browser 0 209 209

Table 4.2: Summary of exploit results

After investigation, the cause for some of the failures was missing triggers (i.e. the exploit required some additional interaction that was not implemented in SVED) which meant that the exploits did not get run on the victim machine. Other errors such as bugs within the attack generation did also occur, which were later solved. When these problems had been resolved, two new tests were performed. These yielded better results where four of the attacks succeeded. A summary of the number of launched exploits grouped into exploits based on file format parsing vulnerabilities, exploits targeting a server application, and exploits targeting browsers can be seen in Table 4.2. A complete listing of the number of attempts per exploit and the result of these can be found in Appendix A.

4.4 Manual testing of exploits

After the first two automatic tests were completed, it was concluded that manual testing could help to pinpoint the issue of why some seemingly correct attacks were failing. Manual testing was performed as described in Section 3.4 on both local workstations using VirtualBox and the virtual environment in CRATE. Additionally, manual tests were performed against some additional programs that were not identified automatically in Section 3.1.7 but were available from Chocolatey and had matching exploits against them. A list of the programs that failed despite seeming vulnerable, along with the attempted exploit, the installed version, and the targeted version can be seen in Table 4.3.

41 4.5. Vulnerable state

Program Exploit Installed version Targeted version VLC exploit/windows/fileformat/vlc_mkv 2.2.5 ď 2.2.8 elasticsearch-service exploit/multi/elasticsearch/search_groovy_script 1.4.1 ă 1.4.3 foxitreader exploit/windows/fileformat/foxit_reader_uaf 9.0.1.1049 9.0.1.1049 freeswitch exploit/multi/misc/freeswitch_event_socket_cmd_exec 1.8.5 Any

Table 4.3: Manually tested exploits

The identified reason for why each manually tested exploit failed and the action required to make them succeed is shown in Table 4.4.

Exploit Failure reason Required action exploit/windows/fileformat/vlc_mkv Metasploit has no target for this version Install VLC version 2.2.8 exploit/multi/elasticsearch/search_groovy_script Required functionality does not trigger Add random data to database (can be without data in database done remotely by the attacker) exploit/windows/fileformat/foxit_reader_uaf Exploit needs to be delivered via a net- Manually create a network share and work share place a malicious .exe file there exploit/multi/misc/freeswitch_event_socket_cmd_exec Service only listening on localhost Configured to listen on all network inter- faces

Table 4.4: Manually tested exploits, reason for failure and required action

4.5 Vulnerable state

Ansible playbooks were created to put Elasticsearch and FreeSWITCH into a vulnerable state, as these two had exploits that failed due to missing configuration changes on the ap- plication side. These playbooks can be specified when installing packages, which means that these applications can be made vulnerable directly after the installation is completed. Listings 4.1 and 4.2 show these two playbooks, where the references to acl.conf.xml and event_socket.conf.xml in Listing 4.2 are local copies of configuration files that have been modi- fied to allow incoming connections from the Local Area Network (LAN).

- name: Reboot win_reboot:

- name: Wait for service to start win_wait_for: port: 9200

- name: Create initial dummy entry win_uri: url: http://localhost:9200/type/type1/id1 method: POST body:'{ "some": "json"}' status_code: 201

Listing 4.1: Playbook to put Elasticsearch into a vulnerable state

42 4.5. Vulnerable state

- name: Allow incoming event socket connections from anywhere win_copy: src: '{{item}}' dest: 'C:\Program Files\FreeSWITCH\conf\autoload_configs\' loop: - acl.conf.xml - event_socket.conf.xml

- name: Enable the FreeSWITCH service win_service: name: FreeSWITCH start_mode: auto state: started

Listing 4.2: Playbook to put FreeSWITCH into a vulnerable state

43 5 Discussion

In this chapter, the result and method will be discussed along with some interesting observa- tions and challenges.

5.1 Results

5.1.1 Internal repository The package installation results for packages of different ages shown in Figure 4.3 reinforces the importance of having an internal repository that is not subject to distribution regulations and thus can contain copies of installers and other binary files. But as mentioned in Sec- tion 4.1.3, the internal repository was not straightforward to populate. The repackaging of Chocolatey packages led to multiple challenges such as handling different standards and cod- ing style preferences between packages, which made the process hard to automate. While we could successfully automate the process for most of the packages, some of the ones that failed would still need to be manually handled if there should arise a need for them. This however was decided to be out of scope for this project but could be of interest for specific pieces of software that are deemed to be relevant for use in the cyber range. One of the most common practices by package maintainers that prevented automatic repackaging was the use of variables in URLs. Listing 5.1 contains an example of this where a variable named locale is embedded within the URL. This variable substitution can occur in several different ways [67], some of which could be handled statically and some which would require more advanced parsing and imitating parts of the PowerShell interpreter. Chocolatey packages, being written by many different authors, seem to use many of these methods de- pending on the personal preferences of the person creating each package. Thus, the created package rewriter cannot handle some packages due to it not being able to identify URLs that contain more complicated variable substitutions. As detailed in Section 4.1.3 however, only 15% of packages fail to be internalized. This shows that the majority of packages either use no variable substitution in their URLs, or a simpler method of substitution such as the one in Listing 5.1 which can often be resolved statically.

44 5.1. Results

$locale= 'en-US' $locale = GetLocale -localeFile "LanguageChecksums.csv" -product $softwareName $Url= "https://download.mozilla.org/?product=firefox-88.0.1&os=win&lang=${locale}"

Listing 5.1: URL with embedded variable

5.1.2 First usage One obstacle that was not addressed in detail during this work was how to handle the first usage of a program. For the tool to be more useful, minimal user interaction is needed so that exploits launched against the programs do not fail due to unexpected dialog boxes and popups. However, many of the programs that have been internalized and can be installed automatically still require user interaction during the first use. These types of interactions vary a lot between each program and are not easily automatable, as they have to be handled individually. The process could be automated for individual programs by using scripting tools like AutoHotKey1 and AutoIt2 to create scripts that click through dialog boxes according to a predefined sequence of inputs. Another way could be to record any changes in the registry and apply these changes after that specific installation, resulting in that the program should behave as if it has been set up using the graphical interface. Both of these solutions require individual processing that consumes time and does not scale well for hundreds of programs. These solutions are similar to the alternatives for automatic installations mentioned in Section 2.9, since both installing a program and manipulating its state use graphical interfaces and involve changes to the registry and the filesystem. Using Ansible playbooks as mentioned in Section 4.5, some attempts were made to prepare the Portable Document Format (PDF) reader Adobe Acrobat Reader such that it would behave as if a user had previously interacted with it. Acrobat Reader has around 10-15 Metasploit modules targeting it, but most of these target Windows XP and will likely not work on Win- dows 10. However, it could still be relevant to have it prepared for automated installation if new attacks are discovered or because it has historically been a common target for file format attacks [68]. Upon first startup Acrobat Reader displays a sequence of an EULA, an offer to make it the default application for PDF files, and a getting started tour. These elements, espe- cially the EULA, can interfere with the exploits if some parts of the PDF rendering process are not started until the dialog box has been dismissed. As an opened PDF file is not displayed until at least the EULA has been accepted, it can be assumed that this might have some impact on an exploit that relies upon the application being in a specific state. By examining the Windows registry using a tool such as RegistryChangesView3, it was discovered that around 800 entries were created or modified after simply having dismissed all first-run dialog boxes. Manually examining these produced a few candidates that seemed to be responsible for suppressing the dialog boxes on future program launches, but the getting started tour still triggered. This shows that it is possible to emulate an in-use program right after installation, but with so many registry entries being manipulated there is a risk that the application becomes unstable if only a subset of them are set after installation. As they might contain machine-specific entries as well, simply including all of them might also lead to issues when repeating the installation on another machine. The large volume of entries overall also means that it requires a significant amount of manual work to differentiate between machine- specific entries, unimportant entries, and entries that control the first-use behavior.

1https://www.autohotkey.com/ 2https://www.autoitscript.com 3https://www.nirsoft.net/utils/registry_changes_view.html

45 5.1. Results

5.1.3 Exploit testing With only four successful attacks, the results from the test were a bit disappointing, although expected. When looking at the suggested exploit for the programs, many of them were badly matched or just completely wrong. For example, exploits for Firefox that also rely on Adobe Flash Player will not succeed if Flash is not installed. Some exploits relied on vulnerabilities in web services where the back-end system had vulnerabilities, requiring a server with a com- bination of specific back-end systems to be set up as well. For example, an exploit targeting PHP might also require a website that makes use of the specific part of PHP which contains a vulnerability. This was not considered to be either within the scope or worth spending time on for this thesis. These results are similar to those presented by Gustafsson and Almroth [53] where only a handful of exploit succeeded against a large set of virtual machines. This study differs from theirs in that file format- and browser-based exploits are tested in addition to exploits targeting server programs running on the target machine, but reaches a similar conclusion in that very few exploits that are automatically suggested succeed when attempted. When attempting to locate programs manually, it proved difficult to find programs that both existed in Chocolatey’s repository as a valid package and had a reliable Metasploit module for that version. As seen in Section 4.4, there were only four unique programs that could both be successfully installed and exploited. There are different reasons for this, one of which has to do with Chocolatey’s way of distributing packages leading to packages breaking over time as mentioned in Section 4.1.2, which makes it harder to get the right version needed for an exploit to work. Another reason is that while Metasploit has many modules of exploits ready to go, when filtering them down to work for relatively new versions of Windows (i.e. not targeting Windows Vista or XP) and to also have the application available as a package from Chocolatey, there are not many programs left. But this does not necessarily mean that there are no other vulnerable programs in Chocolatey’s repository. As mentioned in Section 1.5, we limit ourselves in this thesis to only look at Metasploit’s exploits and with Windows 10 as the operating system. The targeted vulnerability in the VLC media player was present in the version that was installed in the test environment, but the exploit failed due to Metasploit not having a target definition for the installed version (i.e. the exploit was developed for a different version of VLC). As the vulnerability involves memory corruption, an exploit might contain references to the absolute addresses of certain symbols in the program binaries. When code is added or removed between versions this can cause these addresses to change and the exploit to fail. Adding a new target definition could be done by comparing the binary files between versions and locating the new offset for each relevant symbol, but this is out of scope for this thesis. By installing VLC 2.2.8 instead (for which the Metasploit module was created), the exploit succeeded. The exploit targeting Elasticsearch attempted to utilize the built-in search functionality of Elasticsearch to run arbitrary code. This initially failed despite the Elasticsearch service being started (after a reboot). Through some experimentation, it was discovered that adding some arbitrary data to its datastore caused the exploit to work. Through reading the description for the exploit, it became apparent that the exploit had to be delivered via a Server Message Block (SMB) network share. The generated PDF file could not itself contain any malicious code, but could link to a file on the network share that would be executed whenever a user opened the document. Using Metasploit’s msfvenom command to create a program that would establish a command-line session back to the attacking machine and placing this program on a network share allowed the exploit to work reliably. The FreeSWITCH exploit did not work by default after installing the program. This was determined to be caused by a service not being started by default, and due to it only listening on localhost. This made it inaccessible via the network, but the exploit could still be used to

46 5.2. Method escalate privileges if the attacker had previously established a less privileged session to the target machine. To make it easier to test the exploit, the service was configured such that it could be accessed from any machine on the local network. After this, it could be successfully exploited.

5.2 Method

5.2.1 Alternative methods of performing automated software installations Using the method for automated software installation as described in this thesis was suitable for this scenario and the requirements for CRATE. However, there are many other ways to automate the setup process of a virtual machine and software installation, some of which are mentioned in Section 2.4. Researching these in more detail could have been useful to see whether they could have been good alternatives to the current deployment process in CRATE, but they would require significant additional work if the current process could no longer be used. This is not within the scope of this thesis but could have been an option to test out on a smaller scale to compare with the current solution.

5.2.2 Less focus on automated testing As mentioned in Sections 4.3 and 5.1.3, the automated exploit testing process did not initially lead to any successful attacks against the targeted applications. This was due to many of the exploits not being relevant to the set of installed applications. Spending more time on testing more promising exploits manually or creating additional playbooks to simulate the first usage of a program such as described in Section 5.1.2 could perhaps have been a better area to focus on instead of trying to work with solving the issues that arose around the automated testing process. However, being able to perform automatic testing with SVED on the applications installed in the virtual machines also serves to validate that they are in a vulnerable state, which can be useful for ensuring that environments that are created for an exercise are working as they should.

5.2.3 Source criticism The sources used in this thesis vary a bit in their , with information regarding the tools used mainly being retrieved from documentation written by the developers or from the source code itself. For software projects, this is often the most up-to-date and accurate source of information about the project’s purpose and functionality. As we had access to FOI’s cyber range CRATE and could ask people that are involved in building and maintaining it, most information regarding CRATE could be acquired directly. As it is used for research, there are several research papers involving CRATE which cover both exploit testing and automated deployment of virtual machines and networks. As a complement, research papers regarding other cyber ranges were also used to give a wider perspective and to find alternative solutions.

5.3 Challenges

5.3.1 Hitting the rate limit during downloading and testing packages During the project, when we tried to install as many programs as we could, we noticed after a while that all of the programs failed to install and gave the same error message. After some

47 5.3. Challenges research, it was found to be because of a built-in rate limit from Chocolatey that is explained in Section 3.1.9. Fortunately, it was only a one-hour limit, but to prevent it from happening again, we took some precautions. This included adding a small delay between downloading packages and only having one online installation running per computer. With the internal repository, this is not a problem.

5.3.2 Database A few months into the project we noticed that something was not correct with the database. There were programs with strange version numbers compared to other versions of the same program, programs with strange descriptions, and programs names that did not exist in Chocolatey’s repository. The reason for this was that some values in different columns in the database got shifted by one row, resulting in for example program N getting program N ´ 1’s version number. As it was a subtle difference in a data set of around 15 000 records, the error was easy to miss and it took a while until it was discovered. The thing that got our attention was that some programs that could manually be installed successfully ended up as "not_in_repo" in another instance. Fortunately, the raw data retrieved from Chocolatey’s API was unaffected which meant that a new table could be created and the data from our tests could be recreated with the corrected package data available.

5.3.3 Corrupt output from Ansible During the development of the tool, a strange problem arose. The logs from Ansible are by default sent to the terminal one task at a time, with the possibility to write verbose logs from Chocolatey directly to the terminal. The verbose logs are necessary to determine what went wrong in case of an error but are not necessary otherwise. However, since the software ver- sions being installed are several years old in some cases, there is a relatively high probability that installations might fail. The logs from Chocolatey and Ansible come from stdout (stan- dard output) and stderr (standard error) and are written directly to the terminal. Through using a flag in the Python interface ansible-runner that is used to interact with Ansible, the log can be written to a JSON file as well. There is however a problem with ansible-runner in the version used during this thesis (1.4.7) where the JSON file could be partially corrupted when an output line is over 2000 characters long. A sample of such corrupt JSON code can be seen in Listing 5.2. This caused an error when parsing the log file, as the contents were no longer valid JSON data. The problem was not unique to this project as it has been solved in a developer branch of ansible-runner [69]. However, the changes have not yet been applied to a release version of Ansible but hopefully, this will happen soon. In the meantime, a temporary patch has been created based on the pending solution to manually address this.

5.3.4 Strange URLs When working with the internalization of Chocolatey packages, one challenge was that en- countered was that the URLs did not follow a standard format across packages. The simplest variant was the ones that had one static URL for the 32-bit version of the application and one static URL for the 64-bit version. But there were also several "smarter" solutions, where the URLs changed depending on the current version of the program or were even calculated at runtime based on the operating system, architecture, or display language as reported by the operating system. There was even one package that tried to obfuscate the URL by encoding it (see Listing 5.3), resulting in difficulties to read it statically. However, this could be circum- vented by looking at the HTTP request made at runtime (which is displayed in the log files) and then replacing the obfuscated URL manually before saving the package in the repository.

48 5.4. The work in a wider context

{ "host": "192.168.56.106", "uuid": "d8d8c7f5-2e57-4771-abac-48e8155201b7" }

^^[[ 1;35m[WARNING ]: Chocolatey was missing from this system, so it was installed during^^[[0m ^^[[ 1;35mthis task run.^^[[ 0m { "uuid": "874328ec-fa17-4a0b-b105-384ed6ba2705", "counter": 13, "stdout": "\u001b[0;33mchanged: [192.168.56.106]\u001b[0m", "start_line": 12, "end_line": 13, "runner_ident": "5b19763f-5169-4651-b89a-4a9a24d17397", "event": "runner_on_ok", "pid": 15214, "created": "2021-02-24T08:57:29.352492", "parent_uuid": "c5247a8a-752e-a6ac-0477-00000000000a" }

Listing 5.2: Part of the JSON file that got corrupted

5.4 The work in a wider context

While solutions that automatically install applications already exist in several forms, this thesis aims to develop a program that can interface with these existing solutions to provide a simpler way of installing applications inside a cyber range. This will be beneficial for exercise development since more varied software environments can be created in a shorter time span. We also provide statistical data over the rate at which software packages break due to their vendors removing files from their servers (see Section 4.1.2), which highlights the importance of preserving software for future cyber security studies. Cyber security could be a subject of ethical discussions, as attacks often need to be per- formed and studied to learn and to develop defenses against them. Exploits, both publicly available and self-written are necessary to understand vulnerabilities and how to mitigate them. While some research could be done safely on a regular computer, other areas, for example, network-related exploits or might need a whole network of computers. It would be unethical to conduct security research against a production environment or end-users without consent. The use of cyber ranges provides a space where such research and education could be done ethically without negatively affecting anyone. On a societal level, the use of cyber ranges improves cyber security in multiple areas. Governments, companies, universities and individual people could all benefit from cyber ranges as these could be designed for various purposes. In CRATE’s case, there exists a miniature city model which illustrates a fictional city. This is used to demonstrate some of the possible consequences a cyber attack could have in a city environment where more things are connected to the Internet. Training scenarios could be set up for testing the security for

49 5.4. The work in a wider context

$UrlBase= "76492d1116743f0423413b16050a5345MgB8AEQAMwBxAHUAQwBsAHUA WgBuAHkAbABjAG0AWgBYADMAagAwADMANgBxAFEAPQA9AHwAOABiADQANABmADAAZQA4 ADgAMgBiADQAMwBhADQAYgA5ADEAMABkADcAZAAwADcANAA2ADIANAAzADgAMwA0AGUA YQBhADEAMQBhADQAZQBjAGUAZgAxADUAZQBkADIANQBlADAANgA0ADkAOABmAGIAYQBl AGMAZQAxADcANQBiAGEAOAA3AGIANAA3ADgAZgA0ADAAZQA4AGIAOQAxADcAZQAwADcA NQA1AGYAYgBhAGMAOAA4ADkANgA1ADEANAAxADMANQA5AGEAYgBmADQAZgBhAGQANwBk ADQANQA2ADMANAAyADcANABiADgANwA4ADEAYQA4ADcAYwA2AGYAZAA0AGEANgBkADUA MwA4ADkAMQA3ADIANQAyADUAZgA2AGUAOAA4ADEANAAyADMAMgBmADEANgBjADIAYwAx ADIAOAAxAGQAOQA4AGIAZABhADUANQA3ADIAYgA3ADIANAAwADgAMwA4ADIANQAzAGUA YwAxADMAZQBkAGQAMgBhAGEANABiAGEAMwBhADIAOAAwADMAMABjAGEAYwA0ADcAZQAz ADMAZQBiADIAOQBiADgAOQAwAGIAZQAxAA==" $UrlBase64= "76492d1116743f0423413b16050a5345MgB8AFEANwBwAHYAYwBNAF EAdQBpAEUARwBVAEYAYwBNADMAMwAyAHYAVABSAEEAPQA9AHwAMwA0ADcANQA5ADAAMA BjAGUAYQAzADEANQA0ADkAMwBmAGIAYwAxADQANQAwAGQAMQBmAGQAMQA3ADYANwA2AG IAOAAyADUAZgAzADkAZAAxADQAZgA3AGYAYwA1ADMANgA5ADIAMAA1ADAAMwAwADQAMg BlAGYAOABhADYAYQBjAGIANQAwADMANAAwAGEANQBlADMAMQBiAGYAYwA3ADcAZAAzAD YAMAA4ADIAOQA0AGUAYgBhADAAZQAwAGUAOABkAGEAOABhADUANABlAGEAMwBlAGIAZA BlADEAZQA5ADQAYQBmADUANwBhADYANAAxAGEAMAA2ADgAMQA5ADYAMgA0ADUAYgBmAG EAMwAyADYAZgBkAGIAZABmAGYANwA4AGYANQAwADMAZQAxADcANwAxAGMAMwBkAGQAZg BhADUANgAzAGYAMwAxADYAOQA3ADQAZQA0ADcAZQA0ADEAMgBkAGUAYQA4ADUAZgA3AD QAOAAwADIAZQBlAGYAMQA4AGYANwBkAGQAOABiADkAZgA1ADIAZAA2ADYAYwAzADMAZQ A0ADMANQBjADMANgA1ADkAYgBjADUAYwA5AA==" $UrlBase= ConvertTo-SecureString $UrlBase -Key $UrIBase $UrlBase64= ConvertTo-SecureString $UrlBase64 -Key $UrIBase [...] $url= $UrlBase+ $FileVersion+ "_"+ $LangCode+ ".exe"

Listing 5.3: Obfuscated URL opening bridges, controlling trains and traffic lights, just to mention a few. Decision-makers, who might not have deep technical knowledge, can then see the consequences in action without affecting a real city which also would be magnitudes more expensive to execute.

50 6 Conclusion

This chapter presents answers to the research questions defined in Section 1.3 and provides thoughts on future work.

6.1 Research questions

The aim of this thesis was to assist in improving cyber security research and training by improving the installation process of outdated software. To do this, a tool to automate the installation process was needed and a way to store multiple versions of common software in a reliable way. This resulted in the following research questions:

RQ1: How can older, potentially vulnerable software be stored reliably to support replicable security research? To study this, we utilized a package manager called Chocolatey, which can supply self- contained packages of software from their public repository. Chocolatey also provides the ability for organizations to host an internal repository to have a more reliable repos- itory which they are in control of. FOI had from an earlier study one of these internal repositories already set up using the Sonatype Nexus Repository Manager. Our contribution was to develop a tool to automatically repackage software from the public repository into this internal one. It uses a JSON-formatted input file that con- tains the name of the software along with the version. Most of the packages could be repackaged although not all of them. The main reason for this was because they had been removed from the vendors themselves, which is why the public repository is not as reliable as an internal one. RQ2: How can the installation process of such software be automated for scalable tests? To automate the installation process we utilized Ansible, which integrates well with Chocolatey. When used together, the installation process can be automated and can scale up to install multiple machines at the same time if needed. We created a tool that takes a JSON-formatted input file and generates a set of playbooks that Ansible can use. The tool then logs the output from each installation, possible errors

51 6.2. Future work

that might occur, and provides the user with a summary of how the installation went. Full logs are also stored if further analysis is needed.

RQ3: Is the software in a vulnerable state from the point of installation and if not, how can it be placed into such a state? None of the applications targeted by both automated and manual attacks proved to be exploitable right after installation. Of the four manually tested exploits, one failed due to a lack of support in the Metasploit framework, one required an additional vector of delivering the attack, one would have worked had the application contained any user data, and one was not exploitable due to its default configuration not exposing the vulnerable functionality. In total, three of the four applications could be deemed vulnerable to a remote attacker since the vulnerabilities are exposed assuming that the applications are used by an actual user and that the first exploit can be adapted to the installed version of the vulnerable program. Ansible playbooks were created for two of these applications such that they could be placed into an exploitable state as part of the installation process, demonstrating the created installation tool’s ability to simplify the process of rapidly deploying vulnerable software in a controlled environment.

6.2 Future work

6.2.1 Improving the mapping from program and version to CPE As mentioned in Sections 3.1.7 and 3.2.2, the FOI-provided mapping script for producing CPE candidates from the retrieved Chocolatey data often produced incorrect results since it was designed to accept unstructured input and thus would also compare vendors against application names. A limited-scale test was performed using the filtering process based on fuzzy matching (mentioned in Section 3.2.2) to directly suggest CPE candidates from the dictionary1 provided by the NVD. This could sometimes produce better matches than the provided script, but ran much slower and required that the input data followed a known format with separate fields for name, vendor, and version. Since the suggested CPE is used to suggest an exploit that might work on the application, the quality of this mapping is essential to producing a relevant exploit suggestion. Thus, future work to improve the extraction of CPEs from either structured or unstructured input data would be a great benefit for the process of automatically suggesting relevant exploits for an application.

6.2.2 Improving the exploit suggestion process Another area with improvement opportunities is the process of suggesting a Metasploit mod- ule given a CPE. With FOI’s existing scripts this is done through a combination of attempting to match CPEs to CVE codes and then to Metasploit modules, and by string matching the CPE against each Metasploit module’s title, description, and target definitions. Especially this last method could fail to capture the meaning in a statement such as ”affects versions before 1.2.3” and instead suggest CPEs with the version 1.2.3. Utilizing a structured format of specifying which versions are affected, such as the CVE listings available from the NVD which use CPEs to indicate this, could make it easier to connect a Metasploit module using its list of relevant CVEs to a set of affected CPEs and thus a set of installable applications. This is similar to the current process, but additional work focused on capturing the meaning of ”before version X” or ”up to and including Y” could help to decrease the number of misidentified exploits. It would also be relevant to be able to associate CVEs together with their affected programs. Based on a vulnerability with a given CVE, this would enable listing the affected programs

1https://nvd.nist.gov/products/cpe

52 6.2. Future work for that code if these are available from the internal repository. Having this ability would also make it easier to set up a new scenario where vulnerabilities can be tested. For this to work, a mapping would have to be stored in the database between the CVEs and the programs available at Chocolatey (or other repositories).

6.2.3 Further automating the internalization process The package internalization process described in Section 3.1.10 still relies on user input for resolving situations where a variable is found in an URL or if any files could not be down- loaded. By implementing a way of automatically resolving some of these variable references, more packages could be packaged without user interaction. Likewise, the introduction of more advanced methods of determining if an URL is relevant for the package to function or just included for reference could also help to eliminate more cases where the internalization process is interrupted to prompt for user input.

53 Bibliography

[1] Microsoft. Microsoft Digital Defense Report. 2020. URL: https://query.prod.cms. rt.microsoft.com/cms/api/am/binary/RWxPuf (visited on 2021-05-18). [2] Symantec. ISTR Internet Security Threat Report. 2019. URL: https://docs.broadcom. com/doc/istr-24-2019-en (visited on 2021-05-18). [3] Simon Duque Anton, Daniel Fraunholz, Christoph Lipps, Frederic Pohl, Marc Zimmer- mann, and Hans D Schotten. “Two decades of SCADA exploitation: A brief history”. In: 2017 IEEE Conference on Application, Information and Network Security (AINS). IEEE. 2017, pp. 98–104. [4] Rob Kitchin. “The real-time city? Big data and smart urbanism”. In: GeoJournal 79.1 (2014), pp. 1–14. [5] Lovisa Mickelsson and Johan Bengtsson. Svenska smarta städer - En explorativ studie om förväntad nytta och potentiella sårbarheter. 2021. [6] Erik Brynjolfsson, John J Horton, Adam Ozimek, Daniel Rock, Garima Sharma, and Hong-Yi TuYe. COVID-19 and remote work: An early look at US data. Tech. rep. National Bureau of Economic Research, 2020. [7] Christina Meilee Williams, Rahul Chaturvedi, and Krishnan Chakravarthy. “Cyberse- curity risks in a pandemic”. In: Journal of Medical Internet Research 22.9 (2020). [8] Cyber Security for Europe. CyberSec4Europe - European Cybersecurity Competence Network. 2021-05. URL: https://cybersec4europe.eu/ (visited on 2021-05-28). [9] William Newhouse, Stephanie Keith, Benjamin Scribner, and Greg Witte. “National initiative for cybersecurity education (NICE) cybersecurity workforce framework”. In: NIST special publication 800.2017 (2017), p. 181. [10] Elina Suni, Juha Piispanen, Jarmo Nevala, Jani Päijänen, and Karo Saharine. Report on existing cyber ranges, requirements. 2020. [11] Jon Davis and Shane Magrath. A Survey of Cyber Ranges and Testbeds. 2013. URL: https: //apps.dtic.mil/sti/citations/ADA594524. [12] Mohammad Borhani, Madhusanka Liyanage, Ali Hassan Sodhro, Pardeep Kumar, Anca Delia Jurcut, and Andrei Gurtov. “Secure and resilient communications in the industrial internet”. In: Guide to Disaster-Resilient Communication Networks. Springer, 2020, pp. 219– 242.

54 Bibliography

[13] Maria Leitner, Maximilian Frank, Wolfgang Hotwagner, Gregor Langner, Oliver Mau- rhart, Timea Pahi, Lenhard Reuter, Florian Skopik, Paul Smith, and Manuel Warum. “AIT Cyber Range: Flexible Cyber Security Environment for Exercises, Training and Research”. In: Proceedings of the European Interdisciplinary Cybersecurity Conference. EICC 2020. Rennes, France: Association for Computing Machinery, 2020. ISBN: 9781450375993. DOI: 10 . 1145 / 3424954 . 3424959. URL: https : / / doi . org / 10.1145/3424954.3424959. [14] Swedish Defence Research Agency (FOI). CRATE - Cyber Range And Training Environ- ment. 2020. URL: https://www.foi.se/en/foi/resources/crate---cyber- range-and-training-environment.html (visited on 2021-02-08). [15] Microsoft and other contributors. Windows Package Manager CLI (aka winget). 2021. URL: https : / / github . com / microsoft / winget - cli / tree / 5898d09203409d08742d97f8eb54e249316695e8 (visited on 2021-05-25). [16] Debian Wiki Team. Software - Debian wiki. 2021. URL: https://wiki.debian.org/ Software (visited on 2021-05-28). [17] Microsoft. How to install programs from online sources on Windows 10. 2021. URL: https: //support.microsoft.com/en-us/windows/how-to-install-programs- from - online - sources - on - windows - 10 - a503e8b6 - e45b - fd5a - f4c5 - 5a08c8bd9821 (visited on 2021-05-28). [18] Microsoft and other contributors. Standard Installer Command-Line Options. 2018. URL: https://docs.microsoft.com/en- us/windows/win32/msi/standard- installer-command-line-options (visited on 2021-02-08). [19] Juan Jose Pablos. Unattended, A Windows deployment system: Unattended/Silent Installation Switches for Windows Apps. 2005. URL: http://unattended.sourceforge.net/ installers.php (visited on 2021-05-19). [20] Dan R Herrick and John B Tyndall. “Sustainable automated software deployment prac- tices”. In: Proceedings of the 41st annual ACM SIGUCCS conference on User services. 2013, pp. 189–196. [21] Chocolatey Software. Why Chocolatey? 2021. URL: https://chocolatey.org/why- chocolatey (visited on 2021-02-08). [22] Chocolatey Software. Chocolatey Software | Packages. 2021. URL: https://community. chocolatey.org/packages/ (visited on 2021-02-08). [23] Chocolatey Software. Chocolatey.org Packages Disclaimer. 2021. URL: https://docs. chocolatey.org/en- us/community- repository/community- packages- disclaimer (visited on 2021-05-31). [24] Microsoft and other contributors. Windows Package Manager (preview). 2020. URL: https://docs.microsoft.com/en-us/windows/package-manager (visited on 2021-02-08). [25] Microsoft and other contributors. The Microsoft community Windows Package Manager manifest repository. 2021. URL: https://github.com/microsoft/winget-pkgs (visited on 2021-02-08). [26] Luke Sampson and other contributors. The next-generation default bucket for Scoop. 2021. URL: https://github.com/ScoopInstaller/Main/tree/master/bucket (visited on 2021-02-08). [27] Luke Sampson and other contributors. "Extras" bucket for Scoop. 2021. URL: https: //github.com/lukesampson/scoop-extras (visited on 2021-02-08). [28] Secure by Design. Ninite - Install or Update Multiple Apps at Once. 2021. URL: https: //ninite.com/ (visited on 2021-02-08).

55 Bibliography

[29] Keivan Beigi. Package repository for AppGet. 2020. URL: https : / / github . com / appget/appget.packages (visited on 2021-02-25). [30] Microsoft and other contributors. An introduction to NuGet. 2019. URL: https://docs. microsoft.com/en-us/nuget/what-is-nuget (visited on 2021-02-08). [31] Microsoft. NuGet Gallery. 2021. URL: https://www.nuget.org/ (visited on 2021-02- 08). [32] Red Hat and other contributors. Ansible Documentation. 2021. URL: https://docs. ansible.com/ansible/latest/index.html (visited on 2021-02-11). [33] Progress Software Corporation and other contributors. An Overview of Chef Infra. 2021. URL: https://docs.chef.io/chef_overview/ (visited on 2021-02-11). [34] Progress Software Corporation. Chef Enterprise Automation Stack. 2021. URL: https: //www.chef.io/products/enterprise-automation-stack (visited on 2021- 05-28). [35] Erika Heidi. Configuration Management 101: Writing Chef Recipes. 2016. URL: https: / / www . digitalocean . com / community / tutorials / configuration - management-101-writing-chef-recipes (visited on 2021-02-25). [36] Puppet. Introduction to Puppet. 2021. URL: https://puppet.com/docs/puppet/7. 4/puppet_overview.html (visited on 2021-02-25). [37] Puppet. Puppet language overview. 2021. URL: https://puppet.com/docs/puppet/ 7.6/intro_puppet_language_and_code.html (visited on 2021-05-18). [38] Puppet. Getting started with Bolt. 2021. URL: https://puppet.com/docs/bolt/ latest/getting_started_with_bolt.html (visited on 2021-05-18). [39] SaltStack. Introduction to Salt. 2021. URL: https://docs.saltproject.io/en/ latest/topics/index.html (visited on 2021-05-19). [40] SaltStack. Salt Proxy Minion. 2021. URL: https://docs.saltproject.io/en/ latest/topics/proxyminion/index.html (visited on 2021-05-19). [41] Shane Lee (SaltStack). [BUG] Unable to install salt-ssh on windows (Comment). 2020. URL: https : / / github . com / saltstack / salt / issues / 56982 # issuecomment - 621333844 (visited on 2021-05-19). [42] VMware. vRealize Automation. 2020. URL: https://www.vmware.com/products/ vrealize-automation.html (visited on 2021-05-19). [43] Linode. Configure Apache with Salt Stack. 2021. URL: https://blog.linode.com/ docs/guides/configure-apache-with-salt-stack/ (visited on 2021-05-18). [44] Chocolatey Software. Why Boxstarter? 2020. URL: https : / / boxstarter . org / WhyBoxstarter (visited on 2021-05-25). [45] HashiCorp. Why Use Packer? 2021. URL: https://www.packer.io/intro/why (visited on 2021-05-26). [46] National Institute of Standards and Technology (NIST). Official Common Platform Enu- meration (CPE) Dictionary. URL: https://nvd.nist.gov/products/cpe (visited on 2021-03-15). [47] The MITRE Corporation. CPE - Common Platform Enumeration: CPE Specifications. URL: https://cpe.mitre.org/specification/ (visited on 2021-06-17). [48] The MITRE Corporation. CVE - About CVE Records. URL: https://cve.mitre.org/ cve/identifiers/ (visited on 2021-03-15). [49] National Institute of Standards and Technology (NIST). NVD - General Information. URL: https://nvd.nist.gov/general (visited on 2021-03-19).

56 Bibliography

[50] Rapid7. Metasploit Framework. URL: https://docs.rapid7.com/metasploit/ msf-overview/ (visited on 2021-02-26). [51] Hannes Holm and Teodor Sommestad. “Sved: Scanning, vulnerabilities, exploits and detection”. In: MILCOM 2016-2016 IEEE Military Communications Conference. IEEE. 2016, pp. 976–981. [52] Jennifer Bedhammar and Oliver Johansson. “Visualization of cyber security attacks”. MA thesis. 2020. URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu: diva-167144. [53] Hannes Holm and Teodor Sommestad. “So long, and thanks for only using readily available scripts”. In: Information and 25 (2017-03), pp. 47–61. DOI: 10.1108/ICS-08-2016-0069. [54] Tommy Gustafsson and Jonas Almroth. “Cyber Range Automation Overview with a Case Study of CRATE”. In: The 25th Nordic Conference on Secure IT Systems (Nordsec 2020). 2020-11, pp. 192–209. DOI: 10.1007/978-3-030-70852-8_12. [55] Luis Alberto Benthin Sanguino and Rafael Uetz. “Software vulnerability analysis using CPE and CVE”. In: arXiv preprint arXiv:1705.05347 (2017). [56] Cuong Pham, Dat Tang, Ken-ichi Chinen, and Razvan Beuran. “CyRIS: a cyber range instantiation system for facilitating security training”. In: Proceedings of the Seventh Symposium on Information and Communication Technology. 2016, pp. 251–258. [57] Anders Ydremark. “Approaches for Automated Software Installations in Windows”. Bachelor’s Thesis. 2014. URL: http://urn.kb.se/resolve?urn=urn:nbn:se: liu:diva-107419. [58] Stanislav Dashevskyi, Daniel Ricardo dos Santos, Fabio Massacci, and Antonino Sabetta. “TestREx: a framework for repeatable exploits”. In: International Journal on Software Tools for Technology Transfer 21.1 (2019), pp. 105–119. [59] Docker. Docker overview. 2021. URL: https://docs.docker.com/get-started/ overview/ (visited on 2021-05-27). [60] Chocolatey Software and other contributors. Chocolatey Software Docs | Integrates with everything. 2020. URL: https://docs.chocolatey.org/en- us/features/ integrations (visited on 2021-02-08). [61] Chocolatey Software. Chocolatey.org Packages Disclaimer - Rate Limiting. URL: https: / / docs . chocolatey . org / en - us / community - repository / community - packages-disclaimer#rate-limiting (visited on 2021-03-17). [62] Chocolatey Software. Chocolatey.org Packages Disclaimer - Excessive Use. URL: https: / / docs . chocolatey . org / en - us / community - repository / community - packages-disclaimer#excessive-use (visited on 2021-03-17). [63] Chocolatey Software. Package Validator Moderation Service. 2021. URL: https://docs. chocolatey.org/en- us/community- repository/moderation/package- validator/ (visited on 2021-05-19). [64] Chocolatey Software. Package Verifier Moderation Service. 2021. URL: https://docs. chocolatey.org/en- us/community- repository/moderation/package- verifier (visited on 2021-05-19). [65] Adam Cohen. FuzzyWuzzy: Fuzzy String Matching in Python. 2011. URL: https : //chairnerd.seatgeek.com/fuzzywuzzy- fuzzy- string- matching- in- python/ (visited on 2021-05-05). [66] Maxwell Dondo, Jonathan Risto, and Reginald Sawilla. Reliability of exploits and con- sequences for decision support. Tech. rep. NATO Science and Technology Organization, 2015.

57 Bibliography

[67] Kevin Marquette. Everything you wanted to know about variable substitution in strings. 2017. URL: https://docs.microsoft.com/en-us/powershell/scripting/ learn/deep- dives/everything- about- string- substitutions?view= powershell-7.1 (visited on 2021-05-18). [68] Microsoft. Microsoft Security Intelligence Report Volume 11. Tech. rep. 2011-10. [69] Alan Rominger. Ansible Runner Patch. URL: https : / / github . com / ansible / ansible - runner / commit / d51083716b47a49a94233e8e1bad4b3f2e337a9d (visited on 2021-02-26).

58 A Automatic exploit test results

Category Metasploit exploit name Result Count fileformat exploit/windows/fileformat/vlc_mkv successful4 fileformat exploit/windows/fileformat/vlc_mkv failed3 fileformat exploit/windows/fileformat/apple_quicktime_texml failed 12 fileformat exploit/windows/fileformat/apple_quicktime_pnsize failed4 fileformat exploit/windows/fileformat/wireshark_mpeg_overflow failed2 fileformat exploit/windows/fileformat/wireshark_packet_dect failed1 fileformat exploit/windows/fileformat/vlc_smb_uri failed1 fileformat exploit/windows/fileformat/vlc_webm failed1

Table A.1: Exploit results, file format

59 Category Metasploit exploit name Result Count server exploit/windows/http/ektron_xslt_exec_ws failed3 server exploit/windows/ftp/vermillion_ftpd_port failed4 server exploit/windows/ftp/ms09_053_ftpd_nlst failed9 server exploit/multi/http/oracle_weblogic_wsat_deserialization_rce failed4 server exploit/multi/http/jenkins_metaprogramming failed6 server exploit/multi/http/confluence_widget_connector failed6 server exploit/multi/http/jenkins_xstream_deserialize failed 15 server exploit/multi/http/clipbucket_fileupload_exec failed3 server exploit/windows/http/httpdx_handlepeer failed2 server exploit/windows/http/integard_password_bof failed2 server exploit/multi/mysql/mysql_udf_payload failed4 server exploit/windows/http/icecast_header failed2 server exploit/windows/http/mcafee_epolicy_source failed1 server exploit/windows/http/mdaemon_worldclient_form2raw failed2 server exploit/windows/misc/hp_imc_dbman_restartdb_unauth_rce failed3 server exploit/windows/http/umbraco_upload_aspx failed3 server exploit/windows/http/ultraminihttp_bof failed4 server exploit/windows/iis/ms02_018_htr failed 12 server exploit/windows/iis/ms01_023_printer failed3 server exploit/windows/iis/ms02_065_msadc failed3 server exploit/multi/elasticsearch/search_groovy_script failed3 server exploit/windows/iis/msadc failed3 server exploit/multi/http/struts2_content_type_ognl failed1 server exploit/multi/http/manage_engine_dc_pmp_sqli failed4 server exploit/multi/http/splunk_mappy_exec failed1 server exploit/multi/http/splunk_upload_app_exec failed1 server exploit/multi/misc/wireshark_lwres_getaddrbyname failed3 server exploit/multi/misc/wireshark_lwres_getaddrbyname_loop failed 15 server exploit/windows/iis/iis_webdav_scstoragepathfromurl failed3 server exploit/windows/mysql/mysql_yassl_hello failed8 server exploit/multi/http/jira_hipchat_template failed6 server exploit/windows/misc/fb_cnct_group failed 15 server exploit/windows/misc/ibm_director_cim_dllinject failed3 server exploit/multi/http/zabbix_script_exec failed3 server exploit/windows/mssql/ms09_004_sp_replwritetovarbin failed3 server exploit/windows/mssql/mssql_payload failed3 server exploit/windows/mssql/mssql_payload_sqli failed3 server exploit/windows/http/disk_pulse_enterprise_bof failed3 server exploit/windows/smb/ms06_040_netapi failed3 server exploit/windows/scada/advantech_webaccess_dashboard_file_upload failed3 server exploit/windows/ssh/freesshd_authbypass failed4 server exploit/windows/firewall/blackice_pam_icq failed 69

Table A.2: Exploit results, server

60 Category Metasploit exploit name Result Count browser exploit/windows/browser/adobe_flash_copy_pixels_to_byte_array failed4 browser exploit/windows/browser/adobe_flash_avm2 failed3 browser exploit/windows/browser/adobe_flashplayer_avm failed4 browser exploit/windows/browser/adobe_flash_regex_value failed3 browser exploit/windows/browser/apple_quicktime_marshaled_punk failed4 browser exploit/windows/browser/apple_quicktime_mime_type failed4 browser exploit/windows/browser/apple_quicktime_rdrf failed 16 browser exploit/windows/browser/apple_quicktime_rtsp failed4 browser exploit/windows/browser/apple_quicktime_smil_debug failed4 browser exploit/windows/browser/apple_quicktime_texml_font_table failed4 browser exploit/multi/browser/adobe_flash_hacking_team_uaf failed4 browser exploit/multi/browser/adobe_flash_nellymoser_bof failed4 browser exploit/multi/browser/adobe_flash_net_connection_confusion failed4 browser exploit/multi/browser/adobe_flash_opaque_background_uaf failed4 browser exploit/multi/browser/adobe_flash_pixel_bender_bof failed4 browser exploit/multi/browser/adobe_flash_shader_drawing_fill failed4 browser exploit/multi/browser/adobe_flash_shader_job_overflow failed4 browser exploit/multi/browser/firefox_escape_retval failed8 browser exploit/multi/browser/firefox_pdfjs_privilege_escalation failed8 browser exploit/multi/browser/firefox_proto_crmfrequest failed8 browser exploit/windows/browser/ie_execcommand_uaf failed3 browser exploit/multi/browser/firefox_proxy_prototype failed8 browser exploit/multi/browser/firefox_svg_plugin failed8 browser exploit/multi/browser/firefox_tostring_console_injection failed8 browser exploit/multi/browser/firefox_webidl_injection failed8 browser exploit/multi/browser/java_jre17_exec failed8 browser exploit/multi/browser/java_rhino failed 12 browser exploit/windows/browser/mozilla_attribchildremoved failed4 browser exploit/windows/browser/mozilla_firefox_onreadystatechange failed4 browser exploit/windows/browser/mozilla_firefox_xmlserializer failed4 browser exploit/windows/browser/mozilla_nstreerange failed4 browser exploit/windows/browser/mozilla_mchannel failed4 browser exploit/windows/browser/mozilla_reduceright failed4 browser exploit/windows/browser/mozilla_nssvgvalue failed4 browser exploit/windows/browser/mozilla_interleaved_write failed4 browser exploit/windows/browser/ms07_017_ani_loadimage_chunksize failed4 browser exploit/windows/browser/ms10_026_avi_nsamplespersec failed4 browser exploit/windows/browser/ms10_090_ie_css_clip failed3 browser exploit/windows/browser/ms13_059_cflatmarkuppointer failed3 browser exploit/windows/browser/ms14_064_ole_code_execution failed6

Table A.3: Exploit results, browser

61