<<

Towards a Secure IoT Using -Based Containers

Marcus Hufvudsson

Information Security, master's level (120 credits) 2017

Luleå University of Technology Department of Computer Science, Electrical and Space Engineering Abstract

The (IoT) are small, sensing, network enabled computing devices which can extend smart behaviour into resource constrained domains. This thesis focus on evaluating the viability of Linux containers in relation to IoT devices. Three research questions are posed to investigate various aspects of this. (1) Can any guidelines and best practices be derived from creating a Linux container based security enhanced IoT platform? (2) Can the LiCShield project be extended to build dynamic, default deny configurations? (3) Are Linux containers viable on IoT platforms in regards to operational performance impact? To answer these questions, a literature review was conducted, research gaps identified and a research methodology selected. A Linux-based container platform was then created in which appli- cations could be run. Experimentation was conducted on the platform and operational measurements collected. A number of interesting results was produced during the project. In relation to the first research question, it was discovered that the LXC templating code created could probably benefit other Linux container projects as well as the LXC project itself. Secondly, it was found that a robust, layered containerized security architecture could be created by utilizing basic container configurations and by drawing from best practices from LXC and . In relation to the second research ques- tion, a proof of concept system was created to profile and build dynamic, default deny seccomp configurations. Analysis of the system shows that the developed method is viable. In relation to the final research question; Con- tainer overhead with regards to CPU, memory, network I/O and storage was measured. In this project, there were no CPU overhead and only a slight per- formance decrease of 0.1 % on memory operations. With regards to network I/O, a speed decrease of 0.2 % was observed when a container received data and utilized NAT. On the other hand, while the container was sending data, a speed increase of 1.4 % was observed while the container was operating in bridge mode and an increase of 0.9 % was observed while utilizing NAT. Regarding storage overhead, a total of 508 KB base overhead was added to each container on creation. Due to these findings, the overhead containers introduce are considered negligible and thus deemed viable on IoT devices. Contents

1 Introduction 1 1.1 IoT Classification ...... 2 1.2 Powerful IoT Use-Cases ...... 6 1.3 Introduction to Containers ...... 8 1.4 Problem Statement ...... 9 1.5 Research Questions ...... 10 1.6 Research Objectives ...... 10 1.7 Expected Contributions ...... 11 1.8 Delimitations ...... 11 1.9 Thesis Outline ...... 12

2 Background 13 2.1 Linux-Based Containers ...... 13 2.2 Features ...... 14 2.2.1 Namespaces ...... 15 2.2.2 Root ...... 16 2.2.3 ...... 17 2.2.4 Capabilities ...... 18 2.2.5 Secure Computing Mode ...... 19 2.2.6 ...... 20

3 Related Work 22 3.1 Container Security Studies ...... 22 3.2 Container Performance Studies ...... 25 3.3 Comparative Study ...... 30 3.4 Research Gap ...... 31 4 Research Methodology 34 4.1 Methodology Implementation ...... 36 4.1.1 Problem Identification and Motivation ...... 36 4.1.2 Definition of the Objectives for a Solution ...... 36 4.1.3 Design and Development ...... 37 4.1.4 Demonstration ...... 37 4.1.5 Evaluation ...... 37 4.1.6 Communication ...... 38

5 Design 39 5.1 Phase One - IoT Container Platform ...... 39 5.2 Phase Two - Dynamic Seccomp Profiling ...... 40 5.3 Phase Three - Container Performance Measurements . . . . . 40 5.4 Project Overview and Planning ...... 41

6 Implementation 43 6.1 Phase One - IoT Container Platform ...... 43 6.1.1 Base ...... 45 6.1.2 LXC Container Platform ...... 45 6.1.3 UTS Namespace ...... 48 6.1.4 Networking Namespace ...... 48 6.1.5 Mount Namespace ...... 49 6.1.6 Root File System ...... 49 6.1.7 Cgroups ...... 50 6.1.8 Capabilities ...... 51 6.1.9 Secure Computing Mode ...... 52 6.2 Phase Two - Dynamic Seccomp Profiling ...... 54 6.2.1 First Iteration ...... 55 6.2.2 Second Iteration ...... 55 6.2.3 Third Iteration ...... 57 6.3 Phase Three - Container Performance Measurements . . . . . 61

7 Results 63 7.1 Phase One - IoT Container Platform ...... 63 7.1.1 Base Operating System ...... 63 7.1.2 LXC Container Platform ...... 64 7.1.3 UTS Namespace ...... 65 7.1.4 Networking Namespace ...... 65 7.1.5 Mount Namespace ...... 66 7.1.6 Root File System ...... 67 7.1.7 Cgroups ...... 67 7.1.8 Capabilities ...... 68 7.1.9 Secure Computing Mode ...... 68 7.2 Phase Two - Dynamic Seccomp Profiling ...... 69 7.3 Phase Three - Container Performance Measurements . . . . . 70 7.3.1 CPU & Memory Operation Measurements ...... 70 7.3.2 Network Measurements ...... 71

8 Discussion 79

9 Conclusions 84 9.1 Future Work ...... 85 Chapter 1

Introduction

The term Internet of Things was, according to Sundmaeker et al. [1] first mentioned by the founders of the MIT Auto-ID center. The term was then picked up by various news organizations. In 2005, the International Telecom- munications Union (ITU) published a report on the IoT concepts and its meaning. Sundmaeker et al. [1] summarizes the ITU’s report on the mean- ing of IoT:

”The ITU report adopts a comprehensive and holistic approach by sug- gesting that the Internet of Things will connect the world’s objects in both a sensory and intelligent manner through combining technological develop- ments in item identification (”tagging things”), sensors and wireless sensor networks (”feeling things”), embedded systems (”thinking things”) and nan- otechnology (”shrinking things”).”

In essence, the definition boils down to relatively small devices (compared to the more common traditional computers) which has the ability to com- municate with other devices. The smart devices that is IoT has steadily been making progress into society and will most likely continue to do so. Aerospace, automotive, telecommunication, housing, medical, agriculture, retail, processing industries and transportation are just some examples of industries that have seen adoption of IoT [1].

A diverging aspect of the IoT concept compared to traditional comput-

1 ing is its inherent heterogeneous nature. IoT devices can range from small identification tags (RFID), simple sensors and actuators to smart and more powerful, distributed intelligence devices (or even a mixture of all of them). These different types of devices must often interact with each other to pro- vide a meaningful function. One example of how these various heterogeneous devices could work together to form a system is given by Atzori et al. [2]. In their health care example: tracking, identification/authentication, data collection and sensing are used in conjunction in order to provide services needed in the domain. Tracking with the help of RFID systems could for ex- ample be useful to identify the location of a patient. Similarly, identification could be used to associate medication to the patient and data collection could be performed by an intermediary device to which sensor nodes are attached which keep track of the patient’s status.

1.1 IoT Classification

Bormann et al. [3] has proposed a set of terms to be used in the IoT domain. This thesis makes use of some of these terms to facilitate a common ground for the definition of the work presented. In their paper, Bormann et al. define a ”constrained node” by comparing it to a node operating on the Internet. In this context, an ”Internet node” could for example be a server, or laptop. In contrast to an Internet node, a constrained node is one that costs less and/or exhibit ”physical constraints on characteristics such as size, weight, and available power and energy” [3]. These constraints leads to lower expectations of the constrained device. Bormann et al. recognizes that this definition is not very rigorous. It does however offer a relativistic definition that will always points toward significantly less powerful devices than what is currently the state of the art technology used in Internet nodes. Bormann et al. [3] further exemplifies the definition of a constrained node by providing a list of typical facets exhibited:

• constraints on the maximum code complexity (ROM/Flash)

• constraints on the size of state and buffers (RAM)

2 • constraints on the amount of computation feasible in a period of time (”processing power”) • constraints on the available power • constraints on and accessibility in deployment (ability to set keys, update software, etc.)

The first two items in this list, ”code complexity (ROM/Flash)” and ”size of state and buffers (RAM)” is used to further define three specific classes of constrained nodes in order to further extend and understand the definition. Bormann et al. provides a table of the different classes.

Table 1: Classes of Constrained Devices [3]

Name data size (e.g., RAM) code size (e.g., Flash) Class 0 <<10 KiB <<100 KiB Class 1 ∼ 10 KiB ∼ 100 KiB Class 2 ∼ 50 KiB ∼ 250 KiB

As can be seen in table 1, a constrained node differs significantly from an Internet node in terms of memory capacity. This table is constructed based on ”commercially available chips and design cores for constrained devices” [3]. Bormann et al. notes that the boundaries of table 1 will change over time as technology progresses and maybe it already have since this thesis is written three years after [3] was published. The advent of ”single board computers”, such as the [4] could very well have skewed the table. Although, this is just speculation.

Class 0 devices are defined as very constrained sensor nodes that collect data from its sensor(s) and forward it to other devices. Class 0 devices will, under most circumstances not be able to communicate securely with other devices via the Internet directly. Instead they rely on IoT proxies and/or gateway devices to forward their data. Meta-communication with a class 0 devices, such as remote management are virtually non-existent.

Class 1 devices provides a certain increase in ability to communicate with other devices. Typically, they are not able to communicate employing

3 popular communication protocols such as for example HTTP. However, class 1 devices can utilize specific protocols tailored for IoT devices, such as the Constrained Application Protocol (CoAP) [5]. This technically allows class 1 devices to communicate directly with other devices via the Internet. It should be noted however that these IoT-enabling protocols might not be as widely supported as the more popular Internet protocols. As such, a proxy or gateway-type device might still be needed to facilitate communication of the data.

Class 2 devices are less constrained and are usually capable of implement- ing most protocols used by regular Internet nodes. Bormann et al. notes that even though class 2 devices often are capable of implementing higher level protocols like HTTP, it might not always be beneficial to use them since a less demanding protocol could free up resources for other tasks. Therefore, even class 2 devices could need some form of proxy or gateway to facilitate the transmission of data. Bormann et al. notes that there exists constrained devices beyond the class 2 definition which could easily implement and use the standard protocols used by Internet nodes. They do not go further in their definition however, since devices beyond the class 2 definition ”are less demanding from a standards development point of view” [3] in regards to protocol utilization.

Even though Bormann et al. end their classification of constrained devices at class 2, there are plenty of more powerful IoT devices both in terms of available memory (ROM/flash and RAM) as well as computational power and energy requirements. When IoT is described in literature, an architectural model consisting of three main layers are often referred to (figure 1). The model consists of a Perception layer [6] (also referred to as the sensing layer [7] or the ”physical perception” layer [8]), a Network layer [6] and an Application layer [6]. In relation to these layers, class 0-2 devices are typically located in the perception layer as can be seen in figure 1.

4 Camera Surveilance Data Processing Sytem

Traffic Management Application Layer

Home Automation Management

Project focus

Web Application RFID Management

Edge Router

Network Layer IoT Gateway

Emergency info Delivery

Weather Fingerprint Station Scanner

GPS Receiver

Barcode Scanner Perception Layer

Temperature Sensor

RFID Tag

Figure 1: IoT Layers Overview

The perception layer consists of physical devices used for identification, collecting information or otherwise act in the physical world. Examples are: RFID tags/sensors, / scanners, IR sensors, ”smart cameras”,

5 hearing aids, temperature sensors, GPS receivers, etc. Data to and from these devices are exchanged with the network layer. It’s the responsibility of the network layer to transfer the data to the application layer. The network layer usually employ well known communication protocols such as Ethernet, wifi, cellular network protocols (GSM, UMTS, LTE, etc), Bluetooth, infrared protocols, etc. Finally, the application layer the data in some fashion to perform a useful function. This can for example be (camera surveillance, remote thermostat settings, etc), smart electric power grid, traffic management etc. [7, 8, 9, 10, 11, 12]. The borders between the layers can arguably be fuzzy or at least challenging to interpret depending on how a specific IoT system is viewed.

1.2 Powerful IoT Use-Cases

As mentioned, more powerful IoT devices can and has been used for a num- ber of applications. Gill et al. [13] has developed a prototype system for delivering emergency information to elderly people. The project utilizes the Raspberry Pi [4] platform as the IoT device for receiving and delivering the information.

Another example of how utilization of a more powerful IoT device can be used is given by Shah & Haradi [14]. In their project, they build a biometric system backed by a cloud service. The back end communication is secured by strong encryption standards, in this case RSA and AES-256. This is a fairly good example of how a more powerful IoT device must be used in order to achieve the desired outcome of a project.

On the topic of biometrics, Sapes & Solsona [15] has built a proof of concept finger print scanner system utilizing the Raspberry Pi in conjunction with a small finger print reader (class 0-2 device). The system is supported by client side devices such as a desktop PC or a mobile phone. The Raspberry Pi powered finger print scanner can thus exploit the various popular protocols and techniques for providing the back-end communication.

In another project, Vujovi´& Maksimovi´c[16] utilize the Raspberry Pi as a sensor web node to detect fire in a home automation scenario. The

6 Raspberry Pi was evaluated and chosen for its low cost, serviceability and support of a large number of peripherals.

Vemi & Panchev [17] utilizes the flexibility of the Raspberry Pi to demon- strate various wifi penetration methods. In their paper, they present a tech- nique in which the low powered Raspberry Pi is carried as a drone payload. The Raspberry Pi then probes and harvests wifi credentials of the surround- ing environment.

Another project by Ansari et al. [18] utilizes the Raspberry Pi as a physical intrusion detection system by detecting motion in a video stream provided by a camera. When the system detects motion, the images are sent to the cloud. In the event of a connection failure to the cloud, the system stores the images locally until the connection becomes available again. This research points to two interesting aspects which requires a more powerful IoT device. To be able to send the images to the cloud, an implementation of the Internet protocol stack is required. Additionally, the storage space required for saving images from a camera locally can be rather high. Both of these requirements demand a device of a higher classification than class 2 devices can provide.

Finally, Sicari et al. [19] presents a project in which a policy enforcement system is evaluated on a Raspberry Pi. The system is rather complex and can manage multiple client nodes. Since the memory requirements can be fairly high with many clients, the usage of a more powerful IoT device seems like a good fit.

Raspberry Pi, with their recent 10 million sold devices [4] is arguably one of the most successful vendor in the more powerful IoT device segment. A simple survey however reveals a wide variety of similar devices with varying specifications filling the market. The Snapdragon devices from Qualcomm [20], the Edison compute module [21] and the Galileo board [22] from are some examples of big industry names investing in the market of more powerful IoT devices. Other examples include the Orange Pi [23], Beagleboard [24], Y´un[25], Banana Pi [26], C.H.I.P [27], the UDOO boards [28] and the VoCore devices [29].

As seen so far in this section, there is a need for more powerful IoT de-

7 vices, above the class 2 definition. The use-cases exemplified requires more processing power, more RAM for performing calculations and storing ap- plication run-time data, requires larger non-volatile storage, to other types of peripherals as well as access to more commonly used communication protocols.

A related aspect is that since a more powerful IoT device shares many of the characteristics of a regular Internet node, the device is capable of exploiting the same (or similar) operating system services, software stacks and communication protocols. Although in some cases, perhaps to a lesser extent than what an Internet node could. In essence, the more powerful IoT devices helps to bridge the gap between class 0-2 devices and Internet nodes.

Since powerful IoT devices can run more complex software and imple- ments common communication protocols used on the Internet, the security aspects of these devices could be viewed as more in tune with Internet nodes. This means that if a vulnerability is found in a piece of software utilized by both an Internet node and a more powerful IoT device, it stands to reason that the IoT device could also be affected by the vulnerability. An exam- ple could be a vulnerability in a web server used both by an IoT device as well as an Internet node. It’s worth mentioning that security related inci- dents involving IoT devices does occur and that their impact could be rather significant [30].

These are some reasons as to why it could be interesting to focus in on, and attempt to create a more secure environment on these more powerful IoT devices. Therefore, this thesis will focus on IoT devices above the class 2 constraint but below the traditional Internet node (see figure 1 for project focus). As have been shown, there are a number of applications for IoT devices in this segment and a fairly rich market of devices fitting into this more powerful category.

1.3 Introduction to Containers

The concept of containers has been around for a long time in computer science. The main idea centers around isolating and limiting the ”view” of

8 an individual process in an operating system. Although the word ”container” is a relatively new name, other terms have been used to describe the same principles, or in some cases a subset of the principles. The syscall for example, implemented in the late 1970’s is one example of how a process can be isolated in terms of its view of the file system. The chroot syscall will allow a process to change the perceived root in the file system, thereby limiting access to files and directories the process can utilize.

Containers in general and Linux containers in particular use a wide variety of different techniques to limit the view of processes to not only include the file system but also access to other processes, communication, syscalls, networking, etc. By isolating processes in this manner, it’s not only possible to protect processes from each other, but other benefits entails as well, such as ease of creating homogeneous environments or moving processes between physical computers.

1.4 Problem Statement

As mentioned earlier, powerful IoT devices capable of running operating systems, for example a Linux based distribution, can utilize many of the same programs, software libraries and services as Internet nodes can. Since security challenges facing Internet nodes could also be present on IoT devices running the same software, there is a need for protection against various forms of attacks against the system.

Since IoT devices are constrained in various ways, security features needs to be tailored as to not cause various forms of excessive overhead to the system. Additionally, needless interference from the security features with the applications running on the device are generally not desired. Containers have recently been considered mature enough to secure applications by iso- lating parts of the operating system. However, there are no detailed studies surrounding what subsystems are viable to use on IoT devices. Regarding operational performance of Linux containers on IoT devices, some research has been conducted but only on a limited set of IoT devices and some results regarding network performance are inconsistent.

9 Certain aspects of Linux containers can be challenging to configure in an optimal way. Subsystems such as Linux security modules (LSM) and sec- comp offer fine grained control to resources. As such, they can potentially be very powerful if configured in an optimal way. However, manually con- figuring these subsystems are infeasible due to both the dynamic nature of containers resource access requirements (access to different resources are re- quired depending on what application(s) are running in a container) as well as the amount of options and volume of resources to cover. Projects to rem- edy the complexities of LSM configuration exists but no known project or tool addresses seccomp.

1.5 Research Questions

In this thesis, the following research questions are asked:

1. Can any guidelines and best practices be derived from creating a Linux container based security enhanced IoT platform?

2. Can ideas from the LiCShield project [31] be extended to build dy- namic, default deny seccomp configurations?

3. Are Linux containers viable on constrained IoT platforms with regards to operational performance impact?

1.6 Research Objectives

The main objective of this research project is to construct a security enhanced computing platform for software applications running on IoT devices. This will be accomplished by selecting an IoT platform capable of running Linux, preferable one which has had little exposure to previous scientific research. A small Linux based operating system capable of supporting containers will then be built for the selected IoT platform. On top of this, a container soft- ware suit will be installed and customized to work with the selected platform

10 and a custom IoT-centric container configuration will be attempted to be built.

Once the platform is built, the various isolation mechanisms of the con- tainers will be tested and measured with regards to both security and per- formance. A system for creating dynamic and highly customized seccomp configurations will then be built and evaluated.

1.7 Expected Contributions

By the end of the project, it’s expected that the experience gained from build- ing the initial operating system hosting the containers can yield knowledge to developers of IoT devices in such a way that they can build an operating system capable of supporting Linux containers.

Once the host operating system is built, a container construction system will be built for use on the host system. It is expected that the construction system can draw from best practices as well as analysis to create security enhanced containers such that they are isolated in a variety of aspects as well as constrained to only allow functions within the applications normal operating parameters.

Finally, it’s expected that the current knowledge of utilizing Linux con- tainers to support various security efforts on Linux capable IoT devices are expanded. At the end of this project, the testing and analysis done to de- termine the viability of Linux containers on IoT devices should extend the current knowledge of how Linux container technologies can be used on IoT devices. Hopefully, the knowledge produced could further the understanding of Linux container security in general as well as validate the techniques for use on IoT devices.

11 1.8 Delimitations

This project will investigate Linux-based container technologies and how they can be implemented to secure IoT devices. Therefore, the IoT device used must be capable of running the Linux kernel and a basic set of operating system utilities. Only one Linux-based IoT device will be used during this project.

The project will only consider security enhancements provided by the Linux kernel container implementations even though other methods to secure a system is available. The container security is evaluated by directly testing the implemented features against the threats the feature are supposed to protect against. No effort is made to try to circumvent the security system itself.

Some Linux container subsystems will not be considered in this project, mainly due to project time constrains. The recently implemented user names- pace will not be included in this project, neither will the PTS namespace. Furthermore, Linux security modules will not be considered and only the device restriction subsystem of cgroups will be configured. Other cgroups controls will not be utilized.

1.9 Thesis Outline

In this section, an outline of the thesis is presented. In chapter 2, a back- ground of the various container technologies used in Linux based operating systems are presented. In chapter 3, a literature review pertaining to the research questions and the problem definition is presented. In chapter 4, the argumentation for the selected research methodology as well as its implemen- tation is shown. Chapter 5 and 6 presents the method used to complete the research project, such as software & hardware used, research protocols and software development processes. In chapter 7, experimentation results are presented. Chapter 8 discusses and evaluates the results. Finally, chapter 9 draw conclusions from the research project and presents future work.

12 Chapter 2

Background

2.1 Linux-Based Containers

The Linux kernel provides a number of features which, when used in con- junction, form the basis for enabling containers on Linux-based operating systems. However, the kernel features alone are useless without tools that leverages them. Docker [32] and ”Linux Containers” (LXC) [33] are two examples of user space software which uses the Linux kernel features to provide tools for creating, running and managing containers. Note that the user space software ”Linux Containers” (often referred to as LXC) can be confused with the general Linux kernel components which provides the features for implementing containers on Linux. To avoid this potential confu- sion, this thesis will use the term LXC to refer to the user space software that leverages the container related features in the kernel to implement containers on Linux-based operating systems.

The core idea of containers in general is to provide isolation between processes. A process running inside a container only have access to the resources as specified for the specific container. The process should, in general not be able to access any aspects of other containers nor areas of the operating system not specifically made available to the container. This core concept provide a number of potential advantages. The environment of a container

13 could easily be replicated and/or moved between systems and isolation could provide an additional means of security. In figure 2 a schematic view of a system running two containers are shown. The Linux kernel and operating system environment are situated on top of the hardware platform as is the case in general computing. However, a system utilizing containers will have a layer of user space tools responsible for managing containers. This example shows two containers isolating its own environments. It’s assumed that a process running in container 1 can’t access any resources (unless explicitly allowed) on Container 2 nor the host operating system.

Container 1 Container 2

Container Tools

Host Operating System

Linux Kernel

Hardware Platform

Figure 2: Schematic Overview of Linux Containers

2.2 Linux Kernel Features

In this section, a brief overview of the various Linux kernel features which provide the mechanisms for containers are presented. Each sub section ends with potential security related use-cases within containers.

14 2.2.1 Namespaces

Namespaces is a general name for when a global kernel resource has been provided with isolation. Processes accessing a namespaced resource therefore cannot affect other namespaced resources [34]. Table 2 summarizes the kernel resources that uses namespaces.

Table 2: [34]

Namespace Isolates IPC System V IPC, POSIX message queues Network Network devices, stacks, ports, etc. Mount Mount points PID Process IDs User User and group IDs UTS Hostname and NIS domain name

A container utilizing all of the namespaces in table 2 would protect pro- cesses in containers from a variety of potential malicious activities. The IPC and network namespacing could hinder snooping of process- and network data, the mount namespace could limit access to file systems, the PID names- pace could prevent information leakage of processes and the user namespace could further hinder unauthorized access to resources outside of the container. The UTS namespace might not have a direct security related application but could rather serve to further protect the integrity of the system. Figure 3 shows an example of the PID namespace to illustrate the general operation of a namespaced resource. In this example, the process(es) in container 1 belong to PID NS 1 and the process(es) in container 2 belong to PID NS 2. Information about the processes in PID NS 1 can’t be accessed by Container 2 and vice versa.

15 Container 1 Container 2

Process Process

PID NS1 PID NS2

Kernel space

Figure 3: Schematic view of the PID namespace

2.2.2 Root File System

The Linux kernel has for a long time supported two syscalls (pivot root and chroot) used to change the current root file system. These syscalls are a vital part in most boot up sequences in Linux. However, the chroot syscall has been used to isolate the file system view of a process in order to provide security. Linux containers make use of these syscalls to isolate the file system view to processes inside the container. By doing so, each container has its own, private file system and can’t interfere with other containers’ files, nor the host file system. These concepts are illustrated in figure 4.

16 Container 1 Container 2

Process Process

FS 1 FS 2

Kernel space

Figure 4: Schematic View of Private Filesystems

2.2.3 Cgroups

Cgroups are a way of controlling access and usage of various system resources [35]. CPU, memory, disk I/O, network I/O and device files are some of the most common and useful examples of hardware utilization that can be controlled via cgroups.

A container that has been setup to only allow a certain amount of memory usage as well as a limited number of CPU cycles could protect the entire system (as well as other containers running on the system) from a denial of service attack by a potentially malicious user. Even if a CPU- and/or memory intense process is run, the kernel will throttle the process CPU time

17 by restricting the CPU and only allow the process to allocate a defined set of memory. Other processes in the container will contend for the limited resources, but at least other processes outside of the container will not be affected. In figure 5 a schematic overview of cgroups are shown. In this example, the process in the container is trying to access the /dev/tty1 device file. The success of the operation depends on how the cgroups subsystem has been configured for the particular container.

Process

Container

Cgroups

CPU MEM DEV ...

/dev/tty1 Kernel space

Figure 5: Schematic Overview of Cgroups

2.2.4 Capabilities

Capabilities provides a way to partition and define access rights to various privileged operations or groups of operations, these can then be assigned to

18 various system request mechanisms. Capabilities has been available since version 2.2 of the kernel [36]. Implementations of Linux containers can take advantage of capabilities and thus limiting the access to privileged kernel operations the container has access to.

A container could, for example block access to the kernel interface for process tracing (ptrace). Disallowing process tracing could block a malicious user from (among other operations) reading memory regions of arbitrary processes. This could strengthen the integrity of processes within a container. A schematic overview of capabilities are shown in figure 6, if the process in the figure where to attempt to call ptrace (request process tracing) on any process, the success of the operation would depend on how the capabilities subsystem was configured in the particular container.

Process

Container

Container Capabilities Capabilities ACL Configuration

Resource/Operation Kernel space

Figure 6: Schematic view of Capabilities

2.2.5 Secure Computing Mode

The secure computing mode component is essentially a way for the kernel to restrict what syscalls a process are allowed to execute. Seccomp can operate

19 in two modes, strict mode and filter mode [37]. In strict mode only three basic syscalls are allowed to be executed and its use-case is fairly limited. In filter mode, a list of allowed or disallowed syscalls can be constructed resulting in restricted access to syscalls the process can access.

By utilizing a seccomp list, a process within a container can be heavily constrained in what privilege it has access to. For example, a container running a web server would typically not need to load and execute kernel modules. If a malicious user were to circumvent the security of the web server, gaining access to the container and the syscall for loading kernel modules was blocked in the container, at least the attacker is blocked from accessing the kernel on the system. Figure 7 illustrates how a seccomp filter can be configured to intercept syscalls made by a container process in an effort to evaluate if it’s allowed or not.

Process

syscall Container

Container Seccomp Seccomp Filter Configuration

Syscall Kernel space

Figure 7: Secure Computing Mode Operations in Relation to Containers

2.2.6 Linux Security Modules

The Linux security modules (LSM) is a Linux kernel framework which allows anyone to design a ”security module” which will plug-in into the kernel. The

20 framework enables the kernel to support a wide variety of security add-ons in a unified way. At the time of this writing, both LXC and Docker support SELinux [38] and AppArmor [39], two popular LSM. The LSM’s implements (MAC) to restrict access to various resources.

Process

Container

LSM Interface Container AppArmor SELinux AppArmor Configuration

Kernel space Resource/Operation

Figure 8: Schematic view of Linux security modules

21 Chapter 3

Related Work

In this chapter, a literature review of related research papers are presented. The papers are divided into two main categories: (1) security research fo- cusing on Linux containers and (2) performance studies of Linux containers, the majority of which references IoT-type devices.

3.1 Container Security Studies

In this section, papers focusing on security hardening using Linux container technologies are presented. As noted by Reshetova et al. [40] little research has been carried out which considers the security aspects of OS-level virtu- alization in general and Linux containers in particular. The studies that do exist does however provide a good starting point for further security research on Linux containers and particularly in the context of IoT devices.

Wessel et al. [41] has implemented a prototype security system for An- droid (primarily targeted towards smartphones or similar devices) in which instances of programs are isolated from each other. Their solution utilizes Linux container technologies and focuses on the namespaces, control groups and Linux security modules. Their solution extends further and also offer remote administration and management of devices, encryption and tunneling

22 of network traffic as well as storage encryption. The solution does produce some overhead and degrades battery life of the devices with 7.5% in a worst case scenario. The study is interesting as it hints at the viability of utilizing Linux container technologies which could, in conjunction with other mecha- nisms, provide a system which improves the security of a device. Wessel et al. has focused on three main aspects in their work: isolation, communication, and storage. Isolation is provided between programs so they can’t affect each other and encryption is used to secure all communication and storage needs of the program. The study was carried out on a Nexus One and a Samsung Galaxy S3 smartphone.

In a paper by Miller & Chen [42] the security aspects of LXC combined with SELinux are studied. The paper analyses how SELinux are able to pro- tect deployed containers against file traversal attacks. The study concludes that SELinux is a great addition to LXC and that proper use of it will reduce the attack surface of containers.

Mattetti et al. presents a rather extensive paper on their LiCShield frame- work [31]. LiCShield was built to support automatic security configuration of AppArmor or SELinux’s security rule set. The framework works by ana- lyzing the container behavior and records the results of the activities within the container. Once a container has run for a specified amount of time, the rule set is created from the data gathered and the container can then be deployed with the configured AppArmor or SELinux rules. The results from the study shows that the method is fairly reliable and works under most circumstances. There are two major issues facing the LiCShield framework: 1) Glob pattern processing and related issues (such as generated file names) and 2) The inherent difficulty in determining when a program has run though its entire code base and thus performed the operations that should be white listed. Mattetti et al. has managed to show that their approach is viable and that future work might make the solution even better.

Reshetova et al. [40] did a fairly comprehensive security related study of OS-level container technologies (not only Linux based solutions). The main focus of the study was on the level of isolation provided by the various container technologies. Isolation is viewed in two ways by the authors. 1) How well the container is isolated from other containers and 2) how well a container is isolated from the host. The main conclusion of the study is that

23 although work on OS-level container technology originated in the FreeBSD and Solaris operating systems, ”Linux has caught up in terms of features and the flexibility of the implementation” [40]. It should be noted that this paper is old enough to draw some conclusions which are not necessarily true today. One of them is that the implementation of Cgroups is incomplete. Since the state of the Linux kernel is ever changing it’s hard to assert that any given feature is ”complete”, however, cgroups are at least more complete at the time of this writing than it was when [40] was published. Reshetova et al. concludes that Linux containers was the most mature container solution.

Bui [43] has conducted a study investigating the security of Docker. The paper focuses on two main aspects. ”(1) the internal security of Docker, and (2) how Docker interacts with the security features of the Linux kernel, such as SELinux and AppArmor, in order to harden the host system” [43]. The study lists the various security features provided to Docker via the underlying Linux container technologies. The study concludes that containers are ”fairly secure” [43] out of the box and that combined with other Linux based security tools, can provide even greater levels of security.

A paper by Borate & Chavan [44] highlights how sandboxing can act as an important part for securing a computing device. By introducing separation and isolation between processes, a higher resistance against cyber attacks are achieved. Borate & Chavan notes that sandboxing mechanisms are useful for a wide variety of computing applications and that its applications could strengthen the operating system security regardless of what applications it serves. Examples given by Borate & Chavan are smartphones, desktop, and cloud.

It’s worth mentioning that there is emerging research which tries to utilize available security measures to detect anomalous behavior. In a paper by Abed et al. [45], they extend a method for detecting anomalies in behavior in a container. This is achieved by tracing the execution of container programs and constructing a system for detecting if the programs running in the container behaves as predicted. Their result show promise and could in the future be used to further strengthen the security of virtualized environments, such as containers.

A paper by Gantikow et al. [46] studies the viability of utilizing Linux

24 containers in High Performance Cluster (HPC) environments as opposed to based . The study focuses on the performance over- head as well as the security aspects of Linux containers. The study finds that Linux containers are a viable option in both respects. Additionally they found that Linux containers are still not considered as secure as hypervisor based virtualization. However, by including and using proper configuration of all Linux container components, containers can be strengthen to the point where it can compete with- and even be more secure than an equivalent environment not using containers.

3.2 Container Performance Studies

In this section, papers evaluating the performance of Linux container tech- nologies is presented. Papers focusing on IoT devices are prioritized. There exists a healthy amount of studies where the authors have studied the ef- fect of Linux containers in various scenarios. Benchmarking is done either by specific benchmarking tools and/or by measuring the effects containers have on typical (IoT) applications. Most of the performance measurements focuses on rather powerful IoT devices.

Krylovskiy [47] has analyzed the performance of Linux containers utilizing Docker [32] in various benchmarking setups on the Raspberry Pi 2 and the Raspberry Pi Model B+ [4]. Using tools designed to test performance of var- ious computing aspects, Krylovskiy finds that both platforms performs well in Docker containers. The aspects measured were CPU, memory, disk I/O and network I/O. Krylovskiy did however uncover that when Docker was run with a namespace employing network address trans- lation (NAT) the efficiency of network bandwidth was reduced considerably. This was solved by running the Docker container without the networking namespace isolation. It should be noted however that by bypassing the net- work namespace and operating the container directly on the host network layer, security is sacrificed [47]. Krylovskiy also evaluates the performance of two real-world IoT applications, a Message Queue Telemetry Transport (MQTT) broker called Mosquitto [48] and LinkSmart Device Gateway [49]. These could be considered typical IoT Gateway applications and are there- fore good choices to evaluate. Krylovskiy found that although containers

25 does introduce some overhead in these applications, they do perform rather well. Krylovskiy notes that to be able to form a specific recommendation of using Linux containers on IoT devices ”understanding of the IoT landscape and a case-by-case analysis are needed” [47].

A similar study to that of Krylovskiy was done by Morabito [50] where only the Raspberry Pi 2 was evaluated for its potential use as an IoT con- tainer platform. Like Krylovskiy, Morabito used Docker in conjunction with various benchmarking tools to analyze the performance of the device. The aspects measured were CPU, memory, disk I/O and network I/O. The re- sults of the various benchmarking performances was similar. That is, the overhead of Linux containers were negligible. Regarding application perfor- mance, Morabito focused the analysis of two different applications. Namely, the MySQL server [51] and the Apache HTTP server [52]. The results from these benchmarking efforts shows that the applications in question suffers negligible overhead. Morabito concludes the study by noting that the overall performance overhead is negligible on all computing aspects included in the study. Morabito notes that further benchmarking and performance analysis is needed on other computing platforms.

In a paper by Celesti et al. [53] the performance of a CoAP [5] imple- mentation is evaluated on the Raspberry Pi B+. Celesti et al. doesn’t use any benchmarking tools to evaluate the performance of different aspects of the Raspberry Pi B+, but rather they focus on a single metric (application response times) to evaluate the viability of Linux containers on the Rasp- berry Pi B+. As with the previous papers, Celesti et al. use Docker in their study. The choice of measuring only response times for the CoAP server in a container environment makes sense from a service perspective. From the point of view of a client, the processing time of the request is important. Although Celesti et al. finds the overhead added by Linux containers to be acceptable, their findings does show a rather high overhead compared to running the CoAP application without containers. As have been observed by [47] the Docker NAT performance penalty could account for the rather high overhead compared to not using containers, but with no further details regarding the setup, this is only speculation.

A paper by Ismali et al. [54] investigates how Linux containers (also Docker in this case) can be used in scenarios to leverage the

26 proximity to IoT sensors. The paper doesn’t focus on evaluating the perfor- mance aspects of running Docker on IoT devices, but rather it investigates other criteria. These are: deployment and termination, resource and ser- vice management, fault tolerance and caching compared to using traditional hypervisor based virtual machines. In their study, Ismail et al. finds that, although there are some issues to overcome with container deployment, it’s a viable platform to build edge computing services on.

In a paper by Ramalho & Neto [55] a study is presented which evaluates the Cubieboard2 [56] for use as an IoT edge node. The Cubieboard2 has similar specifications as the Raspberry Pi 2 which should make it capable of handling the various tasks performed in an IoT edge node scenario. The study compares the performance penalties of Docker versus full virtualization using KVM [57]. Aspects such as CPU, memory, disk I/O and network I/O is evaluated. Even though KVM virtualization performs rather well, it can’t compete with the thin Linux container technology that powers Docker. As a result, Docker outperforms KVM in all performance benchmarks conducted by the authors.

Another effort into the study of Linux containers’ viability as an IoT deployment platform is made by Morabito et al. [58]. They benchmark two different IoT devices: The Raspberry Pi 2 and the Raspberry Pi 3. Morabito et al. focus their efforts on Gateway as a Service (GaaS), which is essentially a use case for edge computing where IoT sensors can utilize the gateway to communicate their data to a back-end application. In their study they evaluate the performance of the Raspberry Pis with regards to CPU, memory, disk I/O and network I/O. Morabito et al. also introduces the measuring of energy consumption in their performance analysis. By doing this, they can determine if Linux container techniques introduces any significant overhead with regards to energy utilization. The results and conclusions of the study finds that there is a slight performance overhead to utilizing containers but the effects are practically negligible, power consumption overhead included.

Another study by Mulfari et al. [59] focuses on implementing a middle- ware solution on IoT edge nodes. In the paper, they implement the Message Oriented Middleware for Cloud (MOM4C) [60] on top of Docker containers and evaluates the system architecture. The study by Mulfari et al. is still in its early phase and it focuses on the overall system architecture to provide

27 sensor nodes with a gateway-like service to communicate their results. The paper deems the use of Docker containers as a deployment model a success and notes that they didn’t detect any significant overhead in their study. The initial findings are thus that Linux containers are a viable option to ease deployment of middleware.

Renner et al. [61] also evaluates Linux containers using Docker as a plat- form to deploy various resources which could be utilized by surrounding IoT devices. The vision presented by Renner et al. is edge nodes with abilities to dynamically allocate resources as they are needed by either applications or users. The study is performed on a Raspberry Pi 2B. An interesting met- ric evaluated in this paper are the spin-up time of containers. This factor is measured to evaluate the ability of the platform to dynamically allocate resources as they are requested. In their study, a varying number of contain- ers are spun up and their performance measured. The study concludes that Linux containers are a viable option for providing a dynamic IoT environment with regards to performance.

In a paper by Morabito & Beijar [62] they study the performance impact of Linux containers using Docker on a Raspberry Pi 2 and Odroid C1+ [63]. The Odroid C1+ is a similar device to the Raspberry Pi 2, in some respects it’s even more powerful. Morabito & Beijar is examining the viability of these devices as IoT gateways. This is done by measuring the overhead of CPU, memory and network I/O. Like the similar studies presented in this chapter, Morabito & Beijar notes very small overhead in their performance measure- ments. Furthermore they recognize the NAT overhead generated when using the networking namespace feature. They note that Docker can be run with- out the network namespace but doing so increases the security risk. The study by Morabito & Beijar finds a rather high performance degradation on the Odroid C1+ compared to the Raspberry Pi 2 when using the networking namespace combined with NAT, which is an interesting find. Also, Morabito & Beijar doesn’t note as high of a performance degradation when using the networking namespace with NAT on the Raspberry Pi 2 which is also inter- esting as it deviates from the other studies. Morabito & Beijar notes that the differences in performance probably relates to the difference in hardware between the two devices. Their conclusion is that, overall the performance penalties for running containers on the platforms are negligible.

28 Bellavista & Zanni [64] explores the viability of their scalable extensions of the Kura application. Kura is an IoT gateway application which Bellavista & Zanni has deployed in a fog network. The platform they are evaluating on was the Raspberry Pi 1 B+. Bellavista & Zanni also uses Docker as their Linux container implementation. In their study, they focus on building a fog network to provide IoT gateway services to other IoT nodes and deploying them in containers. Bellavista & Zanni had to evaluate what file system to utilize in the Docker containers as some options proved to have better per- formance and thus suit their application better. The authors find containers to be scalable and introduce low overhead for their application domain.

Hajji & Tso [65] has extended a previous study they did to create a Linux container enabled Raspberry Pi cloud. In this study [65] they focus on investigating the viability of big data processing using the same Raspberry Pi cloud. Hajji & Tso uses the Raspberry Pi 2B and Docker in their work. They focus on measuring the performance of Apache Spark [66] and HDFS (Hadoop Distributed File System) [67]. Hajji & Tso measures CPU, memory, network I/O, and energy consumption as they process big data jobs. They find that there are some CPU and memory overhead when operating in a container and that the network throughput decreases. It’s unclear if Hajji & Tso are using networking namespaces or if the container operates in the host’s network mode. Either way, the results indicates that big data processing is possible on smaller IoT devices in containerized environments and that it can be useful.

An interesting and somewhat different study was conducted by Claassen et al. [68] where they measured the performance of different network drivers for Linux containers. The networking namespace can be virtualized using different drivers for the containers. Claassen et al. evaluate the performance of the veth, macvlan, and ipvlan drivers. The paper notes that container based networking efficiency is important as containers are emerging in the distributed cloud environments. However, the findings in this paper could be useful in in other container applications as well. The study finds that the macvlan driver is the most efficient network driver. This is coincidentally also the most secure one as it provides the strongest isolation between containers.

To finish up this section, it could be worth noting that there are a variety of other viable options to the various Raspberry Pi models (which has had the

29 predominant focus of researchers so far). Maksimovi´cet al. has in their study [69] evaluated several different IoT-type boards comparable to the Raspberry Pi in processing power, RAM, disk size and networking bandwidth. The BeagleBone Black [70] and Udoo (Quad) [71] for example seems comparable and could offer an alternative to the Raspberry Pi for a Linux container enabled IoT device.

3.3 Comparative Study

The papers in the literature review analyze a broad spectrum of Linux con- tainer use-cases. The security focused papers all advocate the use of Linux containers as a means of providing an additional layer of security around processes. The papers however range from reviewing general Linux con- tainer technologies to examining specific parts of security concepts. Due to the diverse nature of the security related studies, a comparative study was challenging to perform. However, the diversity could hint towards the flexi- bility of Linux containers. As can be seen in the top part of table 3, many different user space tools was used in the security analysis. Both LXC and Docker are represented, as well as the direct usage of the underlying ker- nel subsystems in some papers (marked N/A in the ”Container” column of table 3). Furthermore, the platforms in which the experimentations were performed varied from smartphones to servers, indicating that the security features of Linux containers are viable in various hardware scenarios. Finally, in regards to the specific subsystems used, Linux security modules seem to be rather prominent in the research field. Regardless of which aspects of Linux containers are studied, the conclusions drawn are generally positive with regards to if Linux containers increase security in the studied domain.

Performance wise, many different aspects has been studied by the various papers, such as CPU, memory, disk I/O and network I/O. These aspects has been the most popular to study. Some papers have added interesting domain specific measurements. For example, Morabito et al. [58] and Hajji & Tso [65] investigates energy overhead in containers, Celesti et al. [53] has measured application response time overhead and Renner et al. [61] measured container spin-up times. All of the results however showed that Linux containers have minimal/negligible impact on the measured variables.

30 The papers by Krylovskiy [47] and Morabito & Beijar [62] note that network address translation (NAT) seems to play a major role in network I/O when utilizing containers. In general, studies utilizing Docker is over-represented and Raspberry Pi is most commonly used in IoT-focused papers.

3.4 Research Gap

In section 3.1 various papers studying the security aspects of Linux containers were presented. In general, the research show that containers could be used to provide a layer of protection against cyber attacks. However, no security focused studies were found which addresses the use of Linux containers on IoT devices. Most research found examined security aspects in cloud-like de- ployments. The studies closest to the IoT sphere is arguably [41] & [44] which both cover the validity of Linux containers on smartphones. In addition, no study has been found which clearly identifies which container subsystems are used, how they are configured or how they work to protect the container and the host operating system. Neither does any paper analyze container imple- mentations to suggest guidelines or best practices to configure a container to be as secure as possible. Examining this research gap could help answering the first research question posed in this thesis.

No paper has been found that focuses on building seccomp configurations based on the needs of the container. The LiCShield project [31] has devised a way to profile containers in an effort to create Linux security module configu- rations based on the specific needs of the container. A similar approach could theoretically be undertaken to build seccomp policies. The studies that do explicitly state seccomp is used doesn’t show how the seccomp configuration was performed nor do they test the seccomp operations inside the container. This thesis will build upon the LiCShield project by Mattetti et al. in an effort to answer the second research question in this thesis.

None of the articles evaluates potential storage overhead caused by con- tainers. Considering that there are IoT deployment scenarios that calls for constrained IoT devices, storage space could start to play a more crucial role in the project design. Linux containers does introduce storage overhead which could be challenging for certain IoT devices if it’s too large.

31 Table 3: Comparative study matrix

Study Platform Container Specifics Conclusions Wessel et al. [41] Samsung Galaxy S3 Custom NS, Cgroups, LSM Viable, Reduced battery lifetime Miller & Chen [42] Servers LXC LSM (SELinux), Cgroups Availability, performance, scalability, security Mattetti et al. [31] N/A Docker LSM (SELinux/AppArmor) LSM Profiling, Increased security Reshetova et al. [40] N/A N/A Isolation features Similar features Bui [43] N/A Docker Docker features, LSM Fairly secure, recommend ”non-privileged” Borate & Chavan [44] N/A N/A Linux kernel container features Offers security, suggests grsec Abed et al. [45] N/A Docker Syscalls Syscall anomaly detection Gantikow et al. [46] HPC Docker Cgroups, NS, CAP., LSM High security, low performance overhead

Krylovskiy [47] RPi, RPi2 Docker CPU, Mem, Disk, Net Small overhead, NAT overhead Morabito [50] RPi2 Docker CPU, Mem, Disk, Net Negligible overhead Celesti et al. [53] RPi Docker CoAP Response times Acceptable overhead Ismali et al. [54] N/A Docker Operational param. Viable for use-cases Ramalho & Neto [55] Cubieboard2 Docker CPU, Mem, Disk, Net Docker outperforms KVM Morabito et al. [58] RPi2, RPi3 Docker CPU, Mem, Disk, Net, Power Practically negligible Mulfari et al. [59] N/A Docker Application (MOM4C) Solution proved to work Renner et al. [61] RPi2 Docker Container spin-up time Viable for a dynamic IoT environment Morabito & Beijar [62] RPi2, Odroid C1+ Docker CPU, Mem, Net Negligible overhead, NAT overhead Bellavista & Zanni [64] RPi Docker Application (Kura) Limited overhead Hajji & Tso [65] RPi2 Docker CPU, Mem, Net, Power Big data processing possible Claassen et al. [68] N/A Docker Veth, Macvlan, Ipvlan Macvlan fastest and offer most isolation Maksimovi´cet al. [69] RPi, BB, P., Udoo N/A N/A Many alternatives to RPi Finally, most papers focus on relatively ”high performance” IoT devices. The Raspberry Pi 1 (which is arguably the least powerful device evalu- ated) utilizes the Broadcom BCM2835 SoC [72] which hosts a 700 MHz ARM1176JZF-S CPU [73], VideoCore IV GPU and 256 (or 512 depending on board revision) MB of RAM. The Raspberry Pi 2 upgraded the SoC to a Broadcom BCM2836 [74] which, among other features has a 900 MHz quad- core ARM Cortex-A7 CPU [75], the RAM was also upgraded to 1 GB. No papers were found which evaluates Linux containers on less powerful IoT devices. The two aspects presented in this and previous paragraph will aid in answering the third and final research question posed in this thesis.

33 Chapter 4

Research Methodology

The work presented in this thesis is carried out utilizing the Design Science Research Methodology (DSRM), as defined by Peffers et al. [76]. A central aspect of DSRM is to provide a scientific research methodology tailored to- wards the theme of producing new knowledge from the construction of an artifact. In the context of DSRM, an artifact is defined very broadly and could refer to a variety of items. Peffers et al. doesn’t go into great de- tail specifying artifacts and borrows several definitions from other papers. They list aspects of artifacts from Hevner et al. which characterize arti- facts as ”constructs, models, methods, and instantiations” [77] as well as the definition by J¨arvinen[78] ”new properties of technical, social and/or informational resources or their combination”. The ambiguity of the term enables the methodology to be applied in a wide range of research projects, which is an ambition with the paper. The research model draws from, and brings together several common traits of design science to produce a complete research methodology.

The research methodology focuses on six activities, each building upon each other in order to create an artifact in a scientific manner. The first activity is aimed at finding a relevant problem and defining it, what should be done in order to solve the problem, as well as indicating how a solution could be of interest in some way. By completing this activity, an understanding of the problem is presented and motivation for a solution is created. In the second activity, a set of objectives are defined in order to solve the problem.

34 This activity ascertains that there exist a general plan for how the problem should be solved. In the third activity, focus is put on actually designing and developing the artifact that should solve the problem. This activity is fairly broad and the methods and tools used are highly coupled to the type of artifact that is being developed. In activity four, the built artifact should be demonstrated. Peffers et al. lists ”experimentation, simulation, case study, proof, or other appropriate activity” [76] as examples of how the artifact could be demonstrated. In the fifth activity, the artifact is evaluated on the basis of how well it supports a solution against the problem defined in the first activity. Once the evaluation is complete, it’s possible to iterate back to activity three in an effort to improve the artifact further. In the final activity, the entire body of work that has been performed should be communicated: The problem, why it’s important to solve, the artifact and its design & construction as well as the results produced should be conveyed. The concept of the activities are further illustrated in figure 9.

Figure 9: Design Science Research Methodology Activities [76]

Design Science Research Methodology is considered to be well in line with the project presented in this thesis. The main focus of this project is aimed at creating specific guidelines and tools for securing an IoT device. These guidelines and tools are created by designing and building an IoT platform in which programs can execute securely. In other words, based on a need for a more secure general IoT platform, a tangible software based artifact is designed and built to meet the need in the form of a complete Linux-based

35 container system. The central theme of the project fits a Design Science Research Methodology. For this reason, DSRM is selected as the research methodology to be used in this project.

4.1 Methodology Implementation

The Design Science Research Methodology consist of six activities that should be performed in order (with possible reiteration of activity 3-5) to produce scientifically valid results [76]. In this section, each of the activities will be described and an outline of how they will be implemented in the context of this thesis is presented. An overview of the activities are given before they are presented in greater detail.

4.1.1 Problem Identification and Motivation

In the first activity, a research problem should be identified and justified. This activity serves as the foundation for the artifact to be produced which solves the problem. This activity also includes showing the current state the problem is in and how important a solution to the problem is. This activity is covered in chapter 1 and 3 of this thesis. Here, a general description of the current state of IoT are introduced and towards the end of the chapter a specific problem definition is defined and research questions are formulated to address the problems presented. In chapter 3, a literature review is provided and research gaps are presented and addressed to further solidify the problem definition.

4.1.2 Definition of the Objectives for a Solution

In the second activity, a set of objectives should rationally be inferred from the problem definition which could provide a viable solution. The objectives can be either quantitative or qualitative in nature. As part of the introduction chapter, subsection 1.6 defines a series of research objectives to be performed

36 in the thesis project. The objectives are quantitatively defined based on the research problems and questions posed in sections 1.4 & 1.5.

4.1.3 Design and Development

The third activity in DSRM focuses on creating the artifact which will con- tain the research contribution. The activity encompasses designing the ar- chitecture and functionality of the artifact as well as the build process. In this thesis, chapter 5 and 6 defines the Linux based system architecture to be built, the various tools and the IoT platform to be used as well as the project planning.

4.1.4 Demonstration

In the forth activity, it should be demonstrated how the artifact solves the problem defined in the first activity. This could be achieved with ”experi- mentation, simulation, case study, proof, or other appropriate activity” [76]. In this thesis, this activity will mainly be described in chapter 7. Here, the results of the various evaluation protocols defined in chapter 6 will be pre- sented. The methods used to demonstrate the effectiveness of the artifact will mainly consist of experimentation and simulation results.

4.1.5 Evaluation

In activity five, the artifact should be measured for how well it performs in relation of the solution of the problem. In essence, the results produced by the artifact are compared to the the objectives of the solution (defined in the second activity). Peffers et al. lists several examples of metrics and analysis techniques that could be used to perform the measurements. These could be comparisons, different quantitative measurements, results from surveys, simulation results, etc. [76]. It should be noted that this activity includes an optional step to iterate back to activity three to try to improve the solu- tion. In this thesis, chapters 7 and 8 will provide the basis for evaluating the

37 system built. Chapter 7 provides the raw results from quantitative measure- ments taken during experimentation and simulation. Chapter 8 discusses the results, referencing back to the problem definition in section 1.4 and the re- search objectives in section 1.6. The measurements acquired while building the container platform and implementing various security features will be evaluated based on their relative effectiveness against direct attacks against the resource which they are trying to protect. This will essentially be a pass/fail grade. In the section where the dynamic seccomp profiler is built, the resulting seccomp configuration is tested against a known working state, also producing a pass/fail grade. The final evaluation criteria, dealing with performance overhead measurements, will be evaluated for the platform and compared against the findings in related research.

4.1.6 Communication

The last activity is communication. Here, the problem is described and the artifact’s characteristics (such as its utility, design and effectiveness) is de- scribed. This activity is mainly covered in chapter 9 of this thesis. Here, a summary of the research problem is reiterated and the outcome of activity 4 (demonstration) & 5 (evaluation) is summarized. Future work is also pre- sented here, providing suggestions for furthering developing IoT containers. In essence, this thesis report is the main form of communication. Addition- ally, mature code produced in the project could be shared with any source project who would benefit from it.

38 Chapter 5

Design

This project is divided into three main phases. Each phase addresses a specific research question. This is done in an effort to better connect the practical project work with the theoretical problems defined in chapter 1.

5.1 Phase One - IoT Container Platform

The first phase of the project focuses on creating a functional IoT device supporting Linux containers in which experimentation and development can be conducted. In order to achieve this, the phase is divided into three key steps. During the phase, noteworthy practices are collected in an effort to later potentially form guidelines which could be generalized. The three key steps in this phase are:

1. Building and configuring a Linux based operating system capable of supporting containers on an IoT device

2. Creating secure-, lightweight containers and utilities for use on the IoT device drawing from the experience of Docker and LXC

3. Evaluate security aspects of the container features

39 In the first step, a small Linux-based operating system will be configured and built. The built system will include container support in the Linux kernel as well as a user space container software suit. In the second step, container templates will be constructed based on practices and guidelines from LXC and Docker. In the final step, experimentation on container instances will be conducted. The experimentation phase will be carried out in an iterative manner according to the following structure:

• A container feature to study will be selected

• The container feature is tested with regards to the resource it’s trying to protect

• Observations from the test are collected

• Repeat process until all container features in the project are tested

5.2 Phase Two - Dynamic Seccomp Profiling

In phase two, an attempt to extend the work by Mattetti et al. [31] will be made. Mattetti et al. [31] showed that their LiCShield project had suc- cessfully built dynamic LSM configurations by profiling containers during normal/controlled operations. This thesis asks if it’s possible to extend the work in regards to seccomp. The second phase will therefore attempt to profile containers with regards to syscalls and generate highly customized seccomp configurations.

5.3 Phase Three - Container Performance Mea- surements

In the final phase, operational performance of the built IoT device is evalu- ated and analyzed. In in literature review (section 3.2) multiple studies in- dicate that Linux-based containers perform very well on Raspberry Pis and

40 equivalent devices. However, none of the studies has measured the perfor- mance of Linux containers on hardware much less powerful than a Raspberry Pi. Additionally, storage overhead of Linux containers hasn’t been investi- gated. Both of these aspects will be considered in this project. In addition, CPU, memory operations and network I/O overhead will be measured and analyzed.

5.4 Project Overview and Planning

The three phases have been mapped out and conditions for when the project should move from one phase to another has been identified. The project will proceed according to figure 10. In phase one, the Linux-based operating system will be built and a base container generator and security focused configuration templates will be built. Once the base system is complete, the project will move on to phase two, investigating the profiling of containers in an effort to create dynamic seccomp configurations. As shown in figure 10 the available time of the project will be evaluated in this phase as to allocate most of the project time in this phase. In phase three, performance measurements will be collected. It is expected that this activity will consume the least amount of project time.

41 Phase 1

Create base Operating system w/ container support

No Works? Support containers?

Yes

Create container rootfs generator and security configuration template

No Container boot? Container application works? Container shutdown

Yes

Phase 2

Create/refine dynamic seccomp configuration End profiler Development

No Container boot? Container application works? Performance Container shutdown measurements

Yes

Out of time Time available Evaluate project for phase Phase 3 time constraints

Figure 10: Development Flowchart

42 Chapter 6

Implementation

6.1 Phase One - IoT Container Platform

In this project, the VoCore [29] device was chosen as the hardware platform to implement the project on. There are various characteristics of the VoCore device which makes it a good choice in the project. The hardware is fairly low-powered with limited resources. Table 4 depicts how the VoCore com- pares to the concept of a class 2 IoT device and the Raspberry Pi 1. As can be seen in table 4, the storage space of 16 MB of flash memory forces the base operating system as well as the container overhead to be small.

Table 4: Comparison of the VoCore with Class 2 and Raspberry Pi devices

IoT Device CPU RAM Storage Class 2 IoT device N/A 0.05 MB 0.25 MB VoCore 360 MHz, RT5350 32 MB 16 MB Raspberry Pi 1 700 MHz, ARM1176JZF-S 256 MB 1000+ MB

Since the VoCore platform is very constrained, a small base operating system is needed. The VoCore supports the OpenWRT/LEDE project dis-

43 tributions which are Linux-based operating system distributions designed for resource constrained hardware platforms. OpenWRT/LEDE Project utilize a Linux kernel together with the system and other, optional pack- ages to create a complete Linux-based operating system image to be installed on a supported architecture. Figure 11 shows a schematic overview of how the software components work together. First, a configuration is created, specifying the target hardware platform. A Linux kernel is configured to suit the selected hardware platform and busybox is built, providing required operating system utilities. Any additional software requirement can be im- ported by creating packages. In this project, LXC packages was added to facilitate the user space container tools. Once the OpenWRT/LEDE build system is configured, the software is built and an image is created, ready to be put on the flash memory of the hardware. It’s worth noting that Open- WRT was forked into the LEDE project, however, both projects share much of the same code which, in the scope of this project can be thought of as synonymous projects. This study however, uses the LEDE project.

Configuration

Kernel

OpenWRT Image

Busybox

Packages

Figure 11: OpenWRT/LEDE build system

44 6.1.1 Base Operating System

When construction of the initial base operating system begun, LEDE was downloaded, configured for the VoCore device and a system image was built. In the initial build, a default Linux kernel, busybox, various system tools and some debugging tools was built (no container software was included). The purpose of the initial build was to test the build process. The operating system image was installed on the VoCore and the busybox implementation was inspected. The final total system (including the kernel, busybox and the various system and debugging tools) consumed around 12 MB of storage space unpacked. It should be noted that this number could be reduced con- siderably since many system, testing and development tools were included.

The initial base operating system was considered a success and a second iteration began in an effort to include the LXC software suit and reconfiguring the Linux kernel to add the proper container support. This phase was iterated a number of times to include all features in the Linux kernel. The LXC software was available in LEDE as a software package feed which was included in the build process. Once LXC had been fully added and all container support code added to the Linux kernel, the final size of the installation was around 14 MB. As noted in the previous paragraph, this number could be reduced considerably by removing superfluous system and development tools. The base operating system was at this point considered complete.

6.1.2 LXC Container Platform

Preparing a LXC container involves creating a new root file system for the container as well as generating a base configuration. The templating sys- tem available in LXC provided a specific template for busybox containers. However, the template was written assuming the host operating system had access to the bash shell and a few other utilities not available by default in the LEDE busybox configuration. The busybox template was therefore modified to accommodate the more constrained environment produced by the LEDE project.

Once the container template was working, containers could be created.

45 The resulting container root file system consisted of the busybox base operat- ing system utilities, base configuration files and the init subsystem, used for the container. In total, the container overhead was 508 KB. It should be noted that this could have been further reduced by removing superfluous tools in busybox and possibly switched to the busybox init system. The final step in creating a secure IoT platform was to create a template configura- tion for the containers to be instantiated. To achieve this, best practices from the LXC and Docker project was compiled and adapted to suit the constrained environment of this project. The subsequent subsections present each subsystem used in securing the container.

The different Linux container features works together to potentially form a layered security approach for each container. Any attempt to access poten- tially sensitive resources from inside a container requires the containerized process to be allowed through various subsystems. Some of the container subsystems also provide the same type of resource protection. In this way, containers are in these cases protected from single point of failure. A general overview of the layered security approach can be seen in figure 12. The fol- lowing example illustrates how a process inside a container must be allowed access by various subsystem to access a resource. For example, a process wishing to create a device file to access a certain kernel driver must first be allowed to create the device file by the capabilities subsystem. Secondly, the cgroups device subsystem must allow access to the device file. Finally, the operations performed on the device file must be allowed by seccomp. This example is a great illustration of how different container subsystems provide similar access protections to resources. By implementing the system with the overall, layered approach in mind, a greater level of container security could be achieved.

46 User space

Container

Process

Kernel Space Capabilities

Seccomp

LSM

Namespaces

Cgroups

Resource

Figure 12: Layered Container Construction 47 6.1.3 UTS Namespace

When a container is created, a unique hostname is assigned. This feature provides confinement around the hostname used inside a container. Even though the hostname namespace might not be the most critical subsystem in terms security, it does play an important role in identifying containers.

6.1.4 Networking Namespace

In this project the virtual ethernet driver (veth) was used to create isolated networking namespaces for each container. A virtual ethernet device consists of a software based ethernet device in the container, the virtual ethernet device is then connected to the host forming a pair of devices. The devices can then be used in a manner that suits the IoT applications best. A common technique is to attach the virtual ethernet devices to an ethernet bridge. By doing so, container to host communication can be achieved if desired. If the physical IoT networking devices are added to a bridge which includes a physical network interface, containers are able to communicate with other devices on the physical network. Multiple virtual ethernet devices can be added to a container, in this way several ethernet bridges can be created, forming private networks among the containers themselves.

The containers created in this project were each equipped with one virtual ethernet interface connected to a host bridge containing the physical ethernet connection of the VoCore. In this way, containers could connect to the outside network, which is assumed to be a fairly realistic requirement of a networked service running on an IoT device. Listing 1 shows the template network configuration for each container. The MAC addresses associated with the virtual ethernet devices are generated when the container starts. In case static MAC addresses are needed, the configuration can simply be changed to a hard coded MAC address. Listing 1: Container network configuration .network.type = veth lxc.network.flags = up lxc.network.link = br−lan

48 lxc.network.hwaddr = xx:xx:xx:xx:xx:xx

As noted by Claassen et al. [68], the macvlan network driver is most efficient to use and could offer an edge over veth with regards to security as it disallows host to container network communication. However, in this project the veth driver was chosen as it’s better suited to perform network address translation (NAT) tests which is a goal in this project.

6.1.5 Mount Namespace

In this implementation it was decided to bind-mount the library directories of the host to the containers as read-only filesystems by default. This will save on disk space and since the directories are mounted as read-only, the containers can’t overwrite the libraries. On IoT devices where available disk space is present, physically copying libraries to the destination container could yield a higher confidence in the container isolation. A vulnerability in the bind-mount system could allow containers write access to the host library directories allowing an attacker to tamper with the system libraries not only in the container, but on the host and all other containers utilizing the bind-mount as well. Physically copying libraries to a container limits any tampering with libraries to the container.

6.1.6 Root File System

The file system for each container is isolated into its own root directory. The containers are built in such a way that when the container starts, its root file system is shifted to a specially prepared location on disk. The templating system prepares the directory with the essential system tools to provide a basic system in which applications can later be installed and run.

The container creator described in section 6.1.2 starts by preparing a base root file system directory structure. The directory structure mainly conforms to the File System Hierarchy Standard [79]. After this, critical system files are installed in /etc. These are the passwd and group files, a minimal start-up script, inittab for the init process and a DHCP client script

49 in case automatic IP configuration should be used. In the next step, the busybox application is installed and symlinks to the various busybox applets are created. Since busybox provides the basic system utilities for a fairly complete Linux based operating system, most of the required tools are thus installed in the container at this point. After busybox is installed, the init program is installed into the container. In this implementation, a default root password was decided not to be set by default. In case the user doesn’t specifically supply a root password at container creation time, the root user will be locked out of the container. However, since the host operating system can always attach itself to the container as the root user after the container has been started without supplying a password, it’s trivial to unlock the root account at a later stage if needed. The decision to lock the root account by default is because of the many security incidents that has happened during the years where default passwords are not changed after installation [30]. In this scenario, if remote access to the container should be allowed, the administrator can set a password once the container has been configured. Furthermore, since these containers will often be used to provide a secure computing environment for applications, user accounts will generally not be needed.

6.1.7 Cgroups

In this project, the LXC templating system was examined in an effort to extract reasonable default values for the cgroups subsystem. LXC has chosen to only limit device access in default container configurations. This strategy was therefore used in this project as well.

The device cgroups limitations works on a default deny type strategy. Access to all device files are by default denied except a few explicitly allowed. The allowed device files are those needed for common application operations. LXC allows access to the following devices: null, zero, consoles, (u)random, pts’es, rtc, tun and tty0/tty1. The configuration is presented in listing 2 Listing 2: Default Deny Cgroups Device List lxc.cgroup.devices.deny = a lxc.cgroup.devices.allow = c 1:3 rwm lxc.cgroup.devices.allow = c 1:5 rwm

50 lxc.cgroup.devices.allow = c 5:0 rwm lxc.cgroup.devices.allow = c 5:1 rwm lxc.cgroup.devices.allow = c 1:8 rwm lxc.cgroup.devices.allow = c 1:9 rwm lxc.cgroup.devices.allow = c 5:2 rwm lxc.cgroup.devices.allow = c 136:∗ rwm lxc.cgroup.devices.allow = c 254:0 rm lxc.cgroup.devices.allow = c 10:200 rwm lxc.cgroup.devices.allow = c 4:0 rwm lxc.cgroup.devices.allow = c 4:1 rwm

6.1.8 Capabilities

LXC provides the option of defining capabilities as ”default allow” or ”default deny”. Keeping with the goals of the project the ”default deny” strategy was selected with capabilities. Since Linux capabilities are of a manage- able amount, an incremental method for determining the least amount of whitelisted capabilities were employed. A test container was created with all capabilities turned off. The container was started in debugging mode while looking for capabilities violations.

In the first iteration, a violation against the CAP NET ADMIN capabil- ity was detected as the container attempted to set its own ip address. For any container wanting to operate on a network (which is considered highly likely), this capability must be granted by the container creator.

In the second iteration, a capability violation occurred when the service application under test (in this case, the embedded web server) tried to bind a socket to a privileged port. Web servers bind to port 80 (and/or 443 for SSL/TLS traffic). These are considered privileged ports and thus the opera- tion failed. The type of service applications running in the IoT environment that this project targets are likely to bind to privileged ports. Since services may include basic networking functionality like DHCP, NTP and DNS or common services like HTTP/HTTPS, SSH or LDAP it was decided to add the capability of allowing a container to bind to a privileged port.

51 On the third iteration, the container booted up and the embedded web server was able to bind to the privileged port and serve web requests. Since the process of determining the offending capabilities in the previous itera- tions proved relatively easy to identify in conjunction with the nature of the projects design goals, it was decided to make the final ’default deny’ list only contain the two exceptions identified in iterations 1 and 2. The final list of capabilities in the default container installation in given in list 3. Listing 3: Default container capabilties CAP NET ADMIN CAP NET BIND SERVICE

6.1.9 Secure Computing Mode

For this phase of the project, a ”default permit” strategy was used when creating the seccomp configuration to be used by the container build program. A default permit strategy is synonymous with a blacklist. This means that all syscalls are by default allowed except those specifically listed in the blacklist. In section 6.2, an effort is made to further increase the security of a container by creating a ”default deny” seccomp generator to extend the LiCShield project. However, since this section is concerned with creating a general purpose template that can easily be used by any container, a ”default deny” approach is not realistic since it requires container profiling to be able to work.

A default permit strategy for building seccomp lists can however aid in protecting against several potentially dangerous syscalls which most applica- tions will not need during normal operations. In this section, best practices from the LXC and Docker projects has been combined to create a default seccomp list. In table 5 the final seccomp blacklist is presented. The list contains syscalls which has a low probability of being used by service type applications such as web servers, database engines, proxy servers, etc. The table lists the syscall, a reference to the source from where the syscall was added and a short description of the syscall functionality.

52 Table 5: Base Seccomp filter

Syscall Reference Description load LXC Load a new kernel for later execution init module LXC & Docker Load kernel module finit module LXC & Docker Load kernel module ptrace Docker Trace a process acct Docker Toggle process accounting add key Docker Add a key to the kernel keyctl Docker Kernel key management control request key Docker Get key from kernel keyring adjtimex Docker Adjust kernel clock clock adjtime Docker Adjust system clock clock settime Docker set system clock settimeofday Docker Set system time stime Docker Interface to system time settings bpf Docker Run BPF map/program in the kernel get kernel syms Docker Get kernel symbols get mempolicy Docker Get NUMA memory policy set mempolicy Docker Set NUMA memory policy ioperm Docker Modify kernel I/O permissions iopl Docker Also modifies kenrel I/O permissions kcmp Docker Process kernel resourse comparison lookup dcookie Docker Lookup directory path mbind Docker Set memory range policy mount Docker Mount file system umount Docker Unmount a file system umount2 Docker Unmount a file system move pages Docker Move process pages open by handle at Docker Open file via a handle name to handle at Docker Get file handle nfsservctl Docker Interface to kernel NFS server event open Docker Performance monitoring personality Docker Set execution domain of process pivot root Docker Change root file system process vm readv Docker Transfer process data

53 process vm writev Docker Transfer process data quotactl Docker Manipulate disk quotas reboot Docker reboot system setns Docker Reassociate thread with a namespace swapon Docker Enable file/device swapping swapoff Docker Disable file/device swapping Docker Retreive file system information sysctl Docker Read/write system parameters unshare Docker Namespace manipulation uselib Docker Older shared library loading userfaultfd Docker Userspace page fault handling ustat Docker Get file system statistics vm86 Docker Enter virtual 8086 mode vm86old Docker Enter virtual 8086 mode

6.2 Phase Two - Dynamic Seccomp Profiling

In this section, the main development effort and contribution of the research project are presented. As mentioned in the literature review chapter, the LiC- Shield project [31] aims to profile containers to build Linux Security Module (like AppArmor or SELinux) profiles. In this section, that work is extended to also encompass generated, dynamic seccomp profiles. The goal of this part of the project was to allow for the seccomp container configuration to be specified with a default deny strategy by profiling the syscalls executed inside a container during a permissive profiling session. The list of syscalls would then be converted into a whitelist, allowing only the syscalls in the whitelist whilst blocking all other syscalls. This approach will reduce the allowed syscalls to only allow those explicitly used by the software executing in the container. The following sections present the work done in this phase.

Before this phase of the project was started, a container was created using the tools and configurations described in the previous sections. A web service was added to the container serving a data file. The container was verified to start, serve the data file via the web service and able to be shut down. This container was used in following experiments.

54 6.2.1 First Iteration

In the first iteration, the container initialization process was attached to a profiler designed to catch the syscalls executed by the process. All subsequent processes are also attached and their syscalls are caught. After the container has started, the primary function of the container is invoked, in this case, an external call for a resource on a web service located inside the container. This is done to ensure that the syscalls utilized by the service are accessed. After the web resource is downloaded, the container is shutdown and the profiler detached when the container process exits. The syscall debug log is processed by cleaning up superfluous debugging data and a list of the syscalls used in the entire process was collected and a seccomp rules file was generated. The default seccomp list used in the initial phase of the container setup was replaced by the newly generated seccomp list and the process was repeated (container started, web resource downloaded and container shut down). The container worked without violating any of the seccomp rules in the newly created whitelist.

6.2.2 Second Iteration

In the second iteration, an effort was made to prune the syscall tracing of superfluous syscalls used mainly in the setup phase of the container and thus not necessarily used in the normal operations of the container. To achieve this goal, the startup and shutdown process of the container was analyzed and patterns in container behaviour identified. This process lead to an identifiable pattern that could be used to exclude syscalls not needed in the container operations and thus could be removed. To allow for a detailed analysis of the process execution flow of a container, a process tree was created (figure 13). Below the analysis work is described.

55 1307 (lxc-start)

1308 1312

1310

1309 1313

/usr/lib/lxc/lxc-monitord /usr/lib/lxc/lxc-monitord 1311

1314 1334 prctl PR_SET_SECCOMP, SECCOMP_MODE_FILTER

/sbin/init

2: 1341 3: 1344 4: 1345 5: 1346

sh -c zfs list 2> /dev/null /sbin/kmodloader /sbin/procd /bin/sh /etc/preinit

6: 1347

/bin/httpd -h /httpd

8: 9: 1353

Figure 13: Profiled Seccomp Tree

After initial process setup, the container calls fork two times and resets the process session ID, indicating that a process will be started. In the new daemon process, the lxc-monitord program is started utilizing the ex- ecve syscalls (indicated by a square box in the container process tree figure). As can be seen in figure 13, two instances of the lxc-monitord programs are created (process 1309 & 1313). All syscalls up until the lxc-monitord point (including the syscalls used by lxc-monitord) was removed from considera- tion since it’s not used by the container, however the resulting seccomp file contained no reduced syscalls from the initial list extracted from iteration 1. This is probably due to the fact the most syscalls called up until this point is mainly process startup-specific (such as loading libraries and setting up the

56 process environment) which is shared by all Linux executables.

The next processes to investigate was 1310, 1311 & 1314. As can be seen in figure 13 these branches seem to be the last processes in the hierarchy before the main container process is started. These processes where also pruned since it was theorized they were not directly involved in the container startup procedure. From the new profile set, a new seccomp list was created. The seccomp list generated after this second prune reduced the number of syscalls by two. The discarded syscalls were connect and . This new list was applied to the container and tested. The container started and the web service was able to serve the request just like in the first iteration. It would seem that during the initialization phase, the container startup code connects to some socket and waits for potential data. It is speculated that the process is trying to communicate with the lxc-monitord process but this has not been verified.

After the successful attempt to prune the seccomp list, the container debug log was further analyzed to see if any additional syscalls could be pruned. The next worker process in line was the actual process that began running the container-specific code (process 1334 in figure 13). This process would switch execution context to the init program which in turn called startup scripts and programs configured to be started on container boot (like the web service). An attempt was made to prune syscalls in the worker process just before init was called. A multitude of syscalls was pruned in this worker process however removing all these syscalls lead to a seccomp violation error when trying to boot the container.

6.2.3 Third Iteration

In the third iteration, the final worker process prune list found in iteration 2 was analyzed together with process 1334’s syscall trace to determine if any of the syscalls could be removed without causing a seccomp violation. During analysis of process 1334, it was found that some time prior to switching the execution context to the init program, capabilities were dropped and, more importantly, the prctl syscall was executed, instructing the kernel to load the seccomp list defined by the container (shown as ”prctl PR SET SECCOMP,

57 SECCOMP MODE FILTER” in figure 13). It would stand to reason that any syscall called after the seccomp list is loaded while not explicitly allowed, would cause a seccomp violation. Therefore, syscalls were pruned prior to (and including) the prctl call, which yielded a list of 14 removed syscalls illustrated in listing 4. It’s noteworthy that this list contains some potentially dangerous syscalls (pivot root, prctl, umount2) which would be beneficial to have denied by default. Listing 4: List of worker process syscalls pruned in last stage clone f c h d i r i o c t l l l s e e k openat pipe2 p i v o t r o o t p r c t l sethostname s i g n a l f d 4 s o c k e t p a i r s t a t f s 6 4 umount2 unlink

Removing all these syscalls however also caused a seccomp violation. Manually testing for the removal of syscalls from the list revealed that the unlink syscall was the cause of the seccomp violation. This was confusing as no call to unlink was shown in the container profile after the prctl call to enforce the seccomp list. However, due to time constraints, this anomaly could not be investigated further. Once the unlink syscall was re-added to the final whitelist, the container worked as expected.

It should also be noted that when the dynamic seccomp profile was used (from iteration 1 and forward) the lxc-attach program was unable to attach to the container. It is theorized that some syscall is missing in the whitelist which would allow the host to attach itself to a container but this has not been investigated.

58 Finally, a program was created using the experience gained in the iter- ations that would automatically process a syscall profile log of a container and prune out any superfluous syscalls in the log and generate a seccomp whitelist, ready to be used by the profiled container. The final seccomp whitelist contains 66 syscalls, which is a fairly restricted syscall list consid- ering there are over a thousand syscalls in the Linux kernel. The final list of syscalls for the web service container are presented in listing 5. Note the bind and listen syscall in the list. These are syscalls explicitly used by the web service to establish a listening tcp socket in order to serve requests, show- ing that the profiler created includes unique syscalls from processes spawned inside the container. Listing 5: Final profiled seccomp list accept a c c e s s bind brk chdir c l o c k g e t t i m e c l o s e dup2 e p o l l c r e a t e 1 e p o l l c t l e p o l l p w a i t execve e x i t g r o u p f c n t l 6 4 f o r k f s t a t 6 4 getcwd getdents64 geteuid getpid getppid getsockname g e t t i d getuid k i l l

59 l i s t e n l s t a t 6 4 mkdir mknod mmap2 mount mprotect nanosleep open pipe read r e a d l i n k readv reboot recvfrom recvmsg r e s t a r t s y s c a l l rmdir r t s i g a c t i o n rt sigprocmask s e n d f i l e 6 4 sendmsg sendto s e t i t i m e r s e t s i d s e t s o c k o p t s e t t h r e a d a r e a s e t t i d a d d r e s s shutdown s i g r e t u r n socket s t at 6 4 symlink umask uname wait4 waitid

60 write writev unlink

6.3 Phase Three - Container Performance Mea- surements

In the final step, a performance analysis was performed on the secure com- puting platform that was built. A benchmarking application was run both on the host and inside a container created with the configuration from the first phase of the project. The performance tests conducted was similar to the studies found in the literature review, i.e. overhead with respect to CPU, memory operations and network I/O was considered.

For benchmarking the CPU and memory intense operations, the nbench tool [80] was used. To obtain reliable results, the following measuring proto- col was created. Before measurements started, the running processes within the operating system was reviewed to identify and remove potential programs that could influence the measurements. During the review, no programs that could significantly affect the measurements was identified. In addition, no network connections were active on the platform while the tests was run, all communication with the platform took place over a serial connection. By eliminating all network traffic during CPU and memory tests, a minimal amount CPU cycles would be spent transmitting and/or receiving network data while the tests were running. In the first test, the nbench tool was executed on the host with no container started, then a container was started and the nbench tool was run inside the container. The nbench tool was run 10 times in consecutive order both on the host and the container and the av- erage results calculated to mitigate any spurious factors that could influence the results. This technique was employed by various performance studies examined in the literature review as well.

To test network bandwidth limitations, the iperf3 [81] utility was used. Iperf3 was first run on the host without any container present, this was done to measure a reference value for the VoCore platform without any potential

61 influences from containers. Iperf3 was then run inside a container configured to operate on the network via an ethernet bridge using the veth driver. Fi- nally, iperf3 was run inside a container configured to operate using network address translation (NAT), also using the veth driver. The last (NAT) test was added since some studies found in the literature review claim perfor- mance was significantly impacted by utilizing NAT in a container setup. A laptop was used to connect directly to the ethernet port on the VoCore device and acted as the client in the tests.

The iperf3 utility was always started on the VoCore device with the fol- lowing parameters i p e r f 3 −s −p 50000

The laptop always initialized connections to the VoCore by calling i p e r f 3 −c −p 50000 i p e r f 3 −c −p 50000 −R

The first invocation instructs the iperf3 utility to send data to the server once connected, the second invocation instructs iperf3 to request data from the server once connection has been established. This way both transmit- ting and receiving data is tested on the VoCore device. The results of the performance tests can be found in section 7.3.

The following netfilter rules was used in the last test (NAT) to configure the kernel to (1) masquerade packets going out from the container and (2) perform destination NAT on packets coming into the physical interface trying to access the used iperf port. i p t a b l e s −I FORWARD 1 −j ACCEPT i p t a b l e s −t nat −I POSTROUTING 1 −s 10.0.0.0/24 −o br− lan −j MASQUERADE i p t a b l e s −t nat −I PREROUTING 1 −i br−lan −d 192.168.1.2 −p tcp −−dport 50000 −j DNAT −−to 1 0 . 0 . 0 . 2

62 Chapter 7

Results

In this chapter, the various tests and results from chapter 6 are presented. In section 7.1, results from the initial construction process and security testing performed in phase one is presented. In section 7.2 the results from the dy- namic seccomp application built in phase 2 is presented. Finally, section 7.3 presents the results from the last phase, operational performance evaluation.

7.1 Phase One - IoT Container Platform

7.1.1 Base Operating System

In the beginning of phase one, a Linux container operating system was built with the help of the LEDE project and put on the VoCore IoT device. Table 6 presents the base operating system size. Counting the kernel, busybox, the libraries used by busybox and various configuration files, the operating system is around 5.3 MB in size. This doesn’t account for any specific soft- ware installed on the system, however table 6 does provide a fairly indicative minimum storage required to run a very limited Linux based operating sys- tem. In this project, once other tools were added to the system (such as developer tools, firewalling tools, the LXC user space tools, networking and performance tools) the total storage space requirement became 12 MB (not

63 counting any container overhead).

Table 6: System size for base operating system utilities

Busybox 325 KB Libc 602 KB Libgcc 77 KB Kernel 4270 KB Total 5274 KB

7.1.2 LXC Container Platform

Once the container installer was created and used to build a container on the platform, the resulting container storage overhead was 508 KB. Table 7 shows the storage overhead for the different parts of the container. Note that the kernel is shared between the host and the container and the libraries are read- only mounted from the host file system into the container, yielding a fairly low storage overhead per container. As mentioned in section 6.1.5 libraries could instead be copied into the container, increasing the container storage overhead with at least 679 KB (Libc + Libgcc from table 6), increasing the total storage overhead to a little over 1 MB per container.

Table 7: Base container size

Busybox 325 KB init 12 KB Various OS files 171 KB Total 508 KB

64 7.1.3 UTS Namespace

In this test scenario, two containers (t1 and t2) is running simultaneously on the host (LEDE). In container t1, a kernel syscall to change the hostname is invoked with an argument to change the hostname to ”newname”. The hostname is then checked on the containers and the host. As can be seen in table 8 the first container has a new hostname while the second container and the host hostname remains the same.

Table 8: Instance hostnames

Instance Hostname Container 1 newname Container 2 t2 Host LEDE

Even though a syscall to change the hostname on container t1 is called, neither the container t2 or the host is affected by the hostname change, thus showing that the UTS namespace is in effect.

7.1.4 Networking Namespace

In this section, the virtual networking devices are tested against data leakage between containers. In container t1, a tcpdump is initiated on the primary network interface while in container t2, ping packets are sent to the host.

Table 9: Results of tcpdump in container t1 and ping in container t2

Container Hostname Packets Sent Packets Received t1 tcpdump 0 0 t2 ping 10 0

65 As can be seen from this test, no data is leaked between containers via the network devices. It should be noted that the capabilities subsystem was disabled while performing this test as both sending raw packets (used by the ping utility) and putting network interfaces in promiscuous mode (as used by tcpdump) are disallowed by the capabilities subsystem. This is an example of the layered security approach utilized in this project. It should also be noted that this test depends highly on how the container network has been configured. It is possible to configure container networking in such a way that packets can traverse from one container boundary to another which could be desired in a scenario where inter-container communication is needed.

7.1.5 Mount Namespace

In this section the security of the virtual directory mounting is tested. One of the virtual mount point seen by the container is the /lib directory. This virtual file system is read-only and should thus prevent alteration from inside a container. Two tests were made, to create a file in the /lib directory and altering a library. As can be seen in table 10 the results of both actions are denied.

Table 10: Container access test to read-only mounted file system

Action Path Success Create /lib/test No Write /lib/libc.so No

As can be seen in the table above, even though the container has file sys- tem permissions to alter the /lib directory, the virtual mounting namespace prevents the container from writing to the directory.

66 7.1.6 Root File System

During the initialization sequence of a container, the root file system is moved to a specified point where a file system, specific to the container is located. Creating a file inside the container limits the file to the specific container root file system unless explicitly shared. Results show that creating a file in container t1, will not make it accessible in container t2, as shown in table 11.

Table 11: Rootfs isolation of containers

Container Path Have Access t1 /test.txt Yes t2 /test.txt No

7.1.7 Cgroups

In this test, the CAP MKNOD capability was temporarily enabled since without it, the container is unable to create any device file. In section 6.1.7 the device file /dev/tty0 and /dev/tty1 are defined as allowed while /de- v/tty2 and above is not. Table 12 shows the results of device file creation on /dev/tty0 and /dev/tty2. As can be seen in the table, creating /dev/tty0 is allowed while creation of /dev/tty2 is disallowed, as was expected due to the cgroups configuration.

Table 12: Cgroups device restrictions

Action Path Success mknod /dev/tty0 Yes mknod /dev/tty2 No

This latter device file however is not explicitly allowed and is therefore denied. As observed in section 7.1.8 the mknod program and its associated

67 syscalls was not allowed to run, if some mechanism is discovered to bypass the capabilities subsystem (or cgroups) this test exemplifies how the layered security approach will still block potentially harmful operations.

7.1.8 Capabilities

In this section, the capabilities subsystem was tested by attempting to vio- late two blocked capabilities, changing owner of a file/directory and creating device files. As can be shown in table 13, neither of the operations was successful. This was expected since the corresponding capabilities was not allowed according the implementation in section 6.1.8. Note that according to the cgroups configuration, access to the /dev/tty0 device file is allowed. However, the action is still blocked since the capabilities subsystem disallows creation of devices altogether.

Table 13: Capabilities testing

Action Path Success chown /tmp No mknod /dev/tty0 No

7.1.9 Secure Computing Mode

In this section, the seccomp configuration was tested by executing the umount syscall, which is in the seccomp blacklist. As can be seen in table 14 the umount syscall is denied.

Table 14: Seccomp violation test

Action Path Success umount /dev No

68 7.2 Phase Two - Dynamic Seccomp Profiling

In this section, the outcome of the different seccomp profile generations are presented. As was described in section 6.2, the seccomp profiler went through three rounds of iterations. The first iteration, presented in section 6.2.1 generated a list of all syscalls used in the lifetime of the container. As can be seen in the first row in table 15, 81 syscalls was included in this version of the whitelist. At this point, a general question of whether the solution was optimal in relation to security considerations was posed. Since the number of whitelisted syscalls could potentially increase security risks as more syscalls are allowed, it was decided to investigate if any superfluous syscalls was present in the seccomp list

In the second iteration (presented in section 6.2.2), a process tree of the container life cycle was created and analyzed in an effort to further reduce the whitelist. Two of the pruning attempts was successful (second and third row in table 15) while the last one (operating on the main container startup process) failed (forth row in table 15), leaving this iteration with two syscalls successfully pruned. Since the removal of superfluous syscalls proved a suc- cess in this iteration, it was decided to further analyze the syscall graph of the container life cycle in an attempt to pinpoint the optimal syscall whitelist, where no superfluous syscalls would be added by the program

In the last iteration, successful pruning of the last process lead to a final seccomp whitelist of 66 syscalls, as shown in the final row of table 15. Since this iteration found the inception point of the seccomp enforcement and the allocated time slot for this part of the project was running out, it was decided that no more iterations were to be made.

69 Table 15: Seccomp violation test

Iteration Start Web Service Stop No. Syscalls 1 Yes Yes Yes 81 2.1 Yes Yes Yes 81 2.2 Yes Yes Yes 79 2.3 No N/A N/A 65 3 Yes Yes Yes 66

7.3 Phase Three - Container Performance Mea- surements

As was shown in chapter 3.2 there has been previous studies investigating the performance of containers on IoT hardware. However, all studies reviewed in chapter 3.2 was focused on fairly powerful IoT devices like the Raspberry Pi 1 and above. Since this project was performed on the VoCore platform (which are significantly less powerful than a Raspberry Pi), additional performance measurements has been conducted in this project to further evaluate the viability of containers on IoT devices.

7.3.1 CPU & Memory Operation Measurements

In this section, results from the nbench [80] benchmarking program are pre- sented. The measurements are divided into two sections: Host measure- ments, which are performed on the host operating system without any con- tainer present and container measurements, which are performed inside a container. The nbench tool performs three types of measurements: Memory (Mi), Iterative (Ii) and Floating-point (Fi) measurements.

70 Table 16: CPU & Memory Performance native vs inside container

Host Measurements Container Measurements No Mi Ii Fi Mi Ii Fi 1 0.616 1.759 0.229 0.616 1.758 0.229 2 0.616 1.760 0.229 0.616 1.757 0.229 3 0.616 1.759 0.229 0.616 1.758 0.229 4 0.616 1.760 0.229 0.616 1.757 0.229 5 0.616 1.759 0.229 0.616 1.758 0.229 6 0.616 1.760 0.229 0.616 1.758 0.229 7 0.616 1.760 0.229 0.616 1.758 0.229 8 0.616 1.759 0.229 0.616 1.758 0.229 9 0.616 1.759 0.229 0.616 1.758 0.229 10 0.616 1.759 0.229 0.616 1.756 0.229 Average 0.616 1.7594 0.229 0.616 1.7576 0.229

As can be seen in table 16, there is virtually no performance penalty to running a program inside a container. In fact, only while performing the iterative intense calculations, an average slow down of 0.1% was observed inside a container, as can be seen in the Ii column.

7.3.2 Network Measurements

In this section, network performance was measured while transmitting and receiving data to the VoCore device. The measurements was taken using the iperf3 [81] utility, which is designed to measure network performance. The network measurements was collected using TCP transfers between a laptop and the VoCore device using an ethernet cable. Three separate measure- ments were performed. The first measurements was obtained by running iperf on the host while no container was active. The second measurement was taken running iperf3 inside a container using the virtual ethernet device bridged to the ethernet LAN chipset. The third measurements was obtained by running iperf3 inside a container using the virtual ethernet device using network address translation (NAT) to and from the container.

71 Host Transfer

In this section, data transfer to and from the host without any container is shown. Table 17 shows data transfer from the laptop to the VoCore host. As can be seen at the bottom of the table, an average bandwidth of 4.31 Mbit/s is achieved. Table 18 shows the host sending data to the laptop. Here, an average bandwidth of around 5.72 Mbit/s is achieved.

Table 17: Laptop Transferring Data to Host

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 550 KBytes 4.51 Mbits/sec 0 19.8 KBytes 1.00-2.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 2.00-3.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 3.00-4.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 4.00-5.00 sec 525 KBytes 4.30 Mbits/sec 0 19.8 KBytes 5.00-6.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 6.00-7.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 7.00-8.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 8.00-9.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 9.00-10.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes

Total 0.00-10.00 sec 5.14 MBytes 4.31 Mbits/sec 0 sender 0.00-10.00 sec 5.10 MBytes 4.28 Mbits/sec receiver

72 Table 18: Host Transferring Data to Laptop

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 665 KBytes 5.44 Mbits/sec 1.00-2.00 sec 696 KBytes 5.70 Mbits/sec 2.00-3.00 sec 693 KBytes 5.68 Mbits/sec 3.00-4.00 sec 690 KBytes 5.65 Mbits/sec 4.00-5.00 sec 694 KBytes 5.69 Mbits/sec 5.00-6.00 sec 691 KBytes 5.66 Mbits/sec 6.00-7.00 sec 697 KBytes 5.71 Mbits/sec 7.00-8.00 sec 696 KBytes 5.70 Mbits/sec 8.00-9.00 sec 694 KBytes 5.69 Mbits/sec 9.00-10.00 sec 699 KBytes 5.72 Mbits/sec

Total 0.00-10.00 sec 6.82 MBytes 5.72 Mbits/sec 7 sender 0.00-10.00 sec 6.82 MBytes 5.72 Mbits/sec receiver

Container Bridge Transfer

In this section, data transfer to and from the host container utilizing a net- work bridge to the ethernet chipset is shown. Table 19 shows data transfer from the laptop to the container. The average bandwidth is around 4.31 Mbit/s, exactly the same measurement as when operating without any con- tainer. Table 20 shows the container sending data to the laptop, the average measured bandwidth in this setup is 5.8 Mbit/s, a slight increase from the previous measurement.

73 Table 19: Laptop Transferring Data to Container via Bridge

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 550 KBytes 4.51 Mbits/sec 0 19.8 KBytes 1.00-2.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 2.00-3.00 sec 509 KBytes 4.17 Mbits/sec 0 19.8 KBytes 3.00-4.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 4.00-5.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 5.00-6.00 sec 537 KBytes 4.40 Mbits/sec 0 19.8 KBytes 6.00-7.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 7.00-8.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 8.00-9.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes 9.00-10.00 sec 523 KBytes 4.29 Mbits/sec 0 19.8 KBytes

Total 0.00-10.00 sec 5.14 MBytes 4.31 Mbits/sec 0 sender 0.00-10.00 sec 5.11 MBytes 4.29 Mbits/sec receiver

74 Table 20: Container Transferring Data to Laptop via Bridge

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 684 KBytes 5.61 Mbits/sec 1.00-2.00 sec 693 KBytes 5.68 Mbits/sec 2.00-3.00 sec 697 KBytes 5.71 Mbits/sec 3.00-4.00 sec 691 KBytes 5.66 Mbits/sec 4.00-5.00 sec 693 KBytes 5.68 Mbits/sec 5.00-6.00 sec 689 KBytes 5.64 Mbits/sec 6.00-7.00 sec 694 KBytes 5.69 Mbits/sec 7.00-8.00 sec 693 KBytes 5.68 Mbits/sec 8.00-9.00 sec 694 KBytes 5.69 Mbits/sec 9.00-10.00 sec 696 KBytes 5.70 Mbits/sec

Total 0.00-10.00 sec 6.92 MBytes 5.80 Mbits/sec 14 sender 0.00-10.00 sec 6.92 MBytes 5.80 Mbits/sec receiver

Container NAT Transfer

In this section, data transfer to and from the host container via a NAT mechanism on the host is shown. Table 21 shows data transfer from the laptop to the container. The average bandwidth measured in this setup is 4.3 Mbit/s again, a very similar bandwidth as in the previous two scenarios. Table 22 shows the container sending data to the laptop, the average value measured in this setup was 5.77 Mbit/s which is also very similar to the previous measurements.

75 Table 21: Laptop Transferring Data to Container via NAT

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 550 KBytes 4.51 Mbits/sec 0 17.0 KBytes 1.00-2.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 2.00-3.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 3.00-4.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 4.00-5.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 5.00-6.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 6.00-7.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 7.00-8.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes 8.00-9.00 sec 533 KBytes 4.37 Mbits/sec 0 17.0 KBytes 9.00-10.00 sec 520 KBytes 4.26 Mbits/sec 0 17.0 KBytes

Total 0.00-10.00 sec 5.12 MBytes 4.30 Mbits/sec 0 sender 0.00-10.00 sec 5.10 MBytes 4.27 Mbits/sec receiver

76 Table 22: Container Transferring Data to Laptop via NAT

Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 687 KBytes 5.63 Mbits/sec 1.00-2.00 sec 694 KBytes 5.69 Mbits/sec 2.00-3.00 sec 693 KBytes 5.68 Mbits/sec 3.00-4.00 sec 693 KBytes 5.68 Mbits/sec 4.00-5.00 sec 690 KBytes 5.65 Mbits/sec 5.00-6.00 sec 690 KBytes 5.65 Mbits/sec 6.00-7.00 sec 691 KBytes 5.66 Mbits/sec 7.00-8.00 sec 693 KBytes 5.68 Mbits/sec 8.00-9.00 sec 693 KBytes 5.68 Mbits/sec 9.00-10.00 sec 694 KBytes 5.69 Mbits/sec

Total 0.00-10.00 sec 6.87 MBytes 5.77 Mbits/sec 11 sender 0.00-10.00 sec 6.87 MBytes 5.77 Mbits/sec receiver

For illustrative purposes, a diagram (figure 14) of the average transfer times has been created. As can be seen in the figure, transfer speeds both inside and outside of the container (regardless of whether NAT is used or not) is fairly consistent.

77 6 5.72 5.8 5.77

4.31 4.31 4.3 4

2 Bandwidth (Mbit/s)

0 Host Cont.Bridge Cont.NAT Laptop Sending Host/Container Sending

Figure 14: Average transfer speeds

78 Chapter 8

Discussion

This project set out to investigate the viability of utilizing Linux containers on IoT devices in terms of security and operational performance as well as compiling any guidelines and best practices for running Linux containers on IoT platforms. In general this project has been able to answer all of the research questions in a relatively satisfying way. This chapter will reflect back on the outcome of the research questions and relate them to the findings of the literature review.

Regarding if any best practices and guidelines can be created when secur- ing IoT devices with containers, a number of valuable aspects was discovered during this project. To begin with, a modified version of the LXC container templating system used to build busybox containers were created. This code could aid others constructing containers on the OpenWRT/LEDE system. The produced code will be contributed to either the LXC project directly or to the OpenWRT/LEDE project. The knowledge gained from this part will therefore be available to use by default when utilizing the mentioned projects.

Another important lesson learned from this project is that Linux con- tainers offers a layered security approach. Simply configuring containers to utilize available features could improve the robustness of the container as dif- ferent subsystems overlap in their responsibilities. This does, in some cases protect containers against single point of failure. The number of container

79 configuration options coupled together with the fact that the various con- tainer subsystems offers layered security yields a highly secure environment in terms of process isolation.

With regards to the capabilities subsystem, this thesis found that it’s possible and relatively simple to manually create default deny capabilities configurations for a specific container. In the container construction and configuration phase, all capabilities was initially disallowed and the container started. If any capabilities violation occurred, the corresponding capability was added and the process repeated until the container no longer violated any capabilities. Since there are a limited amount capabilities and probably only a subset are needed for a given container, it was discovered that finding the minimal capabilities configuration was fairly simple and not very time consuming.

The findings in this project are generally in line with the research pre- sented in section 3.1. Linux containers can offer a layered security approach that will enable processes to be isolated from both each other as well as the host operating system. The security mechanisms prevent a containerized application from executing and accessing potentially dangerous code and re- sources. In addition, this thesis indicate that the same security mechanisms utilized by containers on more powerful hardware like desktops and severs can also be used to secure the IoT domain. The closest piece of security focused research, hardware wise was by Wessel et al. [41] which conducted their research on two smartphones.

With regards to the second research question, this project has, at least initially shown that default deny seccomp profiles can be automatically con- structed in a similar way to that of the LiCShield project. This project succeeded in profiling a test container and its subprocesses syscalls. The list of profiled syscalls was pruned from the initialization traces and a restrictive, default deny seccomp profile was generated. Even though the profiler was only tested on one specific container, the solution should work on any con- tainer since the profiler takes the specific syscall trace from a container as input and generates a seccomp list based on the operations executed inside the container.

In the study by Mattetti et al. [31] the authors showed that it’s possible

80 to profile a container in an effort to build dynamic, restrictive (default deny) configurations for the Linux security module implementations like SELinux and AppArmor. In this project, the work done by Mattetti et al. has been extended to also include dynamic and restrictive (default deny) seccomp configurations. It would be an extremely hard, tedious and time consuming task to build ”default deny” lists manually for these specific subsystems. Adding the contributions in this project to Mattetti et al., two of the major security subsystems in the Linux container architecture can be configured automatically in a secure fashion.

Finally, with regards to operational performance, the results in this project has been very positive. The performed tests of CPU and memory operation overhead support other studies which all classify observed overhead as basi- cally negligible. With regards to network I/O overhead this thesis finds no major overhead for processes operating in containers regardless of configura- tion. This project has also examined the storage size overhead of containers and concluded that the overhead could be made rather small. As was shown in this project, it is possible to construct very thin containers using the busy- box project and a overlay bind-mounting technique to import libraries from the host operating system.

The fact that Linux containers are viable on IoT platforms has, in part been shown in other research. In section 3.2 of the literature review, papers evaluating the performance of Linux containers are presented. However, all of the studies were conducted on one of the many Raspberry Pi versions or devices with similar capabilities. In this thesis, all experiments have been conducted on the VoCore device, a much less powerful device than any of the Raspberry Pi (or similar) devices. It could have been conceivable that a more constrained device (such as the VoCore) wouldn’t be able to perform as well as the Raspberry Pi-like devices. The results from the measurements in this thesis are however consistent with the studies in section 3.2 and thus support the argument that Linux containers produce negligible overhead, at least with regards to CPU and memory utilization. Furthermore, none of the studies in section 3.2 consider non-volatile memory (storage) overhead. In this study, an effort was made to create very small containers as to accommodate devices with very constrained storage space. This thesis showed that containers can be made very thin, a container in this project was 508 KB. This could be further reduced by decreasing any unwanted functionality from busybox.

81 As noted previously, most of the performance measurements made in this project are broadly consistent with the current research. There are however one major deviating difference found this project. Krylovskiy [47] and Mora- bito & Beijar [62] noted that Docker containers exhibited a relatively high network I/O overhead when configured to utilize a network address trans- lation (NAT) configuration. In this thesis, no such overhead was found. In fact, measurements shows practically no difference in network I/O overhead regardless if the test were run on the host (without any container running), inside a container in a bridged network configuration or inside a container in a NAT network configuration. Without further analysis and access to more details surrounding Krylovskiy, Morabito & Beijar’s work it’s consid- ered impractical to draw any definite conclusions. Possible and/or partial explanation could be difference in hardware, improved networking stack in the kernel or diverging strategies in the Docker and LXC NAT configuration and/or implementation.

Some container subsystems was not considered in this project. Even though the PTS namespace is used in the project to provide private pseudo terminals, no security review has been attempted on the implementation. The relatively new features of user namespaces supported both by LXC and Docker has not been considered due to time constraints. It should be noted that user namespaces are considered a major security feature and should be considered for inclusion in any container use-case. Additionally, the cgroups system is made up by several subsystems, controlling different aspects of a container. In this project, only the device file cgroups subsystem has been utilized and tested, however for specific use-cases more subsystems should be considered for inclusion. Finally, Linux security modules (LSM) has not been considered in this project. In part, due to the fact that Mattetti et al. [31] has already made significant studies in this area.

Although this study was conducted on a specific hardware platform, the results should be generalisable to all hardware platforms capable of running a Linux-based operating system. The reason for this is that the entire con- tainer software stack should be hardware independent. The kernel software implementing the container features are software based and doesn’t require any extraordinary CPU features. The supporting container user space tools, like LXC and Docker has been proven to run on a multitude of Linux software distributions and hardware platforms and as of this project, LXC has been

82 adapted to the OpenWRT/LEDE project distribution, opening up container- ization to additional IoT platforms. Additionally, the container automation tools, like the LiCShield [31] tool by Mattetti et al., and now, developed in this project, the dynamic seccomp profiler should run on any Linux-based platform as well. Finally, this project extended the performance measure- ments to include a fairly low powered Linux-capable IoT device, further sup- porting that containers are a viable option on IoT devices capable of running Linux.

83 Chapter 9

Conclusions

The Internet of Things (IoT) is a domain within computing that are growing fast and as it’s growing, so does the security related aspects. IoT devices are now used in various critical environments such as medical, infrastructure and home automation to name a few. In extreme areas, an insecure IoT device can jeopardize human health. It’s therefore vital, at least in critical envi- ronments, that IoT devices are designed with security in mind. This thesis project set out to investigate the viability of utilizing Linux containers as a security mechanism on IoT platforms capable of running operating systems. To evaluate the viability of containers, both security and operational aspects was taken into account. Additionally, the possibility to extend current re- search to further increase the security mechanisms of Linux containers in general was attempted.

During the course of this project, it has been discovered that Linux con- tainers can be well suited as a security mechanism to isolate processes from each other. The container mechanisms available in recent Linux kernels pro- vide multiple layers of isolation which can be configured to work together to shield software from various threats. However, even though many best practices exists and initial configurations are present to isolate processes by default, certain container subsystems must have their configuration manually altered or in some cases, tools should be used to perform the configuration to optimize the security aspects of a container. This thesis presented cer- tain best practices and guidelines for some of the container subsystems as

84 well as extended current research to include a tool for optimized automatic configuration of the seccomp subsystem. The latter being the most impor- tant contribution of this project. This project also confirmed that Linux containers are a viable solution from an operational standpoint.

In conclusion, this project reinforces the scientific view of containers as a lightweight/low overhead option for increasing application security in IoT scenarios. Further more, all the components produced in this project are general and could be used in any container scenario, thus not limiting the solutions to IoT devices.

9.1 Future Work

As noted in the discussion chapter, user namespaces could have further in- creased the security of the containers in this project. Examining potential increase in security as well as the various challenges with user namespaces in relation to busybox could further improve the effectiveness of containers on IoT devices.

It could be interesting to further try to shrink the 508 KB base container size produced in this project. This could produce knowledge in the viability of Linux containers on even more constrained platforms with regards to storage.

During development of the dynamic seccomp profiler, some anomalies was experienced when trying to attach a console to the container via the lxc-attach program. In some cases the lxc-attach program exited without any failure notice. Further investigation and development of the dynamic seccomp profiler could be needed before it could become a reliable technology. Furthermore, as noted in section 6.2.3 even though the profiled container showed no trace of the unlink syscall being used after the prctl call to enable the seccomp list, a violation against it still happened. A deeper analysis of where this violation occurs, as well as under what circumstances they occur and if there are more potential syscalls that could cause this type of behaviour needs to be investigated. Additionally, no further analysis of the produced seccomp configuration after the third iteration has been conducted. It could be possible that some superfluous syscalls made it into the final configuration.

85 The work presented in this thesis would therefore benefit from a post-profiling analysis, where the generated seccomp profile is verified to actually utilize all the syscalls listed in its configuration. Finally, the seccomp profiler has only been tested on one specific container. The profiler would need to be tested to work on multiple containers, all utilizing different sets of syscalls before it can be deemed to produce reliable dynamic seccomp configuration files.

In this thesis project, network I/O overhead was investigated in three different configurations. None of the three configurations displayed any sig- nificant overhead. However, in the literature review phase of this project, two studies was found which concluded that utilizing NAT together with Docker containers introduced significant overhead. These findings contradict those found in this project and no effort was made to investigate the underlying cause of this discrepancy. A deeper study focusing on the overhead caused by NAT in relation to containers could be beneficial to understand the varying results.

86 Bibliography

[1] H. Sundmaeker, P. Guillemin, P. Friess, and S. Woelffl´e,eds., Vision and Challenges for Realising the Internet of Things. Luxembourg: Pub- lications Office of the European Union, 2010.

[2] L. Atzori, A. Iera, and G. Morabito, “The internet of things: A survey,” Computer Networks, vol. 54, no. 15, pp. 2787 – 2805, 2010.

[3] C. Bormann, M. Ersue, and A. Keranen, “Terminology for constrained- node networks,” RFC 7228, RFC Editor, May 2014. http://www.rfc- editor.org/rfc/rfc7228.txt.

[4] “Raspberry pi - teach, learn, and make with raspberry pi.” https:// www.raspberrypi.org/. Accessed: 2017-02-20.

[5] Z. Shelby, K. Hartke, and C. Bormann, “The constrained application protocol (coap),” RFC 7252, RFC Editor, June 2014. http://www. rfc-editor.org/rfc/rfc7252.txt.

[6] J. King and A. I. Awad, “A distributed security mechanism for resource- constrained iot devices,” Informatica, vol. 40, no. 1, pp. 133–143, 2016.

[7] K. D. Chang, J. L. Chen, C. Y. Chen, and H. C. Chao, “Iot operations management and traffic analysis for future internet,” in 2012 Comput- ing, Communications and Applications Conference, pp. 138–142, Jan 2012.

[8] Z. Yan, P. Zhang, and A. V. Vasilakos, “A survey on trust management for internet of things,” Journal of Network and Computer Applications, vol. 42, pp. 120 – 134, 2014.

87 [9] H. Ning and S. Hu, “Technology classification, industry, and education for future internet of things,” International Journal of Communication Systems, vol. 25, no. 9, pp. 1230–1241, 2012.

[10] R. Khan, S. U. Khan, R. Zaheer, and S. Khan, “Future internet: The in- ternet of things architecture, possible applications and key challenges,” in 2012 10th International Conference on Frontiers of Information Tech- nology, pp. 257–260, Dec 2012.

[11] M. C. Domingo, “An overview of the internet of things for people with disabilities,” Journal of Network and Computer Applications, vol. 35, no. 2, pp. 584 – 596, 2012. Simulation and Testbeds.

[12] C. Du and S. Zhu, “Research on urban public safety emergency man- agement early warning system based on technologies for the internet of things,” Procedia Engineering, vol. 45, pp. 748 – 754, 2012.

[13] A. Q. Gill, N. Phennel, D. Lane, and V. L. Phung, “Iot-enabled emer- gency information supply chain architecture for elderly people: The aus- tralian context,” Information Systems, vol. 58, pp. 75 – 86, 2016.

[14] D. Shah and V. Haradi, “Iot based biometrics implementation on rasp- berry pi,” Procedia Computer Science, vol. 79, pp. 328 – 336, 2016.

[15] J. Sapes and F. Solsona, “Fingerscanner: Embedding a fingerprint scan- ner in a raspberry pi,” Sensors (Basel), vol. 16, p. 220, Feb 2016. sensors- 16-00220[PII].

[16] V. Vujovi´cand M. Maksimovi´c,“Raspberry pi as a sensor web node for home automation,” Computers & Electrical Engineering, vol. 44, pp. 153 – 171, 2015.

[17] S. Gergo Vemi and C. Panchev, “Vulnerability testing of wireless ac- cess points using unmanned aerial vehicles (uav),” in ECCWS2015- Proceedings of the 14th European Conference on Cyber Warfare and Security 2015: ECCWS 2015, p. 425, Academic Conferences Limited, 2015.

[18] A. N. Ansari, M. Sedky, N. Sharma, and A. Tyagi, “An internet of things approach for motion detection using raspberry pi,” in Proceedings

88 of 2015 International Conference on Intelligent Computing and Internet of Things, pp. 131–134, Jan 2015.

[19] S. Sicari, A. Rizzardi, D. Miorandi, C. Cappiello, and A. Coen-Porisini, “Security policy enforcement for networked smart objects,” Computer Networks, vol. 108, pp. 133 – 147, 2016.

[20] “Snapdragon mobile processors and chipsets — qualcomm.” https:// www.qualcomm.com/products/snapdragon. Accessed: 2017-03-15.

[21] “The intel edison module — iot — intel software.” https://software. intel.com/en-us/iot/hardware/edison. Accessed: 2017-03-15.

[22] “The intel galileo board — iot — intel software.” https://software. intel.com/en-us/iot/hardware/galileo. Accessed: 2017-03-15.

[23] “Orange pi - orange pi plus.” http://www.orangepi.org/. Accessed: 2017-03-15.

[24] “Beagleboard.org - community supported open hardware computers for making.” https://beagleboard.org/. Accessed: 2017-03-15.

[25] “Arduino - arduinoboardyun.” https://www.arduino.cc/en/Main/ ArduinoBoardYun. Accessed: 2017-03-15.

[26] “Banana pi - bpi single board computers official website.” http://www. banana-pi.org/. Accessed: 2017-03-15.

[27] “Get c.h.i.p. and c.h.i.p. pro - the smarter way to build smart things.” https://getchip.com/pages/chip. Accessed: 2017-03-15.

[28] “All-you-need mini pc android + linux + arduino — udoo.” http:// www.udoo.org/. Accessed: 2017-03-15.

[29] “Vocore — coin-sized linux computer.” http://vocore.io/. Accessed: 2017-03-15.

[30] C. Kolias, G. Kambourakis, A. Stavrou, and J. Voas, “Ddos in the iot: Mirai and other botnets,” Computer, vol. 50, no. 7, pp. 80–84, 2017.

89 [31] M. Mattetti, A. Shulman-Peleg, Y. Allouche, A. Corradi, S. Dolev, and L. Foschini, “Securing the infrastructure and the workloads of linux containers,” in 2015 IEEE Conference on Communications and Network Security (CNS), pp. 559–567, Sept 2015.

[32] “Docker - build, ship, and run any app, anywhere.” https://www. docker.com/. Accessed: 2017-02-20.

[33] “Linux containers.” https://linuxcontainers.org/. Accessed: 2017- 03-10.

[34] Namespaces(7) Linux Programmer’s Manual, September 2014.

[35] “Cgroups.” https://www.kernel.org/doc/Documentation/ cgroup-v1/cgroups.txt. Accessed: 2017-05-05.

[36] Capabilities(7) Linux Programmer’s Manual, December 2015.

[37] seccomp(2) Linux Programmer’s Manual, December 2015.

[38] “Selinux wiki.” https://selinuxproject.org/page/Main_Page. Ac- cessed: 2017-08-10.

[39] “Apparmor wiki.” http://wiki.apparmor.net/index.php/Main_ Page. Accessed: 2017-08-10.

[40] E. Reshetova, J. Karhunen, T. Nyman, and N. Asokan, “Security of OS-level virtualization technologies: Technical report,” ArXiv e-prints, July 2014.

[41] S. Wessel, M. Huber, F. Stumpf, and C. Eckert, “Improving mobile device security with operating system-level virtualization,” Computers & Security, vol. 52, pp. 207 – 220, 2015.

[42] A. Miller and L. Chen, “Securing your containers: An exercise in secure high performance virtual containers,” in Proceedings of the International Conference on Security and Management (SAM), p. 1, The Steering Committee of The World Congress in Computer Science, Computer En- gineering and Applied Computing (WorldComp), 2012.

[43] T. Bui, “Analysis of Docker Security,” ArXiv e-prints, Jan. 2015.

90 [44] I. Borate and R. Chavan, “Sandboxing in linux: From smartphone to cloud,” International Journal of Computer Applications, vol. 148, no. 8, 2016.

[45] A. S. Abed, T. C. Clancy, and D. S. Levy, “Applying bag of system calls for anomalous behavior detection of applications in linux containers,” in 2015 IEEE Globecom Workshops (GC Wkshps), pp. 1–5, Dec 2015.

[46] H. Gantikow, C. Reich, M. Knahl, and N. Clarke, Providing Security in Container-Based HPC Runtime Environments, pp. 685–695. Cham: Springer International Publishing, 2016.

[47] A. Krylovskiy, “Internet of things gateways meet linux containers: Per- formance evaluation and discussion,” in 2015 IEEE 2nd World Forum on Internet of Things (WF-IoT), pp. 222–227, Dec 2015.

[48] “Mosquitto: An open source mq tt v3.l!v3. 1. 1 broker.” https: //mosquitto.org/. Accessed: 2017-02-20.

[49] “Linksmart device gateway. fraunhofer fi t.” https:/lIinksmart. eu/redmine/projects/linksmart-local-connect/wikilDevice_ Gateway. Accessed: 2017-02-20.

[50] R. Morabito, “A performance evaluation of container technologies on in- ternet of things devices,” in 2016 IEEE Conference on Computer Com- munications Workshops (INFOCOM WKSHPS), pp. 999–1000, April 2016.

[51] “Mysql.” https://www.mysql.com/. Accessed: 2017-02-20.

[52] “The apache http server project.” https://httpd.apache.org/. Ac- cessed: 2017-02-20.

[53] A. Celesti, D. Mulfari, M. Fazio, M. Villari, and A. Puliafito, “Explor- ing container virtualization in iot clouds,” in 2016 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 1–6, May 2016.

[54] B. I. Ismail, E. M. Goortani, M. B. A. Karim, W. M. Tat, S. Setapa, J. Y. Luke, and O. H. Hoe, “Evaluation of docker as edge computing platform,” in 2015 IEEE Conference on Open Systems (ICOS), pp. 130– 135, Aug 2015.

91 [55] F. Ramalho and A. Neto, “Virtualization at the network edge: A per- formance comparison,” in 2016 IEEE 17th International Symposium on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), pp. 1–6, June 2016.

[56] “Cubieboard 2.” http://cubieboard.org/model/cb2/. Accessed: 2017-02-20.

[57] “Kernel .” http://www.linux-kvm.org/. Accessed: 2017-02-20.

[58] R. Morabito, R. Petrolo, V. Loscr´ı, and N. Mitton, “Enabling a lightweight edge gateway-as-a-service for the internet of things,” in 2016 7th International Conference on the Network of the Future (NOF), pp. 1–5, Nov 2016.

[59] D. Mulfari, M. Fazio, A. Celesti, M. Villari, and A. Puliafito, Design of an IoT Cloud System for Container Virtualization on Smart Objects, pp. 33–47. Cham: Springer International Publishing, 2016.

[60] M. Fazio, A. Celesti, and M. Villari, Design of a Message-Oriented Mid- dleware for Cooperating Clouds, pp. 25–36. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013.

[61] T. Renner, M. Meldau, and A. Kliem, “Towards container-based re- source management for the internet of things,” in 2016 International Conference on Software Networking (ICSN), pp. 1–5, May 2016.

[62] R. Morabito and N. Beijar, “Enabling data processing at the network edge through lightweight virtualization technologies,” in 2016 IEEE International Conference on Sensing, Communication and Networking (SECON Workshops), pp. 1–6, June 2016.

[63] “Odroid-c1+.” http://www.hardkernel.com/main/products/prdt_ info.php?g_code=G143703355573. Accessed: 2017-02-20. [64] P. Bellavista and A. Zanni, “Feasibility of fog computing deployment based on docker containerization over raspberrypi,” in Proceedings of the 18th International Conference on Distributed Computing and Net- working, ICDCN ’17, (New York, NY, USA), pp. 16:1–16:10, ACM, 2017.

92 [65] W. Hajji and F. P. Tso, “Understanding the performance of low power raspberry pi cloud for big data,” Electronics, vol. 5, no. 2, p. 29, 2016.

[66] “Apache spark - lightning-fast cluster computing.” http://spark. apache.org/. Accessed: 2017-02-20.

[67] “Apache hadoop.” https://hadoop.apache.org/. Accessed: 2017-02- 20.

[68] J. Claassen, R. Koning, and P. Grosso, “Linux containers networking: Performance and scalability of kernel modules,” in NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium, pp. 713– 717, April 2016.

[69] M. Maksimovi´c,V. Vujovi´c,N. Davidovi´c,V. Miloˇsevi´c,and B. Periˇsi´c, “Raspberry pi as internet of things hardware: performances and con- straints,” design issues, vol. 3, p. 8, 2014.

[70] “Beaglebone black.” https://beagleboard.org/black. Accessed: 2017-02-20.

[71] “All-you-need mini pc android + linux + arduino.” http://www.udoo. org/. Accessed: 2017-02-20.

[72] Broadcom Europe Ltd, BCM2835 ARM Peripherals, 2 2012.

[73] ARM Limited, ARM1176JZF-S Technical Reference Manual, 11 2009. Revision: H.

[74] Broadcom Europe Ltd, QA7 ARM Quad A7 core, 8 2014. Rev 3.4.

[75] ARM Limited, Cortex-A7 MPCore Technical Reference Manual, 4 2013. Revision: F.

[76] K. Peffers, T. Tuunanen, M. Rothenberger, and S. Chatterjee, “A de- sign science research methodology for information systems research,” J. Manage. Inf. Syst., vol. 24, pp. 45–77, Dec. 2007.

[77] A. R. Hevner, S. T. March, J. Park, and S. Ram, “Design science in information systems research,” MIS Q., vol. 28, pp. 75–105, Mar. 2004.

93 [78] P. J¨arvinen,“Action research is similar to design science,” Quality & Quantity, vol. 41, no. 1, pp. 37–54, 2007.

[79] Filesystem Hierarchy Standard Group, Filesystem Hierarchy Standard, 1 2004. Rev. 2.3.

[80] “nbench-byte 2.2.3 - download, browsing & more — fossies archive.” https://fossies.org/linux/misc/nbench-byte-2.2.3.tar.gz/. Accessed: 2017-06-20.

[81] “iperf - the tcp, udp and sctp network bandwidth measurement tool.” https://iperf.fr/. Accessed: 2017-06-20.

94