Master thesis for Information engineering September 2020

Efficient naming for Smart Home devices in Information Centric Networks.

Caspar Rossland Lindvall Mikael Soderberg¨

Master Programme in Computer and Information Engineering Civilingenjorsprogrammet¨ i informationsteknologi Abstract

Institutionen for¨ Efficient naming for Smart Home devices in informationsteknologi Information Centric Networks.

Besoksadress:¨ ITC, Polacksbacken Lagerhyddsv¨ agen¨ 2 Caspar Rossland Lindvall Mikael Soderberg¨ Postadress: Box 337 751 05 Uppsala The current network trends point towards a significant discrepancy be- Hemsida: tween the data usage and the underlying architecture; a severely increas- http:/www.it.uu.se ing amount of data is being sent from more devices while data usage is becoming more data-centric instead of the previously host-centric. In- formation Centric Network (ICN) is a new alternative network paradigm that is designed for a data-centric usage. ICN is based on uniquely naming data packages and making it location independent. This the- sis researched how to implement an efficient naming for ICN in a Smart Home Scenario. The results are based on testing how the forwarding information base is populated for numerous different scenarios and how a node’s duty cycle affects its power usage. The results indicate that a hierarchical naming is optimized for hierarchical-like network topology and a flat naming for interconnected network topologies. An optimized duty cycle is strongly dependent on the specific network and accord- ing to the results can a sub-optimal duty cycle lead to excessive power usage. Contents

1 Introduction 3 1.1 Researched topics ...... 4

2 Background 4 2.1 Internet today ...... 5 2.1.1 Internet Protocol Suite ...... 5 2.1.2 Internet Of Things ...... 6 2.1.3 Constrained Application Protocol ...... 7 2.1.4 Lightweight machine to machine ...... 7 2.1.5 MQTT ...... 8 2.1.6 RIOT ...... 8 2.2 Information Centric Networks ...... 8 2.2.1 Architectural Principles ...... 9 2.2.2 Naming data ...... 10 2.2.3 Named Data Objects ...... 12 2.2.4 Forwarding Information Base ...... 12 2.2.5 Pending Interest Table ...... 13 2.2.6 Content Store ...... 13 2.2.7 Content Centric Network ...... 14 2.2.8 Named Data Network ...... 14 2.2.9 Content Delivery Network ...... 14 2.3 Security ...... 15 2.3.1 Difference to host centric network ...... 15 2.3.2 Certificate and origin authentication ...... 16 2.3.3 Vulnerabilities in ICN ...... 16 2.4 Tools ...... 17 2.4.1 CCN-lite ...... 17 2.4.2 RIOT native port ...... 18 2.4.3 Wireshark ...... 18 2.4.4 GNU Compiler Collection ...... 19 2.4.5 GitHub ...... 19

3 Related works 20

4 Name convention 21 4.1 Design ...... 21 4.2 Implementation ...... 22 4.2.1 Test setup ...... 22 4.2.2 Naming structure ...... 24

1 4.2.3 How to identify a node ...... 24 4.2.4 Considered name conventions ...... 25 4.2.5 Considered network topologies ...... 26 4.2.6 Interest and content ...... 31 4.3 Results ...... 33 4.3.1 Measurements ...... 33 4.4 Discussion ...... 35 4.5 Conclusion ...... 36

5 Low power 37 5.1 Background ...... 37 5.1.1 IoT Network Characteristics ...... 37 5.1.2 Duty cycles ...... 38 5.1.3 Push/Pull transmission ...... 38 5.2 Design and Implementation ...... 38 5.2.1 Parameters ...... 39 5.2.2 Network ...... 39 5.3 Results ...... 40 5.4 Discussion ...... 44 5.5 Conclusion ...... 45

6 Evaluation 46 6.1 Researched topics evaluation ...... 46 6.2 Conclusion ...... 47 6.3 Future Work ...... 48

2 1 Introduction

The current network architecture was designed in the early 1970s and fared exception- ally well for the needs of that time. The previous networking requirements revolved around a few selected stationary end-points sending and receiving packets over a well- established and secured line of communication. The modern internet usage has drasti- cally evolved since then and there are more connected devices and raw data being sent then previously thought possible. Furthermore, the way the modern internet is being used has greatly changed and there is s common trend that the data is becoming the primary focus instead of the location of it; in most cases does it not matter where data is retrieved from as long as it is correct. The common host centric view is less appealing and the focus is moved towards an information centric view. To meet the discrepancy between the internet architecture and its usage did the U.S. National Science Foundation set out to fund research projects under the Future Internet Architecture Program [29]. The common goal was to design a new internet architecture that meets the modern and future needs while still having a similar structure to the current internet. based on these principles, the network paradigm Information Centric Network (ICN) has been proposed. In ICN networks does the data become a first-class entity that is uniquely and permanently named, the data is denominated as a Named Data Object (NDO). Most of today’s forwarded data is already information centered - video streaming, web pages, music, etc. - and would easily be translated into NDOs. One of the key features of ICN is the location and storage independency; NDOs are uniquely named and each copy of a NDO is practically interchangeable and can thus co-exist in several locations (network nodes) simultaneously. Instead of traditionally requesting specific data from a known host can NDOs be retrieved from an arbitrary source. Meaning an ICN architecture can use caching to reuse data [35] to reduce network congestion and increase the delivery speed There are several implementations that have realized an ICN solution, the two most prominent are Content Centric Network (CCN) [34] and its continuation Named Data Network (NDN) [30]. Today is NDN the most widely used ICN implementation but CCN and its subprojects are actively being developed, mainly its lightweight adaption called CCN-lite [22] which aims to achieve a minimal implementation to support a bare-boned but lightweight solution. There are still significant design choices and uncertainties around how to properly im- plement a complete ICN solution but continuous research improves the foundation. Mainly at a conceptual level but also at a practical level are ICN networks beginning to mature and the range of possible applications to develop towards is increasing [16]. One of the more promising usage areas for ICN networks is in an IoT scenario according to A. Lindgren et al. [15]. How IoT devices are being used is primarily in an informa-

3 tion centric way, meaning an adaption to an ICN based communication would likely be beneficial. One of the key concerns would be how to achieve a flexible yet minimal namespace utilization which can cover a wide area of applications used by constrained IoT-devices.

1.1 Researched topics

This paper aims to research and find an effective naming convention for lightweight sensors in an ICN network. There are no standardized name conventions for ICN as the naming should be optimized for its use and the targeted network. The core problem will be how to design a name convention that can flexibly assign each sensor a unique name while achieving a low overhead. Furthermore, the protocol should propose power- efficient sensor settings. The core requirements this paper aims to research are the following: 1. Achieve an efficient naming convention for a Smart Home ICN network. 2. Achieve a power-efficient wireless communication.

2 Background

The Internet was designed to allow for remote access to another computer and en- abling a communication between them. One of the earlier implementations of a network was called ARPANET and was developed by the Defense Advanced Research Projects Agency (DARPA) in 1969 [14]. At the time were there a select few computers and the core focus was to establish a reliable connection between the stationary computers. The transmitted data was secondary compared to the few expensive end-nodes in the network. ARAPNET continued to be further developed and supported a global con- nectivity with hundreds of computers [14]. Since ARPNETs creation has the way the Internet is being used today drastically changed; a continuously growing amount of data is being sent by an increasing amount of devices. By 2022 will the annual amount of globally propagated data increase more than threefold compared to 2017. The usage of wireless and mobile devices will also increase from 48% to 71% [6]. The current internet trends point towards current and several new usage-areas which are forced to utilize the current internet architecture called the Internet Protocol Suite (IPS). The core problems which IPS were designed around to solve did not take into account the unexpected explosive growth in users and the raw amount of data that could be sent. Some of the major flaws which the current internet structure does not natively

4 support are the need for improved flexible data distribution, data mobility, and data security. Kutscher, et al. [21] describes that the best way to currently handle the increasing data traffic and the number of devices is primarily by further infrastructure investments, designing software overlays at application-layer that caches data akin to Peer-to-Peer (P2P) applications and by removing location dependency for accessing data. All of which is achieved to some extent today but only through software overlays which makes them less efficient compared to native realizations. The continuous feature patching is unsustainable and only worsen the complexity of the Internet as a whole [13]. New internet paradigms are being explored which aim to better meet the internet needs of the future. ICN is a promising solution that matches the data centric usage of to- day.

2.1 Internet today

There is a large history to draw inspiration from when designing ICN and what aspects to reflect and to avoid. The following sections will highlight some of the key features of today’s internet usage. We first give an overview of the Internet Protocol Stack and then introduce the requirements of IoT and Internet protocols that are designed for IoT.

2.1.1 Internet Protocol Suite

The Internet Protocol Suite (IPS), also called TCP/IP is the model through which we interact over the internet. The IPS specifies exactly how data should be structured in several different abstraction layers to send data from one location to another. The Open Systems Interconnection model (OSI) says how the different layers are structured and used. Today is the OSI model structured around seven different layers which can be seen in figure 1. The layers adopt an hour-glass form (see figure 2) which puts low restrictions on how each layer should be structured as they can be flexibly defined and expanded upon. This has allowed the continuous evolution of the IPS and made it possible to use even today. The Internet Protocol version 4 (IPv4) is historically the most prominent protocol in the network layer of the OSI model. IPv4 defines an address space of 32 bits, which translates to just under 4,3 billion unique addresses which were seen as an excessive amount when it was released in 1981. However, globally have almost all IPv4 addresses been exhausted by now [6] which created a demand to expand the pool of possible

5 Figure 1: Visualization of the structure of the OSI layers.

Source: Y.Li,D.Li,W.CuiandR.Zhang,”ResearchbasedonOSImodel,”2011IEEE3rdInternationalConferenceonCommunicationSoftwareandNetworks, Xi’an,2011,pp.554-557,doi:10.1109/ICCSN.2011.6014631. addresses. IPv6 solves the address shortage by using 128-bit addresses instead of the 32 bit addresses that IPv4 has. Using 128-bit addresses should be sufficient for the future. The continued adaptive structure of the OSI-model has been a key aspect of its longevity.

2.1.2 Internet Of Things

The Internet of Things (IoT) refers to the concept of numerous different lightweight de- vices that are connected to the internet. The devices usually have a constrained amount of computational power, memory availability, and power capabilities. A thing com- monly measures or creates data that is sent to other devices that handle the information. The type of communication is often denominated as Machine to Machine (M2M). The total number of M2M connections will increase from 6.1 billion in 2017 to 14.6 billion by 2022 according to Cisco’s latest Network forecast [6]. Even though the significant 2.4-fold increase of connections will the global M2M IP

6 traffic increase roughly 7-fold during the same period, from a monthly 3.7 Exabytes (EB) in 2017 to more than 25 EB by 2022. The global M2M IP traffic will increase from 3% to 6% of the global IP traffic. The convincing growth of M2M communication is a strong argument for how IoT is becoming an increasing and more integrated part of the future.

2.1.3 Constrained Application Protocol

The Constrained Application Protocol (CoAP) is a widely used application layer pro- tocol specially designed for constrained devices to communicate over networks with minimal overhead, suitable for Machine to Machine (M2M) communication. CoAP is also designed to handle transport over low power networks with a low throughput rate. CoAP’s packet structure is similar to HTTP’s and allow for an easier translation between the two protocols. To secure data HTTP is using secured Transport Layer Secu- rity (TLS) over TCP to secure its data, while CoAP secures its data by using Datagram TLS (DTLS) [7] over UDP transportation. Both HTTP and CoAP supports turning off their securing feature. The aforementioned design features make CoAP suitable for most IoT-scenarios. CoAP is a service layer protocol which means in this case that it combines the applica- tion and Presentation layer to provide the specified features. Since CoAP is to run on constrained devices it requires the devices to support UDP-based communication.

2.1.4 Lightweight machine to machine

Machine to Machine (M2M) communication is nothing new that has come to existence the last decade but has become increasingly used after the introduction of constrained devices that are connected to networks. M2M as a concept is not more complicated than the name applies, it is when machines communicate with one another with minimal to no involvement of humans, and after the introduction of IoT M2M communication significantly increased. To better support constrained devices, IoT included, is there a need to keep the overhead to a minimum and still have a protocol that handles the management of the devices. One of those protocols is (Lightweight M2M) LwM2M and supports both management of the devices themselves and also management for the applications running on them. The protocol itself utilizes the REST architecture design and is based on top of CoAP [33] and therefore gains the benefits from CoAP with all the robustness and compactness it provides. A key point that LwM2M differentiates itself on compared to other M2M protocols are that it handles both device management and application management on

7 the same device and offer cross-vendor support and all that combined contributes to re- ducing the previous application stack on the devices down to a single protocol and by that reducing complexity energy consumption and development time for new applica- tions.

2.1.5 MQTT

MQ Telemetry Transport, more commonly denominated as MQTT is a lightweight pub- lish/subscribe messaging protocol. It has a wide usage area but is prominently used in conjunction with low-power sensors in M2M/IoT communication [27]. MQTT utilizes Publishers and Subscriber which interact to achieve the desired commu- nication. A subscriber asks an intermediate handler called a Broker to forward all mes- sages matching a Topic which is a ”/”-delimited string, e.g. sensor/value. A publisher creates data, e.g. a sensor value, assigns the corresponding topic, and then forwards the data to one or several pre-designated brokers. When the broker receives the new data will it then forward it to the previously subscribed nodes.

2.1.6 RIOT

Riot is a real-time that has been designed for usage in the development and deployment of IoT devices and supports very limited hardware, as low as 8-bit microcontrollers with no more than a couple of KB of memory [3]. Riot supports a broad range of OS microcontroller architectures and comes with support for multiple network stacks (e.g. CCN-lite), cryptographic libraries and is open-source and free to use. Riot also has a basic hardware virtualizer [3] that makes it possible to run code in a host OS terminal and makes it run as a user program which contributes to testing and development becoming more streamlined.

2.2 Information Centric Networks

The future of the internet has been continuously discussed amongst professionals and avid network enthusiasts alike. In 2006 did the renowned computer scientist Van Jacob- son present a Google Tech Talk called A New Way to look at Networking [19] which highlighted the history of the internet and the limitations it presents. Most importantly did he coin a new design philosophy for computer networks which revolved around two key points: • Making data addressable by uniquely naming it instead of being location-based.

8 • The integrity and trust should be derived from the data itself instead of securing the channel. These design ideas later came to be the core principles of the broad term Information Centric Networking (ICN). Today is there several implementations of ICN with slightly different design choices but the different core ICN principles will be brought up in the following subsections.

2.2.1 Architectural Principles

One of the better design choices from the Internet Protocol Suite (IPS) is the so-called thin-waist or hourglass model (see figure 2). In Host Centric Networks (HCN) is the model centered around the use of IP packets, the actual data that is sent from one address to another. The thin-waist allows for relatively seamless expansion and changes on the other levels without restriction on how the IP packets should be formed. Thus can the inner leveled layers be independent of how the outer layers work. This is one of the chief reasons for how the internet has flourished and evolved since its creation. ICN realises the strength and flexibility that an hour-glassed structure provides and have adopted a solution to fit the needs of an ICN network. Comparably do the HCN and ICN structure look quite similar. The ICN is centered around content chunks, which is potentially segmented Named Data Objects (NDOs). Above the content chunks are application-level layers - how the end-user interacts communicates through an applica- tion. The lower layers can be generalized into how the NDOs should be propagated and over through which medium. In ICN networks there needs to be a way to make sure the delivered information is unaltered and has not been maliciously modified by anyone. ICN achieves trust in a received content by a verifiable bond between an NDO’s name and its content, this bond is created by the content creator when the package is made and makes the NDO verifiable and immutable [39]. ICN hierarchical naming has a similar structure to today’s URLs and the naming is rooted in a publisher prefix. The hierarchical naming utilizes a longest prefix matching principle, meaning that if a package with the requested package name can not entirely match the package, does it instead match with the most similar name and is returned. The hierarchical naming also enables aggregation of routing information for the network and improving scalability.

9 Figure 2: Visualization of the structure of an interest and data packet.

Source: http://named-data.net/wp-content/uploads/hourglass.png

2.2.2 Naming data

To move away from a location-based communication does ICN utilize a naming conven- tion to uniquely address all data. There are in general two different naming conventions that have advantages and disadvantages depending on their use: a hierarchical and a flat namespace usage. A hierarchical name scheme means that it follows a top-down structure, e.g. ”/Produc- er/Type/Value”. The different segments are delimited by a ”/” which is not seen as a part of the name just how to separate the different segments. A hierarchical naming allows for a descriptive and self-explanatory representation as to what the data represents while also being humanly readable, though not necessarily. When deciding how to format the namespace is there a significant need to consider in what context the name should be used and how much information is needed within it. The longer name the larger flexibility, routing precision, and identification of a packet but at the cost of a larger overhead that might be overcomplex for its use. A hierarchical naming does have a larger overhead comparably to a flat one which sim- ply increments a value, a hashed value, or follows some one dimensional logistic to distinguish between content. Most significantly does flat naming has a large disadvan- tage when handling a packet out of context as it is practically impossible to identify the content. Meaning there is a significant restriction for when and how to handle flat

10 named content. The requester would also need to know beforehand the exact nam- ing convention of the publishing node. Comparably with a hierarchical name can the requester utilize the longest prefix matching principle. Meaning if the full name of pro- ducing node is ”/Producer/Type/Value” could an NDO be retrieved by requesting with the name ”/Producer/Type” and get the longest matching content with that prefix. Today is a hierarchical naming practically the norm as it provides greater flexibility and scalability than a flat one. Much like the hierarchical structure of IP address would a hierarchical naming in ICN enable easier content aggregation which would greatly benefit when merging with today’s routing systems. When routing NDOs is there no need for the network to understand the meaning of a name, it only needs to compare names and forward packets. Also are there no need for global name restrictions. An NDO only needs to be unique in the scope of its usage. A local home network could name its NDOs quite liberally as they will only be used within the network. Whereas if an NDO is to be sent and retrieved globally over the internet is there a severely larger significance and limitations on the naming conventions to enforce that all NDOs have efficient and unique names.

11 2.2.3 Named Data Objects

ICN core communication revolves around two different data packet types: Interests packets that request data by name, and Data packets which are uniquely NDOs that hold some information. A slightly simplified view can be seen in figure 3.

Figure 3: Visualization of the structure of an interest and data packet.

Source: https://named-data.net/project/archoverview

When the requester creates an interest does it primarily specify what name it wishes to match against but also creates a nonce (number used once) to distinguish interests with the same name. The interest will then be sent out into the network and hopefully, be propagated to a node that has access to the created NDO - either the publishing node or if the NDO was cached in the network. The publishing node will create an NDO and uniquely name it according to its previ- ously defined convention. The NDO will have a signature and meta information for external validation of the NDO (see section in 2.3).

2.2.4 Forwarding Information Base

Each node in the network topology has a Forwarding Interest Base (FIB) which dic- tates how and where the information should be propagated. The FIB usually utilizes the longest matching prefix method which means it will forward the information to the next node which has the longest previously linked prefix to it. If an interest reaches a node with an unknown prefix will it be dropped. There is also a possibility to create a dynam- ically growing FIB that takes either a predefined prefix or a new and then broadcasts the incoming interests over its known interfaces. The node will then have to record the

12 efficiency of the different alternatives and save those who are efficient enough or yield the most satisfactory result according to some metric.

FIB (Node Prefix Network Inter- name) face Node 1 /a/b// Interface X Node 2 /x/y/ Interface Y Node X Y Interface Z

Table 1: A simplified view of a Forwarding Interest Base (FIB).

2.2.5 Pending Interest Table

Each node in the network topology holds a Pending Interest Table (PIT) which fulfills the function of sending data back to the requester/consumer. If numerous incoming interests are requesting the same NDO will the PIT only store the ones with new network interfaces. When an NDO is sent to a node will the PIT try to match any names with the PIT entries, if there are any will it then propagate the NDO to the corresponding interfaces and thereafter remove the used entries from the PIT. This can be seen as a breadcrumb trace that will be followed from the final node which retrieved the NDO back to the requester while removing the breadcrumb (PIT entry) and trying to populate the CS.

2.2.6 Content Store

The content store (CS) is a node’s temporary storage for NDOs. When a node receives a new NDO will the CS try to save it according to its internal cache-strategy, most modern strategies should be applicable and it seems advanced strategies performs at best as well as simple ones in an IoT scenario [35]. Usually, will the CS only be populated when routing an NDO or if the node itself is a creator but it is also possible to forcefully inject an NDO to the CS. By increasing the number of entries the CS holds is there a higher probability that an incoming interest can be matched to a node’s CS and thus eliminate the need to further propagate the interest and alleviating network congestion.

13 2.2.7 Content Centric Network

CCN is the project Van Jacobson started at PARC to build the first implementation of his new vision for the internet, he was leading the development of a software code- base and the first version became a baseline implementation of this architecture [31]. The implementation which goes by the name of CCNx has sprung sub-projects and is where the popular CCN-lite origins from. CCN-lite has become widely spread and implemented on many platforms since the code base compiles from C-code, has a very small code base, and wide option-range on configurations, which makes it beneficial for testing and research.

2.2.8 Named Data Network

NDN is one of five projects under the NSF Future Internet Architecture Project and is today the largest and most prominent ICN project. NDN is an alternative continuation of CCN and in the start of the NDN project the CCNx was used as codebase, but as of 2013 the project split apart from the original codebase and created a specific fork to better solve the demands that were presented to the project and needed a better ap- proach. The NDN project itself has today numerous sub-projects, e.g. ndn-lite [23] and mini-NDN [26]. Mini-NDN is a lightweight emulation tool that allows for experimenta- tion and research based on the NDN-libraries and NDN-lite is a bareboned lightweight implementation of NDN.

2.2.9 Content Delivery Network

Content Delivery Network (CDN) is a HCN solution to counterbalance some of its neg- ative trade-offs that have become far bigger in today’s infrastructure than it was first thought to be. Instead of always having only one server location for a specific web- site/source the content is spread over multiple points across the network and can be anything from a few server hall locations to numerous ones. To determine where the content should be served from CDN relies on algorithms that can, for example, serve your request based on the nearest server. This makes the network operate in a way that instead of asking for a location for the data, the data itself is searched for and fetched through the CDN. To handle a request in that way puts more responsibility and compu- tation on the network itself rather than the end nodes. This leads to a quicker response for the user and also makes it more reliable because if the nearest server is unable to handle the request due to an unforeseen error it is possible to fetch the data from an- other location and complete the request with only some extra time. The increased data

14 retrieval time is attributed to the reduced path for the data to travel which contributes to a reduction of network congestion. The use of CDN is growing rapidly and according to Cisco will 72% of the global IP traffic be routed CDNs by 2022 [6]. The increasing use of CDN is largely thanks to that it helps providers deliver the same data more reliably and faster to a larger amount of end-users. CDNs are trying to solve networking problems similarly to ICN. However, since CDNs are built on top of a HCN based structure will the full potential of an information centric approach not be fully utilized.

2.3 Security

Security is a fundamental requirement for a and is something that users are demanding to be supported. Therefore the security aspects of ICN communi- cation is heavily researched and the ability to improve security and reduce complexity is desirable, and since ICN is designed to have security support built in it has an advantage over the traditional infrastructure that was designed without security as a priority.

2.3.1 Difference to host centric network

IP network’s security is based on securing the session through which the data is sent [18]. To initiate a secure channel for communication does each endpoint of the session needs to authenticate one another and negotiate a session key to encrypt data transmit- ted through the channel. When public key cryptography is used in authentication, an endpoint have to prove that it has the corresponding private key and that the public key is bound to the name of the data producer. Instead of delivering packets of data to receivers that are identified by IP addresses, ICN lets consumers request the desired data using application-layer names. By naming the data does it enable ICN networks to secure data directly at the network layer. This is achieved by making every data packet verifiable by the producer’s public key, and as an option confidential. This verifiable bond is created by the producer who creates a signature by using its cryptographic private key at the time of data creation and the signatures are put into the package for future verification [39]. By uniquely naming each NDO they can be independently requested and verified to ensure that the specific NDO is the requested one and that it is up to date. This is achieved by having each NDO as an immutable object - once created it can not be altered.

15 2.3.2 Certificate and origin authentication

A certificate is usually used to associate a public key with an identity name with the usage of a signature, and in the same manner, the content data in ICN is bound to the packet name through a signature. This allows the data packet to be duplicated and stored at different places and the place where it is stored does not need to be trusted as long as the signature binding can be trusted and verified, the packet is valid. To verify the producer’s public key there are different approaches in different imple- mentations of ICN networks, it is partly dependent on which naming schemes are used. However, the need for producer verification is still needed in all of the different imple- mentations. In a hierarchical named network, one approach for producer verification is to have a hierarchical public key infrastructure (PKI) [17] where trust anchors can endorse the authenticity of a public key with a certificate signature. Due to the hierarchical structure, an entity can endorse public keys and create signatures for nodes that are lower down in the hierarchy. If a consumer receives an NDO and wants to verify a producer’s public key, the consumer needs to traverse the endorsement tree until a trusted node is reached or a pre-established trust-anchor. After verifying the producer’s public key the consumer only needs to verify each NDO’s signature that was created by the same key. In a flat name structure, where the whole name or a part of the name for an NDO is hashed, can the NDO be self-certifying. With the use of a hash can an object’s integrity and validity be verified without the need for a PKI or other third party structures to be able to derive trust in a packet’s integrity.

2.3.3 Vulnerabilities in ICN

ICN is designed to avoid and improve upon shortcomings of IP networks but has not managed to get all security risks eliminated. New types of attacks have appeared in ICN and the two major ones are Interest Flooding Attacks (IFA) and Content Poisoning Attacks (CPA) [31]. IFA is when interest packets are sent out on the network for content that is nonexis- tent and populates the FIB until the interest expires, by filling up the FIBs with non- serviceable interest the router can not handle real interest packages until there is room in the FIB again. IFA is a variant of denial of service attack in an IP network [1]. CPA is when a legitimate interest gets responded to by an altered package, so the name of the package is correct and legitimate but the content has been changed. The altered

16 package has a broken signature and the receiver will discard the package. However, due to the computational strain to continuously verify the signature on every node is it common to sparsely verify it meaning bad packages can spread out on the network and populate caches where signature verification is not performed and that leads to cache poisoning [8]. CPA is another type of denial of service as IFA.

2.4 Tools

For development and testing, we utilized pre-made tools for simplifying the process and obtaining replicability. This section will go through the different tools that have been used throughout the project.

2.4.1 CCN-lite

CCN-lite is a completely C based Library with a codebase with less than 2000 lines of code and has been developed for academic and experimental usage [22]. CCN-lite does not fully implement the functionality of PARC’s CCNx or NDN forwarding daemon but has support for multiple package formats so it can communicate with different ICN implementations, supports communication over raw and UDP connections. The CCN-lite library aims to keep the code as lightweight as possible while supporting a large selection of hardware and devices with memory down to a few KBs of ram. However, the efficiency of the code run is not of the current highest priority, meaning memory accesses and calculations could be improved since many data structures rely on linked lists. It does support multi-level scheduling both in the package level but also in the chunk level where package fragmentation is handled. CCN-lite has been ported to a selection of different platforms from the original code- base. This has mainly been achieved through a community-driven process, hence is the variety of supported devices, the level of documentation, and completeness of the projects largely varying. The codebase has a set of tools that are terminal-based where parts of the codebase can be used in different modules and also be used in conjunction with each other to achieve the desired CCN-lite functionality. When building these modules there are multiple compile options where it is possible to turn on or off optional support, e.g. memory debugging, HTTP server, HMAC256 signatures, and more [5].

17 2.4.2 RIOT native port

RIOT native port is an adaptation to run RIOT as a terminal process in [36], FreeBSD, or Mac OS which allows developers to easily test code for RIOT directly on their machine instead of uploading the program to a chip and then debugging it. A significant advantage is better debugging and development tools that can be used in the local terminal, such as GDB and Valgrind. The RIOT native port emulates hardware by using system calls and signals on the API level, this means that the complete RIOT software stack can be compiled and executed on the previously mentioned supported OS. This contributes to an easier starting point for developing a RIOT-based application since it allows a separation between RIOT and the embedded programmed application which can be a challenge of its own, to begin with. RIOT native port also supports creating virtual testbeds that allow for creating multi- ple instances of RIOT running simultaneously on the same machine and are connected through a virtual network [37]. The network topologies are configurable in all types of structures and allow the developer to specify a virtual network for testing and debugging where it is possible to specify a specific packet loss rate for the network, this allows for debugging the code and functionality without disturbance from the real world where at times it is hard to pinpoint what is at fault.

2.4.3 Wireshark

Wireshark is a cross-platform tool that captures and dissects network traffic which is running on a computer network. Wireshark is the most popular network dissecting tool in the world and its popularity can be attributed due to its open-source and freely available license. Wireshark is an advanced network analyzing tool and can differentiate between different network protocols and their respective package layers and structures, can display the data in fields where the user can decide which data is relevant to analyze and also to be displayed. To make it easier for users to display and get an overview of what is being captured can sorting rules be applied. The rules can categories and color code different packages, protocols, or virtually anything based on some value that can be matched to one or multiple fields in a package. The power of what is possible to display and process from what is captured are down to the bit-wise representation of the captured package which gives a huge range of possibility to analyze and display. Wireshark has support for Plug-ins meaning if a new protocol needs to be dissected which is not currently supported can a user-defined representation be imported and then allow for network dissection with Wireshark. There are no limitations to simply cap- ture from a live network interface, Wireshark also supports reading previously captured

18 network traffic from a [38] file and then reanalyzing as live traffic. Likewise, after capturing live traffic can Wireshark save it to a file for later processing. Wireshark has a huge range of use but there are still some limitations. Wireshark is structured around the use of pcap for storing and reading captured traffic and thus is the support for what is possible to capture and read limited to the scope of pcap’s ca- pabilities. When the development of pcap continues and expands its capabilities will Wireshark at the same time gain more functionality. For Wireshark to be able to dissect ICN communication is it needed to extend its dis- secting capabilities through a Plug-in. The NDN project has several sub-projects, one called NDN-tools [32] which have made an NDN plugin for Wireshark. The recom- mended and used packet formatting in this project is called ndntlv(NDN Type-Length- Value) [28] which can be seen as an NDN adoption to be used as an overlay for HCN and can be decoded with the NDN-tool’s extension.

2.4.4 GNU Compiler Collection

GNU Compiler Collection (GCC) is a compiler system produced by the GNU Project and is the standard compiler in the GNU toolchain [10], it origins from GNU C Compiler (GCC 1.0) and was first introduced back in 1987 and at the time only supported the C programming language. Since then have the support for other languages and a renaming of the project [10], some that have implemented support are C++, Objective-C, Fortran, Java etc [11]. GCC was first created to be the compiler for the GNU operating system and has since then been expanded to support new instruction sets and architectures while still being licensed by the GNU General Public License.

2.4.5 GitHub

The codebase was developed with the help of GitHub [12] for version control as well as to allow for remote access. GitHub was initially used according to a feature branch workflow meaning the core codebase was continuously added upon with specific fea- tures from other branches. However, towards the end of the project was the main devel- opment branch directly worked upon.

19 3 Related works

Design Choices for the IoT in Information-Centric Networks [24] Lindgren, et al. from the Information Centric Network Research Group(ICNRG) [15] describes the trade-offs for using ICN in an IoT scenario. Primarily do Lindgren, et al. highlight the benefit of the core principles of ICN and how they naively match in an IoT usage area. The article also articulates the current uncertainties around ICN and addresses them with potential solutions. Namespace-wise do they argue for the benefit of having aggregated NDOs composing of several nodes’ worth of data to limit the bandwidth utilization and node interaction. However, they are strongly against aggregated NDOs as it will likely have a severe in- crease in the network’s complexity which is ill-suited for most constrained devices in an IoT network. The recommended approach is to reduce the namespace’s complexity and move the functionality elsewhere to allow for the easiest in-network NDO aggregation. This means a flat address space could be sufficient but a hierarchical one is likely to draw some benefits depending on the ICN implementations.

A Survey of Information-Centric Networking [2] Ahlgren et al. discusses the core principles of ICN and how varying implementations differ. They introduce the concept of immutable and verifiable Named Data Objects (NDOs) and how they are used in ICN. Furthermore, Ahlgren et al. highlights the importance of the name used for the NDOs as routing and forwarding of data is based on the name. By uniquely naming the NDOs can the data be independently verifiable and ensure its integrity. The unique naming makes the NDO location independent and allows intermediate nodes to cache NDOs for a possibly faster data retrieval. The survey also highlighted some of the aspects ICN would be a better solution, with a focus on efficient content distribution. They conclude that ICN needs to determine and decide several currently researched design ideas before ICN can be considered as a new network alternative to today’s HCN solution. Information Centric Networking in the IoT: Experiments with NDN in the Wild Wahlisch,¨ et al explores the feasibility, advantages and challenges of an ICN-based based solution for the Internet of Things. The results are based on experiments on real IoT networks with tens of constrained nodes in several different rooms and build- ings. The achieved results indicates a positive use case for ICN in the IoT as it could offer advantages over an host-centric approach in both terms of energy and memory usage. Low-power Internet of Things with NDN & Cooperative Caching [25]

20 A significant factor for the success of ICN is that it should perform better than its HCN counterpart. This paper by Oliver Hahm et. al has researched from an energy efficiency perspective how ICN would fare and compare it to an HCN alternative for IoT scenarios. Their results are based on extensive, large scale experiments on up to 240 nodes, and emulate up to 1000 nodes. They conclude that their ICN implementation with NDN (see 2.2.8) can provide a significant advantage over a similar HCN approach in terms of energy efficiency. In their test do they achieve a content availability over 90% and provide auto-configuration mechanisms enabling practical ICN implementations with reduced energy consumption.

4 Name convention

The namespace is one of the most important factors for an ICN. The namespace decides how the NDOs can be routed as well as restrains how the content can be expressed. An over-complex namespace will express the NDO precisely but put a larger restrain on the network whereas an over-simplistic namespace could yield problems from both routing and content creating perspectives. The following sections will discuss how and why different suggested name conventions can be used as well as how they compare to each other.

4.1 Design

When designing a namespace convention are there numerous aspects to take into con- sideration - the size of the network, the number of addressable nodes, the computational power of the network nodes, and much more. To evaluate and compare how different namespace conventions affects a network is there a need to have measurable data. The sizes of the NDOs, as well as the total and average FIB sizes of a network’s nodes, could give an insight into the efficiency of the implemented name convention. A namespace should be designed and optimized for its target use. Meaning the format, the amount of needed information, and complexity of both the content- and namespace should reflect how the NDOs should be used. Furthermore, a significant design choice is how the needed information of the NDOs should be expressed - the balance between an NDO’s namespace and content space. If a larger part of the information is stored in the namespace could it possibly lead to an easier and more flexible routing at the cost of an overall increased network load. Whereas a smaller namespace puts a larger strain on

21 the routing properties of the network as the same routing capabilities should be achieved but with less information. In terms of functionality did we draw inspiration from today’s HCN-solutions. One of the more popular and lightweight network protocols is CoAP and its extension LwM2M. A core functionality of CoAP is to allow for basic node interaction and method signal- ing. By using a bit-defined convention as a header can the CoAP protocol signal which method or type of packet is being sent. However, CoAP’s header and usage can not be directly translated into an ICN scenario but some of the functionality could be ported into the namespace, the content, or a combination of both the namespace and content of the NDOs. Designing where and how much information should be represented is a crucial part and will likely have a significant impact on the network as a whole. Furthermore, the namespace can also be represented with a hierarchical or a flat naming which are better suited for different scenarios. To motivate the design choices, we study the effect of different name conventions by testing them over several test networks.

4.2 Implementation

The implementation of the design ideas has been split into different parts. The sections cover how the CCN-lite environment was utilized for development, how and why the name conventions were implemented, the utilized networks, and lastly how the different name conventions differ.

4.2.1 Test setup

To study the effect of different name conventions is it necessary to build a reliable and easy to use test setup, where we are able to make changes to the test system and see the effects of those changes without a need to reconfigure and build up the system from scratch. The values that we can change and experiment around are fed into the test setup in a way that no recompilation of CCN-lite’s source code is needed, instead, the parameters get passed to the right parts of the system during initiation. CCN-lite is a library which uses multiple binary modules which contains its category of functionality and functions. The test setup uses a different instance of a CCN-lite relay in each of the terminal windows that gets initiated. A CCN-lite relay acts as a node in the test network and can manage the routing of network packages. The relay holds the

22 FIB, PIT, and CS are located and also decodes the NDO (interest or content) and how to process an incoming or outgoing package. If it is an interest should it be forwarded and added to the PIT, but if the incoming package is a response to an entry in the PIT the package shall be routed back on the network interface that it arrived on and then have the corresponding PIT entry removed. A shell script was created to deploy our test setup for a Linux Gnome terminal environ- ment. Due to the restrictions of the CCN-lite library were we forced to build a script that was built around utilizing their pre-compiled binary files. The script initiated new terminal windows and then ran user-specified code that behaved in the same manner as if we had manually started each terminal window and manually configured the running instance to have the desired functionality. To keep management control of the running instances in the different windows did we had the manually started window as a control window where we had the man- agement options to restart the setup, tear down the setup, and start up a different test scenario. To have control and predictability on the parameters that we do not want fluctuations on for our test we can not let the relay itself populate its CS since if the test environment ex- ecutes for a long time can we not be certain that the CS match up to previous runs every time and that would introduce an unwanted error in our measurements. Our solution to avoid this error is that we have predefined the CS for each instance of running relays and during the setup script’s execution each CS gets populated by predefined content. The predefined content is data packages that have been created by using the CCN-lite’s com- piled module for package creation where the namespace for the packages, the content, and the package format gets defined before a package gets created and saved locally for later use. For relays to be able to communicate between one another there is a need to have some network interface connected to it and the relay’s FIB needs entries with matching names- paces coupled with an outgoing interface. The relay can then determine if and which interface a package should propagate through. By having a dynamically changing FIB there would be uncertainties in the test results, to avoid this did we determine the entries in the FIB beforehand and populates the FIB at the execution start. This allows us to build a network topology where we have full control of the network and can determine how the packages are propagated. To be able to send packages based on the information in the PIT we need to have a connection between the relays where the packages travel. CCN-lite works partly as an overlay on top of today’s HCN network, one abstraction is called faces that allows re- alizing an ICN-based connection between machines - both over localhost and also over larger networks such as the internet. A face’s function is based on a UDP connection

23 that is set up towards another relay. The established connection with a face is unidi- rectional, meaning the relay can send a request to the other relay at the other end and receive the response only over a momentarily established connection that is related to the initial request and is disconnected when the corresponding PIT entry is removed. The contacted relay can not at a later time connect with the help of the momentary connection to send a request of its own or use the connection for anything other than replying to the response. The relay would need its face to the other relay to be able to communicate and send requests to that specific relay. To create a network connection between two relays do they need their faces connected towards each other to be able to accomplish a bidirectional ICN communication. To design a face do we need to decide how and where a relay should route NDOs. This is realized by adding forwarding rules to a relay’s FIB. A forwarding rule is a mapping of a prefix to an outgoing face. An outgoing face can thereby be in multiple forwarding rules where the difference between the rules is what prefix is specified. These forward- ing rules are predetermined and are added to the FIBs during the initiation of the test setup. To be able to add and edit information of each relay that gets initiated by our shell- scrip, and at run-time being able to control them we are using the management module of CCN-lite. The module operates by using Unix sockets [20] to have an inter-process communication for passing management messages, which allows for populating the PIT and FIB and keeping the ability for later management such as quitting and tearing down the setup.

4.2.2 Naming structure

Four different name conventions were implemented to research the suitability and effi- ciency of a name convention in a Smart Home scenario. The four different name con- ventions utilize a hierarchical or flat naming but collectively address a specific end-node with the following information: ”App”, ”House number”, ”Floor number”, ”Room number” and ”Node number” How the name conventions were implemented and used will be explained in the follow- ing sections.

4.2.3 How to identify a node

Throughout this section will the example node ”/app/2/5/1/1” be used to explain the first five segments, /App/HouseNumber/FloorNumber/RoomNumber/NodeNumber

24 which are used to distinguish each node in the network. The different segments’ use and proposed implementation will be explained below. App, the first segment represents the overarching application that is currently being used. This offers flexibility to concurrently run different applications over the same network without interfering with each other. The application segment is suggested to be represented with 1-3 characters and for the example scenario ”/app/X/X/X/X” would it target the network running an application called ”app”. HouseNumber is used to differentiate between potential different houses that should be connected to the same network. The house segment is suggested to be represented with 1 character and for the example node ”/app/2/X/X/X” would it mean that the node is placed in house number 2. FloorNumber, the third segment is suggested to be represented with 1 character for which floor of the house the node is in. For the example node ”/app/2/5/X/X” would it mean that the node is placed somewhere on the 5th floor. RoomNumber is suggested to be represented with 1 character for which room the node is in. For the example node ”/app/2/5/1/X” would it mean that the node is placed some- where in room 1 on the 5th floor in house 2. NodeNumber is to address each unique node in that specific room. The node is sug- gested to be represented with 1 character. For the example node ”/app/2/5/1/1” would address node 1 in room 1 on the 5th floor in house 2. With ”App”, ”House”, ”Floor”, ”Room” and ”Node” can all sensors be uniquely ad- dressable. The hierarchical implementation separates the segments with a ”/” whereas the flat concatenates the segments into one. Henceforth will ”/App/House/Floor/Room/N- ode/” and its flat counterpart ”/App House Floor Room Node” be referred to as the SensorID, the way to uniquely identify each node in a network.

4.2.4 Considered name conventions

When a node has received some arbitrary data does it need to know how to decipher the information. How to handle and decode an NDO can be expressed either by extending the namespace or the content space. The information that needs to be conveyed for a Smart Home scenario in both cases are: SubNode is used if a sensor has multi-sensing capabilities and need to distinguish be- tween the different measurements. The segment is suggested to be represented with 1 character.

25 Method signals the type of the message, e.g. GET, SET, ERROR. The method can determine how the node will use the data differently. The segment is suggested to be represented with 1 character. Type+Subtype is suggested to be represented with 2 characters with respect to the spe- cific method type. The first byte represents the type and the second byte its sub-type, e.g. Temperature and Celsius. PKT NMR is the final segment for all NDOs and will numerically differentiate new NDOs from previous ones, e.g. ”PREFIX/1” and ”PREFIX/2” which could respectively be the first and second specific measurement of a node. By extending the namespace would ”/SubNode/Method/Type+SubType/PKT NMR” be appended as suffix to the SensorID both for the hierarchical and flat name con- vention. Whereas when using the content header would ”SubNode”, ”Method” and ”Type+Subtype” be bit represented in a standardized header and only extend the names- pace with ”PKT NMR” This results in four different name conventions that have dif- ferent namespace and contentspace utilizations and comparably have advantages and disadvantages for different scenarios. To sumerize, the four considered name conventions are: 1. Hierarchical naming without content header (Hier). ”/App/House/Floor/Room/Node/SubNode/Method/Type+SubType/PKT NMR” 2. Hierarchical with content header (Hier CH). ”/App/House/Floor/Room/Node/PKT NMR” 3. Flat without content header (Flat). ”/App House Floor Room Node/SubNode/Method/Type+SubType/PKT NMR” 4. Flat with content header (Flat CH). ”/App House Floor Room Node/PKT NMR”

4.2.5 Considered network topologies

To evaluate the efficiency of the different name conventions were two extreme case network topologies tested with a varying amount of nodes while configured to support the four different name conventions separately. The first test scenario uses a chain-based network topology where each node is con- nected to two neighbors except for the two end nodes which only have one neighbor. General chain-based network topology with M-amount of nodes is visualized in fig- ure 4.

26 Figure 4: A chain-shaped network topology with M amount of nodes

Each node in a network needs to populate its FIB to tell how an incoming interest should be routed. A way to evenly distribute how an interest should be propagated in a chain network is by routing around the use of three core nodes: the root node and the two extreme nodes (see figure 5). The root node (Nroot) is the middle node of the chain and the two extreme nodes (NAy and NBy) are the final two addressable and content producing nodes.

Figure 5: A generalized chain-shaped network topology displaying how an arbitrary hierarchical namespace can be implemented

The non-extreme nodes have exactly two outgoing interfaces, meaning two possible choices either up (blue arrow) or down (green arrow) in the naming hierarchy. While an interest is sent to a node with a matching namespace to one of its sub-nodes (down the hierarchy) will the interest be propagated through the network until it reaches the final node. In figure 6 is an interest sent to a middle node, the interest will then be propagated down to the final node NBy. The content will then be returned by following the previ- ously created breadcrumb trace from where the interest came from (see figure 7).

27 Figure 6: Visualizes how an interest can be propagated from node NB to node NBy

Figure 7: Visualizes how the content gets routed from node NBy to node NB

In the case that the incoming interest does not match a current sub-node will the interest then be propagated up the hierarchy to the root node which then will send the interest towards the other final node (see figure 8).

28 Figure 8: Visualizes how an interest gets routed from node NB to node NAy

Figure 9: A chain-shaped network with 25 nodes displaying how the hierarchical names- pace was implemented.

The number of nodes and the used namespace for a chain based network will determine how to appropriately balance and fill the nodes’ FIBs.

29 We designed an arbitrary M-long chain (see figure 4), when configured to use a hierar- chical namespace, such that all the non-center nodes propagate back ”App” towards the center node. Meaning as long as the first segment is the target application ”App” will the incoming interest be propagated up the hierarchy to the center node unless a node has a longer matching entry in its FIB. If a node has a matching FIB entry (longer than ”app”) will the interest be propagated to the next node until the final node. Meaning if a valid incoming interest is sent to an arbitrary node will it eventually be propagated down in the hierarchy to the specific node. The other test scenario is for a Fully Connected Mesh (FCM) topology which is the opposite extreme case scenario of a chain as all nodes are directly connected to all the other nodes simultaneously. A generalized FCM-based topology with five nodes has been visualized in figure 10 and the implementations showing the routing properties for just Node 1 using a hierarchical namespace (see figure 11) and a flat namespace (see figure 12).

Figure 10: A star-shaped network topology with five nodes using an arbitrary name convention

30 Figure 11: A fully connected mesh topology with 5 nodes emphasising Node 1’s routing for hierarchical naming

Figure 12: A fully connected mesh topology with 5 nodes emphasising Node 1’s routing for flat naming

4.2.6 Interest and content

The respective interest and content for the different name conventions were constructed and analyzed with the CCN-lite library. The following example demonstrates how a

31 content with a hierarchical naming without the use of a content header is built. The namespace ”/app/2/5/1/1/2/G/TC/1” and content ”21.1” is used and then piped into CCN-lite’s decoder to display the byte representation of the packet. Listing 1: How to create and display the contens of a NDO.

$ $CCNL HOME/ build /bin/ccn−l i t e −mkC −s ndn2013 ”/app/2/5/1/1/2/G/TC/1” | $CCNL HOME/ build /bin/ccn−l i t e −pktdump 21.1 # ccn−l i t e −pktdump, parsing 50 bytes # auto−detected NDN TLV format (as of Mar 2014) # 0000 06 30 −− 0002 07 1e −− 0004 08 03 −− 0006 61 70 70 | app | 0009 08 01 −− 000b 32 | 2 | 000 c 08 01 −− 000 e 35 | 5 | 000 f 08 01 −− 0011 31 | 1 | 0012 08 01 −− 0014 31 | 1 | 0015 08 01 −− 0017 32 | 2 | 0018 08 01 −− 001 a 47 |G| 001b 08 02 −− 001d 54 43 | TC | 001 f 08 01 −− 0021 31 | 1 | 0022 14 00 −− 0024 15 05 −− 0026 32312e310a | 21.1 . | 002b 16 03 −− 002d 1b 01 −− 002 f 00 | . | 0030 17 00 −− 0032 pkt.end

32 The respective content and interest sizes for the different name conventions are stored in table 2. Packet sizes Name convention Content (B) Interest (B) Hier 48 48 Hier+CH 40 41 Flat 40 40 Flat+CH 32 33

Table 2: The size in bytes for the packet type between the different name conventions.

4.3 Results

To determine the efficiency and network impact of the different name conventions were several different networks implemented and configured to use the different name con- ventions to evaluate their efficiency and how they impacted the forwarding interest base (FIB).

4.3.1 Measurements

To evaluate and compare the efficiency of a hierarchical and flat name convention were 20 different networks realised. Ten networks using a fully connected mesh topology, consisting out of 5, 9, 25, 49 and 99 nodes and separately using a hierarchical and flat naming. The other ten networks used a chain topology and was also configured for 5, 9, 25, 49 and 99 nodes and separately using a hierarchical and flat naming. The average FIB-size for the different name conventions over a varying amount of nodes (see figure 13) indicates that a chain-based network with hierarchical naming (Chain Hier) has the smallest average FIB-size of 28 bytes compared to the others for all net- work sizes. The 2nd smallest average FIB-size is FCM flat and Chain flat for N=5 with a value of 68 bytes. The largest average FIB-size is FCM hierarchical (FCM Hier) for N=99 with an average FIB-size of 2450 bytes. The results strongly indicate that a chain network should have a more efficient FIB-size utilization when using a hierar- chical naming whereas a FCM network has a lower average FIB-size when using flat naming.

33 The sum of all nodes’ FIB-sizes for 5, 9, 25, 49, and 99 nodes is displayed in figure 14. The results indicate that Chain flat, FCM Hier, and FCM Flat increase at nearly the same rate as N increases. Chain Hier increases at a significantly lower rate compared to the other networks. The total FIB-sizes for Chain Hier and FCM Hier is equal to 145 and 500 bytes for N=5 compared to 3140 and 241 167 bytes for N=99, meaning FCM Hier is roughly 345% larger for N=5 and 7680% for N=99.

Figure 13: A graph displaying the average FIB-sizes for 5, 9, 25, 49, and 99 number of nodes.

34 Figure 14: A graph displaying the total FIB-sizes for 5, 9, 25, 49, and 99 number of nodes.

4.4 Discussion

In section 4.2.3 were four name conventions implemented - a hierarchical, a hierarchi- cal with a content header, a flat, and a flat with a content header. For routing purposes is there no distinction with our without the use of a content header, thus were a hierar- chical and flat namespace usage compared by how they respectively populated different networks FIBs while achieving the same functionality. The results in figure 13 suggest that a hierarchical name convention outperforms a flat in a chain-based network topol- ogy whereas for a FCM-based topology does a flat name convention perform noticeably better. A hierarchical naming is significantly optimized for a network topology with a hierarchical- like structure, e.g. a chain- or tree-like, as a node’s FIB only needs to be more precise the lower down the hierarchy it is. Meaning the intermediate nodes in a network only needs to know how to propagate an interest to the next forwarding node and not directly

35 to the final end-node. This explains the difference between flat and hierarchical naming for a chain-based topology. The (orange line) in figure 13 has roughly the same average FIB-size independent of the number of nodes as the average node will always have two outgoing interfaces with only on namespace per interface, either up or down the hierar- chy. In contrast, flat naming (blue line, Chain Flat) yields a linearly increasing FIB-size depending on the number of nodes. Each node in the network will need to know how to forward to all other individual nodes in the network instead of just the two neighboring ones. In a fully connected mesh (see figure 10) are all nodes directly neighboring to each other. The flexibility a hierarchical naming gained in the chain-based topologies is not applicable as all NDOs will be forwarded directly to the targeted node. The results in figure 13 show that a hierarchical naming (green line) is strictly worse than the flat naming (red line) for the FCM-based networks. The results are logical as the flat names- pace expresses the concatenated segments of its hierarchical counterpart and therefore eliminating the data needed to represent each segment.

4.5 Conclusion

Section 4 has gone through the design ideas, how the designs were implemented, and realized through four different name conventions - two flat and two hierarchical in con- junction with and without the use of a content header (see 4.2.3). From a FIB perspective would the routing be identical with and without the use of a content header. The results were based on 20 different networks which were designed around the following three factors: 1. Hierarchical or flat naming. 2. The number of nodes in the network (5, 9, 25, 49, or 99). 3. The used network topology (fully connected mesh or chain-based). The results are displayed in figures 13 and 14. Both results indicate that for a chain- based network topology would a hierarchical naming most likely yield a more efficient FIB-size utilization, whereas for a FCM-based topology should a flat name convention yield a more efficient FIB-size utilization.

36 5 Low power

Nodes that are driven on a small battery and are supposed to be working off a single charge for many months need to conserve its energy at a maximum level. In this type of application, a small and efficient name convention that takes advantage of reducing package sizes to limit the number of retransmissions in a noisy environment is only part of what makes a network more energy efficient and allows the node to operate over an extended period of time. This section will describe how we approached requirement 2 in 1.1 and discuss the trade offs as well as the different approaches to make an IoT node power efficient. Aspects such as radio duty cycle, burst transfers, and idle state all play a major role in reducing the power consumption a node needs to achieve and at the same time being able to produce and deliver the data at a relevant interval and response time.

5.1 Background

To motivate and show why we have made certain choices there are some concepts to understand that we have taken into considerations when performing our evaluation and tests regarding power consumption.

5.1.1 IoT Network Characteristics

First of we consider a single wireless network as a connection point for transmission of data from multiple nodes in the same network, and are available for connection and transmission when no other node or wireless transmission is using the wireless domain. We consider the wireless link speed to be at a low speed due to the small microproces- sors [9] typically used in IoT devices (easily overwhelmed), to the power-conserving and lossy nature of IoT link layers which yield low data rates. A node can be in either of tree states: active, transmitting or sleeping. In the sleeping state, a node has both its radio switched off and its CPU in sleep mode. A node in sleeping state transitions to the active state triggered by an external interrupt, either generated by a timer or a sensor, e.g., by a temperature sensor if the temperature is above a certain threshold. In the active state, a node’s CPU is running, its radio transceiver is listening or trans- mitting. We consider a scenario where, when a sensor has new data, it can wake up the node if it has been sleeping. The node can process the data accordingly, becoming

37 active, upon which it may consider going back to sleep. This assumption is in line with the capabilities of typical IoT hardware and available IoT software platform.

5.1.2 Duty cycles

The duty cycle for a node is the percentage of time spent in the active state and is one of the most important aspects to consider for IoT nodes. Less time spent in the active state means less energy to keep the radio communication running but at a lower data throughput. When choosing the duty cycle the application requirement for latency and throughput defines it. If however it is allowed to sacrifice quick response and high throughput and mainly focus on power savings a method called burst transfer can be utilized, this method allows the sensor to take multiple packets and pool them together so the next time the sensor is transmitting it can send multiple packets and thus save energy for power-on and initialization.

5.1.3 Push/Pull transmission

For data to be transmitted between a network and an IoT device there are two principles, push or pull. For the pull method, the sensor needs to have an active connection to the network to be able to respond as quickly as possible to the data request, this consumes a lot of power due to the network circuit not powering down. In push mode, the node decides when it has data to be sent and can therefore decide if the network should stay connected until the next transmission or power down the network connection and save on energy. In the case that the pull method is used the data availability of the network can increase in the network since you get the data when you request it. Depending on the interval of sleep between network connections for push method the availability of data can differ significantly between the different methods. If the availability of data is less important than the power savings for an IoT device the device can apply the push method and instead of powering on and sending one data reading the sensor can collect multiple data readings and burst-transfer them at the same time to reduce the times of powering on but at the cost of lowering the availability of data.

5.2 Design and Implementation

In our tests and measurements, we have made some simplifications compared to a full- scale network implementation on our parameters and test environment, so we can pin- point the parts we want to explore and not have an over-complex environment with a

38 non-deterministic behavior. In this section we will present the parameters we have been using and why we chose them and how our test setup was configured.

5.2.1 Parameters

An IoT device network differs from other types of wireless communication as can be read in 5.1.1 and one drawback of wanting to use as low transmission energy as possible is that the network is more susceptible to radio disturbance so the already slow network gets even slower due to packet loss. To model this type of behavior the network standard IEEE 802.15.4 was used as a base due to its limited transmission speed down to Kb/s, a network speed of 10Kb/s was used to model the network’s communication capacity between nodes. In wireless networks, the maximum achievable communication speed is greatly affected by the transmission distance and disturbances that interfere with the transmission of data and invoke a packet loss. Therefore we decided that this param- eter is best kept at a constant value in our setup and is the same in the two different networks. The test setup utilizes a fixed data amount that is being transmitted and is based on 1000 packets. Here we define packets as interest since interest is largely comprised of data information and are therefor a good representation of the benefits that come from our name convention. The package size has been calculated based on an interest that contains the smallest namespace in 2 and data version information and is only 33 bytes, so the total amount of data that is propagating through the setups are 33000 bytes. In the networks the variable values are the duty cycle 5.1.2 and the number of nodes that are in the networks. Having networks with increasing amounts of nodes and testing the networks with different duty cycles we can plot the power consumption for the configurations. When looking at where the biggest power savings in a network can be done these two aspects stood out the most and are where an IoT device needs to optimize its usage.

5.2.2 Network

The network setup was modified from the test setup in 4.2.1 so that we could quantify the communication between the nodes and see how long the node is in the different states of the duty cycle. When information about the duration and data transmission in each state we based the power consumption that each state of the IoT device consumes [4] and from that we can calculate the overall energy required for the entire network. In the network the propagation of packages is not dependent on a probability that the receiving node might be in a state where the package can not be delivered, the package

39 can be delivered when the sending node is in a transmission state, a node can however not receive and transmit a package at the same time, this is a measurement taken to avoid disturbances in the results from things not relating to the explored parameters. The network utilized a data push method so the nodes could go down in a sleep state and not be connected to the network in an idle state all the time since that is something a device with limited power wants to avoid. The test was performed with a network node count that stepped between 5 - 99 nodes and for each size of the network the time window in an active state varied from 0.3 - 15 seconds. Each network node was utilizing the same active state time for each iteration. The tested network topologies were a fully connected mesh and a chain network. For the chain network, the worst-case scenario for sending between the end-nodes was tested and in the fully connected mesh, we had one node that acted as a sending node and the rest of the node as receiving and that each of the receiving nodes were getting the same number of packages.

5.3 Results

When sending data with different time windows and scaling up the number of nodes in a FCM network we get plots that can bee seen in figure 15 and when the network consisting of 99 nodes, the power between time window 0.3s to 2.0s the overall power of the network is decreasing but after 2.0s the power starts to increase and with a time window over 14s the power has exceeded the power of Time=0.3. The networks with 25 and 49 nodes have a similar curve but do not exceed the initial power with time window 0.3s, and network with nodes 5 and 9 is decreasing for each measurement and are very close to each other and only has smaller differences.

40 Figure 15: A graph displaying the power consumption for different amounts of nodes over different time windows sizes in FCM networks.

In figure 15 we plot the networks against power and time window and it shows how a network gets affected with different time windows.

41 Figure 16: A graph displaying the power consumption for different time windows over different amount of nodes in FCM networks.

To show how a time window affects we plotted the time window against the power and number of nodes and can be seen in figure 16 and there we can see that having a time window of 0.9s consumes more power than having a time window of 10s up to around 50 nodes where they cross each other. With an increasing amount of nodes in the network, a trend of having a larger time window is an increased power-consumption is starting to emerge but at different rates.

42 Figure 17: A graph displaying the power consumption for different amounts of nodes over different time windows sizes in chain networks.

For networks with a chain topology, figure 17 displays the curves plotted against power and time window in the same matter as in figure 15. In figure 17 we can see that all the curves display the same pattern, having a high initial power consumption for a small time window and drops off as the time window increases, but compared to the FCM network note the difference in power between the networks.

43 Figure 18: A graph displaying the power consumption for different time windows over different amount of nodes in chain networks.

Figure 18 when compared with figure 16 are plotted in the same manner but for a chain network, what is interesting to look at is the power used and that the lines are linear and growing at different rates. The line with the 0.9s time window is the one with the fastest growth and the most used power. A time window of 10s has the slowest increase and the least overall power usage.

5.4 Discussion

The two network topologies power differs hugely and can be argued rightly so since in a chain network a package needs to jump between every node in between its destination node and in every node the package needs to be received in a time window and sent in a time window and if the time window is small the node needs more time windows than it would when having a larger time window, and as seen in both figure 15 and figure 17 the power for the network with the smallest time window creates a power peak

44 for every configuration, hence avoiding power-on and down is desired. Compared to a FCM network where the package destination node is the next jump for the package and does not need to be re-transmitted the network saves on power, and is why the chain networks power is at a much higher level. Looking at figure 16 and especially the 0.9s time window and the 10.0s time window where the effect of having a long active time correlates to having multiple power on and power down, since when the number of nodes in the network increases and the same amount of data is propagating the overhead of powering on to send a small amount of data the idle time in the remaining time window is not utilized, and in a network as FCM where each node does not have to propagate each packet of data the amount of data each node is receiving and sending is much smaller and can, therefore, benefit from having a smaller time window than a chain network. Looking at figure 18 where each line is linear and only has different slopes we can see the same type of behavior as in figure 16 where having a time window of 0.9s requires more power, and since a chain network has each intermediate node send all data it gets much more significant for the power consumption. In figure 15 and line N=99 we can see the behavior of having a small amount of data for each node to send and a large time window where only a small part is used for trans- mission. This power consumption trend when using increasingly larger time windows and a very small amount of data to send and receive will start to appear both typolo- gies regardless of the node count because when the time to power on is less than the power to stay connected to the network is passed, the decreasing power usage when managing to send as many packets in each time window is no longer beneficial and a power down is the only option to save power. Another option to combat the problem to power up to only send a small amount of data is to use burst transfers and send multiple packets in the same time window. For this to be possible the application must allow lower data availability since it will power up less frequently and hence delay the data delivery.

5.5 Conclusion

Section 5 has highlighted the importance of designing an suitable duty cycle for the network’s nodes. Two different types of network topologies were tested, a chain-based, and a fully connected mesh (FCM). The two networks were then tested with a varying amount of nodes while transmitting the same amount of data to propagate through the network. The power consumption was measured over the varying networks with dif- ferent amounts of duty cycles required. All networks were hugely effected by having an excessively small time window, especially the chain network due to the operating principle where each node in the network takes part in the data propagation. For nodes

45 to gain an advantage to power down, the time in between power on needs to exceed the energy cost to power on. Depending on the network and its needs will it signifi- cantly dictate how to properly design an efficient time window utilization. The results in 5.3 indicates that for the specific test scenarios would a time window of 4s be a good alternative.

6 Evaluation

This section will further discuss the results from section 4.2.2 and 5.3 and evaluate how they compared to the requirements in 1.1.

6.1 Researched topics evaluation

When designing name conventions there are several aspects to take into account while achieving sufficiently large flexibility for constrained devices. The researched topics in section 1.1 will be independently discussed and be evaluated below. Achieve an efficient name convention for Smart Home ICN networks. To flexibly name each unique sensor for a wide possibility of network topologies did we design four different name conventions in section 4.2.2. The four name conventions have the same range of naming capabilities but with a varying network impact and should be applicable to most Smart Home Scenarios. The interest and content sizes for the different name conventions are listed in table 2. Both the network topology and the computational power of the network’s nodes greatly restricts what name convention can be used. To lower the overall network impact there are numerous aspects to take into consid- eration, one key aspect is how the routing capabilities differ when implementing dif- ferent namespaces. The forwarding information base (FIB) determines how interests gets routed within the network. The result in section 4.3 indicates how a better FIB- utilization can be achieved. For the tested networks with a chain-based topology were a better FIB-utilization achieved with hierarchical naming. For the tested networks with a FCM-based topology were a better FIB-utilization achieved with flat naming. However, it is noteworthy that the tested network scenarios are extreme case scenar- ios and do not necessarily reflect a real network topology, though it gives a plausible indication of when a hierarchical or flat naming is more efficient than the other.

46 Achieve a power efficient wireless communication. The power usage is a significant restricting factor for most lightweight networks. A way to optimize the power usage is to choose an appropriate duty cycle for the network’s nodes. The results in section 5.3 discuss how to optimize the time window for sending and receiving data before the nodes sleep. Our results show that for the designed protocol we can make sensors save on energy by not having a too small of a time window and to use burst-transfers so the node pools it transfers and only uses the power of a single network power on. But for these methods to save on power the requirement for an applications data availability and nodes production rate need to be at the level where the power on energy does not go above the power to maintain a constant network connection. The results indicates that the network nodes can become more power efficient by prop- erly designing its time window for its use. If a time window is too short can it result in the node needing to power on and off unnecessarily often, which is energy costly. By using burst-transfers can the node save several packages and transfer them by only using the power of a single network power on. The requirement for an application’s data availability and the node’s production rate in conjunction with the node’s time window relates to its power usage.

6.2 Conclusion

This paper researched how to implement an efficient communication for Information Centric Networks (ICN) in a Smart Home scenario. The namespace is a core principle of ICN and is used to address and fetch all the data objects in the network. The first researched area was how the namespace impacts the network’s forwarding information base (FIB). The results are based on measuring 20 different networks with varying network topologies and utilized namespaces. The re- sults indicates that a poorly designed namespace can lead to a significant strain on the network as FIB-sizes can become excessively large and inefficient. The second researched area was how to improve the nodes’ energy usage depending on the network topology and each node’s duty cycle. The duty cycle is the ratio between being awake (sending and receiving data) and sleeping (inactive). The results imply that a too small time window for being awake can lead to an inefficient power usage. Whereas, large time windows will lead to a slightly excessive power usage. An optimal time window could be determined for the tested networks, though a generalized one is hard to determine as it is strongly dependent on the specific network. When implementing an ICN there are numerous aspects to take into consideration and

47 this thesis has researched some aspects of it but further research with a wider range of used network topologies and namespaces would be needed to conclusively determine how to design an efficient usage for ICN.

6.3 Future Work

There are several adaptations and other aspects that could have been explored to further investigate network functionality in ICN. This thesis was mainly focused on researching how the FIB and power usage varied with different networks, namespaces, and duty cycles. The following ideas could be further explored: • Are there any other aspects that a namespace significantly impacts in an ICN. How would the namespace design differ if the target network was significantly larger. • Except from the node’s memory capacity, what are some other areas that gets affected by the FIB utilization? There are implementations with adaptive FIBs, could that be desirable for an IoT perspective? • To achieve data confidentiality, how would the ICN network be impacted? ICN is still a heavily researched network paradigm and numerous key aspects are still being developed. However, by further research can ICN take a step closer to being a viable alternative network paradigm to better meet the future networking needs.

48 References

[1] A. Afanasyev, P. Mahadevan, I. Moiseenko, E. Uzun, and L. Zhang, “Interest flooding attack and countermeasures in named data networking,” accessed 2019- 05-28. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/6663516 [2] B. Ahlgren, C. Dannewitz, C. Imbrenda, D. Kutscher, and B. Ohlman, “A survey of information-centric networking,” accessed 2020-07-30. [Online]. Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6231276 [3] E. Baccelli, C. Gundo¨ gan,˘ O. Hahm, P. Kietzmann, M. Lenders, H. Petersen, K. Schleiser, T. Schmidt, and M. Wahlisch, “Riot: an open source operating system for low-end embedded devices in the iot,” accessed 2019-04-12. [Online]. Available: https://riot-os.org/docs/riot-ieeeiotjournal-2018.pdf [4] R. Balani, “Energy consumption analysis for bluetooth, wifi and cellular networks,” accessed 2020-02-15. [Online]. Available: http://www.nesl.ucla.edu/ uploads/document/paperupload/254/PowerAnalysis.pdf [5] “Ccn-lite, a lightweight implementation of the ccnx protocol and its variations,” CCN-lite, accessed 2019-04-2. [Online]. Available: https://github.com/cn-uofbasel/ccn-lite [6] Cisco, “Cisco visual networking index: Forecast and trends, 2017–2022 white paper,” 2019-04-12. [Online]. Avail- able: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/ visual-networking-index-vni/white-paper-c11-741490.html# Toc532256789 [7] “Constrained application protocol (coap) draft-ietf-core-coap-18,” CoRE Working Group, accessed 2019-05-07. [Online]. Available: https://tools.ietf.org/html/ draft-ietf-core-coap-18 [8] S. DiBenedetto and C. Papadopoulos, “Mitigating poisoned content with forwarding strategy,” accessed 2019-05-28. [Online]. Available: https:// named-data.net/wp-content/uploads/2016/05/mitigating poisoned content.pdf [9] C. B. M. Ersue. and A. Keranen., “Terminology for constrained-node networks,” Internet Engineering Task Force (IETF), accessed 2019-05-13. [10] “Gcc history,” GCC, accessed 2019-05-13. [Online]. Available: https: //gcc.gnu.org/wiki/History [11] “Gcc languages,” GCC, accessed 2019-05-13. [Online]. Available: https://gcc. gnu.org/onlinedocs/gcc/G 002b 002b-and-GCC.html#G 002b 002b-and-GCC [12] “Github,” Github, accessed 2019-03-15. [Online]. Available: https://github.com/

49 [13] M. Handley, “Why the internet only just works,” accessed 2019-04-25. [Online]. Available: http://www0.cs.ucl.ac.uk/staff/M.Handley/papers/only-just-works.pdf/ [14] M. Hauben, “Behind the net: The untold history of the arpanet and computer science,” accessed 2020-07-30. [Online]. Available: http://www.columbia.edu/ ∼rh120/ch106.x07 [15] “Applicability and tradeoffs of information-centric networking for efficient iot,” ICN Research Group, accessed 2019-05-07. [Online]. Available: https: //tools.ietf.org/pdf/draft-lindgren-icnrg-efficientiot-03.pdf [16] “Deployment considerations for information-centric networking (icn),” ICN Research Group, accessed 2019-05-07. [Online]. Available: https://tools.ietf.org/ pdf/draft-irtf-icnrg-deployment-guidelines-05.pdf [17] “Internet x.509 public key infrastructure certificate and certificate revocation list (crl) profile,” IETF, accessed 2019-05-22. [Online]. Available: https: //tools.ietf.org/pdf/rfc5280.pdf [18] “Security architecture for the internet protocol,” ietf, accessed 2019-04-9. [Online]. Available: https://tools.ietf.org/html/rfc4301 [19] V. Jacobson, “A new way to look at networking,” 2019-04-13. [Online]. Available: https://www.youtube.com/watch?v=8Z685OF-PS8 [20] M. Kerrisk, “unix - sockets for local interprocess communication,” accessed 2019- 05-12. [Online]. Available: http://man7.org/linux/man-pages/man7/unix.7.html [21] D. Kutscher, S. Eum, K. Pentikousis, I. Psaras, D. Corujo, D. Saucez, T. Schmidt, and M. Waehlisch, “Information-centric networking (icn) research challenges,” 2019. [22] C. lite project group, “Ccn-lite project and community,” 2019-03-29. [Online]. Available: http://ccn-lite.net/ [23] N. lite project group from University of California, “Ndn-lite project,” 2019-03-30. [Online]. Available: https://github.com/named-data-iot/ndn-lite [24] A. L. F. A. B. A. O. S. A. Malik., “Design choices for the iot in information-centric networks,” accessed 2019-04-12. [Online]. Available: https://ieeexplore.ieee.org/document/7444905 [25] O. H. E. B. T. S. M. W. C. A. L. mAssoulie.,´ “Low-power internet of things with ndn & cooperative caching,” accessed 2020-01-12. [Online]. Available: https://conferences.sigcomm.org/acm-icn/2017/proceedings/icn17-60.pdf

50 [26] mini NDN, “mini-ndn project,” 2019-03-30. [Online]. Available: https: //github.com/named-data/mini-ndn [27] MQTT, “mqtt — mq telemetry transport,” 2019-04-13. [Online]. Available: http://mosquitto.org/man/mqtt-7.html [28] “Ndn packet structure,” Named Data Network, accessed 2019-04-19. [Online]. Available: http://named-data.net/doc/NDN-packet-spec/current/ [29] “Nsf future internet architecture project,” National Science Foundation, accessed 2019-03-27. [Online]. Available: http://www.nets-fia.net/ [30] “Named data network,” NDN?, accessed 2019-03-07. [Online]. Available: https://named-data.net/ [31] “Named data network,” NDN?, accessed 2019-03-07. [Online]. Available: https://named-data.net/project/faq [32] “Ndn essential tools,” NDN community, accessed 2019-04-06. [Online]. Available: https://github.com/named-data/ndn-tools [33] “Oma lightweight machine to machine requirements,” Open Mobile Alliance, accessed 2019-03-28. [Online]. Avail- able: http://openmobilealliance.org/release/LightweightM2M/V1 2-20190124-C/ OMA-RD-LightweightM2M-V1 2-20190124-C.pdf [34] “Content centric network,” PARC, accessed 2019-03-07. [Online]. Available: https://wiki.fd.io/view/Cicn [35] J. Pfender, A. Valera, and W. K. Seah, “Performance comparison of caching strategies for information-centric iot,” 2019-04-26. [Online]. Available: https://conferences.sigcomm.org/acm-icn/2018/proceedings/icn18-final38.pdf [36] “Family: native,” Riot, accessed 2019-04-9. [Online]. Available: https: //github.com/RIOT-OS/RIOT/wiki/Family%3A-native [37] “Virtual riot network,” Riot, accessed 2019-04-9. [Online]. Available: https: //github.com/RIOT-OS/RIOT/wiki/Virtual-riot-network [38] “Wireshark faq,” Wireshark Foundation, accessed 2019-05-13. [Online]. Available: https://www.wireshark.org/faq.html [39] Z. Zhang, Y. Yu, H. Zhang, E. Newberry, S. Mastorakis, Y. Li, A. Afanasyev, and L. Zhang, “An overview of security support in named data networking,” accessed 2019-04-2. [Online]. Available: https://named-data.net/wp-content/uploads/2018/ 07/ndn-0057-4-ndn-security.pdf

51