<<

Masaryk University Faculty of Informatics

Analysis of SSH executables

Bachelor’s Thesis

Tomáš Šlancar

Brno, Fall 2019

Masaryk University Faculty of Informatics

Analysis of SSH executables

Bachelor’s Thesis

Tomáš Šlancar

Brno, Fall 2019

This is where a copy of the official signed thesis assignment and a copy ofthe Statement of an Author is located in the printed version of the document.

Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Tomáš Šlancar

Advisor: RNDr. Daniel Kouřil, Ph.D.

i

Acknowledgements

I want to thank a lot to my advisor RNDr. Daniel Kouřil, Ph.D., for his excellent guidance, patience, and time. I appreciate every advice he gave me. I would also like to thank him and ESET for the provided collection of the infected SSH executables. Last but not least important, I would like to thank my family for all the support. They gave me a great environment while writing the thesis, and I appreciate it very much.

iii Abstract

The thesis describes the dynamic analysis of SSH executables with the main focus on the client-side. There are also some small aspects relating to the server-side too. The examination is performed only on the level of system calls and using OS . The purpose is to propose a series of steps and create tools that would help to determine whether an unknown SSH binary contains malicious code or is safe and whether it is possible to automate the decision. The setup part, analysis part, and determination part do with minimal user effort. The whole process should be at least semi- automatic, but future steps for better automation are mentioned. At the end of the thesis is an evaluation of proposed mechanisms with the ideas for better results. The correctness is tested on a collection of samples given by ESET..

iv Keywords

SSH, , dynamic analysis, Sysdig, Docker, system calls

v

Contents

Introduction 1

1 Collecting system calls 3 1.1 Dynamic analysis ...... 3 1.2 Sysdig ...... 4 1.3 Docker ...... 6 1.4 Description of the implementation ...... 7

2 The Data analysis 11 2.1 The approach ...... 11 2.1.1 SSH client ...... 11 2.1.2 SSH server ...... 12 2.2 Description of the implementation ...... 13

3 ESET samples and their analysis 17 3.1 Overview of samples ...... 17 3.2 Success Rate ...... 19 3.3 Problems during analysis ...... 21 3.4 Ideas for improvements ...... 22

4 Conclusion 25

Bibliography 27

A List of electronic attachments 29

vii

Introduction

The analysis of executables is a broad field of research. The point of that is to decide whether the program is doing what it suppose to do. There is a thin line between malicious behavior and righteous one. An excellent example of this is a that hides its processes. That looks malicious, but for antiviruses, this is a necessity. Therefore, there is a need for a detailed description of the program behavior. There are many ways to perform an examination. In general, they are in two categories — static and dynamic analysis. The focus in this thesis is only on the system calls, which fall under the category of dynamic analysis. The amount of tools for this job is more than enough, but the vast majority is targeted for Windows only. The target is to perform it under OS Linux and for one of the most used programs under this — SSH. There exist a few tools like probably the most used and versatile Cuckoo or more Linux specific like strace or Sysdig. Other programs are often built upon Cuckoo core. It is a useful tool primarily because it combines many inputs like network traffic, system calls, library calls, check against VirusTotal, and more. The disadvantage is the setup. It takes a long time, and it is a complicated process. On the other hand, strace and Sysdig collect only system calls and let decisions up to a user. Their log can be extensive, and it can take a long time to see any malicious behavior. Strace can also be detected. Sysdig, however, receives all system calls happening in the system as in comparison to strace, which collects only the executables one. Sysdig also offers a great variety of filters — some even for Docker containers. This fact comes handy mainly because Docker containers provide a dynamic environment setup and good enough isolation[1], including network. The result of this is the usage of Docker containers with a combination of Sysdig for collecting the data. Their description in a more detailed way is in the first chapter with the script combining these tools. The second chapter focuses on the evaluation of the data, the de- scription of the approach, and the implementation. The SSH client has the advantage that there is no reason for parallelism, therefore no need for forks, clone, more threads, and similar. The collected data

1 from the SSH client in the same environment with the same steps, they will be identical. For the SSH server, this is not the case. However, the analysis is focused more on the client than on the server. Nevertheless, the server system calls are still collected and tested. Their data can still at least provide more insight into the evaluation and can lead the way to the possible success. The better report is in the third chapter, as well as the description of the SSH samples. The collection provided by ESET contains SSH clients and SSH servers, too. The architectures in the collection are also very diverse. There was a need to eliminate some samples because their architecture was not possible to execute. Other eliminations were also in place as their binary was too hard to execute due to various reasons. The description of problems connected with running the samples is in the third chapter too. The third chapter also talks about the success of this approach over the dataset provided by ESET and ideas for future development.

2 1 Collecting system calls

This thesis focuses on system calls that fall under the category of dy- namic analysis. Therefore, it is needed to monitor and capture all the malware’s behavior on this level. The system call collecting functional- ity offers a tool named Sysdig. It is a system-level monitoring toolwith native container support combining the functionality of programs like strace, tcpdump, htop, iftop, lsof, and more [2]. Nevertheless, executing the binary is required to collect any data. Execution of the malware in an isolated environment is essential. The main reasons are to prevent any real damage and to restrict any further spreading. Docker implements a secure way of containing the malware in a container and yet still to be flexible for quick changes. Spawning a new container is quick compared to using VirtualBox1. Then the only thing left is to combine both these tools and control them from one place. For this purpose, a simple Python script was written. The result is that it is possible always to prepare the same environment and execute the SSH with the same steps in many ways. Both Sysdig and Docker are open-source projects; their source code is available on GitHub.

1.1 Dynamic analysis

The dynamic analysis is a form of analysis done by running the actual malware and observe its behavior. The observation can be at various levels, but running the program is the only common attribute. The highest level is user interaction. That means running the program and see what it visibly does. On lower level can be an examination of RAM, processes, registry, traffic on a network, and more. On the lowest level, there is an examination of library calls and system calls. Each thing and each tool uses a different approach to getting its data. This fact gives the investigator the option of choosing the right tool for a specific job because each tool has its advantages.

1. VirtualBox is a powerful virtualization tool by Oracle. It is possible to run Virtu- alBox on Windows, Linux, macOS.

3 1. Collecting system calls

Running malware means one thing and that the environment will be infected — everything on the machine will be compromised. Mal- ware could then even spread further, so as prevention, the execu- tion needs to be in a contained environment. Usually, achieving this environment means to use any form of virtualization. For example, VirtualBox is often used. The dynamic analysis is a way for an investigator to examine the malware, but there are problems connected with the dynamic analysis. Triggering malware is one of them. Malware usually tries to hide from detection. It can even check whether it runs inside a virtual environment or that known detection tools are running alongside. Having many detection tools for collecting various inputs give the investigator a better picture of the actual behavior of the malware. The tool like this is, for example, a Cuckoo. It collects system calls, library calls, network traffic, memory, and more. This thesis focuses only on system calls. A system call is a way for programs to communicate with the operating system. Accessing memory, accessing files, all networking, creating processes, and more are jobs for the operating system. To be more precise, a kernel job. For all the information available in system calls, choosing them is a good starting point for any analysis. However, there can be too much irrelevant data. In the case of SSH, there is a good chance of seeing the malware storing the stolen credentials somewhere. For example, when stored in a file, then it would be possible to see a file creation andthatthe credentials are written there. It is similar to other methods like sending over a network, different processes, and more.

1.2 Sysdig

Nowadays, keeping track of the behavior of a Linux server is a hard job. Containers and other ways of virtualization are getting more popular and are used more often [3]. Logging became more complicated than before, and with this fact in mind, Sysdig was created. Sysdig is a modern system-level monitoring tool. The point of it is to log system calls, network traffic of a server, but still to be able to distinguish each running process even between containers. It works by running inside

4 1. Collecting system calls

a kernel as a driver and collecting the system calls there. It is, therefore, possible to monitor the execution from outside of the container. That is a significant advantage because malware cannot detect any monitoring tool — only that it runs inside a container — and that it is not required getting a monitoring tool into every container. For comparison, strace also collects system calls, but it works by appending ptrace2 to the binary. There can be only one ptrace per process so the malware can detect it, which is unwanted behavior. The log of Sysdig is in columns. The customization is possible for ordering them. The most interesting columns are the process name, process id, arguments, event direction. Most of the process types are bidirectional, with few exceptions. However, there exist process types that can sometimes appear alone and sometimes in a pair. Execve is an excellent example of this. By default, execve has input and output, but the output could be under different process than the input, and therefore, the names of the process could differ. Sysdig also offers to filter by these columns and not just by them. A column with the name of a container does not exist, but it is still possible to use it as a filter option. Sysdig can simultaneously run only five instances. However, only one instance is enough because Sysdig offers to save collected data to a file. Then this saved log can be used again by Sysdig and useits filters on this data. Also, Sysdig needs to run as sudo orroot. There were still small problems at the time of writing this thesis. Sometimes Sysdig took a small moment before recognizing the newly spawned container and, therefore, its process tagging with the appro- priate name of that container. They are aware of this issue and trying to fix it. The temporary workaround was to use sleep command and wait five seconds. There also should not be any significant processor usage. Sysdig does not want to slow down the host, so if it should happen, it instead drops some information. The way this thesis uses Sysdig is by collecting all system calls. The benefit of this approach is that it can see system calls of the SSHclient and the SSH server. Also, if an infected binary executes a different

2. Ptrace is a system call in Linux. Its purpose is that it can control and inspect other processes. It is often used by debuggers to help to fix bugs in code.

5 1. Collecting system calls process that does not fall under filters, it is still possible to track itin the whole log by process id. No information is lost. To get only the logs from the SSH client and the SSH server, the usage of filters for this thesis is essential. The most attractive ones are filtering by process name and by container name. The process name of the SSH client is ssh, and for SSH server is sshd. Adding filters on the container name can prevent any mixup from other containers or the host machine.

1.3 Docker

Docker has a leading role in the containerization market [3]. It offers secure isolation and custom network settings. The isolation security is as good as the host kernel isolation security [4]. Malware can escape if only there is a security vulnerability in the host kernel. Similar usage of Docker and containerization was done in the following paper[1]. Docker composes of Docker images and Docker containers. The image describes how the container should look at the startup. Docker has only one user in the beginning, and that is root. However, more users can be added later in the image description. A container is a virtual environment where various programs can be executed. Docker uses kernel namespaces for the container’s isolation. A host can share files and folders with containers. For achieving this behavior, there are many options. Docker can create volumes, but a more convenient option is to mount it directly. Docker also has its DNS server. The container can connect to an- other container by the container’s name. However, they have to be on the same network. The container always needs to be in at least one. If not specified, Docker assigns the default network to the container. Docker for Python has API support in the library called docker-py. The library is on GitHub as open-source and also downloadable using pip. In this thesis, the usage of Docker starts by creating base Docker images. One is for the SSH server. It contains the installation of an SSH server package, importing the RSA3 key, and adding one user. The rest of Docker images is for SSH clients. It installs the SSH package,

3. RSA is a public-key symmetric cipher. 6 1. Collecting system calls

enables 32-bit support, and additional libraries for some infected SSH binaries. It is needed to run simultaneously two containers. One is for the SSH server, and the other is for hosting the SSH client. The container with the SSH client needs additional changes — mounting a volume with infected binary, other libraries that the binary requires, and finally, the RSA key. It is necessary to add all required libraries in specific versions to successfully execute the binary in a container, and that differs from binary to binary. Although the documentation says, that only OpenSSL4 or LibreSSL5 is required [5]. It is not always the case. For example, RPM-based Linux distributions, such as Fedora, have other dependencies. The command execution spawns a new container. Inside a container, there always has to be at least one running process. Otherwise, the container shuts down. Also, there is an option to re- move the container after it is shutdown. Unfortunately, the container removal deletes all its logs. They could be hugely useful in case of debugging the executable. For example, it prints the missing libraries. Another important thing is the network. Creating one with a spe- cific name and setting it to be internal is mandatory — the internal means without access to the internet. The network has to have only two containers. One is the SSH client, and the other is the SSH server. The result is that the malware cannot communicate through the internet nor spread through a network in any case. It needs to be completely isolated. Adding a container to the network is established on its cre- ation.

1.4 Description of the implementation

The data collector is a script written in Python. Its purpose is to run specific scenarios and use a combination of Docker and Sysdig to collect data. That means it creates and setup a Docker network, then starts Sysdig, then it spawns container for an SSH server and then container for an SSH client. It takes as an argument the environment (image name) that the SSH binary should run in. In the beginning, the idea was to run in

4. Available on Github: https://github.com/openssl/openssl. 5. Available on Github: https://github.com/libressl-portable/portable.

7 1. Collecting system calls multiple environments. However, the environment did not make many differences, so only one argument is necessary, after all. The differences could be only in the number of attempts of loading a library because the loader looks in many locations in which the library can be. Other arguments are: ∙ --bin ∙ --server_bin ∙ --run_clean ∙ --lib_dir_32 : ∙ --lib_dir_64 : The --bin argument should be a location for the malicious executable. The --server_bin is similar but for the server. Their location in a container is known before, so mounting these binaries is easy. For the libraries, this is not the case. The argument for them is expected to be in a format: : Default value for --lib_dir_64 is: lib64/:/usr/lib/x86_64-linux-gnu/ and for --lib_dir_32 is: lib32/:/usr/lib32 The SSH can be used mainly in four ways — a combination of RSA key and root, a combination of RSA key and a regular user, a combina- tion of password and root, and lastly, a combination of password and a regular user. The reason is that malware could act differently as a regular user and as root [6], and there are two ways of authentication. To be more precise, there exist more ways of authentication (using Kerberos6, for example), but these two are the most used ones. The script starts with removing any previous network and creates a new one with no internet connection. The aim is to have a clean starting

6. Kerberos is a network authentication protocol.

8 1. Collecting system calls point. The network should not contain any containers yet, and also it is necessary to prevent creating a new network with an existing name. Docker Python API does not check that, so the accidental creation of a network with an existing name can lead to ambiguity. That can be a problem when assigning a container to the network, for example. The next step is to iterate through every scenario. Each one starts with running Sysdig and saving the log to a file with the name of the current scenario. The executed Sysdig collects everything. The point of this is to have everything in case the infected binary tries to execute a different process. Also, to be able to analyze the SSH server andthe SSH client from the same execution if needed. Better have more data then missing any. Then the container with the SSH server starts and waits 30 seconds to boot up. The script mounts the public key, but in a different location than where the keys should be. The script then copies the public key file content to the file authorized_keys. This file allows authentication using the correct private key. The SSH client is more complicated than the SSH server. The com- mand depends on the scenario. In both cases, before executing the SSH, there is a five second long sleep so that Sysdig can pick upthe newly spawned container. In the scenario with the password, it is necessary to pass the password using sshpass and also automatically add a remote server to known hosts. SSH can read passwords on input only from sshpass. There is no difference in command between root or a user. To automatically add a remote server, there is an argument -oStrictHostKeyChecking=no. In scenarios with authentication using the key, this is not the case — the location depends on the user. For root, its location is /root/.ssh/. For other users, it is /home//.ssh/. The keys have to have specific rights and need to belong to the user. However, it is still not enough. The SSH agent needs to know about the key and load it. Running eval ‘ssh-agent -s‘ and then ssh-add solves the problem. Mounting additional libraries is luckily the same in both cases and depends on the arguments. Then there is a 40-second waiting period before stopping the container. The main reason is that the malware could not trigger itself, so it is not that obvious that binary is indeed infected, and researchers have harder job examing. If there is meanwhile any error, then the container stops early, and the script

9 1. Collecting system calls prints the log from that container so the user can see the error right away. It is not unusual that a new SSH binary is missing, previously not needed, dependency. The user needs to add it manually. For this purpose, there are two folders. One is lib32 for 32-bit libraries. The other is lib64 for 64-bit libraries. They both can be changed using the arguments. After container stops, the script waits for another few seconds and then ends the Sysdig. It is used then again to filter out the whole data into two files. The first filter is for the client container and theprocess ssh. The second one is for the process sshd and the server container. In the end, it deletes both containers, and the next scene is ready to go. The collected system calls are in three folders:

∙ output_raw — complete logs from Sysdig

∙ output_server — only logs from server containers and process sshd

∙ output — only logs from client containers and process ssh

The files from output or output_server are used in the next phase. That depends on whether analyzing the SSH server or the SSH client. This python script collects data only from one SSH binary, so there is also written a bash script, which iterates over a collection of SSH bi- naries and passes each one to the Python script. It takes two arguments. First is a text file with the names of executables for data collection. The other is a folder with the actual binaries. These scripts need to be run as sudo or root. The reason for that is Sysdig.

10 2 The Data analysis

Analyzing data is a crucial part of this project. The goal is to decide whether it is possible to determine automatically, which part of a col- lected system calls is malicious. The implementation of this is written in Python. However, first, it needs to start with choosing an approach.

2.1 The approach

To successfully choose an approach, the understanding of clean bina- ries structures is a crucial attribute. Hence it is necessary to find out in which way they differ, what is the cause of these differences, and lastly, how to minimalize these differences. The structure in both cases is a composition of 4 phases:

∙ loading libraries

∙ initialize the connection

∙ transferring data

∙ ending the connection

2.1.1 SSH client The client binaries offer a great advantage. Collected logs from two runs from one binary with libraries located at the same locations with the same steps are identical in order of system calls. The differences start occurring with different versions and different Linux distribu- tions. The structure remains the same, operations as well, but contains different system calls. To give a simple example — to open alibrary in some version it uses open with fstat, but in a newer version, it just uses openat. These are small changes, but often appearing in the log, and in comparison, that makes marginal differences. The goal is to make a comparison with the closest SSH version possible, with the same environment. Copying the environment and finding the closest binary for each possible malicious binary isnot very efficient. The aim is to automate the analysis as much as possible.

11 2. The Data analysis

The easiest way is to create an extensive collection of clean runs of different versions. On GitHub is written, that only required dependency is OpenSSL or LibreSSL [5]. The compilation is possible with optional parame- ters as running with Kerberos and so. However, these methods of authentications are more likely rare than often. Nonetheless, all to- gether, it is still a relatively small amount of possible combinations. Although for OpenSSH, that should be every combination, there are still more possible dependencies. RPM-based Linux distributions have for OpenSSH package other libraries. However, it is still doable to make this comprehensive collection with clean runs. It should be noted that there is more than one possible architec- ture. The differences occur mainly in names. For example, mmap and mmap2. However, it still means that all possible combinations of versions with dependencies need to be in various architectures. From this collection, it is possible to make another type of collection, and that compose of valid transitions. This collection can also reduce the number of results of possible malicious binary parts. The final approach is to compare malicious binary against the broad collection of clean runs comparing the order of a system calls and choose the one with the minimal diffs. Then run those differences against the collection of valid transitions. Comparing system calls only on a name level is enough. It is not necessary to add arguments to the comparison yet. The output of these two steps is as less as possible malicious system calls. The malware behavior should be easily visible for the researcher.

2.1.2 SSH server The server is more complicated than the SSH client. The name of the binary is SSHD. The collected system calls are always different. There is an execution of another binary (SSH), forking processes. The approach used in the SSH client does not apply here very well. Maybe using some machine learning algorithms would give some reasonable results because it is possible to generate a big dataset of non-infected and infected system call logs from a relatively small number of binaries. However, this thesis is mainly about the SSH client, also choosing such an advanced algorithm is not part of this thesis as well.

12 2. The Data analysis 2.2 Description of the implementation

The implementation is a composition of two execution modes. The first mode is for the creation of a corpus of valid transitions. These transitions should capture changes between different versions. For example, as already mentioned —- open with fstat to openat. TThe second execution mode takes a log from an infected run and tries to find malicious parts of that infected binary. Both share code with mak- ing the diff between the same methods of a connection and outputting the diff in three possible formats1 –JSON , unified diff, and diff. Creating a corpus takes two arguments. The first argument is a folder with clean runs of binaries. In that folder are text files from different versions of SSH and various methods of a connection. The second argument is --make-corpus. By adding this argument, the code knows how to load the files and that it should compare each version with the same method of connection and finally storing it to the folder. The output format is optional and can be specified by --format. However, the corpus can be loaded only in JSON format at the moment. That has an explanation. The script uses a specific diff structure inside the code. The structure is almost identical and, therefore, offers the most effortless import and export. Diff creation requires to pass two arguments. The first argument is a path to folders containing text files with various methods of connec- tion, where each folder is one infected binary to be tested. The second argument is --clean-runs with a path location to the dataset of clean runs. As an optional argument, there is a --clean-input-corpus with the path location to the corpus made by the other mode of execution. The last possible optional argument is an output format, and it can be specified by --format. The execution of analysis of the malicious binaries starts by pars- ing all system calls. The script expects the columns from the Sysdig to be in unchanged order. Each line represents one system call in a specific direction. Usually, each system call is represented at twolines underneath with the first direction in and the other out. That is not the case all the time. The out direction can be after a few system calls, or even the system call can appear in only one direction.

1. JavaScript Object Notation

13 2. The Data analysis

Each line is converted to the input data class or output data class, depending on the event direction. These two classes are complement- ing each other if their counterpart exists. They are then converted to the class, which represents a system call. The first idea was to implement classes for the most frequent system calls and the one with an indi- vidual approach — only has an input argument or so. Furthermore, for the rest, use generic class. This approach is unnecessary because only one class is needed in the end. However, it was not rewritten yet. From these classes, it is possible to get values, like all arguments, event names, and others using method calls. The parsing uses a parser stack. At the end of the parsing, the parser stack should be empty. In case something wrong happens and the parser stack is not empty, the script warns the user about this issue, yet continues the analysis. Parsed system calls are stored in a multi-level dictionary with the path-like structure. At the top level, the key is the name of the binary, in next level the method of connection, and then the name of the Docker image, in which it was executed. In the case of making the corpus or parsing the clean runs, the structure is slightly different. It has only two levels — top-level, which is the name of the executable, and then the method of connection. The next step is doing the actual analysis and finding the differ- ences. This analysis is done by comparing each infected SSH with every clean one only with the same methods of connections. The script then keeps diff with a minimum amount of changes because that is probably the closest one. Finding the right diff library was a noteworthy hard task. Diff from a standard library does not guarantee a minimal diff, as mentioned in the documentation. The result was horrible. It took a few changes to mark the rest as changed. The best result gave a library from google. Its name is diff-match-patch2 and uses Myer’s diff algorithm that should be the best general-purpose diff algorithm, according to the library documentation. However, the code uses an inner function, not a public function. The problem was with the time of the execution. Probably in most cases, users want to end a diff algorithm in a relatively small amount of time because, by default, the execution ends when hitting a specific time limit. It also uses pre-diff and post-diff procedures for speedup. In the case ofthis

2. Available on GitHub: https://github.com/google/diff-match-patch

14 2. The Data analysis

thesis, time is not a deal-breaker. The output of the inner function is tuples with individual differences. The implementation then transfers this output to two lists of the same sizes. Transformation happens by adding each tuple to a specific list or both lists. It depends on whether the part is equal, removed, or added. Every time the diff parts are equal, the lists are filled uptohave equal sizes. The filling is often right away, but there are some excep- tional cases. When the diff in the tuple does not start with a new line character, then these characters till the new line character are added to the last name of the system call. That was often the case for at in fstat and openat. Then the comparison is made line by line to reduce the storage size. Stored is then a list of dictionaries containing two fields — clean, infected. Both are dictionaries as well and have the same structure. It is a list of system calls in which they differ and the number of the line, where this list starts in the original log. At the moment, currently supported output formats are JSON, diff, and unified diff. These formats are for a couple of specific reasons. The JSON output is targeted for the corpus or to be handled by other programs. The structure is similar to the internal structure used in the script. On the other hand, the diff and unified diff formats arefor various viewers and are full-filling the specification3. The origin part is always a clean run. The other, modified part, is the infected run. The script takes a relatively long time to finish; after all, many comparisons are happening here. Each run contains something about 1300 lines, and all four methods need to be analyzed. The comparison is then against a large dataset containing a similar number of lines. As already mentioned, the larger the clean dataset is, the more precise the results are, but also, the slower the process is. This script could be modified to use something else than system calls. The only thing that would need change is data parser because, at the moment, the scripts expect Sysdig’s output format. However, if the script has a parser for a ltrace, then it would work the same for library calls.

3. The specification is according to GNU Diffutils made by GNU Project.It can be found on this url — https://www.gnu.org/software/diffutils/manual/ diffutils.html#Output-Formats

15

3 ESET samples and their analysis

The necessary part of this thesis is to test the hypothesis. The develop- ment happened alongside testing it on one sample. It was successful, however, not satisfiable. The script could be functional only forthat one malware. Thanks to ESET, it was possible to try more infected binaries and to have some qualitative results. The malware dataset from ESET cannot be part of this thesis. The archive is threated as TLP:AMBER — do not share further than nec- essary, and do not post samples in public repositories. However, the contact with ESET can be mediated on a request. Also, ESET has a pub- licly available and very detailed description of their collected infected SSH binaries written in two white papers [7][8].

3.1 Overview of samples

There are 84 malware samples in the collection. It contained a large variety of malware binaries compiled across a different spectrum of Linux distributions and even architectures. The significant thing to mention is that all malicious code is inside the binary, not a library or any other file. The fundamental division is into two categories, and that is into the SSH clients and the SSH servers. The problem was deciding which was what. The tool named strings came in handy. The easiest way was to look for either — usage: ssh — or for — usage: sshd. Unfortunately, that was not the case for 21 binaries. After further examination, the decision was still not clear. There were definitely at least two server binaries. In one, there was written a welcome text when accessed using the . Most of these were probably clients. However, because the decision was not unanimous and the collection has 24 confirmed clients and even more servers, these binaries were not used in the end. The next step is architecture. In upfront, it is sure that not every architecture is possible to execute effortlessly. The collection contained architecture x86-64, Intel 80386 (32-bit), ARM, MIPS. The host running the analysis is x86-64, so running binaries with this architecture as well in docker is not a problem at all. The Intel 80386 is a little bit worse. In theory, it is possible. However, there is some additional work that

17 3. ESET samples and their analysis needed to be done. Multi-arch support has many Linux distributions. It usually needs to install a 32-bit interpreter and some libraries. In the end, it is still doable with a small amount of effort. The Docker1 Hub also offers 32-bit images. With the rest architectures, ARM, and MIPS, it is more complicated. Using a Qemu could potentially solve the issue. However, it would be needed to collect clean runs for them as well because it is unknown how much differences the emulator would produce on the system-level. The required effort was not worthy of four extra binaries. So as the result of this was, these binaries were also removed from the tested collection. After altogether, there were 22 SSH clients and 43 SSH servers. However, it was still not over. Running the binaries revealed more pieces of information about the binaries and the struggles with their execution, especially when the only required libraries should be the ones from the OpenSSL or LibreSSL. These struggles will be described later in the thesis. The final number of successfully executed binaries is 17 for the SSH clients and 29 for the SSH server. In the case of servers, there was not much to discover. This approach focuses mostly on the SSH client, so the analysis of the SSH servers end up with no success. Going through the analysis of the SSH clients gives more discov- eries. In this small collection, there are many things to learn. Even it is possible to see malware within the same family tree. They do very similar things or even ultimately the same thing. This similarity happened twice in total. The malicious code in the SSH client can do mainly one thing, and that is stealing the authentication details, so either a password with username or a private key. Theft of the private key happens in the collection in several cases. However, the username and password is the main target. Only in two examples out of 17 happened that the objective was a key as well as the password. The next nine binaries aim only for passwords. The other two prevents from being observed. The last four binaries were not possible to either trigger or to detect any malicious behavior. In three cases, the binaries send the authentication details directly to the remote server. Unfortunately, it is not possible to track the ac- tions further. This fact also stands for the undetected ones. There was a

1. A public repository of Docker images.

18 3. ESET samples and their analysis possibility to see some DNS action. However, because there is no inter- net access, the resolving failed. For the rest of the binaries, the stealing method is simple. Just save the credentials into the file somewhere. The little bit funny thing is that only in one case, the file location is in /var/tmp, and therefore, even a user can create a file here. The rest are in the file location that requires a root access or a sudo access. They all rely on the fact that the root or a sudo runs SSH first. This step creates the file, and then it runs chmod on it so every usercan write into it. These locations, for example, are inside /etc or /usr. The files have names that try to fit between the others. Half of themstart with the dot, so with the basic ls command (no additional option), it is not possible to see them. It should be noted that the runs do not differ between root and a regular user though the files may not be created due to missing permissions. In most of the cases, the malware uses a cloned process for actual malicious behavior. The possible reason could be to execute malicious code fully. Also, this way, the user will not feel any delays caused by the malware. The format of the saving of the credentials to the file differs from malware to malware. Some use the base64 encoding of the text. In general, they use plain text format. It contains the server URL, name, password. It is easily possible to distinguish each value on the line. It is also interesting that one malware also used an additional piece of information, and that was a direction of authentication. Maybe there exists a server version of this malware saving the credentials into the same file. There was also one malware that tried to execute some other binary. This fact is an example, why the environment should be identical to the infected one as much as possible. That additional file could lead to a better understanding of the malicious behavior as a whole.

3.2 Success Rate

The number of tested SSH client binaries was 17. It was possible to find some malicious behavior in 13 of them, which can be a positive sign for a further development. However, the decision about the rest of them is

19 3. ESET samples and their analysis hard, and therefore the real success is hard to measure. If the malicious code was not triggered, then it cannot be added to the failed ones. If the behavior can be on the library level, then this analysis will not discover anything as well. It can also fail on DNS resolving, or it was simply missed. There can be many reasons. The whole log produced by Sysdig looked clean, as well. Even the description [7] pointed to not triggering the malware rather than missing the malicious part. The clean dataset used in this thesis was relatively small and only with 64-bit binaries. On average, it still reduces the whole log of average 1355 lines into the 703 of diff. That is about a half. If the clean dataset would be more extensive, then the results could be even better. The best reduction is from 1316 to 285 lines long diff. The SSH version and the environment is similar to the known one for sure. In a detailed look, it is possible to see that it was a 64-bit version in a pretty standard version. Probably coming from CentOS (because their log is similar), but this binary is stripped, so it is hard to tell. On the other hand, the worst reduction is from 1794 to 1187. The binary is a 32-bit binary, and in the clean dataset, there was no such architecture. In general, that was the case. Infected binaries in the 32-bit version had more lines on diff. On average, for 32-bit binaries, it was a reduction from 1488 lines to 1040. Those are unfortunate results. For 64-bit binaries, it was 520 lines of diff after cut from 1282 long Sysdig’s log. That is good with the thought that the composition of the clean dataset was relatively small. The clean dataset is a composition of SSH from packages from following Linux distributions:

∙ CentOS 7

∙ CentOS 8

∙ Debian 8

∙ Debian 9

∙ Debian 10

14

∙ Ubuntu 18

20 3. ESET samples and their analysis

They have a relatively new SSH version. However, the infected 64- bit binaries also contain older ones, and they could made the average number of lines higher. For the SSH servers, the final number of tested binaries was 28. The diffs usually had a similar number of lines as the Sysdig’s logs.As already mentioned in the first chapter, this approach mainly targeted to SSH clients, and it was not optimal for SSH servers due to multiple reasons. The most significant one is that the log was not identical, even for the same environment and steps.

3.3 Problems during analysis

During the analysis of these samples occurred many unexpected situa- tions. The script development was happening alongside the clean bina- ries and one infected binary. Each one was running correctly, but run- ning the ESET’s dataset pointed out some gaps. Adding the required libraries was the biggest challenge when executing the dataset. The required one should be OpenSSL or LibreSSL. OpenSSL has shared libraries called libcrypto.so.x.y.zt with the latest version libcrypto.so.1.1. The naming convention is pretty apparent. However, this convention can differ in various Linux distributions. In RPM-based distributions it is libcrypto.so.x. Although it is just a symbolic link to the one with the standard naming, this link can point to more than just one specific version of the library. For example, libcrypto.so.10 can be targeted to all these versions:

∙ libcrypto.so.1.0.2t

∙ libcrypto.so.1.0.2r

∙ libcrypto.so.1.0.2q

∙ libcrypto.so.1.0.2o

∙ libcrypto.so.1.0.2k

∙ libcrypto.so.1.0.1g

∙ libcrypto.so.1.0.1e

21 3. ESET samples and their analysis

Often the version can be without the letter. However, some of the binaries are built against specific versions. The consequence is the binary then requires its specific version. This version cannot be obtained by using program strings. The only way to find out the used version is by running the binary. If it passes, then it has a correct one. Otherwise, it prints out the needed one. RPM-based Linux distributions also require more libraries like libfipscheck, libz, and more. Finding the right libraries is time- consuming work. The program called ldd comes in handy. It shows the libraries needed by the binary, but it does not reveal chaining. To be concrete, if a missing library has other dependencies, they are unknown until the library is added. Making the whole tree of depen- dencies is, therefore, an easy but time-consuming job. An unexpected discovery was also about configuration. In some cases, the setting was also crucial to the execution of a program. In most cases, the configuration did not throw any error, and even when it does, the clients were still able to connect to the server. Maybe the number of misconfigurations plays a vital role. However, even the same count of wrong configurations between two binaries had different results. Another type of error was that it was not matching any cipher, although the list of supported ciphers was long and contained the most common and well-known ciphers. One possible explanation could be that the binary is pretty old and has only outdated ciphers. Parsing the collected system calls also showed gaps. The main one was that sometimes, it could be a pair of input and output and sometimes not. The execve was a good example, and finding those cases was hard. The Sysdig in one binary produced an unknown system call. Also, cloning the process was unexpected, although it makes sense.

3.4 Ideas for improvements

Many things could be improved. Having run from each version on each architecture is a necessity for the best results. Writing some script that would download each version of SSH, even the outdated ones, and paired them with every possible OpenSSL and LibreSSL version.

22 3. ESET samples and their analysis

The goal is to have a massive clean dataset. The bigger this dataset is, the better the results will be. Also, the environment in each run should remain the same. At the moment, the clean executions captured were from SSH installed using package managers in their native . The library loading phase could be different, and a specific library could be loaded after more attempts. This fact would mean additional system calls in which they can differ. The diff algorithm could also be better. For proof of concept, the general-purpose diff algorithm was chosen. Also, the only comparison of system call names was made. There was no special treatment. Some system calls are more prone to be used in malicious behavior than others. The extra write or connection system call is more suspicious than an extra mmap. More advanced analysis mechanisms would be nice to see. For example, a similar approach like used here [1] or a machine learning algorithm, which would be especially useful for SSH servers. The current status is semi-automatic. The user needs to execute the data collection manually. Use this results in a different script, which requires the user to execute it. Then manually search and decide if the binary is malicious. The overall process needs many manual actions. The analysis should be as much automatic as possible — a friendly web user interface with a simple input for file upload. After the binary would be uploaded, the review would start. The user would be notified after it is done. The analysis would mark the binary either as clean, suspicious, or infected, and the diffs would be shown on the web page. The user can go through them and marking them as positive or false positive. The machine learning could learn from them. Another thing is that the script does nothing with the arguments. At the current code, they are left unused. There exists a small chance that the system call could be parsed with the valid one, and the valid one showed as potentially malicious. The arguments could prevent this, but it is important to add weight to the meaningful ones. For example, for a read system call, the file name can be attractive, read data also, but the size not so much. The IP address of the docker container could be united in every run. The docker network IP varies at the moment and can be either in range 172.18.0.x or 192.168.0.x, which could be confusing to the ones, that do not know this fact.

23 3. ESET samples and their analysis

Another thing is parallel runs. There is much of just waiting time (at least 60 sec for each scenario). The container names, network are name dependent. UUIDs2 could extend their names, and each scenario could run parallel. The same goes for the analysis. Another significant improvement would be that the scripts could run more architectures. For example, the IoT3 is getting popular, and they often contain security vulnerabilities, so they are relatively easy targets. However, it would require a further look into the data because SSH would need to be executed in an emulator. Alternatively, there could be a prepared device with this architecture, and the script would delegate the job there. The addition of required dependencies was also a time-consuming job. However, this task could also be automated more. These missing libraries were installed from packages, and those packages were taken from two websites. Nevertheless, they could be taken from the official repositories as well, however, for older versions that do not have to be the case. There is no guarantee that the package would not be outdated. Also, the script could differentiate whether the input is an SSH server or an SSH client and adapt accordingly. This task was done manually using grep and searching for a typical SSH help command. For some binaries, this did not go well. Better could be to differentiate according to a config file, therefore sshd_config or ssh_config. The whole process could be more automatic, so a more massive collection of infected binaries could be put through, and its result would be a classification for each binary. It could also be possible to see the diffs for each binary.

2. Universally Unique Identifier 3.

24 4 Conclusion

This thesis showed that there is potential in this method. It can save time for the investigator, especially when investigating multiple in- fected binaries. For SSH clients, the results are more promising than for the the SSH servers. For SSH client, at this point, it is more about improvement. However, the SSH server would require further work and research. The potential could be there, although the thesis did not show any success. It could be interesting to try similar steps and tools for more sophisticated programs than SSH. The scripts written for the thesis are prototypes. The code always adapted according to the discoveries during the testing or develop- ment. Every script could use a rewrite for the proper usage as a tool.

25

Bibliography

1. ABED, Amr S.; CLANCY, Charles; LEVY, David S. Intrusion De- tection System for Applications Using Linux Containers. Lecture Notes in Computer Science. 2015. ISBN 9783319248585. ISSN 1611- 3349. Available from DOI: 10.1007/978-3-319-24858-5_8. 2. Sysdig documentation [online]. 2019 [visited on 2019-12-13]. Avail- able from: https://github.com/draios/sysdig/wiki. 3. Sysdig 2019 Container Usage Report: New Kubernetes and security insights [online]. 2019 [visited on 2019-12-13]. Available from: https://sysdig.com/blog/sysdig- 2019- container- usage- report/. 4. Docker documentation [online]. 2019 [visited on 2019-12-13]. Avail- able from: https://docs.docker.com/. 5. OpenSSH documentation [online]. 2019 [visited on 2019-12-13]. Available from: https://github.com/openssh/openssh-portable. 6. COZZI, E.; GRAZIANO, M.; FRATANTONIO, Y.; BALZAROTTI, D. Understanding . In: 2018 IEEE Symposium on Security and Privacy (SP). 2018. ISSN 2375-1207. Available from DOI: 10.1109/SP.2018.00054. 7. DUMONT, Romain; LÉVEILLÉ, Marc-Etienne M.; PORCHER, Hugo. THE DARK SIDE OF THE FORSSHE - A landscape of OpenSSH backdoors [online]. 2018 [visited on 2019-12-13]. Available from: https://www.welivesecurity.com/wp-content/uploads/2018/ 12/ESET-The_Dark_Side_of_the_ForSSHe.pdf. Technical report. ESET. 8. BILODEAU, Olivier; BUREAU, Pierre-Marc; CALVET, Joan; DORAIS- JONCAS, Alexis; LÉVEILLÉ, Marc-Étienne M.; VANHEUVERZWIJN, Benjamin. Operation Windigo – the vivisection of a large Linux server- side credential-stealing malware campaign [online]. 2014 [visited on 2019-12-13]. Available from: https://www.welivesecurity.com/ wp-content/uploads/2014/03/operation_windigo.pdf. Techni- cal report. ESET.

27

A List of electronic attachments

Alongside with this thesis, other attachments such as scripts are inside zip file called attachment.zip. Its description is:

∙ helpers/ folder, which contains various helper functions.

– ContainerDiffMaker.py is a helper function for making the actual diff between two lists of the same size. – DiffParser.py is a helper function for parsing the output from the diff library to the two lists of the same size. – ParseToObject.py is a helper function, which adds argu- ments to the internal diff storage structure.

∙ images/ folder, which contains prepared Docker images. They do not have to be used. They can serve as an inspiration. It also contains a server image used in the collector.

∙ keys/ folder for keys already containing an RSA key pair. The length is 2048-bits.

∙ output/ empty folder, but a log from an SSH client goes into it.

∙ output_raw/ empty folder, but a complete log from Sysdig goes into it.

∙ output_server/ empty folder, but a log from an SSH server goes into it.

∙ output_formats/ folder, which contains supported output for- mats.

– diff_output_format.py normal diff format. Only export is supported. – json_diff_output_format.py JSON diff format. The only one that also supports import. – unified_diff_output_format.py unified diff format. At the moment, only export is supported.

29 A. List of electronic attachments

∙ syscalls/ folder, which contains code around system call rep- resentation and creation. – types/ folder, classes for system calls. The classes are named after an appropriate system call. However, this naming con- vention turned down not be convenient, and therefore the system call name does not have to be the same as the one for the class. – Data.py abstract class for DataIn and DataOut. The class represents one line in the log file. – DataBuilder.py class using builder pattern. It builds either a DataIn or DataOut. – DataIn.py extending Data. The class represents system call in an in direction. – DataOut.py extending Data. The class represents system call in an out direction. – Directions.py class, which purpose is to be the enumera- tion of values for system call direction. – SystemCallFactory.py class using the factory pattern. The purpose is to choose the most relevant class for the repre- sentation of the system call. ∙ analyzer.py script, which does the analysis described in the thesis. ∙ collector.py script, which does the log collection described in the thesis. ∙ image_builder.py script, which builds the server image. ∙ script.sh script, which wraps the analyzer and can iterate over a collection of SSH clients. It passes each one to the analyzer and executes the analyzer accordingly. ∙ script_server.sh script, which wraps the analyzer and can iterate over a collection of SSH servers. It passes each one to the analyzer and executes the analyzer accordingly. ∙ LICENSE.md just a MIT license.

30