Analysis of the Linux Random Number Generator in Virtualized Environment

Masaryk University Faculty of Informatics Analysis of the Linux random number generator in virtualized environment Master’s Thesis Radka Cieslarová Brno, Fall 2018 Declaration Hereby I declare that this thesis is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Radka Cieslarová Advisor: RNDr. Petr Švenda, Ph.D. i Acknowledgements I would like to thank to my great advisor Petr Švenda for his consulting, enthusiasm and tolerance. I am also grateful to other people for their support and willingness to help, especially to Radim, Káťa and Robert. iii Abstract This thesis analyzes the Linux random number generator behavior in virtualized environment in two different ways. Firstly, entropy inputs to the generator are examined and their entropy is compared to the entropy of inputs in non-virtualized environment. Secondly, the Linux random number generator behavior after restoring from snapshots is analyzed to observe so-called reset vulnerabilities. iv Keywords randomness testing, statistical test batteries, Linux random number generator, entropy, virtualized environment, VirtualBox v Contents 1 Introduction 1 2 Related work 3 2.1 Widespread weak keys in network devices ..........3 2.1.1 Vulnerabilities . .3 2.1.2 Weak entropy and the LRNG . .5 2.2 Virtual machine reset vulnerabilities .............5 2.2.1 TLS client vulnerabilities . .6 2.2.2 TLS server vulnerabilities . .7 3 Randomness testing 9 3.1 Statistical randomness testing ................9 3.2 NIST Statistical Test Suite .................. 10 3.3 Dieharder ........................... 11 3.4 TestU01 ............................ 12 3.5 Randomness Testing Toolkit ................. 12 4 Generating random data in Linux 15 4.1 Function $RANDOM ...................... 15 4.2 Linux random number generator ............... 15 4.2.1 Initialization . 16 4.2.2 Entropy collection . 17 4.2.3 Entropy pools update . 20 4.2.4 /dev/random . 20 4.2.5 /dev/urandom . 22 4.2.6 Changes to the LRNG in previous years . 23 5 Generating random data in virtualized environment 25 5.1 Access mediation to hardware resources ........... 25 5.1.1 Hardware resources not accessible to a guest . 25 5.1.2 Hardware resources exclusively assigned to a guest . 25 5.1.3 Hardware resources shared between guests . 26 5.2 Access mediation to CPU instructions ............ 26 5.3 Entropy Sources in Linux random number generator .... 26 5.3.1 Disk I/O . 27 vii 5.3.2 Human input . 28 5.3.3 Interrupt requests . 29 5.3.4 Timer in Oracle VirtualBox . 29 6 Analysis of the LRNG entropy inputs in virtualized environment 31 6.1 Methodology ......................... 32 6.2 Baseline ............................ 34 6.2.1 Testing output data . 34 6.2.2 Testing input data . 35 6.3 Detailed input analysis .................... 39 6.3.1 Disk I/O . 39 6.3.2 Interrupt requests . 39 6.3.3 Human input . 40 7 Analysis of the LRNG within virtual machine snapshots 41 7.1 Methodology ......................... 41 7.2 Results ............................ 43 8 Conclusion 45 Bibliography 47 Index 51 A VirtualBox settings 51 viii 1 Introduction Random numbers are important for many different uses – in statistics, computer simulations, and in cryptography. In cryptography, random numbers are necessary for creating initial vectors, keys, seeds, and random noise. Hence, it is important to be able to generate random data of a high quality. A widely used generator is the Linux random number generator (LRNG) from the Linux kernel. The LRNG uses several different entropy inputs, however, when used in a virtualized environment, an access to these entropy sources might be limited or modified by the virtual machine monitor. The aim of this thesis is to analyze the LRNG behavior in a virtualized environment, especially in the Oracle VirtualBox, in two different ways. Firstly, entropy inputs to the generator are examined and their entropy is compared to the entropy of inputs in a non-virtualized environment. Secondly, the Linux random number generator behavior after restoring from snapshots is analyzed to observe so-called reset vulnerabilities. The second chapter presents two papers connected to the vulnerabilities in the LRNG. The chapter shows how even a good random number generator, which the LRNG is, may generate vulnerable keys. In the first paper, vulnerable keys generated by the LRNG are caused by limited access to entropy. The second paper presents so-called reset vulnerabilities, where the LRNG virtual machines reverted from snapshots produce the same output. The third chapter focus on methods for measuring randomness, on randomness testing. Statistical randomness testing and different randomness testing suites are presented. Chapter four provides one of the most detailed descriptions of the LRNG on the Internet. A part of the LRNG was changed in 2016, thus many sources do not provide actual information about the LRNG generating mechanism. A detailed study of the LRNG was necessary for the analysis in the practical part of the thesis. The following chapter describes how is the LRNG influenced when used in the virtualized environment. The influence on all entropy 1 1. Introduction inputs is examined with a focus on virtual machine monitor Oracle VirtualBox. In chapter six, analysis of the LRNG entropy inputs in a virtualized environment are executed. Firstly, the output of the LRNG is analyzed using statistical randomness testing. The kernel is modified in such a way that it allows collecting entropy inputs. Entropy in the entropy inputs to the LRNG is measured, and finally, different sources of the entropy inputs are observed. The source code of the modified kernel is available as an electronic attachment. The last chapter examines the virtual machine reset vulnerabilities in the Oracle VirtualBox. 2 2 Related work Significant body of research related to random data generation was published. The publications focus on different topics in this area: new generators design [1, 2, 3, 4], weak or predictable generators [5, 6], random data analysis [7], vulnerabilities caused by using problematic generators [8, 9], etc. Two most related publications that investigated issues connected to random data generation in the Linux operating system [10] and within a virtual machine [11] are presented in this chapter. 2.1 Widespread weak keys in network devices Research Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices [10] was published in 2012. The paper focused on RSA and DSA keys, which can be vulnerable when generated by malfunctioning random number generators. Inspected keys come from TLS and SSH servers, and the researchers found out that surprisingly high amount of those keys is vulnerable. Found vulnerabilities are presented in the following section. In this research, a large network scan was performed and 5.8 million unique TLS certificates and 6.2 million unique SSH host keys were obtained. Afterwards, it was necessary to identify what hardware or software generated the obtained keys to find potentially vulnerable device models producing weak keys, and then compare keys from a specific device model. 2.1.1 Vulnerabilities Repeated keys: It was found that 61% of the TLS hosts and 65% of the SSH hosts used the same key as another host in the scan. Usually, the keys were the same due to using default keys or due to low entropy available during the key generation. Many repeated keys were due to shared hosting providers. Another cause of repeated keys were distinct certificates with the same public key, belonging to the same organization. Neither of those causes can 3 2. Related work be considered as a vulnerability, nevertheless still many vulnerable keys remain. More than 5% of the TLS hosts served default keys or certificates from the manufacturer. These keys are preconfigured in the firmware and usually, all devices of one device model share the same key pair. Private keys may be obtained by reverse engineering, or simpler from public databases of these default keys if such a database exists for the device model. In TLS, more than 0.3% hosts (43,852 hosts) served repeated keys due to low entropy during the key generation and 98% of these certificates with repeated keys were self-signed. In the case of SSH hosts, it was impossible to distinguish between default keys and repeated keys due to low entropy during key generation. However, 9.6% of the obtained SSH keys were repeated for one of these reasons. Factorable RSA keys: When two RSA keys share one of their prime factors with another key, counting GCD of these keys is simple and fast, thus it is simple to count the private keys. Based on shared prime factors, it was possible to obtain private keys of 0.4% of the TLS certificates and 0.02% of RSA SSH host keys. Most of those vulnerable keys were generated by devices from only a few different manufacturers. In one case, 576 devices from one manufacturer used keys generated only from 9 distinct prime factors. Most of the vulnerable keys were system-generated certificates and keys used by headless or embedded network devices like routers or firewalls. DSA signature weaknesses: In a case of DSA keys, the problem is with repeated ephemeral keys. When a DSA key is used to sign two different messages using the same ephemeral key, it is possible to compute the long-term private key from the public key and signatures efficiently. In the scan, 0.05% of obtained SSH DSA signatures (usually two signatures were obtained from one SSH host) contained the same r as at least one other signature and 94% of these repeated r values used the same r and public key. Based on this and using the same ephemeral key, it was possible to count private keys for 1.6% of SSH DSA hosts.

Analysis of the Linux Random Number Generator in Virtualized Environment

Random Number Generator

Final Hubbard WSC PRNG Version Dwh 8 21 2019.Pdf

Empirical Testing of Pseudo Random Number Generators Based on Elliptic Curves

On Measuring Randomness