Analysis of the Linux Random Number Generator in Virtualized Environment

Total Page:16

File Type:pdf, Size:1020Kb

Analysis of the Linux Random Number Generator in Virtualized Environment Masaryk University Faculty of Informatics Analysis of the Linux random number generator in virtualized environment Master’s Thesis Radka Cieslarová Brno, Fall 2018 Declaration Hereby I declare that this thesis is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Radka Cieslarová Advisor: RNDr. Petr Švenda, Ph.D. i Acknowledgements I would like to thank to my great advisor Petr Švenda for his consulting, enthusiasm and tolerance. I am also grateful to other people for their support and willingness to help, especially to Radim, Káťa and Robert. iii Abstract This thesis analyzes the Linux random number generator behavior in virtualized environment in two different ways. Firstly, entropy inputs to the generator are examined and their entropy is compared to the entropy of inputs in non-virtualized environment. Secondly, the Linux random number generator behavior after restoring from snapshots is analyzed to observe so-called reset vulnerabilities. iv Keywords randomness testing, statistical test batteries, Linux random number generator, entropy, virtualized environment, VirtualBox v Contents 1 Introduction 1 2 Related work 3 2.1 Widespread weak keys in network devices ..........3 2.1.1 Vulnerabilities . .3 2.1.2 Weak entropy and the LRNG . .5 2.2 Virtual machine reset vulnerabilities .............5 2.2.1 TLS client vulnerabilities . .6 2.2.2 TLS server vulnerabilities . .7 3 Randomness testing 9 3.1 Statistical randomness testing ................9 3.2 NIST Statistical Test Suite .................. 10 3.3 Dieharder ........................... 11 3.4 TestU01 ............................ 12 3.5 Randomness Testing Toolkit ................. 12 4 Generating random data in Linux 15 4.1 Function $RANDOM ...................... 15 4.2 Linux random number generator ............... 15 4.2.1 Initialization . 16 4.2.2 Entropy collection . 17 4.2.3 Entropy pools update . 20 4.2.4 /dev/random . 20 4.2.5 /dev/urandom . 22 4.2.6 Changes to the LRNG in previous years . 23 5 Generating random data in virtualized environment 25 5.1 Access mediation to hardware resources ........... 25 5.1.1 Hardware resources not accessible to a guest . 25 5.1.2 Hardware resources exclusively assigned to a guest . 25 5.1.3 Hardware resources shared between guests . 26 5.2 Access mediation to CPU instructions ............ 26 5.3 Entropy Sources in Linux random number generator .... 26 5.3.1 Disk I/O . 27 vii 5.3.2 Human input . 28 5.3.3 Interrupt requests . 29 5.3.4 Timer in Oracle VirtualBox . 29 6 Analysis of the LRNG entropy inputs in virtualized environ- ment 31 6.1 Methodology ......................... 32 6.2 Baseline ............................ 34 6.2.1 Testing output data . 34 6.2.2 Testing input data . 35 6.3 Detailed input analysis .................... 39 6.3.1 Disk I/O . 39 6.3.2 Interrupt requests . 39 6.3.3 Human input . 40 7 Analysis of the LRNG within virtual machine snapshots 41 7.1 Methodology ......................... 41 7.2 Results ............................ 43 8 Conclusion 45 Bibliography 47 Index 51 A VirtualBox settings 51 viii 1 Introduction Random numbers are important for many different uses – in statistics, computer simulations, and in cryptography. In cryptography, random numbers are necessary for creating initial vectors, keys, seeds, and random noise. Hence, it is important to be able to generate random data of a high quality. A widely used generator is the Linux random number genera- tor (LRNG) from the Linux kernel. The LRNG uses several different entropy inputs, however, when used in a virtualized environment, an access to these entropy sources might be limited or modified by the virtual machine monitor. The aim of this thesis is to analyze the LRNG behavior in a virtual- ized environment, especially in the Oracle VirtualBox, in two different ways. Firstly, entropy inputs to the generator are examined and their entropy is compared to the entropy of inputs in a non-virtualized environment. Secondly, the Linux random number generator behavior after restoring from snapshots is analyzed to observe so-called reset vulnerabilities. The second chapter presents two papers connected to the vulner- abilities in the LRNG. The chapter shows how even a good random number generator, which the LRNG is, may generate vulnerable keys. In the first paper, vulnerable keys generated by the LRNG are caused by limited access to entropy. The second paper presents so-called re- set vulnerabilities, where the LRNG virtual machines reverted from snapshots produce the same output. The third chapter focus on methods for measuring randomness, on randomness testing. Statistical randomness testing and different randomness testing suites are presented. Chapter four provides one of the most detailed descriptions of the LRNG on the Internet. A part of the LRNG was changed in 2016, thus many sources do not provide actual information about the LRNG generating mechanism. A detailed study of the LRNG was necessary for the analysis in the practical part of the thesis. The following chapter describes how is the LRNG influenced when used in the virtualized environment. The influence on all entropy 1 1. Introduction inputs is examined with a focus on virtual machine monitor Oracle VirtualBox. In chapter six, analysis of the LRNG entropy inputs in a virtualized environment are executed. Firstly, the output of the LRNG is analyzed using statistical randomness testing. The kernel is modified in such a way that it allows collecting entropy inputs. Entropy in the entropy inputs to the LRNG is measured, and finally, different sources of the entropy inputs are observed. The source code of the modified kernel is available as an electronic attachment. The last chapter examines the virtual machine reset vulnerabilities in the Oracle VirtualBox. 2 2 Related work Significant body of research related to random data generation was published. The publications focus on different topics in this area: new generators design [1, 2, 3, 4], weak or predictable generators [5, 6], random data analysis [7], vulnerabilities caused by using problematic generators [8, 9], etc. Two most related publications that investigated issues connected to random data generation in the Linux operating system [10] and within a virtual machine [11] are presented in this chapter. 2.1 Widespread weak keys in network devices Research Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices [10] was published in 2012. The paper focused on RSA and DSA keys, which can be vulnerable when generated by malfunctioning random number generators. Inspected keys come from TLS and SSH servers, and the researchers found out that surprisingly high amount of those keys is vulnerable. Found vulnerabilities are presented in the following section. In this research, a large network scan was performed and 5.8 mil- lion unique TLS certificates and 6.2 million unique SSH host keys were obtained. Afterwards, it was necessary to identify what hardware or software generated the obtained keys to find potentially vulnerable device models producing weak keys, and then compare keys from a specific device model. 2.1.1 Vulnerabilities Repeated keys: It was found that 61% of the TLS hosts and 65% of the SSH hosts used the same key as another host in the scan. Usually, the keys were the same due to using default keys or due to low entropy available during the key generation. Many repeated keys were due to shared hosting providers. Another cause of repeated keys were distinct certificates with the same public key, belonging to the same organization. Neither of those causes can 3 2. Related work be considered as a vulnerability, nevertheless still many vulnerable keys remain. More than 5% of the TLS hosts served default keys or certificates from the manufacturer. These keys are preconfigured in the firmware and usually, all devices of one device model share the same key pair. Private keys may be obtained by reverse engineering, or simpler from public databases of these default keys if such a database exists for the device model. In TLS, more than 0.3% hosts (43,852 hosts) served repeated keys due to low entropy during the key generation and 98% of these certifi- cates with repeated keys were self-signed. In the case of SSH hosts, it was impossible to distinguish between default keys and repeated keys due to low entropy during key genera- tion. However, 9.6% of the obtained SSH keys were repeated for one of these reasons. Factorable RSA keys: When two RSA keys share one of their prime factors with another key, counting GCD of these keys is simple and fast, thus it is simple to count the private keys. Based on shared prime factors, it was possible to obtain private keys of 0.4% of the TLS certificates and 0.02% of RSA SSH host keys. Most of those vulnerable keys were generated by devices from only a few different manufacturers. In one case, 576 devices from one manu- facturer used keys generated only from 9 distinct prime factors. Most of the vulnerable keys were system-generated certificates and keys used by headless or embedded network devices like routers or firewalls. DSA signature weaknesses: In a case of DSA keys, the problem is with repeated ephemeral keys. When a DSA key is used to sign two different messages using the same ephemeral key, it is possible to compute the long-term private key from the public key and signatures efficiently. In the scan, 0.05% of obtained SSH DSA signatures (usually two signatures were obtained from one SSH host) contained the same r as at least one other signature and 94% of these repeated r values used the same r and public key. Based on this and using the same ephemeral key, it was possible to count private keys for 1.6% of SSH DSA hosts.
Recommended publications
  • Random Number Generator
    Random Number Generator Naga Bharat Reddy Dasari October 2 2015 1 Introduction Random Number Generator was classified into two types, non-deterministic random bit generator (NRBG) and Deterministic Random Bit Generators (DRBG). Random Bits produced by involving some physical process which are unpredictable, fall into class of NRBG's. The rest of the random bit generators which were developed using some strong mathematical logic and algorithms, fall into the class of DRBG's. In this paper we would see the different DRBG algorithms which were specifically designed to accommodate the properties of cryptography. These algorithm would fall into the category of cryptographically secure pseudo-random number generator (CSPRNG).An algorithm is said to be CSPRNG if they pass statistical randomness test and also hold up well under serious attack, even when part of their initial or running state becomes available to an attacker[1]. 2 History There are many pseudo random number generators which have been proposed and implemented throughout the decades. The first pseudo random number generator algorithm for the use electronic computer was proposed in 1951 by Jon von Neumann [2] and this algorithm implements middle square method. The disadvantage of this method was choosing the initial seed. Some more pseudo random number generators with high importance in the history include Lin- ear congruential generator, Mersenne twister, Blum Blum Shub , Wichmann-Hill etc., But to be a CSPRNG any algorithm need to fulfill the requirements mentioned in section 1. There are many ex- amples in cryptographic ciphers which are excellent but the random choices were not random enough and security was lost as a direct consequence.
    [Show full text]
  • Final Hubbard WSC PRNG Version Dwh 8 21 2019.Pdf
    Proceedings of the 2019 Winter Simulation Conference N. Mustafee, K.-H.G. Bae, S. Lazarova-Molnar, M. Rabe, C. Szabo, P. Haas, and Y.-J. Son, eds. A MULTI-DIMENSIONAL, COUNTER-BASED PSEUDO RANDOM NUMBER GENERATOR AS A STANDARD FOR MONTE CARLO SIMULATIONS Douglas W. Hubbard Hubbard Decision Research 2S410 Canterbury Ct. Glen Ellyn, IL 60137, USA ABSTRACT As decisions models involving Monte Carlo simulations become more widely used and more complex, the need to organize and share components of models increases. Standards have already been proposed which would facilitate the adoption and quality control of simulations. I propose a new pseudo-random number generator (PRNG) as part of those standards. This PRNG is a simple, multi-dimensional, counter-based equation which compares favorably to other widely used PRNGs in statistical tests of randomness. Wide adoption of this standard PRNG may be helped by the fact that the entire algorithm fits in a single cell in an Excel spreadsheet. Also, quality control and auditability will be helped because it will produce the same results in any common programming language regardless of differences in precision. 1 INTRODUCTION Monte Carlo simulations are a powerful way to model uncertainty in risky and complex decisions. Even relatively simple cost-benefit analysis problems with a few uncertain variables will benefit from the use of a simulation to determine the probability distribution of net benefits given the probability distributions of project cost and durations, product demand, interest rates and growth rates (Hubbard 2009; Hubbard 2014; Savage 2012; Savage 2016). These require the generation of samples of values which are distributed in a way which appear to be consistent with random values.
    [Show full text]
  • Empirical Testing of Pseudo Random Number Generators Based on Elliptic Curves
    Degree project Empirical testing of pseudo random number generators based on elliptic curves Abstract An introduction on random numbers, their history and applications is given, along with explanations of different methods currently used to generate them. Such generators can be of different kinds, and in particular they can be based on physical systems or algorithmic procedures. The latter type of procedures gives rise to pseudo-random number generators. Specifically, several such generators which are based on elliptic curves are examined. Therefore, in order to ease understanding, a basic primer on elliptic curves over fields and the operations arising from their group structure is also provided. Empirical tests to verify randomness of generated sequences are then considered. Afterwards, there are some statistical considerations and observations about theoretical properties of the generators at hand, useful in order to use them optimally. Finally, several randomly generated curves are created and used to produce pseudo-random se- quences which are then tested by means of the previously described generators. In the end, an analysis of the results is attempted and some final considerations are made. Keywords: elliptic curves, cryptography, pseudo random, number generation, PRNG, TRNG, linear congruential generator, power generator, Naor-Reingold generator, empirical testing, frequency test, serial test, run test, poker test, autocorrelation test Acknowledgements I would like to thank in particular Per-Anders Svensson for advice while choosing the topic for this thesis and the assistance throughout; Karl-Olof Lindahl for the useful lectures on the thesis process; and finally my family for supporting me during my studies. 1 Contents 1 Introduction 4 1.1 Motivation and aim .
    [Show full text]
  • On Measuring Randomness
    On Measuring Randomness Karel Lang 37754 Master's thesis in computer science Supervisor: Jan Westerholm Faculty of Science and Engineering Åbo Akademi University Table of Contents Abstract ................................................................................................................................................ 2 1 Introduction ....................................................................................................................................... 3 1.1 Randomness and Quantum Randomness ................................................................................... 4 1.2 True Randomness and Axioms ................................................................................................... 5 1.3 Types of Randomness ................................................................................................................ 7 1.4 Explaining the Random Sequence ........................................................................................... 10 2 Measuring Randomness .................................................................................................................. 12 2.1 Statistical Hypothesis Test ....................................................................................................... 13 2.2 Sigma and p-values .................................................................................................................. 16 2.3 Inner workings of a PRNG......................................................................................................
    [Show full text]