True Random Number Generation Using Genetic Algorithms on High Performance Architectures
Total Page:16
File Type:pdf, Size:1020Kb
True Random Number Generation using Genetic Algorithms on High Performance Architectures by Jose Juan Mijares Chan A thesis submitted to The Faculty of Graduate Studies of The University of Manitoba in partial fulfillment of the requirements of the degree of Doctor of Philosophy Department of Electrical and Computer Engineering The University of Manitoba Winnipeg, Manitoba, Canada October 2016 © Copyright 2016 by Jose Juan Mijares Chan Thesis advisor Author Gabriel Thomas, Parimala Thulasiraman Jose Juan Mijares Chan True Random Number Generation using Genetic Algorithms on High Performance Architectures Abstract Many real-world applications use random numbers generated by pseudo-random number and true random number generators (TRNG). Unlike pseudo-random number generators which rely on an input seed to generate random numbers, a TRNG relies on a non-deterministic source to generate aperiodic random numbers. In this research, we develop a novel and generic software-based TRNG using a random source extracted from compute architectures of today. We show that the non-deterministic events such as race conditions between compute threads follow a near Gamma distribution, independent of the architecture, multi-cores or co-processors. Our design improves the distribution towards a uniform distribution ensuring the stationarity of the sequence of random variables. We improve the random numbers statistical deficiencies by using a post-processing stage based on a heuristic evolutionary algorithm. Our post-processing algorithm is composed of two phases: (i) Histogram Specification and (ii) Stationarity Enforcement. We propose two techniques for histogram equalization, Exact Histogram Equalization (EHE) and Adaptive EHE (AEHE) that maps the random numbers distribution to ii Abstract iii a user-specified distribution. EHE is an offline algorithm with O(NlogN). AEHE is an online algorithm that improves performance using a sliding window and achieves O(N). Both algorithms ensure a normalized entropy of (0:95; 1:0]. The stationarity enforcement phase uses genetic algorithms to mitigate the statis- tical deficiencies from the output of histogram equalization by permuting therandom numbers until wide-sense stationarity is achieved. By measuring the power spectral density standard deviation, we ensure that the quality of the numbers generated from the genetic algorithms are within the specified level of error defined by the user. We develop two algorithms, a naive algorithm with an expected exponential complexity of E[O(eN )], and an accelerated FFT-based algorithm with an expected quadratic complexity of E[O(N 2)]. The accelerated FFT-based algorithm exploits the paral- lelism found in genetic algorithms on a homogeneous multi-core cluster. We evaluate the effects of its scalability and data size on a standardized battery of tests, TestU01, finding the tuning parameters to ensure wide-sense stationarity on long runs. Contributions The goal of this thesis is to develop a novel TRNG using a random source extracted from random numbers permuted using evolutionary algorithms for improved solution quality. The following points summarize my contributions. • Extraction of generic random source: I observed that events involving race conditions between computer threads follow a near Gamma distribution, independently of the modern computer architecture (CPU & GPU) and the underlying process scheduler. This observation led me to propose the following below. • TRNG construction: Given that the Gamma distribution found above is of poor quality for an ideal random number generator, I designed a software- based TRNG that improves the distribution towards a uniform distribution and ensures the stationarity of the sequence of random variables. To the best of my knowledge no one has attempted this before. • The proposed TRNG is composed of two post-processing stages: Histogram Specification and Stationarity Enforcement. – Histogram Specification: I developed two techniques, Exact Histogram Equalization (offline algorithm) and Adaptive EHE (online algorithm) that maps the random numbers distribution to a uniform distribution. – Stationarity Enforcement: I used genetic algorithms to mitigate the statistical deficiencies from the histogram specification stage by permuting iv Contributions v the random numbers until wide-sense stationarity is achieved. I developed two algorithms to achieve this: a naive algorithm and an accelerated FFT- based algorithm. – Parallelization: The accelerated FFT-based algorithm using genetic al- gorithms is parallelized on an homogeneous multi-core cluster using Intel Ivy Bridge processors, using the MapReduce programming model to in- crease the performance. • Evaluation: I presented a group of evaluations that highlight the performance and the quality conditions of my algorithm. – Performance analysis: The results of the algorithm scalability and data size on the performance suggested that an expected quadratic com- plexity on the computation time can be achieved. – Quality analysis: Using the parallel version of the accelerated FFT- based algorithm with the genetic algorithms, I observed a relation between the window size and the quality of the results from a standardized battery of tests, TestU01. As well, I presented sub-optimal solutions with tuning parameters that ensure wide-sense stationarity on long runs. Contents Abstract ..................................... ii Contributions .................................. iv Table of Contents . viii List of Figures ................................. ix List of Tables .................................. xv Acknowledgments ................................ xvi Dedication . xvii Acronyms .................................... 1 List of symbols ................................. 4 1 Introduction 8 1.1 The history of random numbers ..................... 9 1.2 Random Numbers ............................. 16 1.3 Characteristics of random numbers ................... 19 1.3.1 The uniform distribution of random numbers . 19 1.3.2 The correlation of random numbers . 23 1.3.3 The spectral density of random numbers . 25 1.3.4 The stationarity of random numbers . 25 1.4 Types of random number generators . 29 2 The Extraction layer on True Random Number Generators 34 2.1 The computer architectures of today . 34 2.2 The idea behind a TRNG on today’s computer architectures . 39 2.2.1 Data hazards ........................... 39 2.2.2 Schedulers ............................. 40 2.2.3 Strategy for designing a TRNG using a compute architecture 43 2.3 Case study: TNRG design on GPUs . 46 2.3.1 Source of randomness found in GPUs . 50 The measurement of race conditions times . 52 The measurement of in-chip temperature calculation times . 52 2.3.2 Evaluation of TRNGs sources in GPUs ............. 53 vi Contents vii Phenomena observation. ..................... 53 Effects due to architectural changes. 53 Behaviour consistency over long runs. ............. 60 2.4 Case study: TRNGs in Modern microprocessors . 64 Effects due to architectural changes. 65 Behaviour consistency over long runs. ............. 68 Other TRNGs developed in hardware for CPUs. 70 2.5 Summary ................................. 71 3 Distribution shape enhancement on TRNGs 73 3.1 Modulus-based algorithms ........................ 74 3.1.1 Optimizing the random numbers representation format . 75 3.1.2 Modulus .............................. 76 3.2 Histogram-based algorithms ....................... 79 3.2.1 Histogram equalization ...................... 79 3.2.2 Exact Histogram Equalization . 83 3.2.3 Adaptive Exact Histogram Equalization . 83 3.3 Asymptotic analysis of algorithms .................... 88 3.4 Summary ................................. 91 4 Stationarity enforcement on True Random Number Generators 93 4.1 Naive approach of stationarity enforcement . 95 4.1.1 Evaluation methodology . 104 4.1.2 Results .............................. 105 4.2 Stationarity enforcement accelerated by a FFT-based algorithm . 109 4.2.1 Correlation metrics . 110 Autocorrelation . 110 Power spectral density . 112 4.2.2 Evaluation criteria . 115 4.2.3 Enhancements on the stationary enforcement block . 116 4.2.4 Evaluation methodology . 122 Characteristics of a GA post-processing stage . 122 4.2.5 Results .............................. 123 Results from the Characteristics of a GA post-processing stage 123 4.3 Asymptotic analysis of algorithms . 128 4.4 Summary ................................. 131 5 Quality Evaluation in True Random Number Generators 132 5.1 Parallel scheme .............................. 133 5.2 SmallCrush evaluation . 134 5.3 Results ................................... 137 5.4 Discussion ................................. 140 viii Contents 5.5 Summary ................................. 142 6 Summary of work 144 7 Conclusions and Future work 146 7.1 Conclusions ................................ 146 7.2 Future Work ............................... 149 Bibliography 164 List of Figures 1.1 On the left, Pakua or Bāguà (八卦) representation, literally ”eight trigrams” by BenduKiwi, licensed under CC-AS 3.0, the 8 basic tria- grams are presented surrounding the Yin and Yang, the duality of all things in nature. On the right, the I Ching hexagrams are presented in ascending order, starting at 0 in the upper left corner, going left to right and up to down, to 63 in the lower right corner. 10 1.2 On the left, The Royal game of Ur from Mesopotamia by British Mu- seum licensed under CC-BY-3.0. On the right, an Egyptian dice (600- 800 Before Common Era (BCE)) by Swiss