Quick viewing(Text Mode)

CAS Pseudo-Random Numbers Generators Statistical Test Suits Пакеты Статистических Тестов Для

CAS Pseudo-Random Numbers Generators Statistical Test Suits Пакеты Статистических Тестов Для

UDC 519.174.1 CAS pseudo-random numbers generators statistical test suits

M. N. Gevorkyan∗, D. S. Kulyabov∗†, A. V. Demidova∗, A. V. Korolkova∗ ∗ Department of Applied Probability and Informatics, Peoples’ Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation † Laboratory of Information Technologies Joint Institute for Nuclear Research 6 Joliot-Curie, Dubna, Moscow region, 141980, Russian Federation

Email: [email protected],[email protected],[email protected],[email protected]

For a long time the implementation of pseudo-random number sequence generators in standard programming language libraries and mathematical packages was of poor quality. The situation has started to improve relatively recently. Even nowadays a large number of libraries and poorly supported mathematical packages utilize the old algorithms of pseudo- random numbers generation. We describe four actual sets of statistical tests that can be used to test the generator that is used in a particular software system. The emphasis is on the use of command-line utilities, to avoid low-level C or C++ programming.

Key words and phrases: pseudo random number generators, TestU01, PractRand, DieHarder, gjrand.

УДК 519.174.1 Пакеты статистических тестов для генераторов случайных чисел систем компьютерной алгебры

М. Н. Геворкян∗, Д. С. Кулябов∗†, А. В. Демидова∗, А. В. Королькова∗ ∗ Кафедра прикладной информатики и теории вероятностей, Российский университет дружбы народов, ул. Миклухо-Маклая, д.6, Москва, Россия, 117198 † Лаборатория информационных технологий, Объединённый институт ядерных исследований, ул. Жолио-Кюри, д. 6, Дубна, Московская область, Россия, 141980

Email: [email protected],[email protected],[email protected],[email protected]

Долгое время реализации генераторов последовательностей псевдослучайных чисел в стандартных библиотеках языков программирования и математических пакетов были плохо проработаны. Ситуация начала улучшатся сравнительно недавно. До сих пор боль- шое количество библиотек и слабо поддерживаемых математических пакетов использу- ют в своем составе старые алгоритмы генерации псевдослучайных чисел. Мы описываем четыре актуальных набора статистических тестов, которые можно применить для про- верки генератора, который используется в той или иной программной системе. Упор делается на использование утилит командной строки, что позволяет избежать довольно низкоуровневого программирования на языках С или С++. Ключевые слова: генерация псевдослучайных чисел, TestU01, PractRand, DieHarder, gjrand. 1. Introduction In 1995, George Marsaglia released a set of statistical tests that allowed users to check the quality of existing pseudo random numbers generators. This battery of tests showed that the vast majority of generators give poor-quality sequence and fail most tests. This set of tests has become widely known and pushed researchers to start searching for better algorithms for random numbers generation.

2. Generators in modern computer algebra systems To date, modern versions of the standard libraries of actively supported program- ming languages and computer algebra systems such as [1], Mathematica [2], SymPy [3] utilize the Mersenne twister algorithm (MT). It was one of the first algo- rithms that was widely introduced as a qualitative replacement for LCG, as it was discovered two years after Marsaglia test suite was introduced (in 1997 [4]). MT passed all Diehard tests. MT got its name from 219937 − 1. Depending on the implementation, a period up to 2216091 − 1is provided. The main disadvantage of the algorithm is the relative complexity and, as a result, relatively low performance. Note also that much more efficient and simple algorithms [5– 7] are currently developed. Otherwise, this MT provides a pseudo-random sequence of good quality and is quite applicable for most tasks. Let us, however, move on to the main purpose of this work. If the researcher uses the generation of pseudorandom numbers, how can we check the quality of the sequences of these numbers? This may be relevant when using a non-standard computer algebra system or systems of an old version. Even if one uses relatively modern tools, it remains a question of choosing a good initial value (seed). The obvious answer to this question will be the use of any statistical test package. However, mostly all open source packages are implemented in C or C++ languages and to use their functions directly, decent amount of low-level programming is necessary. Since the programming language for computer algebra systems is quite high-level, to call functions in C/C++ can be impossible or very time-consuming. In our opinion, this difficulty can be avoided by using the command-line utilities supplied with tests suites. The use of these utilities will eliminate the need to embed C code into the program and will allow one to create a script that will submit a sequence of analyzed numbers to the input of the testing utility.

3. The actual sets of statistical tests As already noted, historically, the first set of statistical tests for testing random number generators was the DieHard test suite [8] created in 1995 by George Marsaglia. It was distributed on CD and currently the official web-page is only available in web archive. This suite is not relevant at the moment, but all tests from it are now available in other suites. The following four relevant test sets can be distinguished – TestU01 [9,10] by Pierre L’Ecuyer and Richard Simard. Written in ANSI-C. Today is the most famous test suite. It can tests generators that produce numbers from the unit interval [0, 1). The latest version is 1.2.3 dated by august 18, 2009. – PractRand [11] for the authorship of Chris Doty-Humphrey. Written in C++11 with C99 elements. Takes as input a stream of bytes, can test 32-and 64-bit generators. Able to handle very large amounts of data. The latest version is 0.94 dated by august 04, 2018. – gjrand [12]. There is no any author’s contacts on the official website. Coded in C99. Accepts a stream of bytes on input. It comes with a set of different generators capable of generating not only uniformly distributed sequences of pseudo-random numbers, but also sequences of normal, Poisson and other distributions. The latest version is 4.2.1 dated by november 28, 2014. Table 1 Summary of four test suites

Package Lang CMD Unix Windows Version Year URL TestU01 ANSI C - ++ ± 1.2.3 18.08.2009 [9] PractRand C99, C++11 + + + 0.94 04.08.2018 [11] gjrand C99 + + ± 4.2.1 28.11.2014 [12] DieHarder C99 + ++ ± 3.31.1 19.06.2017 [13]

– DieHarder [13] by Robert Brown. It is positioned as the successor of DieHard tests. Written in C. Requires GSL [14] and can test any generator with the GSL- style interface. The latest version is 3.31.1 by from june 19, 2017. The 1 table summarizes the main characteristics of test suites. The Unix column specifies the ability to install under *nix systems. The Windows column is set to plus only if the program can be built without installing CygWin or MinGW. All test suites listed are open source. TestU01 and DieHarder are available for installation through the official repositories of many distributions, in particular Ubuntu 18.10. The other two test sets must be installed manually. Each of these packages allows one to perform tests by connecting libraries to the C/C++ programs. Three of them, except TestU01, also provide a command-line utility. Let’s briefly describe each of these utilities.

References 1. Maple home site (2018). URL https://www.maplesoft.com/products/maple/ 2. Mathematica home site (2018). URL https://www.wolfram.com/mathematica/ 3. Sympy home site (2017). URL http://www.sympy.org/ru/index.html 4. M. Matsumoto, T. Nishimura, Mersenne twister: A 623-dimensionally Equidis- tributed Uniform Pseudo-random Number Generator, ACM Trans. Model. Com- put. Simul. 8 (1) (1998) 3–30. doi:10.1145/272991.272995. URL http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.pdf 5. F. Panneton, P. L’Ecuyer, On the random number generators, ACM Trans. Model. Comput. Simul. 15 (4) (2005) 346–361. URL http://doi.acm.org/10.1145/1113316.1113319 6. M. E. O’Neill, Pcg: A Family of Simple Fast Space-Efficient Statistically Good Al- gorithms for Random Number Generation, Tech. Rep. HMC-CS-2014-0905, Harvey Mudd College, Claremont, CA (Sep. 2014). URL https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf 7. P. Boldi, S. Vigna, On the lattice of antichains of finite intervals, Order 35 (1) (2018) 57–81. doi:10.1007/s11083-016-9418-8. URL https://doi.org/10.1007/s11083-016-9418-8 8. G. Marsaglia, The marsaglia random number cdrom including the diehard battery of tests of randomness (1995). URL https://web.archive.org/web/20160125103112/http://stat.fsu.edu/pub/ diehard/ 9. P. L’Ecuyer, . Simard, Testu01 — empirical testing of random number generators (2009). URL http://simul.iro.umontreal.ca/testu01/tu01.html 10. P. L’Ecuyer, R. Simard, Testu01: A C library for empirical testing of random number generators, ACM Transactions on Mathematical Software (TOMS) 33 (4) (2007) 22. URL http://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf 11. C. Doty-Humphrey, Practrand official site (2018). URL http://pracrand.sourceforge.net/ 12. Gjrand random numbers official site (2014). URL http://gjrand.sourceforge.net/ 13. R. G. Brown, D. Eddelbuettel, D. Bauer, Dieharder: A Random Number Test Suite (2017). URL http://www.phy.duke.edu/~rgb/General/rand_rate.php 14. M. Galassi, B. Gough, G. Jungman, J. Theiler, J. Davies, M. Booth, F. Rossi, Gsl — gnu scientific library (2019). URL https://www.gnu.org/software/gsl/