Generating Random Permutations by Coin-Tossing: Classical Algorithms, New Analysis and Modern Implementation
Total Page:16
File Type:pdf, Size:1020Kb
Generating random permutations by coin-tossing: classical algorithms, new analysis and modern implementation Axel Bacher| , Olivier Bodini| , Hsien-Kuei Hwang~ , and Tsung-Hsi Tsai~ | LIPN, Universite´ Paris 13, Villetaneuse, France ~ Institute of Statistical Science, Acamedia Sinica, Taipei, Taiwan March 10, 2016 Abstract Several simple, classical, little-known algorithms in the statistical literature for generating random permutations by coin-tossing are examined, analyzed and implemented. These algo- rithms are either asymptotically optimal or close to being so in terms of the expected number of times the random bits are generated. In addition to asymptotic approximations to the expected complexity, we also clarify the corresponding variances, as well as the asymptotic distribu- tions. A brief comparative discussion with numerical computations in a multicore system is also given. 1 Introduction Random permutations are indispensable in widespread applications ranging from cryptology to statistical testing, from data structures to experimental design, from data randomization to Monte Carlo simulations, etc. Natural examples include the block cipher (Rivest, 1994), permutation tests (Berry et al., 2014) and the generalized association plots (Chen, 2002). Random permutations are also central in the framework of Boltzmann sampling for labeled combinatorial classes (Flajolet et al., 2007) where they intervene in the labeling process of samplers. Finding simple, efficient and cryptographically secure algorithms for generating large random permutations is then of vi- tal importance in the modern perspective. We are concerned in this paper with several simple classical algorithms for generating random permutations (each with the same probability of being This research was partially supported by the ANR-MOST Joint Project MetAConC under the Grants ANR 2015- BLAN-0204 and MOST 105-2923-E-001-001-MY4. 1 generated), some having remained little known in the statistical and computer science literature, and focus mostly on their stochastic behaviors for large samples; implementation issues are also discussed. Algorithm Laisant-Lehmer: when UnifŒ1; n! is available (Here and throughout this paper UnifŒa; b represents a discrete uniform distribution over all integers in the interval Œa; b.) The earliest algorithm for generating a random permutation dated back to Laisant’s work near the end of the 19th century (Laisant, 1888), which was later re-discovered in (Lehmer, 1960). It is based on the factorial representation of an integer k c1.n 1/! c2.n 2/! cn 11! .0 6 k < n!/; D C C C where 0 6 cj 6 n j for 1 6 j 6 n 1. A simple algorithm implementing this representation then proceeds as follows; see (Devroye, 1986, p. 648) or (Robson, 1969). Let k UnifŒ0; n! 1. The first element of the ran- D dom permutation is c1 1, which is then removed from 1;:::; n . The next elementC of the random permutation Algorithm 1: LL.n; c/ f g will then be the .c2 1/st element of the n 1 remaining Input: c: an array with n elements elements. ContinueC this procedure until the last element Output: A random permutation on c is removed. A direct implementation of this algorithm begin 1 u UnifŒ1; n!; results in a two-loop procedure; a simple one-loop proce- WD dure was devised in (Robson, 1969) and is shown on the 2 for i n downto 2 do WD u 3 t ; j u i t 1; right; see also (Plackett, 1968) and Devroye’s book (De- WD i WD C 4 swap.ci; cj /; u t; vroye, 1986, ~XIII.1) for variants, implementation details WD and references. This algorithm is mostly useful when n is small, say less than 20 because n! grows very fast and the large number arithmetics involved reduce its effi- ciency for large n. Also the generation of the uniform distribution is better realized by the coin- tossing algorithms described in Section2. Algorithm Fisher-Yates (FY): when UnifŒ1; n is available One of the simplest and mostly widely used algorithms (based on a given sequence of distinct numbers c1;:::; cn ) for generat- ing a random permutation is the Fisher-Yates or Knuth shuffle (in its modernf form byg Durstenfeld (Durstenfeld, 1964); see Wikipedia’s page on Fisher-Yates shuffle and (Devroye, 1986; Dursten- feld, 1964; Fisher and Yates., 1948; Knuth, 1998a) for more information). The algorithm starts by swapping cn with a ran- domly chosen element in c1;:::; cn (each with the Algorithm 2: FY.n; c/ same probability of beingf selected), andg then repeats Input: c: an array with n > 2 elements the same procedure for cn 1,..., c2. See also the re- Output: A random permutation on c cent book (Berry et al., 2014) or the survey paper (Rit- begin ter, 1991) for a more detailed account. 1 for i n downto 2 by 1 do WD Such an algorithm seems to have it all: single loop, 2 j UnifŒ1; i; swap.ci; cj / one-line description, constant extra storage, efficient WD 2 and easy to code. Yet it is not optimal in situations such as (i) when only a partial subset of the permuted elements is needed (see (Black and Rogaway, 2002; Brassard and Kannan, 1988)), (ii) when imple- mented on a non-array type data structure such as a list (see (Ressler, 1992)), (iii) when numerical truncation errors are inherent (see (Kimble, 1989)), and (iv) when a parallel or distributed comput- ing environment is available (see (Anderson, 1990; Langr et al., 2014)). On the other hand, at the memory access level, a direct generation of the uniform random variable results in a higher rate of cache miss (see (Andres´ and Perez´ , 2011)), making it less efficient than it seems, notably when n is very large; see also Section6 for some implementation and simulation aspects. Finally, this algo- rithm is sequential in nature and the memory conflict problem is subtle in parallel implementation; see (Waechter et al., 2011). Note that the implementation of this algorithm strongly relies on the availability of a uniform random variate generator, and its bit-complexity (number of random bits needed) is of linearithmic order (not linear); see (Lumbroso, 2013; Sandelius, 1962)) and below for a detailed analysis. From unbounded uniforms to bounded uniforms? Instead of relying on uniform distribution with very large range, our starting question was: can we generate random permutations by bounded uniform distributions (for example, by flipping unbiased coins)? There are at least two different ways to achieve this: Fisher-Yates type: simulate the uniform distribution used in Fisher-Yates shuffle by coin- tossing, which can be realized either by von Neumann’s rejection method (von Neumann, 1951) or by the Knuth-Yao algorithm (for generating a discrete distribution by unbiased coins; see (Devroye, 1986; Knuth and Yao, 1976)) and Section2, and divide-and-conquer type: each element flips an unbiased coin and then depending on the outcome being head or tail divide the elements into two groups. Continue recursively the same procedure for each of the two groups. Then a random resampling is achieved by an inorder traversal on the corresponding digital tree; see the next section for details. This realization induces naturally a binary trie (Knuth, 1998b), which is closely related to a few other binomial splitting processes that will be briefly described below; see (Fuchs et al., 2014). It turns out that exactly the same binomial splitting idea was already developed in the early 1960’s in the statistical literature in (Rao, 1961) and independently in (Sandelius, 1962), and an- alyzed later in (Plackett, 1968). The papers by Rao and by Sandelius also propose other variants, which have their modern algorithmic interests per se. However, all these algorithms have remained little known not only in computer science but also in statistics (see (Berry et al., 2014; Devroye, 1986)), partly because they rely on tables of random digits instead of more modern computer generated random bits although the underlying principle remains the same. Since a complete and rigorous analysis of the bit-complexity of these algorithms remains open, for historical reasons and for completeness, we will provide a detailed analysis of the algorithms proposed in (Rao, 1961) and (Sandelius, 1962) (and partially analyzed in (Plackett, 1968)) and two versions of Fisher-Yates 3 with different implementations of the underlying uniform UnifŒ0; n 1 by coin-tossing: one re- lying on von Neumann’s rejection method (Devroye and Gravel, 2015; von Neumann, 1951) and the other on Knuth-Yao’s tree method (Devroye, 1986; Knuth and Yao, 1976). As the ideas of these algorithms are very simple, it is no wonder that similar ideas also appeared in computer science literature but in different guises; see (Barker and Kelsey, 2007; Flajolet et al., 2011; Koo et al., 2014; Ressler, 1992) and the references therein. We will comment more on this in the next section. We describe in the next section the algorithms we will analyze in this paper. Then we give a complete probabilistic analysis of the number of random bits used by each of them. Implemen- tation aspects and benchmarks are briefly discussed in the final section. Note that Fisher-Yates shuffle and its variants for generating cyclic permutations have been analyzed in (Louchard et al., 2008; Mahmoud, 2003; Prodinger, 2002; Wilson, 2009) but their focus is on data movements rather than on bit-complexity. 2 Generating random permutations by coin-tossing We describe in this section three algorithms for generating random permutations, assuming that a bounded uniform UnifŒ0; r 1 is available for some fixed integer r > 2. The first algorithm relies on the divide-and-conquer strategy