<<

AC 2007-458: SCOPE OF VARIOUS RANDOM NUMBER GENERATORS IN ANT SYSTEM APPROACH FOR TSP

S.K. Sen, Florida Institute of Technology Syamal K Sen ([email protected]) is currently a professor in the Dept. of Mathematical Sciences, Florida Institute of Technology (FIT), Melbourne, Florida. He did his Ph.D. (Engg.) in Computational Science from the prestigious Indian Institute of Science (IISc), Bangalore, India in 1973 and then continued as a faculty of this institute for 33 years. He was a professor of Supercomputer Computer Education and Research Centre of IISc during 1996-2004 before joining FIT in January 2004. He held a Fulbright Fellowship for senior teachers in 1991 and worked in FIT. He also held faculty positions, on leave from IISc, in several universities around the globe including University of Mauritius (Professor, Maths., 1997-98), Mauritius, Florida Institute of Technology (Visiting Professor, Math. Sciences, 1995-96), Al-Fateh University (Associate Professor, Computer Engg, 1981-83.), Tripoli, Libya, University of the West Indies (Lecturer, Maths., 1975-76), Barbados.. He has published over 130 research articles in refereed international journals such as Nonlinear World, Appl. Maths. and Computation, J. of Math. Analysis and Application, Simulation, Int. J. of Computer Maths., Int. J Systems Sci., IEEE Trans. Computers, Internl. J. Control, Internat. J. Math. & Math. Sci., Matrix & Tensor Qrtly, Acta Applicande Mathematicae, J. Computational and Applied Mathematics, Advances in Modelling and Simulation, Int. J. Engineering Simulation, Neural, Parallel and Scientific Computations, Nonlinear Analysis, Computers and Mathematics with Applications, Mathematical and Computer Modelling, Int. J. Innovative Computing, Information and Control, J. Computational Methods in Sciences and Engineering, and Computers & Mathematics with Applications. Besides, he has coauthored seven books including the most recent one entitled “Computational Error and Complexity in Science and Engineering (with V. Lakshmikantham), Elsevier, Amsterdam, 2005. He had also authored several book chapters. All his research and book publications are in several areas mainly in computational science. He has been teaching several courses in areas such as stochastic and deterministic operations research, applied statistical analysis, and computational mathematics since late sixties. Further, he has been a member of the editorial board of international journals such as Computer Science and Informatics (India), and Neural, Parallel and Scientific Computations (USA). He has also been cited in Marquis Whos Who (Sep 2005).

Gholam Ali Shaykhian, NASA Gholam “Ali” Shaykhian Gholam Ali Shaykhian ([email protected]) is a software engineer with the National Aeronautics and Space Administration (NASA), Kennedy Space Center (KSC), Engineering Directorate. He is a National Administrator Fellowship Program (NAFP) fellow and served his fellowships at Bethune Cookman College in Daytona Beach, Florida. Ali is currently pursing a Ph.D. in Operations Research at Florida Institute of Technology. He has received a Master of Science (M.S.) degree in Computer Systems from University of Central Florida in 1985 and a second M.S. degree in Operations Research from the same university in 1997. His research interests include object-oriented methodologies, design patterns, software safety, and genetic and optimization algorithms. He teaches graduate courses in Computer Information Systems at Florida Institute of Technology’s University College. Mr. Shaykhian is a senior member of the Institute of Electrical and Electronics Engineering (IEEE) and is the Vice-Chair (2005-2007), Education Chair (2003-2007) and Awards Chair of the IEEE Canaveral section. He is a professional member of the American Society for Engineering Education (ASEE), serving as the Program Chair and Web Master for the Minorities in Page 12.1256.1 Engineering Division of ASEE (2006-2008). He was an assistant professor and coordinator of the Information Systems program at the University of Central Florida prior to his full time appointment at NASA KSC.

© American Society for Engineering Education, 2007

Scope of various random number generators in ant system approach for TSP

Abstract

Random numbers are essential ingredients to all stochastic methods including probabilistic heuristic ones. There are several random number generators. Given a method to solve a traveling salesman problem (TSP), which generator should one use to obtain the best result in terms of quality/accuracy and cost/computational complexity? The article attempts to seek an answer to this question when the specified method is the ant algorithm. Nature and characteristics of random numbers, their generators, and TSPs are specially stressed for a better appreciation.

1. Introduction

A given number cannot be just termed random unless we check/test the sequence which it belongs to. This is unlike the transcendental number … .3 14159265358 or the algebraic number l ? 1( - 2/)5 … 1.61803398874989 (golden ratio) or the Hilbert number 2 2 … 2.66514414269023. The word random implies that the predictability (probability of correct prediction) is low and never 100%. As long as there is a finite number of outcomes, the predictability is never zero. In the case of tossing a fair coin, the predictability is 50% while that 2 of rolling a six-faced fair die, it is 16 %. However, an approximate global prediction with high 3 predictability is possible so far as the character of the sequence is concerned. After a large number, say 500, of tosses of a coin, we may say that approximately 250 heads will be obtained. This number could be 245 or 256. By a statistical test, say e 2 -test, we may conclude that the coin is fair (unbiased) statistically/probabilistically. These heads and tails or, equivalently, 1s and 0s constitute a uniformly distributed random sequence. If we consider a class of 60 students and note their heights or weights one after the other, it is not possible to predict what the height/weight of the next student would be. These 60 heights or 60 weights are statistically random. But these random heights/weights are not uniformly distributed. We have invariably discovered that such heights/weights are normally distributed. Thus we have random sequences which are exponentially distributed or log normally distributed or distributed in any of the infinite possible ways. Thus the uniform distribution of random numbers is one of infinite possible distribution. Once we have a procedure to produce a uniformly distributed random sequence, we can easily produce a random sequence having any other distribution from this sequence. One may distinguish between global randomness and local randomness although most randomness concepts are global. In a near-infinite random sequence, there could be long sequences (stretches) of the same digit(s), say, 1, the overall sequence might be random though. Page 12.1256.2 Long sequences of the same digits, even though generated by a random process would reduce the

local randomness of a sample. That is, a sample could only be globally random for sequences of, say, 100,000 digits while it might not appear at all random when a sequence of less than 500 digits is considered. Usually in a statistical environment, the numeric sequence need to be a large one (30 or more entries) before we could talk about whether the sequence is random or not. For example, in a tossing of a coin denoting a head by 1 and a tail by 0, if we get 15 0’s successively, can we say that the coin is biased statistically? The answer is no. However, if we get say 25 0’s out of 30 first throws then we have a statistical reason by, say, e 2 -test, to believe that the coin is biased assuming that the tossing is done totally in an unbiased way. When we talk about the human population of a country or the red blood count in a human body, we do not normally use the unit as one human being or one red blood cell. If we do so, then too large numbers need to be written or need to be used in a computation. These may not be readily comprehensible by a human. Also, these large numbers occupy more space. So, a more convenient unit, say, million is usually employed. Thus, by an appropriate choice of the unit in a given environment, a sample size of 30 or larger but comparable could be considered, as a rule of thumb, large in statistics. Moreover, these sizes are easily comprehensible. Statistical randomness does not necessarily mean true randomness [1, 2]. Strictly speaking, any number which is the outcome of a process / artificial or natural / can be at best called a pseudo-random number (pseudo implying false) and can never be a true random number. In fact, it is strictly impossible to generate a true random number since if we have to generate by any physical means or any artificial means such as by a computer, we must pass through a natural process or an artificial process. Thus the output of such a process is related to the input through the process and cannot be random. Even if we produce a number through a combination of a natural process and an artificial process, the number cannot again be a true random number. Strictly speaking, if any natural process, which we may term as chaotic or possibly random, is exactly modeled, then the predictability is and has to be always 100%. Such an exact modeling, however, is impossible by human beings/computers. Nature only does exact modeling which may be very much beyond human sphere of capturing the exactness of all parameters known as well as unknown to us. The pseudo-random numbers are good enough to be used in ant system approach, genetic algorithms, Monte Carlo methods, any evolutionary process or any other randomized algorithm to solve many vitally important computational problems, specifically NP-hard [1, 3] problems, such as the TSP, in polynomial-time. It is, however, possible to compare random number generators statistically based on the large sample theory with respect to a given type of problems and rank them. Unlike a deterministic algorithm in which for a given input (data), we always get the same output (result) on a given platform (computer with software implemented on it), a randomized algorithm could produce different outputs for the same input. This happens because of the initial seeds for generating random numbers will usually be different at different times. Thus, we see that the performance of two or more generators could fluctuate. One may appear better than the other at one run while it is worse in the next run. Because of this fluctuation, ranking/comparing random number generators seems to be controversial. If we consider large (size 30 or more) samples, then we will be able to compare/rank them in a reasonable way. Such a ranking is usually not prone to fluctuation (unlike the situation in a small sample) and is fairly stable. Page 12.1256.3 Quasi-random sequence In many evolutionary algorithms an explicit construction of point sets that fills out the s-dimensional (s-D) unit as uniformly as possible is desirable. If pseudo-

random numbers (PRNs) are used, then these tend to show some clustering effect [4]. This effect will be more or less pronounced based on the PRNG employed. To minimize/eliminate this undesired effect, quasi-random number (QRN) sequences are introduced. The generators of these sequences are so designed and developed that they produce more uniformly distributed random numbers. The study of uniformly distributed RNs was started in 1916 by Weyl [5]. He introduced the notion of discrepancy that reflects clustering and that measures the quality of uniformity of a random point set. Hence the QRNs are also known as low discrepancy sequences. An ideal QRN sequence is one where discrepancy/clustering is nonexistent. For instance, if we generate K, say 1000, random points over a finite area having A, say 10, unit squares, then in each of the A=10 unit squares we should have exactly K/A = 100 points for an ideal QR sequence. If we now generate K=10000 random points over the same finite area A=10 squares then in each of the A=10 squares we should have exactly K/A = 1000 points. On the other hand, if we generate 10000 random points over the area A and divide the area A into 500 equal subareas then each subarea will contain exactly 20 points if the points are ideal quasi- random points. If the QRNG that produced 10000 random points is used to generate just 1000 random points then each of the 500 subareas should have exactly 2 points if that QRNG is an ideal QRNG. Generating an ideal QR sequence is an open problem where the generator is comparable to the existing QRNGs in terms of computational complexity. It is necessary to stress that an ideal QR sequence generator should be fast and comparable (in terms of rate of production of QRNs) to other existing QRNGs [6-8] since we need a large number of uniformly distributed random numbers for real-world applications. It may not be impossible to produce an ideal QRNG probably after very complicated computations needing unacceptable time to produce each QRN. Such a production, if possible, may not be usable/acceptable in practical applications in a meaningful way. Several methods have been developed by Halton [9], Sobol [10], Faure[11], and Niederreiter [12] for generating QR sequences for real-world applications. Fox [13] compared the efficiency of Halton and Faure QRNGs and a linear congruential PRNG. Bratley et al. [14] included Sobol QRNG in this comparison. QR sequences along with their discrepancy measures and applications have been discussed [4 15-21]. There are many computational problems / NP-hard or not / such as those in computer graphics [17], computational finance [22], linear algebra [23], linear and nonlinear optimization [24], and image processing [25], where QR sequence based algorithms are preferable for better numerical results. Satisfaction of statistical tests for uniformly distributed RNGs are necessary but not sufficient Uniformly distributed RNs are often the basis for generating RNs which could be normally or exponentially or log-normally (or any other way) distributed. An RN sequence should satisfy two statistical properties, viz., uniformity and independence. These properties imply that if the interval [0, 1] is divided into k subintervals then the expected number of observations in each interval is K/k, where K is the total number of observations and the probability of occurring an observation in a particular subinterval is independent of the previous observations [6-8, 26]. The generated numbers may not be uniformly distributed and/or their mean and variance may be too high or too low and/or the numbers may display cyclic variations such as auto correlation among numbers. These numbers may show some pattern such as numbers are successively higher or lower than adjacent numbers and/or several numbers above the mean are followed by several numbers below the mean [26]. Frequency tests such as the Kolmogorov-Smirnov test or 2 the e -test or both may be used to test for uniformity while a runs test, autocorrelation test, gap Page 12.1256.4 test or poker test or all the foregoing four tests may be employed to test for independence [26]. If a sequence of numbers satisfies all these tests as well as all other tests which one may find in the

literature or create may still not tell us that the sequence is completely free from any kind of pattern. However, satisfaction of these tests are absolutely necessary in the statistical sense by any RN sequence generated. Quality of RN sequence is based on accuracy and cost of result One might attempt to measure the degree of randomness of RN sequences generated by pseudo-RNGs such as the Matlab rand, the Simscript, the R250, and the Whlcg [8] and quasi-RNGs such as the Halton, the Faure, the Sobol, the Niederreiter probably based on the foregoing tests and/or some other means. Since the purpose of generating RNs is to use them in real-world applications, we believe that statistical ranking of these RNGs should be based on the quality (accuracy) of the result as well as the cost (computational/time complexity) of producing the result of a given real-world problem or even a hypothetical problem. These rankings sometimes may differ from one application to another or/and from one algorithm to another. We, however, assume that these rankings are based on large (size ‡ 30 ) samples. Since we do/should not start with the fixed seed to generate an RN sequence by any generator, we will get, unlike deterministic algorithms, different results at different runs/times. Further, even for a particular application, one RNG may perform better than another in one run while in the next run, this performance could be worse. Thus, there is a controversy about which RNG is most suited for a particular application/type of problems. We have reasons to believe that such a controversy can be put to rest when we consider a large sample of problems. Through our extensive numerical experiment, we have observed that the large sample based rankings remain fairly unaltered (stable) for a particular application/algorithm. In Section 2 we state the traveling salesman problem (TSP) and its integer program (IP) formulation and stress that it is NP-hard [1, 3]. The ant system approach is described in Section 3 while the different RNGs and their performance are presented in Section 4. Sections 5 and 6 comprise numerical experiments and conclusions, respectively.

2. Traveling salesman problem and its integer program model

Let there be n cities The traveling salesman problem (TSP) is to determine the shortest/lowest- cost (or sometimes minimum-time) path from the city (city of origin) where the salesman starts his journey from to the same city where he returns after visiting each of the remaining n /1 cities only once. Without any loss of generality we define the TSP as one of finding a minimal length closed tour that visits each of the n cities once. It is a deceptively simple combinatorial problem. There always exists an optimal solution of the TSP unlike linear optimization problems. There is a host of very important real-world problems such as the problem of milk/newspaper supply, that of flight scheduling, and that of transport (train/bus/boat) network. The deterministic solution is to evaluate all possible paths and then take the one with shortest distance/lowest cost/minimum time. But the number of all possible paths/tours that need to be evaluated, for an n city asymmetric TSP, is (n/1)!, n being the number of cities. This deterministic solution, though gives the required global best path, is intractable (too much computing time) even for a significantly small TSP corresponding to a small real-world problem. A reasonable (not too large nor too small) TSP will consist of n=100 cities. For just n=21 cities, we need to evaluate the cost of each of 20! … 2.43290200817664e+018 paths. Assuming that the evaluation of each path takes 10/9 sec. (a reasonable assumption as in 2006), to evaluate 21! paths we would need 77.1468165961644 years. For n=26 cities (still a small TSP), we would be Page 12.1256.5 needing about 1.5511210043331e+025/(60*60*24*365) = 4.91857243890506e+017 years. According to the Big Bang theory, the age of the universe is defined as the time elapsed between

the Big Bang and the present day. This age is estimated to be about 13.7 × 109 years = 13.7 billion years. According to the Wilkinson Microwave Anisotropy Probe project of NASA, the estimated age of the universe is between 13.5 and 13.9 billion years. Thus to obtain the optimal global minimal path for a TSP of only 26 cities, the fastest available computer of 2006 would need about 5 × 1017 years compared to which even the estimated age of the universe is a numerical zero. Even if a TSP solution is given, its verification is also intractable. This is because the TSP is an NP (nondeterministic polynomial time)-hard problem. Designing a polynomial-time deterministic algorithm for a TSP is and has been an open problem for centuries. We, therefore, attempt to solve a symmetric TSP by employing a heuristic algorithm such as the greedy, the 2-opt, the 3-opt, the simulated annealing (SA), the genetic algorithm (GA), and the ant system approach (AS) [6, 27]. A TSP is symmetric if, for any two cities i,j , the distance from the city i to the city j is the same as that from the city j to the city i . The efficiency of these algorithms varies from case to case and from size to size. However, such an algorithm is always polynomial-time and will produce a solution which is believed to be reasonably close to the true optimal path and is profitably usable in a real-world environment. We, however, will never know the true optimal path, in general. The survey paper [28] and the books [29, 30] discuss various developments on this subject. An elaborate description of TSP and its possible solutions may be found in the Georgia Tech TSP website [31].

Integer program modeling of TSP A TSP can be exactly modeled as an integer program (IP) as follows. Consider the graph G ? (N, A), where N={1, 2, 3, …, n}is a set of n vertices which represent cities, and A ? {(i, j :) i, j Œ N} is an arc set. Associated with every arc i,( j) , there is a nonnegative cost function cij . That is, the cost of travel from vertex i to vertex j is cij . Let V be the starting (as well as the ending) vertex or city. The TSP then may be stated: given (V , A) and

C ? {cij ) , determine an optimal tour from and back to the starting city (vertex) V covering every vertex in the network only once. Let S be a subset of vertex set N and |S| be the number of elements (vertices) of S. If cij ? c ji$i, j Œ{1, 2, 3, …, n}, the TSP is symmetric, else it is asymmetric. cij ? ,0 $i ? j . Let xij be an integer decision/flow variable whose value has to be 0 or 1. If xij is 1, then the salesman visits the city j after visiting the city i else he does not visit the city j after visiting the city i . The IP formulation is then n n Minimize ÂÂcij xij subject to i??11j n n  xij ? ,1 j ? ,2,1 5,n (column sum),  xij ? ,1 i ? ,2,1 5,n (row sum) (1) i?1 j?1

ÂÂxij ~| S | /V $S Ø N {\ V}, xij Œ }1,0{ $i, j Œ ,2,1{ 5,n} (2)

The constraints (1) ensures that every vertex (city) occurs only once in the tour while the first constraint in (2) is the sub-tour elimination constraint and the second constraint in (2) implies Page 12.1256.6 that if the binary variable xij ? 1, the arc i,( j) is included in the tour else it is not.

Consider the four node symmetric TSP (since three node TSP is trivial), where the cij matrix is given as

Çc11 c12 c13 c14 × Ç0 4 7 5× È Ù c c c c È4 0 3 9Ù È 21 22 23 24 Ù ? È Ù . (3) Èc31 c32 c33 c34 Ù È7 3 0 6Ù È Ù È Ù Éc41 c42 c43 c44 Ú É5 9 6 0Ú

Then the IP formulation of the TSP is

4 4 Minimize ÂÂcij xij ? 4x12 - 7x13 - 5x14 - 3x23 - 9x24 - 6x34 - 4x21 - 7x31 - 3x32 - 5x41 - 9x42 - 6x43 i??1 j 1 subject to one-node conditions (8 conditions):

x21 - x31 - x41 ? ,1 x12 - x32 - x42 ? ,1 x13 - x23 - x43 ? ,1 x14 - x24 - x34 ? :1 column / sum 4( conditions)

x12 - x13 - x14 ? ,1 x21 - x23 - x24 ? ,1 x31 - x32 - x34 ? ,1 x41 - x42 - x43 ? :1 row / sum 4( conditions) two-node conditions (6 conditions):

x12 - x21 ~ ,1 x13 - x31 ~ ,1 x14 - x41 ~ ,1 x23 - x32 ~ ,1 x24 - x42 ~ ,1 x34 - x43 ~ 1 three-node conditions (4 conditions)

x12 - x23 - x31 ~ ,2 x12 - x24 - x41 ~ ,2 x13 - x34 - x41 ~ ,2 x23 - x34 - x42 ~ 2

xij ? 0 or 1 $i, j

The LINDO [32] program for the foregoing IP is min 4x12+7x13+5x14+3x23+9x24+6x34+4x21+7x31+3x32+5x41+9x42+6x43 st x21+x31+x41=1 x23+x32<=1 inte x21 x12+x32+x42=1 x24+x42<=1 inte x23 x13+x23+x43=1 x34+x43<=1 inte x24 x14+x24+x34=1 x12+x23+x31<=2 inte x31 x12+x13+x14=1 x13+x34+x41<=2 inte x32 x21+x23+x24=1 x23+x34+x42<=2 inte x34 x31+x32+x34=1 x12+x24+x41<=2 inte x41 x41+x42+x43=1 end inte x42 x12+x21<=1 inte x12 inte x43 x13+x31<=1 inte x13 x14+x41<=1 inte x14 Page 12.1256.7

The output of this program is x14=1, x21=1, x32=1, x43=1 while the remaining eight variables are 0. The objective function value (shortest tour length) is 18. It can be easily appreciated that even to formulate a general TSP consisting of, say 100 nodes, as an IP manually is very time consuming and difficult. To solve such a linear IP by the fastest available computer is simply intractable. It may be observed that while there are polynomial-time algorithms for solving a linear program, there exists no polynomial-time algorithm to solve a general linear IP. Hence, as of now, our only way is to use a heuristic algorithm to obtain a solution of a given TSP while its true global optimal solution will remain unknown possibly for ever.

3. Ant system approach

The ant system techniques are developed in such a way that it attempts to mimic the real world phenomena that ants using the shortest path can find their way to a food source and can return to their colony/nest. The ant system (AS) approach was proposed by Dorigo [33] in 1991. Dorigo et al. [34] has given an excellent easy-to-understand description of the ant system for optimization by a colony of cooperating agents. We briefly provide here their ant system keeping most of their notation with practically no change. We hope that this will help the reader once he desires to fall back on their paper [34]. In an AS algorithm [35] a colony of artificial ants construct solutions to the given problem iteratively and communicates indirectly by the artificial pheromone trails. These trails in the TSP application represent the edges. The AS procedure works well for both symmetric and asymmetric TSPs. The AS approach is essentially an autocatalytic process, i.e., a positive feedback mechanism mimicking the trail-laying and trail-following behavior of ants. It is versatile, robust, population-based, and is suited for parallelization. The AS algorithm is versatile since it can be employed to similar versions of the same problem. It can be readily extended from symmetric TSP to asymmetric TSP. It is robust since it can be readily adapted with a minimal change to combinatorial problems (other than TSP) such as job-shop scheduling problem, nonlinear assignment problems. The AS approach is a population based algorithm. It can exploit positive feedback as a search mechanism as well as inherent parallelism.

The AS algorithm [6, 34] is as follows. Let n = the number of cities, bi , i ? )1(1 n be the number of ants in city i at time t . Let n m ? Âbi = the total number of ants (fixed) at all times. (4) i?1 In the artificial ant system, the time is discrete unlike that in the real ant system, where the time 2 2 is continuous. Also, let dij ? [(xi / x j ) - (yi / y j ) ] = the Euclidean distance between city i and city j, i.e. the edge (i, j). The concerned TSP may be termed Euclidean TSP. Depending on the definition of the distance, the TSP may be given an appropriate name for convenience of reference. The information is kept in an n×n distance matrix W. jij ? /1 d ij = the visibility on the d edge (i, j). This quantity jij raised to power く, i.e. jij is used to control the degree of visibility. v ij = the intensity (quantity) of the trail (scent) on the edge i,( j) at time t , which is a measure of the quantity of chemical marker called pheromone (scent) deposited on the trail by each ant at time t . A given ant will more likely follow that edge that has a higher level of pheromone. The c quantity v ij raised to power g, viz., v ij is used to control the ant’s dependence on scent to Page 12.1256.8 decide on the edge it would traverse next. Each ant is an agent that (i) chooses the city (out of those not yet visited by it) to go to at time t -1 with a probability that is a function of the

distance d ij and the intensity of trail v ij present on the edge i,( j) at the current time t , (ii) is compelled to make legal tours (as per the TSP definition), i.e. is not allowed to transit to already visited cities till a tour is completed, and (iii) lays a trail (the substance pheromone in real ants) on each edge i,( j) visited/traversed when it completes a tour. One iteration of the AS algorithm here means m moves carried out by m ants in the discrete time interval ,[ tt - ]1 . One cycle of the AS algorithm means n iterations (of the algorithm) which each ant has completed.

To start with bi (i ? )1(1 n) ants are placed at the ith city. At each time step, the m ants, in unison, will travel to the next allowable city based on a stochastic decision. This traveling takes place over a period of n time steps, at the end of which each ant will have completed a cycle(a closed tour touching each of the n cities once). Assume that at the time t ? 0 all m ants start their travel (simultaneously). At each of the moments (i.e. after each of the n iterations) n, 2n 3, n, 5 when cycles ,1 ,3,2 5 will have been completed, the trails’ scent level will be updated. The update rule for the pheromone levels is m k v ij (t - n) ? tv ij (t) -  Fv ij (5) k ?1 where t > 1 (to impose a bound on the continued accumulation of trail) is such that 1/ t k represents the evaporation of scent/trail between time t and t + n and Fv ij is the quantity per unit length of trail laid by the kth ant on the edge i,( j) between time t and t - n . The quantity m k  Fv ij measures the additional trail traffic, where k?1

k Fv ij ? Q / Lk if kth ant travels the edge i,( j) in its tour in time ,[ tt - n], else 0 (6)

where Q is a constant and Lk is the tour length of the kth ant so that the shorter the tour is, the more will be the chemical reinforcement. The quantity of trail v ij at time t ? 0 is set to a small constant c . A data structure, say, cv list, where cv stands for “city-visited” is maintained. This list is a dynamically growing vector that consists of all the cities already visited by an ant up to time t (maintaining the order in which these cities have been visited by the ant) and prohibits the ant to revisit them before the completion of n iterations. When a tour is completed, i.e., just after n iterations, the cv list is used to compute the distance of the path traveled by the ant. The cv list is then initialized (emptied) and the ant is free again to choose as per the aforementioned three conditions (i-iii). Let cvk consist of the cv list of the kth ant. Let the transition probability from city i to city j for the kth ant be defined as

c d k [v ij (t)] [jij (t)] ij p (t) ? c d if ant k has not visited edge i,( j) , else 0 (7) Â[v ik (t)] [jik (t)] kŒallowed Page 12.1256.9

where allowedis a list of yet-to-be-visited cities by the kth ant. So, at each time step t, ant k will determine, in a probabilistic fashion, which city it will visit next. This probability is based on the distances to the nearest cities that it has not visited yet as well as on the amount of trail (pheromone) present at that moment on the different (distinct) allowed edges. Therefore the transition probability is a trade-off between visibility and trail intensity (quantity) at time t, thus implementing the autocatalytic process. A Matlab code ( a modified version of the code in [6]) for Ant algorithm for TSP is: function [] = antalgorithm() D = load('cityloc.txt'); % The file cityloc.txt is n x 3 matrix % Each row of cityloc has three elements. The second and third % elements representing the numerical coordinates (location) of the city %while the first element is the numerical name of the city starting from 1. Ncity = length(D); % number of cities on tour Nants = Ncity; % number of ants=number of cities (each b(i) is taken as 1) % Cities are located at (xcity, ycity) x = [D(1,2)]; y = [D(1,3)]; for i=2:Nants, xcity=D(i,2); x=[x xcity]; ycity=D(i,3); y=[y ycity]; end;

% Calculate distance matrix for i=1:Ncity-1, for j=i+1:Ncity, if i==j, dcity(i,j)=0; else dcity(i,j)=sqrt((x(i)-x(j))^2 + (y(i)-y(j))^2); end, end, end;

% Completing the symmetric matrix for i=2:Ncity, for j=1:i-1, dcity(i,j)=dcity(j,i); end, end; t0= clock; vis=1./dcity; trail=.1*ones(Ncity,Ncity); %Visibility vis should be rightly infinity at the city where the ant is %present. So a division by zero resulting in infinity in Matlab is alright % although a warning message will be produced by Matlab. It is immeterial. maxit=300; a=1; b=5; rr=0.5; Q=sum(1./(1:8)); dbest=10^10; e=5;

% initialize tours for ic=1:Nants, [dum, paths]=sort(rand(1,Ncity));tour(ic,:)=paths;end; tour(:,Ncity+1)=tour(:,1); % Tour ends on city where it started from. for it=1:maxit, for ia=1:Nants, for iq=2:Ncity-1 [iq tour(ia,:)]; st=tour(ia,iq-1); nxt=tour(ia,iq:Ncity); prob=((trail(st,nxt).^a).*(vis(st,nxt).^b))./sum((trail(st,nxt).^a).*(vis(st,nxt).^b)); rcity=rand; %Matlab pseudo-random number generator "rand" is used; rand needs %to be replaced throughout the program by the concerned quasi-/pseudo- %random number generator when we want to compare the output with rand's. for iz=1:length(prob) if rcity

% calculate the length of each tour and trail distribution phtemp=zeros(Ncity,Ncity); for ic=1:Nants Page 12.1256.10 dist(ic,1)=0; for id=1:Ncity dist(ic,1)=dist(ic)+dcity(tour(ic,id),tour(ic,id+1));

phtemp(tour(ic,id),tour(ic,id+1))=Q/dist(ic,1); end % id end % ic [dmin,ind]=min(dist); if dmin

% trail for elite path ph1=zeros(Ncity,Ncity); for id=1:Ncity, ph1(tour(ind,id),tour(ind,id+1))=Q/dbest; end;

% update trail trail=(1-rr)*trail +phtemp+e*ph1;dd(it,:)=[dbest dmin];[it dmin dbest]; end %it disp(' Tour in terms of city numbers and the corresponding tour length') format short g; disp([tour,dist]), CPUtime= etime(clock, t0); fprintf('Best Tour Length = %10.4f CPU time = %8.4f \n\n', dbest, CPUtime) end

If we store the text M file cityloc.txt as

[1 8 10 2 9 11 3 10 6 4 15 17] just for testing the ant system algorithm named “antalgorithm” (and easily checking the output), we obtain issuing the Matlab command >>antalgorithm in the command window, the output as

Tour in terms of city numbers and the corresponding tour length 1 2 3 4 1 28.496 3 1 2 4 3 26.455 4 3 1 2 4 26.455 3 1 2 4 3 26.455

Best Tour Length = 26.4547 CPU time = 0.1720.

In different runs, the output may be different but the best tour length is likely (not necessarily though) to be the same in each run. The CPU time needed could also be different in different runs. We have used Matlab pseudo-random rand in the foregoing program. This generator can be replaced by any other quasi-/pseudo-random number generator (for the purpose of testing/comparing/ranking, over real-world applications, the different generators). The foregoing program is written mainly for the purpose of experimenting on different RNGs. Since we have taken the number of ants bi (t) ? ,1 i ? )1(1 n at t ? 0 , the generality is slightly lost, but our purpose is not affected by this.

How to generate/create TSPs A simple yet an unbiased way to create a TSP is to use a random number generator, say, Matlab rand. A Matlab code for obtaining a TSP mainly for the purpose of comparing/ranking various TSP solvers (such as those based on ant system approach, genetic algorithms, and simulated annealing) or various RNGs for a single solver is as follows. Page 12.1256.11 function []=tspgeneration(n) %n is the number of cities.

fid=fopen('cityloc.txt', 'w'); for d=1:n, X=round(rand*100)+2; Y=round(rand*100)+2; fprintf(fid, '%d %d %d \n', d, X, Y); end; fclose(fid)

The TSP data will be found in the Matlab file cityloc.txt which can be used in antalgorithm. The coordinates of a city-location have been set/chosen from the values [2, 100] uniformly randomly. One may, however, change the values 2 and 100 in the tspgeneration program as desired. It may be seen that the generated TSP consisting of n cities is not a test TSP. In fact, if we define (as is usually done) a test problem as one whose solution is known, then it is impossible to have a test nontrivial TSP with reasonably large n. Since the TSP is NP-hard (i.e., neither the procedure for solution nor the solution are verifiable in polynomial-time), test TSPs where the true global shortest path is known is not possible to be created for an n which is even as small as 30.

4. Quasi- and pseudo-random generators and testing procedures

Quasi-random number generators A quasi-random number generator (QRNG) produces a deterministic sequence that fills the interval [0, 1) as uniformly as possible at each point in time. If we divide the interval into n equal subintervals and start the generation of RNs at time t ? 0 , and generate K RNs by the time t ? 1, then each subinterval should ideally contain K / n RNs at t ? 1. Assuming number of RNs generated at each equispaced discrete time t ? ,3,2,1 5 are equal then at time t ? 2 , each subinterval should ideally contain 2K / n RNs. Similarly, at t? 3, it should ideally contain 3K / n RNs and so on. If such a generation happens then the concerned RNG will be a perfect QRNG in which uniformity is 100% or, equivalently, the discrepancy is 0%. There is no simple and fast means to develop such an ideal QRNG comparable to the existing non-perfect QRNGs such as the Halton, the Sobol, and the Faure QRNGs particularly for filling up an s / D cube. It is an open problem.

Mapping RNs in [a, b) given RNs in [0,1] We can always generate an RN in the interval [a,b) if we have an RNG that generates RNs in the interval 1,0[ ).For example, if r is an RN in )1,0[ then the corresponding RN in [a,b) will be a - r(b / a) . We briefly describe below only some simple QRNGs. For other QRNGs, refer [6-8]. However, Matlab programs for all these QRNGs are appended for the reader to copy and paste the programs in a Matlab file and run them to get a first hand feel of these QRNGs so far as their uniformity and time complexity are concerned. van der Corput QRNG The van der Corput QRNG (=1/ D Halton QRNG) that constructs a quasi-random sequence is as follows.

m j 1. Write each integer n as a polynomial (uniquely) in a given prime base b: na? Â j ()nb, j?0

where the coefficients anj () Œ {0, 1, …, b – 1} and m is the lowest integer that makes Page 12.1256.12

anj ()? 0 for all j > m. (In a non-prime base, n can also be written as a unique

polynomial. However, for the van der Corput sequence, we specify a prime base and use it.) m / j/1 2. Reflect the polynomial about the base point:hbj()na? Â ()nb. j?0 10 If n ? 10, b ? 3 then n ? 1· 32 - 0· 31 -1· 30 µ l (10) ? 1· 3/1 - 0· 3/2 -1· 3/3 ? ? 3 27 0.37037037037037 (correct up to 14 decimal places).

Thus the sequence of first 11 QRNs in van der Corput sequence corresponding to 1 2 1 4 7 2 5 8 1 10 n ? ,9,8,7,6,5,4,3,2,1,0 10 in prime base b ? 3 is ,0{ , , , , , , , , , } . This sequence 3 3 9 9 9 9 9 9 27 27 correct up to 5 decimal places is {0 .33333 .66667 .11111 .44444 .77778 .22222 .55556 .88889 0.037037 .37037 }obtained by issuing the Matlab command

>>format short g; qrseq=[0 1/3 2/3 1/9 4/9 7/9 2/9 5/9 8/9 1/27 10/27] in its command window. If we choose the prime base b ? 5 then we will get similarly a different van der Corput sequence. The van der Corput sequences are limited to only one-dimension. Many of the multi- dimensional sequences are directly or indirectly constructed using this approach. Therefore, a van der Corput sequence can be thought of as the building block for other QR sequences.

Halton QRNG The construction of a Halton sequence follows the same steps as the van der Corput sequence. However, a different prime base is used for each dimension, typically starting with the first prime, viz., the prime 2. The number 2 is known as the very first positive prime and it is the only prime which is even out of infinitely many countable primes. Observe that the number 1 is not considered a prime. Therefore the Halton sequence in its first dimension is the van der Corput sequence in base 2, that in the second dimension is the van der Corput sequence in base 3, that in the third dimension is the van der Corput sequence in base 5, and so on. An s-dimensional Halton sequence is generated by pairing s one-dimensional sequences, each based on s different prime numbers (usually the first s primes) [7, 36]. The numbers in the sequence lie in cycles of b increasing terms, i.e., it moves upward and falls back in cycles of b (where b is the prime base). Further, within each cycle, the terms are separated by 1/b. Table 1 demonstrates the cycle length for the first two dimensions (base 2 and base 3, respectively) of the Halton sequence. Thus, higher base means longer cycle (consequently more computation). The 24th dimension corresponds to the prime base 89. Here each cycle has 89 terms. The 62nd dimension corresponds to the prime base of 293 (the 62nd starting from the first positive prime number 2). In this case, each cycle has 293 terms. Therefore, to fill the interval [0, 1), more points, i.e. more quasi-random numbers are needed. Otherwise (in less points in higher dimensions), undesired bands will emerge [6-8]. Good 2-D projections may be obtained if the number of quasi-random points used is the product of the bases [6, 36]. For example, if the 44th th and 45 dimensions are to be used (whose bases are 193 and 197, respectively), then Page 12.1256.13 193·197 ? 38021 points will be well distributed in these dimensions.

Table 1. Cycle length of Halton sequences for dimension 1 (base 2) and dimension 2 (base 3)

Cycle Dimension 1 (b = 2) Dimension 2 (b = 3) Generated QRNs Generated QRNs 1 0, 1/2 0, 1/3, 2/3 2 1/4, 3/4 1/9, 4/9, 7/9 3 1/8, 5/8 2/9, 5/9, 8/9 4 3/8, 7/8 1/27, 10/27, 19/27 5 1/16, 9/16 4/27, 13/27, 22/27 6 5/16, 13/16 7/27, 16/27, 25/27 7 3/16, 11/16 2/27, 11/27, 20/27 8 7/16, 15/16 5/27, 14/27, 23/27

Faure QRNG In Faure sequences, the base is again prime, but it is the same prime for each dimension. Further, the base is not fixed and depends on the total number of dimensions. It is chosen as the lowest prime number greater than or equal to the number of dimensions (e.g., 2 for s = 1). The previous dimension is used to generate the next dimension. This obviates to some extent problems of correlation between high dimensions that occur with the Halton sequences. The one-dimensional Faure sequence is identical to the van der Corput sequence in base 2 and the first dimension of the Halton sequence. Only the first dimension contains cycles of increasing terms. For the description of the Faure QRNG along with illustrations, refer [6-8].

Sobol QRNG The Sobol sequences use the same prime base and it is 2. The order in which the elements appear is the only difference between two Sobol sequences of the same size. Each dimension uses a different permutation of the same set of values. The permutations are generated using irreducible primitive polynomials of modulo 2 (i.e., whose coefficients are either 0 or 1). A primitive polynomial is a polynomial that generates all elements of an extension (a field K is an extension field of field F if F is a subfield of K) from a base field. The lowest degree polynomial is preferred. However, with the increase of dimension, higher degree polynomials need to be used. For a modified version of the Sobol QRNG for faster generation refer [4, 8]. It may be seen that the construction of a multi-dimensional sequence is similar to that of one dimension. For the description of the Sobol QRNG, refer [6-8]. For multiple dimensions, a distinct primitive polynomial is used for each dimension.

Niederreiter QRNG The theory behind this method is quite complex and can be found in detail in Niederreiter [37]. Bratley et al. [38] have presented a simplified form of the Niederreiter QRNG. However, for the description of the Niederreiter procedure to produce QR sequences in s dimensions, the reader may refer [6-8].

Pseudo-RNGs We briefly discuss several PRNGs below. For details, refer [6-8, 27, 39]. Linear Congruential methods which are special forms of multiple recursive methods are at the base of majority of the commonly used PRNGs. Other PRNGs are derived from combined linear Congruential method, generalized feedback shift register method, subtract with borrow procedure, and lagged Fibonacci technique(Generalized LCG with a long period). For detailed

mathematical derivation of PR sequences, refer Knuth [40]. Page 12.1256.14

Linear Congruential method (generator) The linear Congruential generator (LCG)

LCG(a,c,m, x0 ) generates RNs xn by the recursion xn ? (axn - c) mod m, i ? ,3,2,1 5, where

0~ x0 > m is the seed, m is positive integer called the modulus, 0 ~ a > m is the multiplier, and 0 ~ c > m is the increment. In a 32 bit word machine, a reasonable choice of these parameters 32 could be x0 ? ,0 m ? 2 ? 4294967296, a ? 2147001325, c ? 715136305 . Also, refer [26].

The PRNGs developed based on this method are [6, 8]

(i) Apple (implemented on Apple computers): LCG 5( 13 ,0, 235 1, ) , (ii) Derive (implemented in derive software): LCG 3( ·107 · 9786893 ,1, 232 0, ) , (iii) Drand (used by ANSI C Drand 48): LCG 7( · 443· 739·11003, 11, 248 0, ) , (iv) (used in Maple software): LCG 3( ·142473223027 ,0, 1012 /11 1, ) , (v) Minstd (a full period generator): LCG 7( 5 ,0, 231 / 1,1 ) , (vi) Randu (a 1960s non-full period generator): LCG 2( 16 - ,0,3 231 1, )

The name “Apple” is given to the PRN generator since it is implemented on Apple computers. Similarly other generators are also named after the machine/software on/in which these are implemented. This is just for convenience of referencing and not for any specific reason.

The choice of the parameters a,c,m, x0 in LCG(a,c,m, x0 ) is important [26]. Other LCGs can be generated by simply changing the parameter values appropriately if a good PRNG is desired.

Simscript It is a linear congruential generator (LCG) originally implemented in SIMSCRIPT II simulation programming language. It needs four inputs, viz., the modulus m (m > 0), the multiplier a (0 < a < m), the increment c (0 < c < m), and the seed xo (0 < xo < m). The sequence of random numbers is generated by the recursion xnn? ()ax/1 - cmod m , where the inputs a = 31 6303600, c = 0, m = 2 – 1, and xo = 1 are used. Other appropriate inputs could also be used depending on precision and accuracy factors.

Combined linear Congruential methods (generators) To improve the quality of the PRN sequence, a combination of two or more LCGs have been used [6]. We briefly present some of these generators. However, such a combination is often shown to be equivalent to a single LCG.

(i) Whlcg: LCG(170, ,0 30323 )1, ,LCG(171, ,0 30269 )1, ,LCG(172, ,0 30307 )1, »LCG LCG(16555425264690, ,0 27817185604309 )1, , (ii) Ranecu: LCG(40692, ,0 2147483399 1, ), LCG(40014, ,0 2147483563 )1, , (iii) Lelcg: LCG(142, ,0 31657 )1, ,LCG(146, ,0 31727 )1, ,LCG(157, ,0 32363 )1, » LCG(30890646900944, ,0 32504802982957 )1, (used in Excel).

The Matlab code for the foregoing PRNGs are given in [6, 39] as well as are included in the Appendix with some modifications for better clarity/computational efficiency.

Generalized feedback shift register method (generator) Using LCG(a,c,m, x0 ) Page 12.1256.15 31 =)LCG(16807 ,0, 2 / 1,1 , obtain the first 250 RNs. We then use the recursion xn ? xn/ p ¸ xn/q , where n ? 251, p ? 250, q ? 103 , and ¸ is the bit-wise exclusive operator. Bit-wise exclusive

OR (XOR) operation ¸: 0 ¸ 0=0, 0¸ 1=1, 1 ¸ 0=1, 1 ¸ 1=0. The 4-bit representations of the decimal numbers, say, 13 and 5 are 1101 and 0101, respectively. Then 13 ¸ 5=8 (the decimal equivalent of the XOR binary output 1000). Mersenne Twister is based on this method.

R250 It is a generalized feedback shift register (GFSR) generator. It needs two inputs p and q pq based on the primitive polynomial xx- -1. It then uses the recursion xn ? xn/ p ¸ xn/q .

Subtract and borrow method (generator) It is the lagged Fibonacci generator supplemented with an extra carry bit (multiple with carry method).

Matlab rand The rand command in Matlab produces an array of random numbers whose elements are uniformly distributed in the interval (0, 1). This generator produces floating-point values and is based on the subtract with borrow technique [41].

For further details and for Matlab codes for the foregoing PRNGs, refer [6, 39] as well as Appendix.

Comparison of QRNGs and PRNGs Intra-comparison as well as inter-comparison of QR sequences as well as PR sequences both numerically and graphically using Matlab have been reported [6-8, 39]. As stated earlier and as expected, the QR sequences are more uniformly distributed, i.e., distributed with less discrepancy than the PR sequences assuming sufficiently large (not too small) sequences. The clustering effect as well as other pros and cons of various QRNGs and PRNGs are also experimented/studied in [6-8, 39].

Statistical tests for QRNGs and PRNGs An RN sequence must uphold two statistical properties: uniformity and independence [26]. To test for independence, a runs test, autocorrelation test, gap test, or poker test or all of these tests can be performed on the given sequence [26]. Satisfaction of these tests are necessary but not sufficient. In fact, we do not have yet a set of necessary and sufficient statistical tests for random sequences.However, one may generate random numbers which are normally (not uniformly) distributed or log-normally distributed or exponentially distributed. According to the autocorrelation test [26], there was a slightly mixed outcome regarding the independence of numbers in a sequence generated by both PRNGs and QRNGs. Low autocorrelation usually implies high degree of independence. The PRNGs Matlab rand, Simscript, R250, and Whlcg passed the autocorrelation test at 5% significance level for sequences (of lengths 20, 200, 2000, and 20000) implying intra-independence of numbers in a sequence. The QRNGs Halton, Faure, Niederreiter (all in 1-D) passed the autocorrelation test at 5% significance level for smaller sequences (of length 20) but these failed for larger sequence of length 20000. For detailed numerical results along with Matlab codes for autocorrelation tests, refer [7]. Thus there is a clear distinction between PRNGs and QRNGs. In other words, given a sequence, it is possible to a significant extent to determine whether a given sequence is a pseudo- random sequence or a quasi-random sequence. However, both have definite scope in solving different kinds of real-world problems. That is, for some real world problems, PRNGs perform Page 12.1256.16 better while for other problems, QRNGs are more desirable [6-8 39].

5. Numerical experiments and visualization

The website entitled “MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances” with address http://elib.zib.de/pub/Packages/mp-testdata/tsp/tsplib/tsp/ is an excellent source of about 135 ready-made real-world (or akin to real-world) TSPs from a wide range of application areas [42]. The TSP size in this website varied from 16 (cities/nodes) to 85900 (cities/nodes). To compare/rank PRNGs and QRNGs with respect to the ant algorithm presented in section 3, we have used some of these problems/examples for our numerical experiments. However, we report here results on randomly generated TSPs using the Matlab program tspgeneration. Since the comparison needs to be made with respect to a large sample (of size ‡30), each selected TSP is required to be run/executed 32-35 times by changing initial seeds (or taking different seeds) for each run. When we consider small samples (sizes = 5-7) of a test problem, stable ranking might not be always possible. A test problem is one whose solution is known. The RNGs could go on interchanging their positions at different runs making it difficult for us to assign each RNG an unchangeable/stable rank with respect to a particular algorithm (e.g. the ant algorithms, the simulated annealing, and genetic algorithms) based on the quality of results and time complexity. Throughout we have used 5% significance level. The experiment using the ant algorithm where the number of starting ants is taken as the number of nodes/cities. The generality, viz., using bi ants at the node/city i for i ? ,3,2,1 5,n in the general ant algorithm is lost. However, such a loss does not affect the comparison/ranking as long as we implement the single-node- single-ant based ant algorithm in all the PRNGs and QRNGs. The TSP solutions obtained by these runs of the foregoing implementation are summarized below. Since the execution time required by the PC (Pentium ® D CPU 2.80 GHz, 2.79 GHz, 1.00 GB Ram Physical Address Extension, Microsoft Windows XP) employed here for a TSP of size, say, 40 is quite significant, instead of running the Matlab program antalgorithm for over 30 different TSPs of the same size, we have run this program for a few (around 5) TSPs of the same size. Interestingly, we did not find instability in ranking the 9 RNGs that we have considered here for this ant algorithm and for the random TSP. So we have reported below the result with respect to only three TSPs / sizes 4, 17, and 200. However, the ranking instability may not be ruled out for small samples for all (> 9) quasi- and pseudo-RNGs mentioned in section 4 as well as for other RNGs not mentioned there. Besides ranking the RNGs based on the quality (i.e. 1/relative error) of the TSP solution as well as the time complexity, we have assigned each RNG a weighted rank which we have based on the sum 0.1 * Time complexity + 0.9 * Relative Error. The relative error of a (continuous) quantity Q is defined as the nonnegative value | Q / Q| |/| Q |, where Q = the exact value of the quantity and Q|is the approximate value of the quantity. By this definition it is impossible to find the relative error since the exact quantity Q is not known and will never be known. For if the exact quantity is known then the error is unnecessary to be brought into the scene. So the practice is to replace the exact quantity by the available most accurate quantity (or quantity of higher order accuracy) [1]. Thus our Q here will be the minimal shortest path produced by one of the nine RNGs. Thus the relative error in the TSP solution can be computed here.

The weights 0.1 and 0.9 are subjective. One may change these weights based on the relative Page 12.1256.17 importance of the parameters, viz., the time complexity and the relative error in a given context. However, we will not be possibly wrong if we say that over 90% of the computing resources (as

per the current scenario) available to us (all of the human beings) are unutilized and hence are wasted. As of 2006, the main frame concept dominant during 50s-early 90s is no longer so. Or, in other words, the concept is near-extinct and is over-ridden by distributed computing including a non-centralized approach. So the importance for the time complexity of good heuristics for majority of the real-world TSPs is significantly less pronounced than the accuracy (1/relqtive error) of the solution (the shortest path). The unit of time complexity should be such that it is comparable to the numerical magnitude of the relative error. The unit will be in minutes, hours, days, etc. so that the numerical value of the complexity is comparable to that of relative error. Else, our weighted rank will be misrepresented. In the following two examples, viz., the 17-, and 200-city TSPs, the graphs have been drawn using the Microsoft Excel and not the Matlab code. Computer used for these examples is Pentium ® D CPU 2.80 GHz, 2.79 GHz, 1.00 GB Ram Physical Address Extension, Microsoft Windows XP while the weighted rank is defined as stated after each table.

Example 1: 17-City TSP

Input cityloc.txt 1 16 67 4 82 80 7 32 62 10 13 75 13 37 30 16 93 46 2 94 20 5 18 53 8 23 41 11 51 17 14 85 42 17 2 35 3 50 34 6 48 71 9 70 35 12 97 20 15 57 82

Shortest Time Relative Ranking based on Ranking based Weighted

Path Complexity error time complexity on relative error rank

1. Apple 324.2866 8.7030 0.00000000 3 1 2 2. Whlcg 330.7543 11.4210 0.01955439 7 6 8 3. Ranecu 327.9157 9.8750 0.01106717 5 4 5 4. Lelcg 327.9157 11.4220 0.01106717 8 4 7 5. Rand 324.9306 4.7810 0.00198196 1 2 1 6. Halton 328.5597 9.5780 0.01300555 4 5 6 7. Sobol 324.2866 10.7970 0.00000000 6 1 3 8. Faure 325.5814 2154.2340 0.00397689 9 3 9 9. Nied 327.9157 8.0930 0.01106717 2 4 4

Weighted Rank = (0.1 * Time complexity/ 60 + 0.9 * Relative Error) Page 12.1256.18

Random Sequence Generatrs - ANT System

10 9 8

7 Ranking based on time 6 complexity Ranking based on relative error 5 Rank 4 Weighted rank 3 2 1 0

g n e d cg ie hl obol Apple anecu S Faur R . 9. N 1. 2. W. 4. Lelc5. Rand6. Halto7. 8 3 Random Generators

Example 2: 200-City TSP

Input cityloc.txt

1 97 75 26 36 18 51 60 32 76 5 22 101 87 3 126 82 72 151 17 81 176 68 79 2 74 40 27 79 42 52 30 41 77 49 39 102 7 88 127 94 31 152 15 5 177 72 39 3 54 95 28 15 46 53 56 20 78 85 101 103 21 81 128 77 86 153 62 73 178 97 56 4 31 29 29 81 31 54 4 77 79 44 45 104 16 84 129 34 89 154 37 73 179 52 31 5 90 48 30 20 45 55 14 23 80 12 49 105 24 13 130 50 84 155 39 91 180 101 35 6 54 36 31 67 68 56 41 71 81 8 12 106 30 88 131 36 100 156 27 77 181 68 62 7 58 26 32 20 53 57 61 71 82 42 77 107 65 28 132 58 54 157 12 34 182 28 4 8 13 100 33 88 41 58 13 69 83 79 57 108 19 73 133 71 26 158 93 43 183 46 92 9 4 92 34 65 42 59 23 99 84 59 81 109 6 80 134 99 4 159 102 78 184 83 51 10 39 77 35 57 12 60 28 98 85 81 57 110 98 5 135 61 20 160 15 94 185 37 99 11 27 19 36 36 86 61 25 101 86 38 48 111 56 37 136 101 87 161 74 9 186 55 34 12 54 82 37 51 84 62 95 86 87 56 15 112 87 28 137 63 37 162 76 62 187 36 9 13 92 7 38 39 90 63 74 88 88 14 86 113 27 2 138 8 7 163 35 88 188 27 87 14 31 27 39 34 70 64 51 98 89 5 96 114 14 71 139 52 43 164 85 24 189 90 55 15 26 94 40 79 65 65 71 91 90 69 70 115 6 98 140 53 40 165 34 88 190 71 64 16 3 95 41 21 95 66 4 88 91 38 38 116 4 82 141 19 82 166 11 88 191 55 14 17 45 66 42 5 47 67 74 61 92 54 8 117 82 17 142 74 38 167 93 53 192 32 5 18 40 27 43 13 62 68 97 74 93 27 8 118 20 35 143 14 23 168 6 83 193 98 46 19 61 17 44 87 24 69 49 40 94 67 76 119 11 93 144 48 87 169 84 30 194 2 66 20 24 17 45 100 73 70 78 69 95 57 27 120 38 62 145 97 70 170 67 6 195 97 25 21 71 74 46 34 95 71 13 87 96 95 64 121 99 101 146 20 46 171 14 5 196 67 95 22 44 89 47 83 71 72 28 61 97 100 38 122 47 61 147 92 84 172 66 72 197 30 77 23 31 51 48 57 101 73 4 42 98 91 39 123 6 91 148 75 80 173 75 79 198 99 65 24 90 77 49 5 35 74 58 39 99 31 91 124 26 101 149 55 92 174 77 3 199 22 90

25 90 40 50 92 54 75 70 82 100 40 24 125 89 82 150 100 82 175 38 72 200 95 87 Page 12.1256.19

Shortest Path Time Relative error Ranking based Ranking Weighted Length Complexity on time based on rank complexity relative error

1. Apple 1297.1795 6250.275 0.008000281 3 2 3 2. Whlcg 1286.8017 6813.093 0.00000000 8 1 8 3. Ranecu 1312.9473 6510.484 0.019913671 5 6 5 4. Lelcg 1323.5326 6753.797 0.027752169 7 8 7 5. Rand 1314.1748 5015.484 0.020829116 1 7 1 6. Halton 1301.4585 6376.844 0.011261827 4 3 4 7. Sobol 1302.5518 6598.078 0.012091726 6 4 6 8. Faure 1327.5822 336418.9840 0.031691363 9 9 9 9. Nied 1302.7493 6178.391 0.012241496 2 5 2 *Estimated Weighted Rank = (0.1 * Time complexity/ (3600*24) + 0.9 * Relative Error)

Random Sequence Generators - ANT System

10 9 8 7 Ranking based on time complexity 6 Ranking based on relative error 5 Rank 4 Weighted rank 3 2 1 0

g d l cg c bo Lel Ran So Ranecu Nied Halton Apple Faure . 9. 1. 2. Whl 4. 5. 6 7. 8. 3. Random Generators

6. Conclusions

The RNGs with larger cycle/period and with larger degree of independence are, in general (for wider variety of problems/algorithms), better than those with smaller cycle and smaller degree of independence. Our studies suggest that the choice of an RNG depends not only on the algorithm Page 12.1256.20 but also on the type of problems for which it has been used. The earlier statistical experiments [6-8, 39] on large ( ‡ 30) samples have demonstrated that QRNGs are better than PRNGs for

Monte Carlo method for integration ́ single as well as multiple ́ problems. The reason that could be attributed is that QR sequences are more uniformly distributed (i.e., with less discrepancy) over a region than PR sequences. Geometrically the integration problem needs to compute the hyper-space under the hyper-curve and hence a greater uniformity of random (multi-dimensional) points is more desired although the randomness is reduced because of increased uniformity. On the other hand, the experiments [6, 8, 27] performed on simulated annealing for TSP have shown that PRNGs are better suited than QRNGs with respect to quality (accuracy) of the result. Interestingly, no specific advantage on the choice of an RNG was discovered while solving TSP using a genetic algorithm [6, 27]. However, for TSPs using an ant algorithm, we have a mixed result. For a TSP (in ant algorithm) with moderate size (number of cities ‡ 30), the ranking in section 5 demonstrates that QRNGs and PRNGs are interleaved. So far as the accuracy is concerned, the PRNGs Whlcg and Apple are found marginally better than (or statistically near about the same as) the QRNGs Halton, Sobol, and Niederreiter, while the other PRNGs ranecu, the popular Matlab rand, and lelcg are slightly worse than the foregoing QRNGs. “Why is it so?” is the question which needs to be explored. Alternatively, which are the properties of an ant algorithm that are affected by the enhanced uniformity of the QRNGs and which are not? Our quest will continue to get a satisfactory answer to this question. However, in the foregoing findings, large ( ‡ 30) samples (each of a moderate size) have not been used mainly because of large (but polynomial) time complexity. We hope that we will get a stable ranking of the RNGs for large samples and also we will be able to correlate the ant algorithm properties for TSPs with those of RNGs to justify/visualize why such a ranking has happened.

Acknowledgment This article is a revised version resulted from the comments of the reviewers and the telephonic suggestions/comments of Dr. E. Graves, Program Chair, Mathematics Division of ASEE to one of the authors. The authors thank them both.

Bibliography

[1] V. Lakshmikantham and S.K. Sen, Computational Error and Complexity in Science and Engineering, Elsevier, Amsterdam, 2005. [2] M. Haahr, Introduction to randomness and random numbers, http://www.random.org/essay.html , 1999. [3] E.V. Krishnamurthy and S.K. Sen, Introductory Theory of Computer Science, Affiliated East West Press, New Delhi, 2004. [4] S. Galanti and A. Jung, Low-discrepancy sequences: Monte Carlo simulation of option prices, The Journal of Derivatives, 5, 63-83, 1997. [5] H. Weyl, Uber die Gleichverteilung von Zahlen mod. Eins, Math. Ann. 77, 313-352. [6] T. Samanta, Random Number Generators: MC Integration and TSP-solving by Simulated Annealing, Genetic and Ant System Approaches, Ph.D. thesis, Florida Institute of Technology (Department of Mathematical Sciences), Melbourne, Florida, USA, 2006. [7] A. Reese, Quasi- Versus Pseudo-random Numbers with Applications to Nonlinear Optimization, Ph.D. thesis,

Florida Institute of Technology (Department of Mathematical Sciences), Melbourne, Florida, USA, 2006. Page 12.1256.21 [8] S.K. Sen, T. Samanta, and A. Reese, Quasi- versus pseudo-random generators: Discrepancy, complexity and integration-error based comparison, International Journal of Innovative Computing, Information and Control, 2, 3, 621-651, 2006.

[9] J.H. Halton, On the efficiency of certain quasi-random sequences of points in evaluating multi dimensional integrals, Numerische Mathematik, 2, 84-90, 1960. [10] I.M. Sobol, On the distribution of points in a cube and the approximate evaluation of integrals, U.S.S.R. Computational Mathematics and Mathematical Physics, 7, 86-112, 1967. [11] H. Faure, Discrepance de suites associees a un systeme de numeration (en dimension s), Acta Arithmetica, XLZ, 337-351, 1982. [12] H. Niederreiter, Low discrepancy and low dispersion sequences, Journal of , 30, 51-70, 1988. [13] B.L. Fox, Algorithm 647: Implementation and relative efficiency of quasirandom sequence generators, ACM Transactions on Mathematical Software, 12, 4, 362-376, 1986. [14] P. Bratley, B.L. Fox, and H. Niederreiter, Algorithm 738: Programs to generate Niederreiter’s low-discrepancy sequences, ACM Transactions on Mathematical Software, 20, 4, 494-495, 1994. [15] H. Niederreiter, Random number generation and quasi-Monte Carlo methods, SIAM, Pennsylvania, 1992. [16] P.K. Sarkar and M.A. Prasad, A comparative study of pseudo and quasi random sequences for the solution of integral equations, Journal of Computational Physics, 68, 66-88, 1987. [17] P. Shirley, Discrepancy as a quality measure for simple distributions, Proceedings of Eurographics,91, 183- 193, 1991. [18] J. Struckmeier, Fast generation of low discrepancy sequences, Journal of Computational and Applied Mathematics, 61, 29-41, 1995. [19] L. Kocis and W. Whiten, Computational investigations of low-discrepancy sequences, ACM Transactions on Mathematical Software, 23, 2, 266-294, 1997. [20] S.G. Henderson, B.A. Chiera, and R.M. Cooke, Generating “dependent” quasi-random numbers, Proceedings of the 32nd Conference on Winter Simulation, 527-536, 2000. [21] S. Haupt and R. Haupt, Practical Genetic Algorithms, Wiley, New York, 1998. [22] A. Papageorgiou and J.F. Traub, Faster evaluation of multi-dimensional integrals, Computers in Physics, 574- 578, 11, 1997. [23] M. Mascagni and A. Karaivanova, Matrix computations using quasirandom numbers, Springer Verlag Lecture Notes in Computer Science, 552-559, 2000. [24] F.J. Hickernell and Y. Yuan, A simple multistart algorithm for global optimization, OR Transactions, 1, 2, 1997. [25] B. Hannaford, Resolution-first scanning of multidimensional spaces, CVGIP: Graphical Models and Image Processing, 55, 359-369, 1993. [26] J. Banks, J. Carson, and B. Nelson, Discrete-event System Simulation (2nd edition), Prentice-Hall, New Jersey, 1996. [27] T. Samanta and S.K. Sen, Pseudo- versus quasi-random generators in heuristics for traveling salesman problem, to appear. [28] M. Junger, G. Reinelt, and G. Renaldi, The traveling salesman problem, Operations Research and Management Sciences, 7, 225-330, 1995. [29] E.L. Lawler, J.K. Lenstra, A.H.G. Rinnooy Kan, anf D.B. Shmoys, The Traveling Salesman Problem / A Guided Tour of Combinatorial Optimization, Wiley, New York, 1985. [30] G. Reinelt, The Traveling Salesman: Computational Solutions for TSP Applications, Springer-Verlag, New York, 1994. [31] Georgia Tech TSP page, http://www.tsp.gatech.edu/. [32] W.L. Winston, Operations Research: Applications and Algorithms (4th Edition), Thomson, Belmont, California, 2004. [33] M. Dorigo, Optimization, Learning, and Natural Algorithms, Ph.D. thesis (in Italian), Politecnico di Milano, Italy, 1992. [34] M. Dorigo, V. Maniezzo, and A. Colorni, The ant system: Optimization by a colony of cooperating agents, IEEE Trans. On Systems, Man, and Cybernetics-Part B, 26, 1, 29-41, 1996. [35] L.M. Gambardella and M. Dorigo, Solving symmetric and asymmetric TSPs by ant colonies. Proceedings of the IEEE Conference on Evolutionary Computation, ICEC96, Nagoya, Japan, 622-627, 1996. [36] H. Chi, M. Mascagni, and T. Warnock, On the optimal Halton sequence, Mathematics and Computers in Simulation, 70, 9-21, 2005. Page 12.1256.22 [37] H. Niederreiter, Low-discrepancy and low dispersion sequences, Journal of Number Theory, 30, 51-70, 1988. [38] P. Bratley, B.L. Fox, and H. Niederreiter, Implementation and test of low-discrepancy sequences, ACM Transections on Modeling and Computer Simulation, 2, 3, 195-213, 1992.

[39] V. Lakshmikantham, S.K. Sen, and T. Samanta, Comparing random number generators using Monte Carlo integration, International Journal of Innovative Computing, Information, and Control, 1, 2, 143-165, 2005. [40] D.E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Addison-Wesley, Reading, MA, 2nd Edition, 1981. [41] S.K. Park and K.W. Miller, Random number generators: Good ones are hard to find, Communications of the ACM, 31, 1192-1201, 1988. [42] MP-TESTDATA - The TSPLIB Symmetric Traveling Salesman Problem Instances, http://elib.zib.de/pub/Packages/mp-testdata/tsp/tsplib/tsp/ ,1995.

Appendix

We provide here Matlab codes for the RNGs used in this paper. In addition we include Matlab codes for discrepancy test, frequency test and scatter plot. These codes are the modified versions of the codes in [6] for better clarity/computational efficiency.

Matlab code for Apple PR sequence function r=apple(p, q) %The parameters used here are for the apple(based on lcg) PRNG. %If p=1 and q=500 then 500 PRNs will be generated. persistent a b m x if isempty(x), a=5^13; b=0; m=2^35; x=1; end; if nargin<1, p=1; end; if nargin<2, q=1; end; pq=p*q; r=zeros(p, q); for k=1: pq, x=rem(a*x+b, m); r(k)=x/m; end;

In the foregoing code, by changing the values of a, b, m, x, we can produce other LCGs.

Matlab code for Whlcg PR sequence function r=whlcg(p, q) persistent a1 a2 a3 b1 b2 b3 m1 m2 m3 x y z %If p=1 and q=500 then 500 PRNs will be generated. if isempty(x), a3=170; b3=0; m3=30323; z=1; a2=172; b2=0; m2=30307; y=1; a1=171; b1=0; m1=30269; x=1; end; if nargin<1, p=1; end; if nargin<2, q=1; end; pq=p*q; r=zeros(p, q); for k=1:pq, x=mod(a1*x+b1, m1); y=mod(a2*y+b2, m2); z=mod(a3*z+b3, m3); r(k)=mod((x/m1+y/m2+z/m3), 1); end;

Matlab code for Ranecu PR sequence function r=ranecu(p, q) %If p=1, q=500 then 500 PRNs will be generated. persistent a1 a2 b1 b2 m1 m2 x y if isempty(x), a1=40014; a2=40692; b1=0; b2=0; m=2^31; m1=m-85; m2=m-249; x=1; y=1; end; if nargin<1, p=1; end; if nargin<2, q=1; end; r=zeros(p, q);pq=p*q; for k=1:pq, x=mod(a1*x+b1, m1); y=mod(a2*y+b2, m2); r(k)=mod((x/m1+y/m2), 1); end;

Matlab code for lelcg PR sequence function r=lelcg(p, q)

%If p=1 and q=500 then 500 PRNs will be generated. Page 12.1256.23 persistent a1 a2 a3 b1 b2 b3 m1 m2 m3 x y z if isempty(x), a1=157; a2=146; a3=142; b1=0; b2=0; b3=0; m1=32363; m2=31727; m3=31657; x=1; y=1; z=1; end; if nargin<1, p=1; end; if nargin<2, q=1; end; r=zeros(p, q); pq=p*q;

for k=1:pq, x=mod(a1*x+b1, m1); y=mod(a2*y+b2, m2); z=mod(a3*z+b3, m3); r(k)=mod((x/m1+y/m2+z/m3),1); end;

Matlab code for Halton QR sequence function r = halton(p, q) %If p = 1 and q = 500 then 500 QRNs will be generated. dim = 2; persistent seed seed2 base if isempty(seed2), prm_numbers=primes(300); base=prm_numbers(dim); seed2=base^4-1; end if nargin < 1, p = 1; end, if nargin < 2, q = 1; end r = zeros(p,q); seed = seed2; for k = 1:p*q, x = 0.0; base_inv = 1.0/base; while ( any ( seed ~= 0 ) ) digit = mod ( seed, base ); x = x + digit * base_inv; base_inv = base_inv / base; seed = floor ( seed / base ); end r(k) = x; seed2 = seed2 + 1; seed = seed2; end

The code for generating Sobol QR sequence is function r = sobol(p, q) %If p = 1 and q = 500 then 500 QRNs will be generated. % Run only this program once by calling sobol in prompt before % executing the antalgorithm in which you are replacing rand by sobol. %This will save repeated computation and hence computing time. global ip mdeg ix iv in fac d = 1; if isempty(ip), d = -1; end, if nargin < 1, p = 1; end, if nargin < 2, q = 1; end r = zeros(p,q); maxdim=6; maxbit=30; pq=p*q; for k1 = 1:pq, if d<0, ip=[0 1 1 2 1 4]; mdeg=[1 2 3 3 4 4]; ix=zeros(1,6); iv=zeros(maxdim, maxbit); iv(:)=[ones(1,6),3,1,3,3,1,1,5,7,7,3,3,5,15,11,5,15,13,9,zeros(1,156)]; for k=1:maxdim, for j=1:mdeg(k), iv(k,j) = iv(k,j)*2^(maxbit-j); end for j=(mdeg(k)+1):maxbit ipp = ip(k); i = iv(k,j-mdeg(k)); i = bitxor(i,floor(i/2^mdeg(k))); for l=(mdeg(k)-1):-1:1, if bitand(ipp,1) ~= 0, i = bitxor(i,iv(k,j-l)); end; ipp = floor(ipp/2); end iv(k,j) = i; end; end; fac = 1/2^maxbit; in = 0; else im = in; for j=1:maxbit, if bitand(im,1) == 0, flag = 0; break; end; im = floor(im/2); end im = (j-1)*maxdim; for k=1:min(d,maxdim), ix(k) = bitxor(ix(k),iv(im+k)); r(k1) = ix(k)*fac; end in = in + 1; end; end

The code for generating Faure QR sequence is function r = faure(p, q) %If p = 1 and q = 500 then 500 QRNs will be generated. dim = 2; base = 2; persistent seed C if isempty(seed), seed = base^4 - 1; end if nargin < 1, p = 1; end, if nargin < 2, q = 1; end r = zeros(p,q); MatDim = 35; C = zeros(MatDim,MatDim); for y = 1:MatDim, for z = y:MatDim, C(y,z) = (dim-1)^(z-1) * nchoosek(z-1,y-1); end, end pq=p*q; for k = 1:pq, n = dec2base(seed,base); i = length(n); for i1 = i:-1:1, a = mod(n,base); end

a = fliplr(a); j = -[1:i]; base_inv = base.^j; c = C(1:i, 1:i); bc = c * a'; Page 12.1256.24 b = mod(bc, base); x = base_inv * b; r(k) = x; seed = seed + 1; end

The code for generating Niederreiter QR sequence is

function r = nied1(p, q) %If p = 1 and q = 500 then 500 QRNs will be generated. persistent seed % d = dimension d = 1; if isempty(seed), seed = 1; end if nargin < 1, p = 1; end, if nargin < 2, q = 1; end; pq=p*q; r = zeros(p, q); for k = 1:pq, x = seed.*(2.^(d/(1+d))); r(k) = x - fix(x); seed = seed + 1; end

Matlab code for Discrepancy Test (provided here for the reader to get a feel about the statistical test on discrepancy; no actual discrepancy test has been included in this paper.) function [] = discr(N);% test for uniformity: Kolmogorov-Smirnov(freq) x = rand(1,N); % N = # of RNs x1 = sort(x); % rank QRNs from small to large for i=1:N, dplus(i) = i/N; dminus(i) = (i-1)/N; end dplus2 = max(dplus - x1); dminus2 = max(x1 - dminus); % compute D+, D- d = max(dplus2, dminus2); % compute test statistic D critval = 1.36/sqrt(N); % compute critical value, alpha = 0.05 when N > 35, 1.36/root(N) fprintf('\n N d critval \n'), fprintf('%8i %10.5f %9.4f \n', N, d, critval)

Matlab code for Scatter Plot (provided here for the reader to visualize uniformity of a random sequence produced by an RNG by varying the sequence size, i.e., N; no actual scatter plot has been included in the paper.)

% for QRNGs replace XX=rand(1,N) by XX=****1(1,N) for 1-D and YY=rand(I,N) by % YY=****2(1,N) for 2-D, where **** should be Halton or Faure or Sobol or Niederreiter. % for PRNGs, however, replace rand throughout by Apple or Ranecu or Whlcg. function [] = scatterplot(N); % N = # of RNs XX = zeros(1,N); YY = zeros(1,N); % initialize the XX and YY XX = rand(1,N); YY = rand(1,N); plot(XX,YY,'k.'), title([num2str(N),' pairs of random numbers generated by rand in [0, 1]’])

Matlab code for an LCG (different PR sequences are generated by changing the values for a, b, m, and x) (This code is just included so that the reader can perform numerical experiments and get a first-hand feel about how the values of a, b, m, and x affect the PR sequence.) function r = lcg(n) % lcg is linear congruential uniform PRN generator. %If n=500 then 500 PR numbers will be generated. persistent a b m x if isempty(x), a = 5^13; b = 0; m = 2^35; x = 1; end; %m=modulus, a=multiplier, b=increment, x=initial seed. for k = 1:n, x = rem(a*x+b, m); r(k)=x/m; end;

Page 12.1256.25