<<

Problems with the Use of Computers for Selecting Panels Author(s): George Marsaglia Source: , Vol. 41, No. 4 (SUMMER 2001), pp. 425-427 Published by: American Association Stable URL: http://www.jstor.org/stable/29762720 Accessed: 02-06-2017 15:03 UTC

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms

American Bar Association is collaborating with JSTOR to digitize, preserve and extend access to Jurimetrics

This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Problems with the Use of Computers for Selecting Jury Panels

The idea of random selection?choosing by lot?is rooted in history and . In Ancient Greece, pieces of wood, "lots," each bearing the mark of a competitor, were placed in a helmet and drawn to determine choice battle assignments, division of plunder, and the like. A provision of Lex Pompeia Provinciea required that governors of Roman provinces be chosen by lot from eligible ex consuls. Grafton reports that choice of exiles from Germany was made by "a maner & sort of a Lot sundrie times used in the sayde lande."1 According to Plato, "[t]he ancients knew that election by lot was the most democratic of all modes of appointment."2 The tradition of choosing by lot continues, but the difficulty of collecting thousands of lots in a large "helmet" makes the task more readily suited to computer automation. Thus, a Florida authorizes use of computers for choosing jury venires, if such drawing is "by lot and at random" by a method approved by the Florida Supreme .3 Most state codes have similar provisions.4 However, a problem plagues most attempts to use computers for choosing jury venires. Virtually all schemes for random selection by computer are based on a random number generator (RNG). A RNG is merely a set of computer instructions that combines an initial set of random , called seeds, in a deterministic way to produce the desired result. Thus, the of a random number generator is determined by the randomness of its input seed values. These are presumed to be chosen uniformly and independently from an available set of roughly 10-digit numbers. In any case, the number of possible

1. Richard Grafton, A Chronicle at Large and Meere History of the Affayres of England 95 (1568). 2. 5 Benjamin Jowett, The Dialogues of Plato 125 (2d ed. 1871). 3. Fla. Stat. ch. 40.225 (2000). 4. E.g., Ariz. Rev. Stat. ?? 21-312,313 (2001); Me. Rev. Stat. Ann. tit. 14, ?? 1252-C, 1253-A (West 1999); Or. Rev. Stat. ? 54.060 (1999); Pa Cons. Stat. ? 4525 (2001); Tex. Gov't Code Ann. ? 62.011 (Vernon 2000).

SUMMER 2001 425

This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Marsaglia

outcomes from a particular RNG cannot exceed the number of possible choices for seeds. This last point is cause for concern in the use of random number generators for selecting jury venires. To illustrate with an example that requires much smaller numbers than those we must ultimately consider, suppose we are to use a RNG to choose a ticket of six numbers from 1, 2, ... , 49, as in the Florida lottery. Suppose we will use the computer clock to set the seed for the RNG (a common practice), and the current clock value is stored in a 16-bit register. There are only 65,536 possible values for the seed, and thus we are only able to randomly select 65,536 of the possible 13,983,816 lottery tickets. Thus, even if the 16-bit seed were a good uniformly random selection from the set of 65,536 (which it is not), we cannot consider that the entire selection was by lot and at random, since millions of possible tickets could never be selected by a seed with such limited possibilities. Now turn to more realistic numbers?for example, the choice of 80 potential jurors from a list of200 eligibles. There are 1,647,278,652,451,762,678,788,128, 833,110,870,712,983,038,446,517,480,945,400 ways to select such apanel "by lot and at random." Because this is a fifty-seven-digit number, we would require a random number generator that uses six ten-digit seeds. (For comparison, a truly random shuffle of a deck of fifty-two cards would require a set of seven ten-digit seeds?or seven RNGs each using one seed?since the number of possible shuffles is a sixty-seven-digit number.) For a more extreme example, consider choosing a venire of 1,200 from a list of 500,000 eligibles in Palm Beach County. The totality of such choices is a number of 3,662 digits. It would require a random number generator with 367 ten-digit seeds (or 367 different RNGs, each having a single seed) to provide selection that was truly random and by lot. Can we select, by lot and at random, from a collection so large that enumeration requires a number of several hundred or even several thousand digits? Or, can we interpret "by lot and at random" so that selection from a reasonably large proportion of the possible selections would still be deemed to have met the requirements of the statute? The latter solution seems unsatisfactory. Just as the player of a Casino poker machine that displays forty hands of poker is entitled to the chance that his forty hands will all be straight flushes,5 a litigant should be entitled to the possible selection, however remote, of a preponderance of jurors who might favor his case, even though their may be only a few in a hundred. Furthermore, selection by "lot and at random" can be accomplished, but it requires more than the casual assignment of a random ten digit seed value as practiced in many , or worse, choice of a seed by

5. Such a consideration led to a Michigan Game Control Board's requiring multiple-seed RNGs for certain gaming machines. This finding of Nov. 7,2000, by Mark Robinson and Pat Leen of the Michigan Game Control Lab, is one of the first to point out the inadequacies of single-seed RNGs for some applications.

426 41 JURIMETRICS

This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Letter to the Editor

a method obscured in some proprietary software code but ultimately dependent on the computer clock?and unverifiable after the fact. A multiple-seed RNG or many single-seed RNGs are available. We need only provide a satisfactory set of seeds?as many as the truly "by lot and at random" requirement calls for. I have previously recommended that seed selection should be a well-defined procedure that is specified before the day of selection.6 It should be determined on or after that day from that will be unpredictable before, but will become publicly available afterward. For example, to choose a single seed, we can specify today that the digits of the fourth rightmost column of sales for next Tuesday's ten most active stocks on the New York Stock Exchange will be used. This value will be available from most daily newspapers next Wednesday, but it is virtually unpredictable today. Alternatively, a drawing in the Florida lottery can be identified with its position in an enumeration of the possibilities 1, 2, 3,... , 13983816. Thus, for example, the lottery draw 5,16,28,34,38,43 might provide the index 8463225, corresponding to its position in that list. That index might serve as a random seed, unpredictable before the drawing but a matter of record afterward. Such procedures are used in some counties, but at most for a few seed values. Hundreds of seed values could be provided for all counties by having the Office of the State Administrator maintain a website at which hundreds of random seed values would be available weekly or even daily. In short, implementation of a computer method for selection by lot and at random should:

? Use a random number generator that requires many seeds, certainly enough that the selection procedure will be able to provide every possible choice.7 ? The randomness of the selections comes from the randomness of the seeds, since most RNGs produce output that is a fixed function of the seed values. The seeds should be chosen in a predetermined manner from events that are unpredictable but can be documented after they occur. The number of elements to be chosen at random must be less than the number of possible choices for seed values. To find the number of 32-bit seeds necessary to choose, at random and by lot, a venire of k from a list of n eligibles, form x=(n+.5)ln(n)- (k+.5)ln(k)- (n- k+.5)ln(n- k), then take the first whole number greater than .045*x-.041. For example, with n=200 and k=80, this expression yields x=132.7, with .045*x-.041=5.93, so that a RNG with six or more seeds would be necessary. ?George Marsaglia Professor Emeritus of Pure and Applied Mathematics, Computer Science and Statistics Washington State and Florida State Universities

6. Memoranda from George Marsaglia, to Office of Florida State Courts Administrator (Feb. 5,2001 and May 29,2001) (on file with author). 7. Use of many single-seed RNGs is feasible, but more complicated. Short C programs that have passed all tests of randomness and use 256, 512,1019,1024 or 3068 seeds are available from the author; many of these can be found by searching for "Marsaglia + Random" on Internet engines.

SUMMER 2001 427

This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms