Problems with the Use of Computers for Selecting Jury Panels Author(S): George Marsaglia Source: Jurimetrics, Vol

Total Page:16

File Type:pdf, Size:1020Kb

Problems with the Use of Computers for Selecting Jury Panels Author(S): George Marsaglia Source: Jurimetrics, Vol Problems with the Use of Computers for Selecting Jury Panels Author(s): George Marsaglia Source: Jurimetrics, Vol. 41, No. 4 (SUMMER 2001), pp. 425-427 Published by: American Bar Association Stable URL: http://www.jstor.org/stable/29762720 Accessed: 02-06-2017 15:03 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms American Bar Association is collaborating with JSTOR to digitize, preserve and extend access to Jurimetrics This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Problems with the Use of Computers for Selecting Jury Panels The idea of random selection?choosing by lot?is rooted in history and law. In Ancient Greece, pieces of wood, "lots," each bearing the mark of a competitor, were placed in a helmet and drawn to determine choice battle assignments, division of plunder, and the like. A provision of Lex Pompeia Provinciea required that governors of Roman provinces be chosen by lot from eligible ex consuls. Grafton reports that choice of exiles from Germany was made by "a maner & sort of a Lot sundrie times used in the sayde lande."1 According to Plato, "[t]he ancients knew that election by lot was the most democratic of all modes of appointment."2 The tradition of choosing by lot continues, but the difficulty of collecting thousands of lots in a large "helmet" makes the task more readily suited to computer automation. Thus, a Florida statute authorizes use of computers for choosing jury venires, if such drawing is "by lot and at random" by a method approved by the Florida Supreme Court.3 Most state codes have similar provisions.4 However, a problem plagues most attempts to use computers for choosing jury venires. Virtually all schemes for random selection by computer are based on a random number generator (RNG). A RNG is merely a set of computer instructions that combines an initial set of random numbers, called seeds, in a deterministic way to produce the desired result. Thus, the randomness of a random number generator is determined by the randomness of its input seed values. These are presumed to be chosen uniformly and independently from an available set of roughly 10-digit numbers. In any case, the number of possible 1. Richard Grafton, A Chronicle at Large and Meere History of the Affayres of England 95 (1568). 2. 5 Benjamin Jowett, The Dialogues of Plato 125 (2d ed. 1871). 3. Fla. Stat. ch. 40.225 (2000). 4. E.g., Ariz. Rev. Stat. ?? 21-312,313 (2001); Me. Rev. Stat. Ann. tit. 14, ?? 1252-C, 1253-A (West 1999); Or. Rev. Stat. ? 54.060 (1999); Pa Cons. Stat. ? 4525 (2001); Tex. Gov't Code Ann. ? 62.011 (Vernon 2000). SUMMER 2001 425 This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Marsaglia outcomes from a particular RNG cannot exceed the number of possible choices for seeds. This last point is cause for concern in the use of random number generators for selecting jury venires. To illustrate with an example that requires much smaller numbers than those we must ultimately consider, suppose we are to use a RNG to choose a lottery ticket of six numbers from 1, 2, ... , 49, as in the Florida lottery. Suppose we will use the computer clock to set the seed for the RNG (a common practice), and the current clock value is stored in a 16-bit register. There are only 65,536 possible values for the seed, and thus we are only able to randomly select 65,536 of the possible 13,983,816 lottery tickets. Thus, even if the 16-bit seed were a good uniformly random selection from the set of 65,536 (which it is not), we cannot consider that the entire selection was by lot and at random, since millions of possible tickets could never be selected by a seed with such limited possibilities. Now turn to more realistic numbers?for example, the choice of 80 potential jurors from a list of200 eligibles. There are 1,647,278,652,451,762,678,788,128, 833,110,870,712,983,038,446,517,480,945,400 ways to select such apanel "by lot and at random." Because this is a fifty-seven-digit number, we would require a random number generator that uses six ten-digit seeds. (For comparison, a truly random shuffle of a deck of fifty-two cards would require a set of seven ten-digit seeds?or seven RNGs each using one seed?since the number of possible shuffles is a sixty-seven-digit number.) For a more extreme example, consider choosing a venire of 1,200 from a list of 500,000 eligibles in Palm Beach County. The totality of such choices is a number of 3,662 digits. It would require a random number generator with 367 ten-digit seeds (or 367 different RNGs, each having a single seed) to provide selection that was truly random and by lot. Can we select, by lot and at random, from a collection so large that enumeration requires a number of several hundred or even several thousand digits? Or, can we interpret "by lot and at random" so that selection from a reasonably large proportion of the possible selections would still be deemed to have met the requirements of the statute? The latter solution seems unsatisfactory. Just as the player of a Casino poker machine that displays forty hands of poker is entitled to the chance that his forty hands will all be straight flushes,5 a litigant should be entitled to the possible selection, however remote, of a preponderance of jurors who might favor his case, even though their frequency may be only a few in a hundred. Furthermore, selection by "lot and at random" can be accomplished, but it requires more than the casual assignment of a random ten digit seed value as practiced in many jurisdictions, or worse, choice of a seed by 5. Such a consideration led to a Michigan Game Control Board's requiring multiple-seed RNGs for certain gaming machines. This finding of Nov. 7,2000, by Mark Robinson and Pat Leen of the Michigan Game Control Lab, is one of the first to point out the inadequacies of single-seed RNGs for some applications. 426 41 JURIMETRICS This content downloaded from 128.118.10.59 on Fri, 02 Jun 2017 15:03:30 UTC All use subject to http://about.jstor.org/terms Letter to the Editor a method obscured in some proprietary software code but ultimately dependent on the computer clock?and unverifiable after the fact. A multiple-seed RNG or many single-seed RNGs are available. We need only provide a satisfactory set of seeds?as many as the truly "by lot and at random" requirement calls for. I have previously recommended that seed selection should be a well-defined procedure that is specified before the day of selection.6 It should be determined on or after that day from data that will be unpredictable before, but will become publicly available afterward. For example, to choose a single seed, we can specify today that the digits of the fourth rightmost column of sales for next Tuesday's ten most active stocks on the New York Stock Exchange will be used. This value will be available from most daily newspapers next Wednesday, but it is virtually unpredictable today. Alternatively, a drawing in the Florida lottery can be identified with its position in an enumeration of the possibilities 1, 2, 3,... , 13983816. Thus, for example, the lottery draw 5,16,28,34,38,43 might provide the index 8463225, corresponding to its position in that list. That index might serve as a random seed, unpredictable before the drawing but a matter of record afterward. Such procedures are used in some counties, but at most for a few seed values. Hundreds of seed values could be provided for all counties by having the Office of the State Courts Administrator maintain a website at which hundreds of random seed values would be available weekly or even daily. In short, implementation of a computer method for selection by lot and at random should: ? Use a random number generator that requires many seeds, certainly enough that the selection procedure will be able to provide every possible choice.7 ? The randomness of the selections comes from the randomness of the seeds, since most RNGs produce output that is a fixed function of the seed values. The seeds should be chosen in a predetermined manner from events that are unpredictable but can be documented after they occur. The number of elements to be chosen at random must be less than the number of possible choices for seed values. To find the number of 32-bit seeds necessary to choose, at random and by lot, a venire of k from a list of n eligibles, form x=(n+.5)ln(n)- (k+.5)ln(k)- (n- k+.5)ln(n- k), then take the first whole number greater than .045*x-.041. For example, with n=200 and k=80, this expression yields x=132.7, with .045*x-.041=5.93, so that a RNG with six or more seeds would be necessary.
Recommended publications
  • 3 Autocorrelation
    3 Autocorrelation Autocorrelation refers to the correlation of a time series with its own past and future values. Autocorrelation is also sometimes called “lagged correlation” or “serial correlation”, which refers to the correlation between members of a series of numbers arranged in time. Positive autocorrelation might be considered a specific form of “persistence”, a tendency for a system to remain in the same state from one observation to the next. For example, the likelihood of tomorrow being rainy is greater if today is rainy than if today is dry. Geophysical time series are frequently autocorrelated because of inertia or carryover processes in the physical system. For example, the slowly evolving and moving low pressure systems in the atmosphere might impart persistence to daily rainfall. Or the slow drainage of groundwater reserves might impart correlation to successive annual flows of a river. Or stored photosynthates might impart correlation to successive annual values of tree-ring indices. Autocorrelation complicates the application of statistical tests by reducing the effective sample size. Autocorrelation can also complicate the identification of significant covariance or correlation between time series (e.g., precipitation with a tree-ring series). Autocorrelation implies that a time series is predictable, probabilistically, as future values are correlated with current and past values. Three tools for assessing the autocorrelation of a time series are (1) the time series plot, (2) the lagged scatterplot, and (3) the autocorrelation function. 3.1 Time series plot Positively autocorrelated series are sometimes referred to as persistent because positive departures from the mean tend to be followed by positive depatures from the mean, and negative departures from the mean tend to be followed by negative departures (Figure 3.1).
    [Show full text]
  • The Bayesian Approach to Statistics
    THE BAYESIAN APPROACH TO STATISTICS ANTHONY O’HAGAN INTRODUCTION the true nature of scientific reasoning. The fi- nal section addresses various features of modern By far the most widely taught and used statisti- Bayesian methods that provide some explanation for the rapid increase in their adoption since the cal methods in practice are those of the frequen- 1980s. tist school. The ideas of frequentist inference, as set out in Chapter 5 of this book, rest on the frequency definition of probability (Chapter 2), BAYESIAN INFERENCE and were developed in the first half of the 20th century. This chapter concerns a radically differ- We first present the basic procedures of Bayesian ent approach to statistics, the Bayesian approach, inference. which depends instead on the subjective defini- tion of probability (Chapter 3). In some respects, Bayesian methods are older than frequentist ones, Bayes’s Theorem and the Nature of Learning having been the basis of very early statistical rea- Bayesian inference is a process of learning soning as far back as the 18th century. Bayesian from data. To give substance to this statement, statistics as it is now understood, however, dates we need to identify who is doing the learning and back to the 1950s, with subsequent development what they are learning about. in the second half of the 20th century. Over that time, the Bayesian approach has steadily gained Terms and Notation ground, and is now recognized as a legitimate al- ternative to the frequentist approach. The person doing the learning is an individual This chapter is organized into three sections.
    [Show full text]
  • Using Epidemiological Evidence in Tort Law: a Practical Guide Aleksandra Kobyasheva
    Using epidemiological evidence in tort law: a practical guide Aleksandra Kobyasheva Introduction Epidemiology is the study of disease patterns in populations which seeks to identify and understand causes of disease. By using this data to predict how and when diseases are likely to arise, it aims to prevent the disease and its consequences through public health regulation or the development of medication. But can this data also be used to tell us something about an event that has already occurred? To illustrate, consider a case where a claimant is exposed to a substance due to the defendant’s negligence and subsequently contracts a disease. There is not enough evidence available to satisfy the traditional ‘but-for’ test of causation.1 However, an epidemiological study shows that there is a potential causal link between the substance and the disease. What is, or what should be, the significance of this study? In Sienkiewicz v Greif members of the Supreme Court were sceptical that epidemiological evidence can have a useful role to play in determining questions of causation.2 The issue in that case was a narrow one that did not present the full range of concerns raised by epidemiological evidence, but the court’s general attitude was not unusual. Few English courts have thought seriously about the ways in which epidemiological evidence should be approached or interpreted; in particular, the factors which should be taken into account when applying the ‘balance of probability’ test, or the weight which should be attached to such factors. As a result, this article aims to explore this question from a practical perspective.
    [Show full text]
  • Analysis of Scientific Research on Eyewitness Identification
    ANALYSIS OF SCIENTIFIC RESEARCH ON EYEWITNESS IDENTIFICATION January 2018 Patricia A. Riley U.S. Attorney’s Office, DC Table of Contents RESEARCH DOES NOT PROVIDE A FIRM FOUNDATION FOR JURY INSTRUCTIONS OF THE TYPE ADOPTED BY SOME COURTS OR, IN SOME INSTANCES, FOR EXPERT TESTIMONY ...................................................... 6 RESEARCH DOES NOT PROVIDE A FIRM FOUNDATION FOR JURY INSTRUCTIONS OF THE TYPE ADOPTED BY SOME COURTS AND, IN SOME INSTANCES, FOR EXPERT TESTIMONY .................................................... 8 Introduction ............................................................................................................................................. 8 Courts should not comment on the evidence or take judicial notice of contested facts ................... 11 Jury Instructions based on flawed or outdated research should not be given .................................. 12 AN OVERVIEW OF THE RESEARCH ON EYEWITNESS IDENTIFICATION OF STRANGERS, NEW AND OLD .... 16 Important Points .................................................................................................................................... 16 System Variables .................................................................................................................................... 16 Simultaneous versus sequential presentation ................................................................................... 16 Double blind, blind, blinded administration (unless impracticable ...................................................
    [Show full text]
  • A Review of Graph and Network Complexity from an Algorithmic Information Perspective
    entropy Review A Review of Graph and Network Complexity from an Algorithmic Information Perspective Hector Zenil 1,2,3,4,5,*, Narsis A. Kiani 1,2,3,4 and Jesper Tegnér 2,3,4,5 1 Algorithmic Dynamics Lab, Centre for Molecular Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 2 Unit of Computational Medicine, Department of Medicine, Karolinska Institute, 171 77 Stockholm, Sweden; [email protected] 3 Science for Life Laboratory (SciLifeLab), 171 77 Stockholm, Sweden 4 Algorithmic Nature Group, Laboratoire de Recherche Scientifique (LABORES) for the Natural and Digital Sciences, 75005 Paris, France 5 Biological and Environmental Sciences and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia * Correspondence: [email protected] or [email protected] Received: 21 June 2018; Accepted: 20 July 2018; Published: 25 July 2018 Abstract: Information-theoretic-based measures have been useful in quantifying network complexity. Here we briefly survey and contrast (algorithmic) information-theoretic methods which have been used to characterize graphs and networks. We illustrate the strengths and limitations of Shannon’s entropy, lossless compressibility and algorithmic complexity when used to identify aspects and properties of complex networks. We review the fragility of computable measures on the one hand and the invariant properties of algorithmic measures on the other demonstrating how current approaches to algorithmic complexity are misguided and suffer of similar limitations than traditional statistical approaches such as Shannon entropy. Finally, we review some current definitions of algorithmic complexity which are used in analyzing labelled and unlabelled graphs. This analysis opens up several new opportunities to advance beyond traditional measures.
    [Show full text]
  • Introduction to Bayesian Inference and Modeling Edps 590BAY
    Introduction to Bayesian Inference and Modeling Edps 590BAY Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2019 Introduction What Why Probability Steps Example History Practice Overview ◮ What is Bayes theorem ◮ Why Bayesian analysis ◮ What is probability? ◮ Basic Steps ◮ An little example ◮ History (not all of the 705+ people that influenced development of Bayesian approach) ◮ In class work with probabilities Depending on the book that you select for this course, read either Gelman et al. Chapter 1 or Kruschke Chapters 1 & 2. C.J. Anderson (Illinois) Introduction Fall 2019 2.2/ 29 Introduction What Why Probability Steps Example History Practice Main References for Course Throughout the coures, I will take material from ◮ Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., & Rubin, D.B. (20114). Bayesian Data Analysis, 3rd Edition. Boco Raton, FL, CRC/Taylor & Francis.** ◮ Hoff, P.D., (2009). A First Course in Bayesian Statistical Methods. NY: Sringer.** ◮ McElreath, R.M. (2016). Statistical Rethinking: A Bayesian Course with Examples in R and Stan. Boco Raton, FL, CRC/Taylor & Francis. ◮ Kruschke, J.K. (2015). Doing Bayesian Data Analysis: A Tutorial with JAGS and Stan. NY: Academic Press.** ** There are e-versions these of from the UofI library. There is a verson of McElreath, but I couldn’t get if from UofI e-collection. C.J. Anderson (Illinois) Introduction Fall 2019 3.3/ 29 Introduction What Why Probability Steps Example History Practice Bayes Theorem A whole semester on this? p(y|θ)p(θ) p(θ|y)= p(y) where ◮ y is data, sample from some population.
    [Show full text]
  • Autocorrelation
    Autocorrelation David Gerbing School of Business Administration Portland State University January 30, 2016 Autocorrelation The difference between an actual data value and the forecasted data value from a model is the residual for that forecasted value. Residual: ei = Yi − Y^i One of the assumptions of the least squares estimation procedure for the coefficients of a regression model is that the residuals are purely random. One consequence of randomness is that the residuals would not correlate with anything else, including with each other at different time points. A value above the mean, that is, a value with a positive residual, would contain no information as to whether the next value in time would have a positive residual, or negative residual, with a data value below the mean. For example, flipping a fair coin yields random flips, with half of the flips resulting in a Head and the other half a Tail. If a Head is scored as a 1 and a Tail as a 0, and the probability of both Heads and Tails is 0.5, then calculate the value of the population mean as: Population Mean: µ = (0:5)(1) + (0:5)(0) = :5 The forecast of the outcome of the next flip of a fair coin is the mean of the process, 0.5, which is stable over time. What are the corresponding residuals? A residual value is the difference of the corresponding data value minus the mean. With this scoring system, a Head generates a positive residual from the mean, µ, Head: ei = 1 − µ = 1 − 0:5 = 0:5 A Tail generates a negative residual from the mean, Tail: ei = 0 − µ = 0 − 0:5 = −0:5 The error terms of the coin flips are independent of each other, so if the current flip is a Head, or if the last 5 flips are Heads, the forecast for the next flip is still µ = :5.
    [Show full text]
  • Mild Vs. Wild Randomness: Focusing on Those Risks That Matter
    with empirical sense1. Granted, it has been Mild vs. Wild Randomness: tinkered with using such methods as Focusing on those Risks that complementary “jumps”, stress testing, regime Matter switching or the elaborate methods known as GARCH, but while they represent a good effort, Benoit Mandelbrot & Nassim Nicholas Taleb they fail to remediate the bell curve’s irremediable flaws. Forthcoming in, The Known, the Unknown and the The problem is that measures of uncertainty Unknowable in Financial Institutions Frank using the bell curve simply disregard the possibility Diebold, Neil Doherty, and Richard Herring, of sharp jumps or discontinuities. Therefore they editors, Princeton: Princeton University Press. have no meaning or consequence. Using them is like focusing on the grass and missing out on the (gigantic) trees. Conventional studies of uncertainty, whether in statistics, economics, finance or social science, In fact, while the occasional and unpredictable have largely stayed close to the so-called “bell large deviations are rare, they cannot be dismissed curve”, a symmetrical graph that represents a as “outliers” because, cumulatively, their impact in probability distribution. Used to great effect to the long term is so dramatic. describe errors in astronomical measurement by The good news, especially for practitioners, is the 19th-century mathematician Carl Friedrich that the fractal model is both intuitively and Gauss, the bell curve, or Gaussian model, has computationally simpler than the Gaussian. It too since pervaded our business and scientific culture, has been around since the sixties, which makes us and terms like sigma, variance, standard deviation, wonder why it was not implemented before. correlation, R-square and Sharpe ratio are all The traditional Gaussian way of looking at the directly linked to it.
    [Show full text]
  • Random Variables and Applications
    Random Variables and Applications OPRE 6301 Random Variables. As noted earlier, variability is omnipresent in the busi- ness world. To model variability probabilistically, we need the concept of a random variable. A random variable is a numerically valued variable which takes on different values with given probabilities. Examples: The return on an investment in a one-year period The price of an equity The number of customers entering a store The sales volume of a store on a particular day The turnover rate at your organization next year 1 Types of Random Variables. Discrete Random Variable: — one that takes on a countable number of possible values, e.g., total of roll of two dice: 2, 3, ..., 12 • number of desktops sold: 0, 1, ... • customer count: 0, 1, ... • Continuous Random Variable: — one that takes on an uncountable number of possible values, e.g., interest rate: 3.25%, 6.125%, ... • task completion time: a nonnegative value • price of a stock: a nonnegative value • Basic Concept: Integer or rational numbers are discrete, while real numbers are continuous. 2 Probability Distributions. “Randomness” of a random variable is described by a probability distribution. Informally, the probability distribution specifies the probability or likelihood for a random variable to assume a particular value. Formally, let X be a random variable and let x be a possible value of X. Then, we have two cases. Discrete: the probability mass function of X specifies P (x) P (X = x) for all possible values of x. ≡ Continuous: the probability density function of X is a function f(x) that is such that f(x) h P (x < · ≈ X x + h) for small positive h.
    [Show full text]
  • Probability, Explanation, and Inference: a Reply
    Alabama Law Scholarly Commons Working Papers Faculty Scholarship 3-2-2017 Probability, Explanation, and Inference: A Reply Michael S. Pardo University of Alabama - School of Law, [email protected] Ronald J. Allen Northwestern University - Pritzker School of Law, [email protected] Follow this and additional works at: https://scholarship.law.ua.edu/fac_working_papers Recommended Citation Michael S. Pardo & Ronald J. Allen, Probability, Explanation, and Inference: A Reply, (2017). Available at: https://scholarship.law.ua.edu/fac_working_papers/4 This Working Paper is brought to you for free and open access by the Faculty Scholarship at Alabama Law Scholarly Commons. It has been accepted for inclusion in Working Papers by an authorized administrator of Alabama Law Scholarly Commons. Probability, Explanation, and Inference: A Reply Ronald J. Allen Michael S. Pardo 11 INTERNATIONAL JOURNAL OF EVIDENCE AND PROOF 307 (2007) This paper can be downloaded without charge from the Social Science Research Network Electronic Paper Collection: http://ssrn.com/abstract=2925866 Electronic copy available at: https://ssrn.com/abstract=2925866 PROBABILITY, EXPLANATION AND INFERENCE: A REPLY Probability, explanation and inference: a reply By Ronald J. Allen* and Wigmore Professor of Law, Northwestern University; Fellow, Procedural Law Research Center, China Political Science and Law University Michael S. Pardo† Assistant Professor, University of Alabama School of Law he inferences drawn from legal evidence may be understood in both probabilistic and explanatory terms. Consider evidence that a criminal T defendant confessed while in police custody. To evaluate the strength of this evidence in supporting the conclusion that the defendant is guilty, one could try to assess the probability that guilty and innocent persons confess while in police custody.
    [Show full text]
  • Patents & Legal Expenditures
    Patents & Legal Expenditures Christopher J. Ryan, Jr. & Brian L. Frye* I. INTRODUCTION ................................................................................................ 577 A. A Brief History of University Patents ................................................. 578 B. The Origin of University Patents ........................................................ 578 C. University Patenting as a Function of Patent Policy Incentives ........ 580 D. The Business of University Patenting and Technology Transfer ....... 581 E. Trends in Patent Litigation ................................................................. 584 II. DATA AND ANALYSIS .................................................................................... 588 III. CONCLUSION ................................................................................................. 591 I. INTRODUCTION Universities are engines of innovation. To encourage further innovation, the federal government and charitable foundations give universities grants in order to enable university researchers to produce the inventions and discoveries that will continue to fuel our knowledge economy. Among other things, the Bayh-Dole Act of 1980 was supposed to encourage additional innovation by enabling universities to patent inventions and discoveries produced using federal funds and to license those patents to private companies, rather than turning their patent rights over to the government. The Bayh-Dole Act unquestionably encouraged universities to patent inventions and license their patents.
    [Show full text]
  • Algorithmic Information Theory and Novelty Generation
    Computational Creativity 2007 Algorithmic Information Theory and Novelty Generation Simon McGregor Centre for Research in Cognitive Science University of Sussex, UK [email protected] Abstract 100,000 digits in the binary expansion of π can be gener- ated by a program far shorter than 100,000 bits. A string This paper discusses some of the possible contributions of consisting of the binary digit 1 repeated 1,000 times can be algorithmic information theory, and in particular the cen- generated by a program shorter than 1,000 bits. However, tral notion of data compression, to a theoretical exposition it can be shown (Li and Vitanyi, 1997) that most strings of computational creativity and novelty generation. I note are incompressible, i.e. they cannot be generated by a pro- that the formalised concepts of pattern and randomness gram shorter than themselves. Consequently, if you flip due to algorithmic information theory are relevant to com- a perfectly random coin 100,000 times, the likelihood is puter creativity, briefly discuss the role of compression that the sequence of heads and tails you obtain cannot be in machine learning theory and present a general model described by a program shorter than 100,000 bits. In algo- for generative algorithms which turns out to be instanti- rithmic information theory a string is described as random ated by decompression in a lossy compression scheme. I if and only if it is incompressible. also investigate the concept of novelty using information- Note that randomness in algorithmic information the- theoretic tools and show that a purely “impersonal” formal ory applies to strings, and not to the physical processes notion of novelty is inadequate; novelty must be defined which generate those strings.
    [Show full text]