General Coupon Collecting Models and Multinomial Games

UNLV Theses, Dissertations, Professional Papers, and Capstones 5-2010 General coupon collecting models and multinomial games James Y. Lee University of Nevada Las Vegas Follow this and additional works at: https://digitalscholarship.unlv.edu/thesesdissertations Part of the Probability Commons Repository Citation Lee, James Y., "General coupon collecting models and multinomial games" (2010). UNLV Theses, Dissertations, Professional Papers, and Capstones. 352. http://dx.doi.org/10.34917/1592231 This Thesis is protected by copyright and/or related rights. It has been brought to you by Digital Scholarship@UNLV with permission from the rights-holder(s). You are free to use this Thesis in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/ or on the work itself. This Thesis has been accepted for inclusion in UNLV Theses, Dissertations, Professional Papers, and Capstones by an authorized administrator of Digital Scholarship@UNLV. For more information, please contact [email protected]. GENERAL COUPON COLLECTING MODELS AND MULTINOMIAL GAMES by James Y. Lee Bachelor of Science University of California, Los Angeles 2003 Master of Science Creighton University 2005 A thesis submitted in partial fulfillment of the requirements for the Master of Science in Mathematical Sciences Department of Mathematical Sciences College of Sciences Graduate College University of Nevada, Las Vegas May 2010 © Copyright by James Lee 2010 All Rights Reserved THE GRADUATE COLLEGE We recommend the thesis prepared under our supervision by James Y. Lee entitled General Coupon Collecting Models and Multinomial Games be accepted in partial fulfillment of the requirements for the degree of Master of Science in Mathematical Sciences Hokwon Cho, Committee Chair Malwane Ananda, Committee Member Sandra Catlin, Committee Member Evangelos Yfantis, Graduate Faculty Representative Ronald Smith, Ph. D., Vice President for Research and Graduate Studies and Dean of the Graduate College May 2010 ii ABSTRACT General Coupon Collecting Models and Multinomial Games by James Y. Lee Hokwon Cho, Ph.D., Examination Committee Chair Associate Professor of Mathematical Sciences University of Nevada, Las Vegas The coupon collection problem is one of the most studied problems in statistics. It is the problem of collecting r (r<∞) distinct coupons one by one from k different kinds (k<∞) of coupons. We note that this is equivalent to the classical occupancy problem which involves the random allocation of r distinct balls into k distinct cells. Although the problem was first introduced centuries ago, it is still actively investigated today. Perhaps its greatest feature is its versatility, numerous approaches, and countless variations. For this reason, we are particularly interested in creating a classification system for the many generalizations of the coupon collection problem. In this thesis, we will introduce models that will be able to categorize these generalizations. In addition, we calculate the waiting time for the models under consideration. Our approach is to use the Dirichlet Type II integral. We compare our calculations to the ones obtained through Monte Carlo simulation. Our results will show that our models and the method used to find the waiting times are ideal for solving problems of this type. iii TABLE OF CONTENTS ABSTRACT……………………………………………………………………............... iii LIST OF TABLES………………………………………………………………………... v LIST OF FIGURES……………………………………………………………………… vi ACKNOWLEDGEMENTS……………………………………………………………... vii CHAPTER 1 INTRODUCTION………………………………………………... 1 1.1 Motivation………………………………………………………………...….. 1 1.2 Assumptions and Definitions………………………………………………… 2 1.3 Probability Tools……………………………………………………………... 2 CHAPTER 2 COUPON COLLECTION MODELS……………………………..8 2.1 Classic Coupon Collecting Problem……………………………………..…… 8 2.2 Formulation of Basic Models………………………………………………… 9 CHAPTER 3 EXPECTED WAITING TIMES………………………………....16 3.1 E[X] for the Classic Coupon Collection Problem…………………………... 16 3.2 E[X] for Basic Models……………………………………………..……...... 19 CHAPTER 4 SUMMARY AND CONCLUSION…………………………….. 23 4.1 Simulation……………………………………………………………...……. 23 4.2 Summary of Results………………………………………………………… 28 4.3 Further Study…………………………………………………………..……. 28 4.4 Conclusion……………………………………………………………...…... 29 REFERENCES………………………………………………………………………..… 30 VITA…………………………………………………………………………………….. 32 iv LIST OF TABLES Table 3.1 E(WT) for OMM under EPC.....................................................……………… 19 Table 3.2 E(WT) for OMM under SSC, p[1] = 1/10………………………...…………… 21 Table 3.3 E(WT) for OMM under SSC, p[1] = 1/20…………………………………...… 22 Table 3.4 E(WT) for OMM under EPC, 2 complete sets……...………………………... 22 Table 4.1 Simulation results for OMM under EPC……………………………...……… 23 Table 4.2 Simulation results for OMM under SSC, p[1] = 1/10..……………..…………. 25 Table 4.3 Simulation results for OMM under SSC, p[1] = 1/20…………………………. 26 Table 4.4 Simulation results for OMM under EPC, 2 complete sets.…………………… 27 v LIST OF FIGURES Figure 2.1 Cell probabilities under EPC, 6 cells…….……..……………………………. 10 Figure 2.2 Cell probabilities under EPC, k cells…...……………………………………. 10 Figure 2.3 Cell probabilities under SSC, k = 6, Δ = 4/50…………………….…………. 11 Figure 2.4 Cell probabilities under SSC, general case...…………………………………12 Figure 2.5 Cell probabilities under SSC, 2 cells………………………………………… 12 Figure 2.6 Cell probabilities under MSC, one cell each....……………………………… 13 Figure 2.7 Cell probabilities under MSC, two cells…...…………………………………13 vi ACKNOWLEDGEMENTS I would like to thank everyone who has helped me during this journey. I especially would like to thank the UNLV Department of Mathematical Sciences for giving me this opportunity to receive my degree. I am extremely grateful for the guidance and supervision of my advisor, Dr. Hokwon Cho. You have pushed me and taught me how to become a better person. I couldn’t have finished without your support. In addition, I would like to extend my gratitude to the various faculty from whom I have learned a great deal, especially the members of my thesis committee. Lastly, I would like to thank my family for never giving up on me and staying with me to see the finish. Special thanks to my wife Jinhee for always believing in me. vii CHAPTER 1 INTRODUCTION 1.1 Motivation of the Problem The coupon collecting problem is one of the most well known problems among probability and statistics. It has been studied extensively ever since it was first formulated by many mathematicians and statisticians (Hald, 1984). It is still actively studied. We assume that there are k (<∞) distinct coupons to collect and the probability of collecting a coupon of type i (i = 0,1,…,k) is non-zero, and coupons are obtained one at a time. We are interested in the waiting time that represents the number of coupons until we have collected one of each. Generalizations of this problem include collecting a subset, collecting at least two of each coupon, and many others. The goal of this thesis is to classify or model these generalizations into more detailed and appropriate categories. In addition, we will find the expected waiting time to collect the coupons for each model we introduce. The coupon collection problem can be explained via a multinomial distribution. In general, the waiting time of a sequential decision problem such as the coupon collection problem can be found using the incomplete Dirichlet integrals. We will use Monte Carlo simulation and compare these to the expected waiting time. 1.2 Assumptions and Definitions Suppose that there are k (<∞) distinct types of coupons to collect. Denote the k probability of collecting a coupon of type i by p , where p 1. Then a complete set i i1 i refers to all k distinct objects in the set. A subset is any part of the complete set. A 1 singleton is an object that appears once and only once in the set. For example, if we roll a fair six-sided die, then the best case scenario would be to see all six faces of the die exactly once. Any extra object beyond the complete set is referred to as a surplus. 1.3 Some Important Probability Distributions In this section, we introduce some important probability distributions. Definition 1.3.1 (Binomial Distribution) A random variable X is said to have a binomial probability distribution with parameters (n,p) if the probability mass function is given by: n m n m p(1 p ) , m 0,1,... n P() X m m (1.1) 0, otherwise. We denote this by X~ Bin(n,p), and E( X ) np , and Var( X ) np (1 p ). Definition 1.3.2 (Multinomial Distribution) The Multinomial distribution is a generalization of the binomial distribution. Consider n independent trials ()n , where each trial results in one of k mutually exclusive outcomes, and each outcome has a k probability p of occurring, where p 1 and 0p 1, i = 1,..., k . Let Yi,n be the i i1 i i number of outcomes falling in cell i (1 ≤ i ≤ k) after n observations. It follows that 2 k 0 ≤ Yi,n ≤ n and Yn . Then a random vector Y is said to have the multinomial i1 in, probability mass function n! P( Y y ,... Y y ) pxx12 p ... pxk . 1 1k ky! y !... y ! 1 2 k 12 k (1.2) with parameters n and p = (p1 ,…,pk ). Definition 1.3.3 (Beta Distribution) The Beta distribution is a continuous distribution with the probability distribution function 1 11 x(1 x ) , 0 x 1, 0, 0 fx( | , ) B(,) 0, elsewhere, (1.3) where B(,) represents the beta function, 1 B( , ) x11 (1 x ) dx , 0 and E[ X ] , and Var[ X ] . ( )2 ( 1) 3 The Beta distribution is closely related to the binomial distribution. The role of the random variable is reversed in the binomial and the beta distribution. Second, consider a problem of the following: let X~ Bin ( n , p ) and we wish to calculate P( X m ) or P( X m ), such that we have n n m n m P( X m ) 1 P ( X m ) p (1 p ) .

General Coupon Collecting Models and Multinomial Games

Urns Filled with Bitcoins: New Perspectives on Proof-Of-Work Mining

What Is the Entropy of a Social Organization?

Ballot Theorems, Old and New

ABSTRACTS (In Order of Presentation) INVITED TALK 1: MARK NEWMAN, UNIVERSITY 0F MICHIGAN Estimating Structure in Networks from Complex Or Uncertain Data

Shuffling Large Decks of Cards and the Bernoulli-Laplace Urn Model

On (Multi)-Collision Times

The Block-Constrained Configuration Model Giona Casiraghi

Concentration Inequalities and Martingale Inequalities: a Survey

Nonlinear Pólya Urn Models and Self-Organizing Processes Tong

Electoral Voting and Population Distribution in the United States Paul Kvam University of Richmond, [email protected]

Polya Urn Model

On Polya-Friedman Random Walks Thierry Huillet