Poisson Process and Poisson Random Measure

Total Page:16

File Type:pdf, Size:1020Kb

Poisson Process and Poisson Random Measure Appendix A Poisson Process and Poisson Random Measure Reference [130] is the main source for the material on Poisson process and Poisson random measure. A.1 Definitions Definition A.1. Let (Ω,F ,P) be a probability space, and S be a locally compact, separable metric space with Borel σ-algebra B.ThesetS denotes the collec- tion of all countable subsets of S.APoisson process with state space S, defined on (Ω,F ,P),isamap from (Ω,F ,P) to S satisfying: (a) for each B in B, N(B)=#{ ∩ B} is a Poisson random variable with parameter μ(B)=E[N(B)]; (b) for disjoint sets B1,...,Bn in B, N(B1),...,N(Bn) are independent. If B1,B2,... are disjoint, then, by definition, we have ∞ ∪∞ N( i=1Bi)=∑ N(Bi), i=1 and ∞ μ ∪∞ μ ( i=1Bi)=∑ (Bi). i=1 Hence, N is a random measure, and μ is a measure, both on (S,B). The random measure N can also be written in the following form: S. Feng, The Poisson–Dirichlet Distribution and Related Topics, 199 Probability and its Applications, DOI 10.1007/978-3-642-11194-5, © Springer-Verlag Berlin Heidelberg 2010 200 A Poisson Process and Poisson Random Measure N = ∑ δς . ς∈ Definition A.2. The measure μ is called the mean measure of the Poisson process , and N is called a Poisson random measure associated with the Poisson process . The measure μ is also called the mean measure of N. Remark: Let S =[0,∞) and μ be the Lebesgue measure on S. Then the Poisson random measure associated with the Poisson process with state space S and mean measure μ is just the one-dimensional time-homogeneous Poisson process that is defined as a pure-birth Markov chain with birth rate one. The random set is com- posed of all jump times of the process. Definition A.3. Assume that S = Rd for some d ≥ 1. Then the mean measure is also called the intensity measure. If there exists a positive constant c such that for any measurable set B, μ(B)=c|B|, |B| = Lebesgue measure of B, then the Poisson process is said to be homogeneous with intensity c. A.2 Properties Theorem A.1. Let μ be the mean measure of a Poisson process with state space S. Then μ is diffuse; i.e., for every x in S, μ({x})=0. Proof. For any fixed x in S,seta = μ({x}). Then, by definition, 2 a − P{N({x})=2} = e a = 0, 2 which leads to the result. 2 The next theorem describes the close relation between a Poisson process and the multinomial distribution. Theorem A.2. Let be a Poisson process with state space S and mean measure μ. Assume that the total mass μ(S) is finite. Then, for any n ≥ 1,1 ≤ m ≤ n, and any set partition B1,...,Bm of S, the conditional distribution of the random vector (N(B1),...,N(Bm)) given N(S)=n is a multinomial distribution with parameters n and μ(B ) μ(B ) 1 ,..., m . μ(S) μ(S) A.2 Properties 201 Proof. For any partitions n1,...,nm of n, P{N(B1)=n1,...,N(Bm)=nm | N(S)=n} P{N(B )=n ,...,N(B )=n } = 1 1 m m P{N(S)=n} −μ μ n (Bi) m (Bi)i e ∏i=1 = ni! μ(S)ne−μ(S) n! m ni n μ(Bi) = ∏ μ . n1,...,nm i=1 (S) 2 Theorem A.3. (Restriction and union) (1) Let be a Poisson process with state space S. Then, for every B in B, ∩ B is a Poisson process with state space S and mean measure μB(·)=μ(·∩B). Equivalently, ∩B can also be viewed as a Poisson process with state space B with mean measure given by the restriction of μ on B. (2) Let 1 and 2 be two independent Poisson processes with state space S, and respective mean measures μ1 and μ2. Then 1 ∪ 2 is a Poisson process with state space S and mean measure μ = μ1 + μ2. Proof. Direct verification of the definition of Poisson process. 2 The remaining theorems in this section are stated without proof. The details can be found in [130]. Theorem A.4. (Mapping) Let be a Poisson process with state space S and σ- finite mean measure μ(·). Consider a measurable map h from S to another locally compact, separable metric space S. If the measure μ(·)=μ( f −1(·)) is diffuse, then f ()={ f (ς) : ς ∈ } is a Poisson process with state space S and mean measure μ. Theorem A.5. (Marking) Let be a Poisson process with state space S and mean measure μ. The mark of each point ς in , denoted by mς , is a random variable, tak- ing values in a locally compact, separable metric space S, with distribution q(z,·). Assume that: (1) for every measurable set B in S,q(·,B) is a measurable function on (S,B); (2) given , the random variables {mς : ς ∈ } are independent; 202 A Poisson Process and Poisson Random Measure then ˜ = {(ς,mς ) : ς ∈ } is a Poisson process on with state space S×S and mean measure μ˜ (dx,dm)=μ(dx)q(x,dm). The Poisson process ˜ is aptly called a marked Poisson process. Theorem A.6. (Campbell) Let be a Poisson process on space (S,B) with mean measure μ. Then for any non-negative measurable function f , − E exp − ∑ f (ς) = exp (e f (x) − 1)μ(dx) . ς∈ S If f is a real-valued measurable function on (S,B) satisfying min(| f (x)|,1)μ(dx) < ∞, S then for any complex number λ such that the integral (eλ f (x) − 1)μ(dx) S converges, we have λ E exp λ ∑ f (ς) = exp (e f (x) − 1)μ(dx) . ς∈ S Moreover, if | f (x)|μ(dx) < ∞, (A.1) S then E ∑ f (ς) = f (x)μ(dx), ς∈ S Var ∑ f (ς) = f 2(x)μ(dx). ς∈ S In general, for any n ≥ 1, and any real-valued measurable functions f1,..., fn sat- isfying (A.1), we have n E ∑ f1(ς1)··· fn(ςn) = ∏E ∑ fi(ςi) . (A.2) ς ∈ distinct ς1,...,ςn∈ i=1 i Appendix B Basics of Large Deviations In probability theory, the law of large numbers describes the limiting average or mean behavior of a random population. The fluctuations around the average are characterized by a fluctuation theorem such as the central limit theorem. The theory of large deviations is concerned with the rare event of deviations from the average. Here we give a brief account of the basic definitions and results of large deviations. Everything will be stated in a form that will be sufficient for our needs. All proofs will be omitted. Classical references on large deviations include [30], [50], [168], and [175]. More recent developments can be found in [46], [28], and [69]. The formulations here follow mainly Dembo and Zeitouni [28]. Theorem B.6 is from [157]. Let E be a complete, separable metric space with metric ρ. Generic elements of E are denoted by x,y, etc. Definition B.1. A function I on E is called a rate function if it takes values in [0,+∞] and is lower semicontinuous. For each c in [0,+∞),theset {x ∈ E : I(x) ≤ c} is called a level set. The effective domain of I is defined as {x ∈ E : I(x) < ∞}. If all level sets are compact, the rate function is said to be good. Rate functions will be denoted by other symbols as the need arises. Let {Xε : ε > 0} be a family of E-valued random variables with distributions {Pε : ε > 0}, defined on the Borel σ-algebra B of E. Definition B.2. The family {Xε : ε > 0} or the family {Pε : ε > 0} is said to satisfy a large deviation principle (LDP) on E as ε converges to zero, with speed ε and a good rate function I if S. Feng, The Poisson–Dirichlet Distribution and Related Topics, 203 Probability and its Applications, DOI 10.1007/978-3-642-11194-5, © Springer-Verlag Berlin Heidelberg 2010 204 B Basics of Large Deviations for any closed set F, limsup ε logPε {F}≤−inf I(x), (B.1) ∈ ε→0 x F for any open set G, liminf ε logPε {G}≥−inf I(x). (B.2) ε→0 x∈G Estimates (B.1) and (B.2) are called the upper bound and lower bound, respectively. Let a(ε) be a function of ε satisfying a(ε) > 0, lim a(ε)=0. ε→0 If the multiplication factor ε in front of the logarithm is replaced by a(ε), then the LDP has speed a(ε). It is clear that the upper and lower bounds are equivalent to the following state- ment: for all B ∈ B, − inf I(x) ≤ liminfε logPε {B} x∈B◦ ε→0 limsupε logPε {B}≤−inf I(x), ε→0 x∈B where B◦ and B denote the interior and closure of B respectively. An event B ∈ B satisfying inf◦ I(x)=inf I(x) x∈B x∈B is called a I-continuity set. Thus for a I-continuity set B, we have that lim ε logPε {B} = − inf I(x). ε→0 x∈B If the values for ε are only {1/n : n ≥ 1}, we will write Pn instead of P1/n. If the upper bound (B.1) holds only for compact sets, then we say the family {Pε : ε > 0} satisfies the weak LDP.
Recommended publications
  • Poisson Representations of Branching Markov and Measure-Valued
    The Annals of Probability 2011, Vol. 39, No. 3, 939–984 DOI: 10.1214/10-AOP574 c Institute of Mathematical Statistics, 2011 POISSON REPRESENTATIONS OF BRANCHING MARKOV AND MEASURE-VALUED BRANCHING PROCESSES By Thomas G. Kurtz1 and Eliane R. Rodrigues2 University of Wisconsin, Madison and UNAM Representations of branching Markov processes and their measure- valued limits in terms of countable systems of particles are con- structed for models with spatially varying birth and death rates. Each particle has a location and a “level,” but unlike earlier con- structions, the levels change with time. In fact, death of a particle occurs only when the level of the particle crosses a specified level r, or for the limiting models, hits infinity. For branching Markov pro- cesses, at each time t, conditioned on the state of the process, the levels are independent and uniformly distributed on [0,r]. For the limiting measure-valued process, at each time t, the joint distribu- tion of locations and levels is conditionally Poisson distributed with mean measure K(t) × Λ, where Λ denotes Lebesgue measure, and K is the desired measure-valued process. The representation simplifies or gives alternative proofs for a vari- ety of calculations and results including conditioning on extinction or nonextinction, Harris’s convergence theorem for supercritical branch- ing processes, and diffusion approximations for processes in random environments. 1. Introduction. Measure-valued processes arise naturally as infinite sys- tem limits of empirical measures of finite particle systems. A number of ap- proaches have been developed which preserve distinct particles in the limit and which give a representation of the measure-valued process as a transfor- mation of the limiting infinite particle system.
    [Show full text]
  • Superprocesses and Mckean-Vlasov Equations with Creation of Mass
    Sup erpro cesses and McKean-Vlasov equations with creation of mass L. Overb eck Department of Statistics, University of California, Berkeley, 367, Evans Hall Berkeley, CA 94720, y U.S.A. Abstract Weak solutions of McKean-Vlasov equations with creation of mass are given in terms of sup erpro cesses. The solutions can b e approxi- mated by a sequence of non-interacting sup erpro cesses or by the mean- eld of multityp e sup erpro cesses with mean- eld interaction. The lat- ter approximation is asso ciated with a propagation of chaos statement for weakly interacting multityp e sup erpro cesses. Running title: Sup erpro cesses and McKean-Vlasov equations . 1 Intro duction Sup erpro cesses are useful in solving nonlinear partial di erential equation of 1+ the typ e f = f , 2 0; 1], cf. [Dy]. Wenowchange the p oint of view and showhowtheyprovide sto chastic solutions of nonlinear partial di erential Supp orted byanFellowship of the Deutsche Forschungsgemeinschaft. y On leave from the Universitat Bonn, Institut fur Angewandte Mathematik, Wegelerstr. 6, 53115 Bonn, Germany. 1 equation of McKean-Vlasovtyp e, i.e. wewant to nd weak solutions of d d 2 X X @ @ @ + d x; + bx; : 1.1 = a x; t i t t t t t ij t @t @x @x @x i j i i=1 i;j =1 d Aweak solution = 2 C [0;T];MIR satis es s Z 2 t X X @ @ a f = f + f + d f + b f ds: s ij s t 0 i s s @x @x @x 0 i j i Equation 1.1 generalizes McKean-Vlasov equations of twodi erenttyp es.
    [Show full text]
  • Local Conditioning in Dawson–Watanabe Superprocesses
    The Annals of Probability 2013, Vol. 41, No. 1, 385–443 DOI: 10.1214/11-AOP702 c Institute of Mathematical Statistics, 2013 LOCAL CONDITIONING IN DAWSON–WATANABE SUPERPROCESSES By Olav Kallenberg Auburn University Consider a locally finite Dawson–Watanabe superprocess ξ =(ξt) in Rd with d ≥ 2. Our main results include some recursive formulas for the moment measures of ξ, with connections to the uniform Brown- ian tree, a Brownian snake representation of Palm measures, continu- ity properties of conditional moment densities, leading by duality to strongly continuous versions of the multivariate Palm distributions, and a local approximation of ξt by a stationary clusterη ˜ with nice continuity and scaling properties. This all leads up to an asymptotic description of the conditional distribution of ξt for a fixed t> 0, given d that ξt charges the ε-neighborhoods of some points x1,...,xn ∈ R . In the limit as ε → 0, the restrictions to those sets are conditionally in- dependent and given by the pseudo-random measures ξ˜ orη ˜, whereas the contribution to the exterior is given by the Palm distribution of ξt at x1,...,xn. Our proofs are based on the Cox cluster representa- tions of the historical process and involve some delicate estimates of moment densities. 1. Introduction. This paper may be regarded as a continuation of [19], where we considered some local properties of a Dawson–Watanabe super- process (henceforth referred to as a DW-process) at a fixed time t> 0. Recall that a DW-process ξ = (ξt) is a vaguely continuous, measure-valued diffu- d ξtf µvt sion process in R with Laplace functionals Eµe− = e− for suitable functions f 0, where v = (vt) is the unique solution to the evolution equa- 1 ≥ 2 tion v˙ = 2 ∆v v with initial condition v0 = f.
    [Show full text]
  • Completely Random Measures and Related Models
    CRMs Sinead Williamson Background Completely random measures and related L´evyprocesses Completely models random measures Applications Normalized Sinead Williamson random measures Neutral-to-the- right processes Computational and Biological Learning Laboratory Exchangeable University of Cambridge matrices January 20, 2011 Outline CRMs Sinead Williamson 1 Background Background L´evyprocesses Completely 2 L´evyprocesses random measures Applications 3 Completely random measures Normalized random measures Neutral-to-the- right processes 4 Applications Exchangeable matrices Normalized random measures Neutral-to-the-right processes Exchangeable matrices A little measure theory CRMs Sinead Williamson Set: e.g. Integers, real numbers, people called James. Background May be finite, countably infinite, or uncountably infinite. L´evyprocesses Completely Algebra: Class T of subsets of a set T s.t. random measures 1 T 2 T . 2 If A 2 T , then Ac 2 T . Applications K Normalized 3 If A1;:::; AK 2 T , then [ Ak = A1 [ A2 [ ::: AK 2 T random k=1 measures (closed under finite unions). Neutral-to-the- right K processes 4 If A1;:::; AK 2 T , then \k=1Ak = A1 \ A2 \ ::: AK 2 T Exchangeable matrices (closed under finite intersections). σ-Algebra: Algebra that is closed under countably infinite unions and intersections. A little measure theory CRMs Sinead Williamson Background L´evyprocesses Measurable space: Combination (T ; T ) of a set and a Completely σ-algebra on that set. random measures Measure: Function µ between a σ-field and the positive Applications reals (+ 1) s.t. Normalized random measures 1 µ(;) = 0. Neutral-to-the- right 2 For all countable collections of disjoint sets processes P Exchangeable A1; A2; · · · 2 T , µ([k Ak ) = µ(Ak ).
    [Show full text]
  • Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis
    Hierarchical Dirichlet Process Hidden Markov Model for Unsupervised Bioacoustic Analysis Marius Bartcus, Faicel Chamroukhi, Herve Glotin Abstract-Hidden Markov Models (HMMs) are one of the the standard HDP-HMM Gibbs sampling has the limitation most popular and successful models in statistics and machine of an inadequate modeling of the temporal persistence of learning for modeling sequential data. However, one main issue states [9]. This problem has been addressed in [9] by relying in HMMs is the one of choosing the number of hidden states. on a sticky extension which allows a more robust learning. The Hierarchical Dirichlet Process (HDP)-HMM is a Bayesian Other solutions for the inference of the hidden Markov non-parametric alternative for standard HMMs that offers a model in this infinite state space models are using the Beam principled way to tackle this challenging problem by relying sampling [lO] rather than Gibbs sampling. on a Hierarchical Dirichlet Process (HDP) prior. We investigate the HDP-HMM in a challenging problem of unsupervised We investigate the BNP formulation for the HMM, that is learning from bioacoustic data by using Markov-Chain Monte the HDP-HMM into a challenging problem of unsupervised Carlo (MCMC) sampling techniques, namely the Gibbs sampler. learning from bioacoustic data. The problem consists of We consider a real problem of fully unsupervised humpback extracting and classifying, in a fully unsupervised way, whale song decomposition. It consists in simultaneously finding the structure of hidden whale song units, and automatically unknown number of whale song units. We use the Gibbs inferring the unknown number of the hidden units from the Mel sampler to infer the HDP-HMM from the bioacoustic data.
    [Show full text]
  • Ewens' Sampling Formula
    Ewens’ sampling formula; a combinatorial derivation Bob Griffiths, University of Oxford Sabin Lessard, Universit´edeMontr´eal Infinitely-many-alleles-model: unique mutations ......................................................................................................... •. •. •. ............................................................................................................ •. •. .................................................................................................... •. •. •. ............................................................. ......................................... .............................................. •. ............................... ..................... ..................... ......................................... ......................................... ..................... • ••• • • • ••• • • • • • Sample configuration of alleles 4 A1,1A2,4A3,3A4,3A5. Ewens’ sampling formula (1972) n sampled genes k b Probability of a sample having types with j types represented j times, jbj = n,and bj = k,is n! 1 θk · · b b 1 1 ···n n b1! ···bn! θ(θ +1)···(θ + n − 1) Example: Sample 4 A1,1A2,4A3,3A4,3A5. b1 =1, b2 =0, b3 =2, b4 =2. Old and New lineages . •. ............................... ......................................... •. .............................................. ............................... ............................... ..................... ..................... ......................................... ..................... ..................... ◦ ◦◦◦◦◦ n1 n2 nm nm+1
    [Show full text]
  • Non-Local Branching Superprocesses and Some Related Models
    Published in: Acta Applicandae Mathematicae 74 (2002), 93–112. Non-local Branching Superprocesses and Some Related Models Donald A. Dawson1 School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, Canada K1S 5B6 E-mail: [email protected] Luis G. Gorostiza2 Departamento de Matem´aticas, Centro de Investigaci´ony de Estudios Avanzados, A.P. 14-740, 07000 M´exicoD. F., M´exico E-mail: [email protected] Zenghu Li3 Department of Mathematics, Beijing Normal University, Beijing 100875, P.R. China E-mail: [email protected] Abstract A new formulation of non-local branching superprocesses is given from which we derive as special cases the rebirth, the multitype, the mass- structured, the multilevel and the age-reproduction-structured superpro- cesses and the superprocess-controlled immigration process. This unified treatment simplifies considerably the proof of existence of the old classes of superprocesses and also gives rise to some new ones. AMS Subject Classifications: 60G57, 60J80 Key words and phrases: superprocess, non-local branching, rebirth, mul- titype, mass-structured, multilevel, age-reproduction-structured, superprocess- controlled immigration. 1Supported by an NSERC Research Grant and a Max Planck Award. 2Supported by the CONACYT (Mexico, Grant No. 37130-E). 3Supported by the NNSF (China, Grant No. 10131040). 1 1 Introduction Measure-valued branching processes or superprocesses constitute a rich class of infinite dimensional processes currently under rapid development. Such processes arose in appli- cations as high density limits of branching particle systems; see e.g. Dawson (1992, 1993), Dynkin (1993, 1994), Watanabe (1968). The development of this subject has been stimu- lated from different subjects including branching processes, interacting particle systems, stochastic partial differential equations and non-linear partial differential equations.
    [Show full text]
  • Some Theory for the Analysis of Random Fields Diplomarbeit
    Philipp Pluch Some Theory for the Analysis of Random Fields With Applications to Geostatistics Diplomarbeit zur Erlangung des akademischen Grades Diplom-Ingenieur Studium der Technischen Mathematik Universit¨at Klagenfurt Fakult¨at fur¨ Wirtschaftswissenschaften und Informatik Begutachter: O.Univ.-Prof.Dr. Jurgen¨ Pilz Institut fur¨ Mathematik September 2004 To my parents, Verena and all my friends Ehrenw¨ortliche Erkl¨arung Ich erkl¨are ehrenw¨ortlich, dass ich die vorliegende Schrift verfasst und die mit ihr unmittelbar verbundenen Arbeiten selbst durchgefuhrt¨ habe. Die in der Schrift verwendete Literatur sowie das Ausmaß der mir im gesamten Arbeitsvorgang gew¨ahrten Unterstutzung¨ sind ausnahmslos angegeben. Die Schrift ist noch keiner anderen Prufungsb¨ eh¨orde vorgelegt worden. St. Urban, 29 September 2004 Preface I remember when I first was at our univeristy - I walked inside this large corridor called ’Aula’ and had no idea what I should do, didn’t know what I should study, I had interest in Psychology or Media Studies, and now I’m sitting in my office at the university, five years later, writing my final lines for my master degree theses in mathematics. A long and also hard but so beautiful way was gone, I remember at the beginning, the first mathematic courses in discrete mathematics, how difficult that was for me, the abstract thinking, my first exams and now I have finished them all, I mastered them. I have to thank so many people and I will do so now. First I have to thank my parents, who always believed in me, who gave me financial support and who had to fight with my mood when I was working hard.
    [Show full text]
  • Completely Random Measures
    Pacific Journal of Mathematics COMPLETELY RANDOM MEASURES JOHN FRANK CHARLES KINGMAN Vol. 21, No. 1 November 1967 PACIFIC JOURNAL OF MATHEMATICS Vol. 21, No. 1, 1967 COMPLETELY RANDOM MEASURES J. F. C. KlNGMAN The theory of stochastic processes is concerned with random functions defined on some parameter set. This paper is con- cerned with the case, which occurs naturally in some practical situations, in which the parameter set is a ^-algebra of subsets of some space, and the random functions are all measures on this space. Among all such random measures are distinguished some which are called completely random, which have the property that the values they take on disjoint subsets are independent. A representation theorem is proved for all completely random measures satisfying a weak finiteness condi- tion, and as a consequence it is shown that all such measures are necessarily purely atomic. 1. Stochastic processes X(t) whose realisation are nondecreasing functions of a real parameter t occur in a number of applications of probability theory. For instance, the number of events of a point process in the interval (0, t), the 'load function' of a queue input [2], the amount of water entering a reservoir in time t, and the local time process in a Markov chain ([8], [7] §14), are all processes of this type. In many applications the function X(t) enters as a convenient way of representing a measure on the real line, the Stieltjes measure Φ defined as the unique Borel measure for which (1) Φ(a, b] = X(b + ) - X(a + ), (-co <α< b< oo).
    [Show full text]
  • A Representation for Functionals of Superprocesses Via Particle Picture Raisa E
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector stochastic processes and their applications ELSEVIER Stochastic Processes and their Applications 64 (1996) 173-186 A representation for functionals of superprocesses via particle picture Raisa E. Feldman *,‘, Srikanth K. Iyer ‘,* Department of Statistics and Applied Probability, University oj’ Cull~ornia at Santa Barbara, CA 93/06-31/O, USA, and Department of Industrial Enyineeriny and Management. Technion - Israel Institute of’ Technology, Israel Received October 1995; revised May 1996 Abstract A superprocess is a measure valued process arising as the limiting density of an infinite col- lection of particles undergoing branching and diffusion. It can also be defined as a measure valued Markov process with a specified semigroup. Using the latter definition and explicit mo- ment calculations, Dynkin (1988) built multiple integrals for the superprocess. We show that the multiple integrals of the superprocess defined by Dynkin arise as weak limits of linear additive functionals built on the particle system. Keywords: Superprocesses; Additive functionals; Particle system; Multiple integrals; Intersection local time AMS c.lassijication: primary 60555; 60H05; 60580; secondary 60F05; 6OG57; 60Fl7 1. Introduction We study a class of functionals of a superprocess that can be identified as the multiple Wiener-It6 integrals of this measure valued process. More precisely, our objective is to study these via the underlying branching particle system, thus providing means of thinking about the functionals in terms of more simple, intuitive and visualizable objects. This way of looking at multiple integrals of a stochastic process has its roots in the previous work of Adler and Epstein (=Feldman) (1987), Epstein (1989), Adler ( 1989) Feldman and Rachev (1993), Feldman and Krishnakumar ( 1994).
    [Show full text]
  • Part 2: Basics of Dirichlet Processes 2.1 Motivation
    CS547Q Statistical Modeling with Stochastic Processes Winter 2011 Part 2: Basics of Dirichlet processes Lecturer: Alexandre Bouchard-Cˆot´e Scribe(s): Liangliang Wang Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. Last update: May 17, 2011 2.1 Motivation To motivate the Dirichlet process, let us consider a simple density estimation problem: modeling the height of UBC students. We are going to take a Bayesian approach to this problem, considering the parameters as random variables. In one of the most basic models, one would define a mean and variance random parameter θ = (µ, σ2), and a height random variable normally distributed conditionally on θ, with parameters θ.1 Using a single normal distribution is clearly a defective approach, since for example the male/female sub- populations create a skewness in the distribution, which cannot be capture by normal distributions: The solution suggested by this figure is to use a mixture of two normal distributions, with one set of parameters θc for each sub-population or cluster c ∈ {1, 2}. Pictorially, the model can be described as follows: 1- Generate a male/female relative frequence " " ~ Beta(male prior pseudo counts, female P.C) Mean height 2- Generate the sex of each student for each i for men Mult( ) !(1) x1 x2 x3 xi | " ~ " 3- Generate the mean height of each cluster c !(2) Mean height !(c) ~ N(prior height, how confident prior) for women y1 y2 y3 4- Generate student heights for each i yi | xi, !(1), !(2) ~ N(!(xi) ,variance) 1Yes, this has many problems (heights cannot be negative, normal assumption broken, etc).
    [Show full text]
  • Lectures on Bayesian Nonparametrics: Modeling, Algorithms and Some Theory VIASM Summer School in Hanoi, August 7–14, 2015
    Lectures on Bayesian nonparametrics: modeling, algorithms and some theory VIASM summer school in Hanoi, August 7–14, 2015 XuanLong Nguyen University of Michigan Abstract This is an elementary introduction to fundamental concepts of Bayesian nonparametrics, with an emphasis on the modeling and inference using Dirichlet process mixtures and their extensions. I draw partially from the materials of a graduate course jointly taught by Professor Jayaram Sethuraman and myself in Fall 2014 at the University of Michigan. Bayesian nonparametrics is a field which aims to create inference algorithms and statistical models, whose complexity may grow as data size increases. It is an area where probabilists, theoretical and applied statisticians, and machine learning researchers all make meaningful contacts and contributions. With the following choice of materials I hope to take the audience with modest statistical background to a vantage point from where one may begin to see a rich and sweeping panorama of a beautiful area of modern statistics. 1 Statistical inference The main goal of statistics and machine learning is to make inference about some unknown quantity based on observed data. For this purpose, one obtains data X with distribution depending on the unknown quantity, to be denoted by parameter θ. In classical statistics, this distribution is completely specified except for θ. Based on X, one makes statements about θ in some optimal way. This is the basic idea of statistical inference. Machine learning originally approached the inference problem through the lense of the computer science field of artificial intelligence. It was eventually realized that machine learning was built on the same theoretical foundation of mathematical statistics, in addition to carrying a strong emphasis on the development of algorithms.
    [Show full text]