Lecture Notes on Bayesian Nonparametrics Peter Orbanz
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Introduction to Lévy Processes
Introduction to Lévy Processes Huang Lorick [email protected] Document type These are lecture notes. Typos, errors, and imprecisions are expected. Comments are welcome! This version is available at http://perso.math.univ-toulouse.fr/lhuang/enseignements/ Year of publication 2021 Terms of use This work is licensed under a Creative Commons Attribution 4.0 International license: https://creativecommons.org/licenses/by/4.0/ Contents Contents 1 1 Introduction and Examples 2 1.1 Infinitely divisible distributions . 2 1.2 Examples of infinitely divisible distributions . 2 1.3 The Lévy Khintchine formula . 4 1.4 Digression on Relativity . 6 2 Lévy processes 8 2.1 Definition of a Lévy process . 8 2.2 Examples of Lévy processes . 9 2.3 Exploring the Jumps of a Lévy Process . 11 3 Proof of the Levy Khintchine formula 19 3.1 The Lévy-Itô Decomposition . 19 3.2 Consequences of the Lévy-Itô Decomposition . 21 3.3 Exercises . 23 3.4 Discussion . 23 4 Lévy processes as Markov Processes 24 4.1 Properties of the Semi-group . 24 4.2 The Generator . 26 4.3 Recurrence and Transience . 28 4.4 Fractional Derivatives . 29 5 Elements of Stochastic Calculus with Jumps 31 5.1 Example of Use in Applications . 31 5.2 Stochastic Integration . 32 5.3 Construction of the Stochastic Integral . 33 5.4 Quadratic Variation and Itô Formula with jumps . 34 5.5 Stochastic Differential Equation . 35 Bibliography 38 1 Chapter 1 Introduction and Examples In this introductive chapter, we start by defining the notion of infinitely divisible distributions. We then give examples of such distributions and end this chapter by stating the celebrated Lévy-Khintchine formula. -
Superprocesses and Mckean-Vlasov Equations with Creation of Mass
Sup erpro cesses and McKean-Vlasov equations with creation of mass L. Overb eck Department of Statistics, University of California, Berkeley, 367, Evans Hall Berkeley, CA 94720, y U.S.A. Abstract Weak solutions of McKean-Vlasov equations with creation of mass are given in terms of sup erpro cesses. The solutions can b e approxi- mated by a sequence of non-interacting sup erpro cesses or by the mean- eld of multityp e sup erpro cesses with mean- eld interaction. The lat- ter approximation is asso ciated with a propagation of chaos statement for weakly interacting multityp e sup erpro cesses. Running title: Sup erpro cesses and McKean-Vlasov equations . 1 Intro duction Sup erpro cesses are useful in solving nonlinear partial di erential equation of 1+ the typ e f = f , 2 0; 1], cf. [Dy]. Wenowchange the p oint of view and showhowtheyprovide sto chastic solutions of nonlinear partial di erential Supp orted byanFellowship of the Deutsche Forschungsgemeinschaft. y On leave from the Universitat Bonn, Institut fur Angewandte Mathematik, Wegelerstr. 6, 53115 Bonn, Germany. 1 equation of McKean-Vlasovtyp e, i.e. wewant to nd weak solutions of d d 2 X X @ @ @ + d x; + bx; : 1.1 = a x; t i t t t t t ij t @t @x @x @x i j i i=1 i;j =1 d Aweak solution = 2 C [0;T];MIR satis es s Z 2 t X X @ @ a f = f + f + d f + b f ds: s ij s t 0 i s s @x @x @x 0 i j i Equation 1.1 generalizes McKean-Vlasov equations of twodi erenttyp es. -
Birth and Death Process in Mean Field Type Interaction
BIRTH AND DEATH PROCESS IN MEAN FIELD TYPE INTERACTION MARIE-NOÉMIE THAI ABSTRACT. Theaim ofthispaperis to study theasymptoticbehaviorofa system of birth and death processes in mean field type interaction in discrete space. We first establish the exponential convergence of the particle system to equilibrium for a suitable Wasserstein coupling distance. The approach provides an explicit quantitative estimate on the rate of convergence. We prove next a uniform propagation of chaos property. As a consequence, we show that the limit of the associated empirical distribution, which is the solution of a nonlinear differential equation, converges exponentially fast to equilibrium. This paper can be seen as a discrete version of the particle approximation of the McKean-Vlasov equations and is inspired from previous works of Malrieu and al and Caputo, Dai Pra and Posta. AMS 2000 Mathematical Subject Classification: 60K35, 60K25, 60J27, 60B10, 37A25. Keywords: Interacting particle system - mean field - coupling - Wasserstein distance - propagation of chaos. CONTENTS 1. Introduction 1 Long time behavior of the particle system 4 Propagation of chaos 5 Longtimebehaviorofthenonlinearprocess 7 2. Proof of Theorem1.1 7 3. Proof of Theorem1.2 12 4. Proof of Theorem1.5 15 5. Appendix 16 References 18 1. INTRODUCTION The concept of mean field interaction arised in statistical physics with Kac [17] and then arXiv:1510.03238v1 [math.PR] 12 Oct 2015 McKean [21] in order to describe the collisions between particles in a gas, and has later been applied in other areas such as biology or communication networks. A particle system is in mean field interaction when the system acts over one fixed particle through the em- pirical measure of the system. -
1 Introduction
The analysis of marked and weighted empirical processes of estimated residuals Vanessa Berenguer-Rico Department of Economics; University of Oxford; Oxford; OX1 3UQ; UK and Mansfield College; Oxford; OX1 3TF [email protected] Søren Johansen Department of Economics; University of Copenhagen and CREATES; Department of Economics and Business; Aarhus University [email protected] Bent Nielsen1 Department of Economics; University of Oxford; Oxford; OX1 3UQ; UK and Nuffield College; Oxford; OX1 1NF; U.K. bent.nielsen@nuffield.ox.ac.uk 29 April 2019 Summary An extended and improved theory is presented for marked and weighted empirical processes of residuals of time series regressions. The theory is motivated by 1- step Huber-skip estimators, where a set of good observations are selected using an initial estimator and an updated estimator is found by applying least squares to the selected observations. In this case, the weights and marks represent powers of the regressors and the regression errors, respectively. The inclusion of marks is a non-trivial extention to previous theory and requires refined martingale arguments. Keywords 1-step Huber-skip; Non-stationarity; Robust Statistics; Stationarity. 1 Introduction We consider marked and weighted empirical processes of residuals from a linear time series regression. Such processes are sums of products of an adapted weight, a mark that is a power of the innovations and an indicator for the residuals belonging to a half line. They have previously been studied by Johansen & Nielsen (2016a) - JN16 henceforth - generalising results by Koul & Ossiander (1994) and Koul (2002) for processes without marks. The results presented extend and improve upon expansions previously given in JN16, while correcting a mistake in the argument, simplifying proofs and allowing weaker conditions on the innovation distribution and regressors. -
A Mathematical Theory of Network Interference and Its Applications
INVITED PAPER A Mathematical Theory of Network Interference and Its Applications A unifying framework is developed to characterize the aggregate interference in wireless networks, and several applications are presented. By Moe Z. Win, Fellow IEEE, Pedro C. Pinto, Student Member IEEE, and Lawrence A. Shepp ABSTRACT | In this paper, we introduce a mathematical I. INTRODUCTION framework for the characterization of network interference in In a wireless network composed of many spatially wireless systems. We consider a network in which the scattered nodes, communication is constrained by various interferers are scattered according to a spatial Poisson process impairments such as the wireless propagation effects, and are operating asynchronously in a wireless environment network interference, and thermal noise. The effects subject to path loss, shadowing, and multipath fading. We start introduced by propagation in the wireless channel include by determining the statistical distribution of the aggregate the attenuation of radiated signals with distance (path network interference. We then investigate four applications of loss), the blocking of signals caused by large obstacles the proposed model: 1) interference in cognitive radio net- (shadowing), and the reception of multiple copies of the works; 2) interference in wireless packet networks; 3) spectrum same transmitted signal (multipath fading). The network of the aggregate radio-frequency emission of wireless net- interference is due to accumulation of signals radiated by works; and 4) coexistence between ultrawideband and nar- other transmitters, which undesirably affect receiver nodes rowband systems. Our framework accounts for all the essential in the network. The thermal noise is introduced by the physical parameters that affect network interference, such as receiver electronics and is usually modeled as additive the wireless propagation effects, the transmission technology, white Gaussian noise (AWGN). -
Hyperprior on Symmetric Dirichlet Distribution
Hyperprior on symmetric Dirichlet distribution Jun Lu Computer Science, EPFL, Lausanne [email protected] November 5, 2018 Abstract In this article we introduce how to put vague hyperprior on Dirichlet distribution, and we update the parameter of it by adaptive rejection sampling (ARS). Finally we analyze this hyperprior in an over-fitted mixture model by some synthetic experiments. 1 Introduction It has become popular to use over-fitted mixture models in which number of cluster K is chosen as a conservative upper bound on the number of components under the expectation that only relatively few of the components K0 will be occupied by data points in the samples X . This kind of over-fitted mixture models has been successfully due to the ease in computation. Previously Rousseau & Mengersen(2011) proved that quite generally, the posterior behaviour of overfitted mixtures depends on the chosen prior on the weights, and on the number of free parameters in the emission distributions (here D, i.e. the dimension of data). Specifically, they have proved that (a) If α=min(αk; k ≤ K)>D=2 and if the number of components is larger than it should be, asymptotically two or more components in an overfitted mixture model will tend to merge with non-negligible weights. (b) −1=2 In contrast, if α=max(αk; k 6 K) < D=2, the extra components are emptied at a rate of N . Hence, if none of the components are small, it implies that K is probably not larger than K0. In the intermediate case, if min(αk; k ≤ K) ≤ D=2 ≤ max(αk; k 6 K), then the situation varies depending on the αk’s and on the difference between K and K0. -
Analysis of Error Propagation in Particle Filters with Approximation
The Annals of Applied Probability 2011, Vol. 21, No. 6, 2343–2378 DOI: 10.1214/11-AAP760 c Institute of Mathematical Statistics, 2011 ANALYSIS OF ERROR PROPAGATION IN PARTICLE FILTERS WITH APPROXIMATION1 By Boris N. Oreshkin and Mark J. Coates McGill University This paper examines the impact of approximation steps that be- come necessary when particle filters are implemented on resource- constrained platforms. We consider particle filters that perform in- termittent approximation, either by subsampling the particles or by generating a parametric approximation. For such algorithms, we de- rive time-uniform bounds on the weak-sense Lp error and present associated exponential inequalities. We motivate the theoretical anal- ysis by considering the leader node particle filter and present numeri- cal experiments exploring its performance and the relationship to the error bounds. 1. Introduction. Particle filters have proven to be an effective approach for addressing difficult tracking problems [8]. Since they are more compu- tationally demanding and require more memory than most other filtering algorithms, they are really only a valid choice for challenging problems, for which other well-established techniques perform poorly. Such problems in- volve dynamics and/or observation models that are substantially nonlinear and non-Gaussian. A particle filter maintains a set of “particles” that are candidate state values of the system (e.g., the position and velocity of an object). The filter evaluates how well individual particles correspond to the dynamic model and set of observations, and updates weights accordingly. The set of weighted particles provides a pointwise approximation to the filtering distribution, which represents the posterior probability of the state. -
Introduction to Lévy Processes
Introduction to L´evyprocesses Graduate lecture 22 January 2004 Matthias Winkel Departmental lecturer (Institute of Actuaries and Aon lecturer in Statistics) 1. Random walks and continuous-time limits 2. Examples 3. Classification and construction of L´evy processes 4. Examples 5. Poisson point processes and simulation 1 1. Random walks and continuous-time limits 4 Definition 1 Let Yk, k ≥ 1, be i.i.d. Then n X 0 Sn = Yk, n ∈ N, k=1 is called a random walk. -4 0 8 16 Random walks have stationary and independent increments Yk = Sk − Sk−1, k ≥ 1. Stationarity means the Yk have identical distribution. Definition 2 A right-continuous process Xt, t ∈ R+, with stationary independent increments is called L´evy process. 2 Page 1 What are Sn, n ≥ 0, and Xt, t ≥ 0? Stochastic processes; mathematical objects, well-defined, with many nice properties that can be studied. If you don’t like this, think of a model for a stock price evolving with time. There are also many other applications. If you worry about negative values, think of log’s of prices. What does Definition 2 mean? Increments , = 1 , are independent and Xtk − Xtk−1 k , . , n , = 1 for all 0 = . Xtk − Xtk−1 ∼ Xtk−tk−1 k , . , n t0 < . < tn Right-continuity refers to the sample paths (realisations). 3 Can we obtain L´evyprocesses from random walks? What happens e.g. if we let the time unit tend to zero, i.e. take a more and more remote look at our random walk? If we focus at a fixed time, 1 say, and speed up the process so as to make n steps per time unit, we know what happens, the answer is given by the Central Limit Theorem: 2 Theorem 1 (Lindeberg-L´evy) If σ = V ar(Y1) < ∞, then Sn − (Sn) √E → Z ∼ N(0, σ2) in distribution, as n → ∞. -
Fclts for the Quadratic Variation of a CTRW and for Certain Stochastic Integrals
FCLTs for the Quadratic Variation of a CTRW and for certain stochastic integrals No`eliaViles Cuadros (joint work with Enrico Scalas) Universitat de Barcelona Sevilla, 17 de Septiembre 2013 Damped harmonic oscillator subject to a random force The equation of motion is informally given by x¨(t) + γx_(t) + kx(t) = ξ(t); (1) where x(t) is the position of the oscillating particle with unit mass at time t, γ > 0 is the damping coefficient, k > 0 is the spring constant and ξ(t) represents white L´evynoise. I. M. Sokolov, Harmonic oscillator under L´evynoise: Unexpected properties in the phase space. Phys. Rev. E. Stat. Nonlin Soft Matter Phys 83, 041118 (2011). 2 of 27 The formal solution is Z t x(t) = F (t) + G(t − t0)ξ(t0)dt0; (2) −∞ where G(t) is the Green function for the homogeneous equation. The solution for the velocity component can be written as Z t 0 0 0 v(t) = Fv (t) + Gv (t − t )ξ(t )dt ; (3) −∞ d d where Fv (t) = dt F (t) and Gv (t) = dt G(t). 3 of 27 • Replace the white noise with a sequence of instantaneous shots of random amplitude at random times. • They can be expressed in terms of the formal derivative of compound renewal process, a random walk subordinated to a counting process called continuous-time random walk. A continuous time random walk (CTRW) is a pure jump process given by a sum of i.i.d. random jumps fYi gi2N separated by i.i.d. random waiting times (positive random variables) fJi gi2N. -
Levy Processes
LÉVY PROCESSES, STABLE PROCESSES, AND SUBORDINATORS STEVEN P.LALLEY 1. DEFINITIONSAND EXAMPLES d Definition 1.1. A continuous–time process Xt = X(t ) t 0 with values in R (or, more generally, in an abelian topological groupG ) isf called a Lévyg ≥ process if (1) its sample paths are right-continuous and have left limits at every time point t , and (2) it has stationary, independent increments, that is: (a) For all 0 = t0 < t1 < < tk , the increments X(ti ) X(ti 1) are independent. − (b) For all 0 s t the··· random variables X(t ) X−(s ) and X(t s ) X(0) have the same distribution.≤ ≤ − − − The default initial condition is X0 = 0. A subordinator is a real-valued Lévy process with nondecreasing sample paths. A stable process is a real-valued Lévy process Xt t 0 with ≥ initial value X0 = 0 that satisfies the self-similarity property f g 1/α (1.1) Xt =t =D X1 t > 0. 8 The parameter α is called the exponent of the process. Example 1.1. The most fundamental Lévy processes are the Wiener process and the Poisson process. The Poisson process is a subordinator, but is not stable; the Wiener process is stable, with exponent α = 2. Any linear combination of independent Lévy processes is again a Lévy process, so, for instance, if the Wiener process Wt and the Poisson process Nt are independent then Wt Nt is a Lévy process. More important, linear combinations of independent Poisson− processes are Lévy processes: these are special cases of what are called compound Poisson processes: see sec. -
Financial Modeling with L´Evy Processes and Applying L
FINANCIAL MODELING WITH LEVY´ PROCESSES AND APPLYING LEVY´ SUBORDINATOR TO CURRENT STOCK DATA by GONSALGE ALMEIDA Submitted in partial fullfillment of the requirements for the degree of Doctor of Philosophy Dissertation Advisor: Dr. Wojbor A. Woyczynski Department of Mathematics, Applied Mathematics and Statistics CASE WESTERN RESERVE UNIVERSITY January 2020 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the dissertation of Gonsalge Almeida candidate for the Doctoral of Philosophy degree Committee Chair: Dr.Wojbor Woyczynski Professor, Department of the Mathematics, Applied Mathematics and Statis- tics Committee: Dr.Alethea Barbaro Associate Professor, Department of the Mathematics, Applied Mathematics and Statistics Committee: Dr.Jenny Brynjarsdottir Associate Professor, Department of the Mathematics, Applied Mathematics and Statistics Committee: Dr.Peter Ritchken Professor, Weatherhead School of Management Acceptance date: June 14, 2019 *We also certify that written approval has been obtained for any proprietary material contained therein. CONTENTS List of Figures iv List of Tables ix Introduction . .1 1 Financial Modeling with L´evyProcesses and Infinitely Divisible Distributions 5 1.1 Introduction . .5 1.2 Preliminaries on L´evyprocesses . .6 1.3 Characteristic Functions . .8 1.4 Cumulant Generating Function . .9 1.5 α−Stable Distributions . 10 1.6 Tempered Stable Distribution and Process . 19 1.6.1 Tempered Stable Diffusion and Super-Diffusion . 23 1.7 Numerical Approximation of Stable and Tempered Stable Sample Paths 28 1.8 Monte Carlo Simulation for Tempered α−Stable L´evyprocess . 34 2 Brownian Subordination (Tempered Stable Subordinator) 44 i 2.1 Introduction . 44 2.2 Tempered Anomalous Subdiffusion . 46 2.3 Subordinators . 49 2.4 Time-Changed Brownian Motion . -
A Dirichlet Process Mixture Model of Discrete Choice Arxiv:1801.06296V1
A Dirichlet Process Mixture Model of Discrete Choice 19 January, 2018 Rico Krueger (corresponding author) Research Centre for Integrated Transport Innovation, School of Civil and Environmental Engineering, UNSW Australia, Sydney NSW 2052, Australia [email protected] Akshay Vij Institute for Choice, University of South Australia 140 Arthur Street, North Sydney NSW 2060, Australia [email protected] Taha H. Rashidi Research Centre for Integrated Transport Innovation, School of Civil and Environmental Engineering, UNSW Australia, Sydney NSW 2052, Australia [email protected] arXiv:1801.06296v1 [stat.AP] 19 Jan 2018 1 Abstract We present a mixed multinomial logit (MNL) model, which leverages the truncated stick- breaking process representation of the Dirichlet process as a flexible nonparametric mixing distribution. The proposed model is a Dirichlet process mixture model and accommodates discrete representations of heterogeneity, like a latent class MNL model. Yet, unlike a latent class MNL model, the proposed discrete choice model does not require the analyst to fix the number of mixture components prior to estimation, as the complexity of the discrete mixing distribution is inferred from the evidence. For posterior inference in the proposed Dirichlet process mixture model of discrete choice, we derive an expectation maximisation algorithm. In a simulation study, we demonstrate that the proposed model framework can flexibly capture differently-shaped taste parameter distributions. Furthermore, we empirically validate the model framework in a case study on motorists’ route choice preferences and find that the proposed Dirichlet process mixture model of discrete choice outperforms a latent class MNL model and mixed MNL models with common parametric mixing distributions in terms of both in-sample fit and out-of-sample predictive ability.