Two Classes of Bivariate Distributions on the Unit Square

ISSN 2279-9362 Two classes of bivariate distributions on the unit square Antonio Lijoi Bernardo Nipoti No. 238 January 2012 www.carloalberto.org/research/working-papers © 2012 by Antonio Lijoi and Bernardo Nipoti. Any opinions expressed here are those of the authors and not those of the Collegio Carlo Alberto. Two classes of bivariate distributions on the unit square A. Lijoi Universitàdegli Studi di Pavia & Collegio Carlo Alberto, Italy. E-mail: [email protected] B. Nipoti MD Anderson Cancer Center, USA & Collegio Carlo Alberto, Italy. E-mail: [email protected] February 2012 Abstract We study a class of bivariate distributions on the unit square that are obtained by means of a suitable transformation of exponentially and polynomially tilted σ{stable distributions. It is interesting to note that they appear in a class of models used in Bayesian nonparametric inference where dependent nonparametric prior processes are obtained via normalization. The approach we undertake is general, even if we focus on the distributions that result from considering Poisson{Dirichlet and normalized inverse{Gaussian processes. We study some of their properties such as mixed moments and correlation and provide extensions to the mul- tivariate case. Finally we implement an algorithm to simulate from such distributions and observe that it may also be used to sample vectors from a wider class of distributions. Key words and phrases: Completely random measures; Generalized arcsine distribution; Inverse{Gaussian distribution; Tilted stable distributions; Poisson{Dirichlet process; Ran- dom variate generator. 1 Introduction The present paper introduces two new families of distributions on the unit square (0; 1)2 that are obtained by suitably transforming random variables whose probability distribution is a polynomial or exponential tilting of a positive σ-stable distribution. The main motivation for the analysis we are going to develop comes from possible applications to Bayesian nonparametric inference. Indeed, polynomially and exponentially tilted random variables are connected to two-parameter Poisson- Dirichlet and normalized generalized gamma processes that represent two well-known classes of nonparametric priors used in various research areas even beyond Bayesian statistics. See, e.g., 1 Pitman & Yor (1997), Brix (1999), Pitman (2003, 2006) and Lijoi et al. (2007). If we confine ourselves to the case where σ = 1=2, the stable distribution has a closed analytic form depending on a parameter c > 0, namely r c − 3 c f 1 (x) = x 2 exp − 1(0;1)(x) (1) 2 2π 2x where 1A stands for the indicator function of set A. This is also known as the Lévydensity. The polynomially tilted random variable related to the two-parameter Poisson-Dirichlet random probability measure has density function −θ g 1 (x) / x f 1 (x); (2) 2 ,θ 2 for some θ > −1=2, where / means that the above expression lacks the proportionality constant. See Pitman & Yor (1997). Similarly, the density function of the exponentially tilted random variable that is related to the normalized inverse-Gaussian prior is p − βx g 1 (x) / e f 1 (x); (3) 2 ,β 2 for some β > 0. See Lijoi et al. (2005). When σ 6= 1=2 there is no closed form expression for the density fσ: one can just characterize it through its Laplace transform and the same can be said for the tilted distributions. The construction we are going to resort to is quite simple. Let X0, X1 and X2 be independent and positive random variables whose probability distribution belongs to the same parametric family. We define the random vector (W1;W2) in such a way that Wi = Xi=(X0 +Xi), for i = 1; 2. As we shall see, when the Xj's have a density of the form (2) or (3) it is possible to obtain an explicit form of the density of (W1;W2) and analyses some of its properties such as moments and correlations. When σ 6= 1=2 it is not possible to deduce exact analytic forms of the quantities of interest and one must rely on some suitable simulation algorithm that generates realizations of the vector (W1;W2). Such a construction can also be extended to incorporate the case of d{ dimensional vectors generated by the d + 1 independent random variables X0;X1;:::;Xd, with d ≥ 2. In terms of statistical applications, the analytic and computational results that will be illustrated throughout the following sections might be useful for the construction of dependent nonparametric priors, namely collections fp~z : z 2 Zg of random probability measures indexed by a covariate, or set of covariates, z. The definition of covariate-dependent priors has recently been the object of a very active research work in the Bayesian nonparametric literature since they are amenable of use in a variety of contexts ranging from nonparametric regression to meta{ analysis, from spatial statistics to time series analysis and so on. See Hjort et al. (Eds) for a recent overview. In our case, if we set Z = f1; : : : ; dg we can definep ~z as a convex linear combination of independent random probabilities with weights Wz and (1 − Wz). The dependence among the Wz's will induce dependence among thep ~z's. Moreover, a suitable parameterization of the distribution of the Xi's, for i = 0; 1; : : : ; d, ensures that the marginal distribution of eachp ~z is the same for each z. For example, with d = 2 we can follow such a construction to obtain dependent Dirichlet processes (~p1; p~2) by letting X0, X1 and X2 be independent gamma random variables: the distribution of (W1;W2) corresponds to the bivariate beta by Olkin & Liu (2003). We will 2 not enter the details of Bayesian nonparametric modeling here and will focus on some structural properties of the distribution of the vector (W1;:::;Wd). The structure of the paper is as follows. In Section 2 we provide a quick resuméon completely random measures: these are useful for defining random probability measures that are analytically tractable in a Bayesian inferential framework and are connected to the random variables considered for defining distributions on (0; 1)2. In Section 3 we study the bivariate density that arises in the construction of dependent two-parameter Poisson-Dirichlet (PD) random measures. We start by defining the marginal density of the weights W1 and W2 that turns out to be a generalization of the arcsine distribution. The joint density of (W1;W2) can be seen as a bivariate generalized arcsine distribution. For some values of the parameters, we obtain closed expressions for mixed moments, correlation and correlation of odds ratios of (W1;W2). In section 4 we consider the vector (W1;W2) whose marginals are obtained via a suitable transformation of inverse{Gaussian random variables. Such a vector appears when dealing with dependent normalized inverse{Gaussian processes (NIG). Both Sections 3 and 4 are completed by a natural extension of the distributions to (0; 1)d with d > 2. As already mentioned, the density of (W1;W2) is not always available and one needs to resort to some simulation algorithm in order to generate realizations of (W1;W2). This algorithm is thoroughly studied in Section 5 and we devise it by relying on Devroye's generator proposed in (Devroye, 2009). We particularly focus on polynomially tilted σ{stable random variables with σ 2 (0; 1). This may be useful since it allows to estimate quantities that we do not know how to compute analytically but may be interesting for statistical inference. 2 A quick overview on completely random measures Completely random measures can be considered as an extension to general parametric spaces of R[0;+1) processes with independent increments in . In order to provide a formal definition, let MX be the space of boundedly finite measures over (X; X ), namely if m 2 MX one has m(A) < 1 for any bounded set A 2 X . Note that one can define a suitable topology on MX so that it is possible to consider the Borel σ{algebra of sets B(MX) on MX. For details see Daley & Vere-Jones (2008). P Definition 1. A measurable mapping µ from a probability space (Ω; F; ) into (MX; B(MX)) is called completely random measure (CRM) if, for any A1;:::;An in X such that Ai \ Aj = ? when i 6= j, the random variables µ(A1); : : : ; µ(An) are mutually independent. CRMs are almost surely discrete, that is any realization of a CRM is a discrete measure with probability one. Any CRM µ may be represented as µ = µc + µ0; P1 where µc = i=1 JiδXi is a CRM with both the positive jumps Ji and the locations Xi random, PM and µ0 = i=1 Viδxi is a measure with random masses Vi at fixed locations xi in X. Moreover, M 2 N[f1g and V1,:::,VM are mutually independent and independent from µc. The component without fixed jumps, µc, is characterized by the Lévy{Khintchine representation which states that 3 there exists a measure ν on R+ × X such that Z minfs; 1gν(ds; dx) < 1 (4) B×R+ for any B 2 X and Z Z E exp − f(x)µc(dx) = exp − [1 − exp (−s f(x))] ν(ds; dx) + X R ×X R for any measurable function f : ! R such that jf(x)j µc(dx) < 1 almost surely. The measure X X ν characterizes µc and is referred to as the Lévyintensity of µc. One can then define a CRM by assigning the measure ν. Moreover, a CRM µ can define a random probability measure whose distribution acts as a nonparametric prior for Bayesian inference. For example, if ν is such that it satisfies (4) and ν(R+ × X) = 1 then the corresponding CRM µ definesp ~ = µ/µ(X).

Load more