Sub-Independence: an Expository Perspective
Total Page:16
File Type:pdf, Size:1020Kb
Marquette University e-Publications@Marquette Mathematics, Statistics and Computer Science Mathematics, Statistics and Computer Science, Faculty Research and Publications Department of (- 2019) 9-2013 Sub-Independence: An Expository Perspective Gholamhossein Hamedani Marquette University, [email protected] Follow this and additional works at: https://epublications.marquette.edu/mscs_fac Part of the Computer Sciences Commons, Mathematics Commons, and the Statistics and Probability Commons Recommended Citation Hamedani, Gholamhossein, "Sub-Independence: An Expository Perspective" (2013). Mathematics, Statistics and Computer Science Faculty Research and Publications. 165. https://epublications.marquette.edu/mscs_fac/165 Marquette University e-Publications@Marquette Mathematics and Statistical Sciences Faculty Research and Publications/College of Arts and Sciences This paper is NOT THE PUBLISHED VERSION; but the author’s final, peer-reviewed manuscript. The published version may be accessed by following the link in the citation below. Communications in Statistics : Theory and Methods, Vol. 42, No. 20 (2013): 3615-3638. DOI. This article is © Taylor & Francis and permission has been granted for this version to appear in e- Publications@Marquette. Taylor & Francis does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Taylor & Francis. Sub-Independence: An Expository Perspective G. G. Hamedani Department of Mathematics, Statistics and Computer Sciences, Marquette University, Milwaukee, WI Abstract Limit theorems as well as other well-known results in probability and statistics are often based on the distribution of the sums of independent random variables. The concept of sub-independence, which is much weaker than that of independence, is shown to be sufficient to yield the conclusions of these theorems and results. It also provides a measure of dissociation between two random variables which is much stronger than uncorrelatedness. Keywords Characteristics function, Independence, Limit theorems, Sub-independence 1. Introduction Limit theorems as well as other well-known results in probability and statistics are often based on the distribution of the sums of independent (and often identically distributed) random variables rather than the joint distribution of the summands. Therefore, the full force of independence of the summands will not be required. In other words, it is the convolution of the marginal distributions which is needed rather than the joint distribution of the summands, which in the case of independence, is the product of the marginal distributions. The concept of sub-independence can help to provide solution for some modeling problems where the variable of interest is the sum of a few components. Examples include household income, the total profit of major firms in an industry, and a regression model Y = g(X) + ϵ where g(X) and ϵ are uncorrelated but may not be independent. For example, in Bazargan et al. (2007), the return value of significant wave height (Y) is modeled by the sum of a cyclic function of random delay D, 푔̂(퐷), and a residual term 휀̂. They found that the two components are at least uncorrelated but not independent and used sub-independence to compute the distribution of the return value. Let 푋 and 푌 be two random variables (rv's) with joint and marginal cumulative distribution functions (cdf's) 퐹푋,푌 , 퐹푋 , and 퐹푌, respectively. Then 푋 and 푌 are said to be independent if and only if (1.1) 2 퐹푋,푌(푥, 푦) = 퐹푋(푥)퐹푌(푦), 푓표푟 푎푙푙 (푥, 푦) ∈ ℝ , or equivalently if and only if (1.2) 2 휑푋,푌(푠, 푡) = 휑푋(푠)휑푥(푡), 푓표푟 푎푙푙 (푠, 푡) ∈ ℝ where 휑푋,푌(푠, 푡), 휑푋(푠), and 휑푥(푡), are, respectively, the corresponding joint and marginal characteristic functions (cf's). Note that (1.1) and (1.2) are also equivalent to (1.3) 푃(푋 ∈ 퐴푎푛푑푌 ∈ 퐵) = 푃(푋 ∈ 퐴)푃(푌 ∈ 퐵), for all Borel sets 퐴, 퐵. The concept of sub-independence, as far as we have gathered, was formally introduced by Durairajan (1979) stated as follows: The rv's 푋 and 푌 with cdf's 퐹푋 and 퐹푌 are sub-independent (s.i.) if the cdf of 푋 + 푌 is given by (1.4) 퐹푋+푌(푧) = (퐹푋 ∗ 퐹푌)(푧) = ∫ 퐹푋(푧 − 푦)푑퐹푌(푦) , 푧 ∈ ℝ, ℝ or, equivalently, if and only if (1.5) 휑푋+푌(푡) = 휑푋,푌(푡, 푡) = 휑푋(푡)휑푦(푡), 푓표푟 푎푙푙 푡 ∈ ℝ. The drawback of the concept of sub-independence in comparison with that of independence has been that the former does not have an equivalent definition in the sense of (1.3) which some believe to be the natural definition of independence. We believe to have found such a definition now which is stated below. We will give two separate definitions, one for the discrete case (Definition 1.1) and the other for the continuous case (Definition 1.2). Let (푋, 푌): 훺 → ℝ2 be a discrete random vector with range ℜ(푋, 푌) = (푥 푖 , 푦 푗 ): 푖, 푗 = 1,2, … (finitely or infinitely countable). Consider the events 퐴푖 = {휔 ∈ Ω: 푋(휔) = 푥푖}, 퐵푗 = {휔 ∈ Ω: 푌(휔) = 푦푗} And 퐴푧 = {휔 ∈ Ω:푋(휔) + 푌(휔) = 푧}, 푧 ∈ ℜ(푋 + 푌). Definition 1.1 The discrete rv's 푋 and 푌 are s.i. if for every 푧 ∈ ℜ(푋 + 푌) (1.6) 푧 푃(퐴 ) = ∑ ∑ 푃(퐴푖)푃(퐵푗). 푖,푗, 푥푖+푦푗=푧 To see that (1.6) is equivalent to (1.5), suppose 푋 and 푌 are s.i. via (1.5), then 푖푡(푥푖+푦푗) 푖푡(푥푖+푦푗) ∑ ∑ 푒 푓 (푥푖 + 푦푗) = ∑ ∑ 푒 푓푋(푥푖)푓푌(푦푗), 푖 푗 푖 푗 where 푓, 푓푋 , and 푓푌 are probability functions of (푋, 푌), 푋, and 푌, respectively. Let 푧 ∈ ℜ(푋 + 푌), then 푖푡푧 푖푡푧 푒 ∑ ∑ f (푥푖 + 푦푗) =푒 ∑ ∑ 푓푋(푥푖)푓푌 (푦푗) , 푖,푗, 푥푖+푦푗=푧 푖,푗, 푥푖+푦푗=푧 which implies (1.6). For the converse, assume (1.6) holds and reverse the above last two steps to arrive at (1.5). For the continuous case, we observe that the half-plane 퐻 = (푥, 푦): 푥 + 푦 < 0 can be written as a countable disjoint union of rectangles: ∞ 퐻 = ⋃ 퐸푖 × 퐹푖, 푖=1 2 where 퐸푖 and 퐹푖 are intervals. Now, let (푋, 푌): 훺 → ℝ be a continuous random vector and for 푐 ∈ ℝ let 퐴푐 = {휔 ∈ Ω ∶ 푋(휔) + 푌(휔) < 푐} and 푐 푐 퐴(푐) = {휔 ∈ Ω: 푋(휔) − ∈ 퐸 } , 퐵(푐) = {휔 ∈ Ω: 푌(휔) − ∈ 퐹 } 푖 2 푖 푖 2 푖 Definition 1.2 The continuous rv's 푋 and 푌 are s.i. if for every 푐 ∈ ℝ (1.7) ∞ (푐) (푐) 푃(퐴푐) = ∑ 푃 (퐴푖 ) 푃 (퐵푖 ) 푖=1 To see that (1.7) is equivalent to (1.4), observe that (LHS of (1.7)) (1.8) 푃(퐴푐) = 푃(푋 + 푌 < 푐) = 푃((푋, 푌) ∈ 퐻푐), where 퐻푐 = (푥, 푦): 푥 + 푦 < 푐. Now, if 푋 and 푌 are s.i. then 푃(퐴푐) = (푃푋 × 푃푌)(퐻푐), where 푃푋 , 푃푌 are probability measures on ℝ defined by 푃푋(퐵) = 푃(푋 ∈ 퐵) and 푃푌(퐵) = 푃(푌 ∈ 퐵), and 푃푋 × 푃푌 is the product measure. We also observe that (RHS of (1.7)) (1.9) ∞ 푐 푐 = ∑ 푃 (푋 − ∈ 퐸 ) 푃 (푌 − ∈ 퐹 ) 2 푖 2 푖 푖=1 ∞ ∞ 푐 푐 ∑ 푃 (퐴(푐)) 푃 (퐵(푐)) = ∑ 푃 (푋 ∈ 퐸 + ) 푃 (푌 ∈ 퐹 + ) 푖 푖 푖 2 푖 2 푖=1 푖=1 ∞ 푐 푐 = ∑ 푃 × 푃 (퐸 + ) × (퐹 + ) . 푋 푌 푖 2 푖 2 푖=1 ∞ 푐 푐 Now, (1.8) and (1.9) will be equal if 퐻 = ⋃ {(퐸 + ) × (퐹 + )}, which is true since the points in 퐻 are 푐 푖=1 푖 2 푖 2 푐 푐 푐 obtained by shifting each point in H over to the right by units and then up by units. 2 2 Remark 1.1 (i) Note that 퐻 can be written as a union of squares and triangles. The triangles are congruent to 0 ≤ 푦 < 푥, 0 ≤ 푥 < 1 which in turn can be written as a disjoint union of squares. For example, take [0,1/2) × [0,1/2) then [1/2,3/4) × [0,1/4) and so on. (ii) The discrete rv's 푋, 푌, and 푍 are s.i. if (1.6) holds for any pair and (1.10) 푠 푃(퐴 ) = ∑ ∑ ∑ 푃(퐴푖)푃(퐵푗)푃(퐶푘). 푖,푗,푘, 푥푖+푦푗+푧푘=푠 For 푛 variate case we need 2푛 − 푛 − 1 equations of the above form. (iii) The representation (1.7) can be extended to the multivariate case as well. (iv) For the sake of simplicity of the computations, we use (1.5) and its extension to the multivariate case as our definition of sub-independence throughout this work. We may in some occasions have asked ourselves if there is a concept between “uncorrelatedness” and “independence” of two random variables. It seems that the concept of “sub-independence” is the one: it is much stronger than uncorrelatedness and much weaker than independence. The notion of sub-independence seems important in the sense that under usual assumptions, Khintchine's Law of Large Numbers and Lindeberg- Lévy's Central Limit Theorem as well as other important theorems in probability and statistics hold for a sequence of s.i. random variables. While sub-independence can be substituted for independence in many cases, it is difficult (in general) to find conditions under which the former implies the latter. Even in the case of two discrete identically distributed rv's X and Y the joint distribution can assume many forms consistent with sub- independence. In order for two random variables X and Y to be s.i. the probabilities 푝푖 = 푃(푋 = 푥푖), 푖 = 1,2, … , 푛 and 푞푖푗 = 푃(푋 = 푥푖, 푌 = 푥푖), 푖, 푗 = 1.2 … .