On products of Gaussian random variables
Zeljkaˇ Stojanac∗1, Daniel Suess†1, and Martin Kliesch‡2
1Institute for Theoretical Physics, University of Cologne, Germany 2 Institute of Theoretical Physics and Astrophysics, University of Gda´nsk,Poland
May 29, 2018
Sums of independent random variables form the basis of many fundamental theorems in probability theory and statistics, and therefore, are well understood. The related problem of characterizing products of independent random variables seems to be much more challenging. In this work, we investigate such products of normal random vari- ables, products of their absolute values, and products of their squares. We compute power-log series expansions of their cumulative distribution function (CDF) based on the theory of Fox H-functions. Numerically we show that for small arguments the CDFs are well approximated by the lowest orders of this expansion. For the two non-negative random variables, we also compute the moment generating functions in terms of Mei- jer G-functions, and consequently, obtain a Chernoff bound for sums of such random variables. Keywords: Gaussian random variable, product distribution, Meijer G-function, Cher- noff bound, moment generating function AMS subject classifications: 60E99, 33C60, 62E15, 62E17
1. Introduction and motivation
Compared to sums of independent random variables, our understanding of products is much less comprehensive. Nevertheless, products of independent random variables arise naturally in many applications including channel modeling [1,2], wireless relaying systems [3], quantum physics (product measurements of product states), as well as signal processing. Here, we are particularly motivated by a tensor sensing problem (see Ref. [4] for the basic idea). In this problem we consider n1×n2×···×n tensors T ∈ R d and wish to recover them from measurements of the form yi := hAi,T i 1 2 d j nj with the sensing tensors also being of rank one, Ai = ai ⊗ai ⊗· · ·⊗ai with ai ∈ R . Additionally, j we assumed that the entries of ai are iid Gaussian random variables. Applying such maps to arXiv:1711.10516v2 [math.PR] 27 May 2018 properly normalized rank-one tensor results in a product of d Gaussian random variables. Products of independent random variables have already been studied for more than 50 years [5] but are still subject of ongoing research [6–9]. In particular, it was shown that the probability density function of a product of certain independent and identically distributed (iid) random variables from the exponential family can be written in terms of Meijer G-functions [10]. In this work, we characterize cumulative distribution functions (CDFs) and moment generating functions arising from products of iid normal random variables, products of their squares, and products of their absolute values. We provide power-log series expansions of the CDFs of these distributions and demonstrate numerically that low-order truncations of these series provide tight
∗E-mail: [email protected] †E-mail: [email protected] ‡E-mail: [email protected] Comparison in Prop. 1.1 of true bound and C for N= 3 Comparison in Prop. 1.1 of true bound and C for N= 6 Comparison in Prop. 1.1 of true bound and C for N= 13 3 3 3 1.2 1.2 1
0.9 1 1
0.8
) )
)
t t
t
5 5
0.8 0.8 5 0.7
2 2
2
i i
i
g g
g
Q Q
Q 0.6
( (
(
P P
0.6 0.6 P
y y
y 0.5
t t
t
i i
i
l l
l
i i
i
b b
b
a a
a 0.4
b b
b
o o
0.4 0.4 o
r r
r
p p p 0.3
0.2 0.2 true bound 0.2 true bound true bound upper bound C upper bound C upper bound C 3 3 0.1 3 const=1 const=1 const=1 0 0 0 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 t t t
QN 2 Figure 1: Comparison of the CDF P( i=1 gi ≤ t) (blue lines) with the bound stated in Proposi- tion1 (red lines) for different values of N. Here, we chose θ for each value t numerically from all possible values between 10−5 and 0.5 − 10−5 with step size 10−5 such that the right hand side in (1) is minimal. approximations. Moreover, we express the moment generating functions of the two latter distribu- tions in terms of Meijer-G functions. We state the corresponding Chernoff bounds, which bound the CDFs of sums of such products. Finally, we find simplified estimates of the Chernoff bounds and illustrate them numerically.
1.1. Simple bound We first state a simple but quite loose bound to the CDF of the product of iid squared standard Gaussian random variables.
N Proposition 1. Let {gi}i=1 be a set of iid standard Gaussian random variables (i.e., gi ∼ N (0, 1)). Then " N # Y 2 N θ P gi ≤ t ≤ inf Cθ t , (1) θ∈(0,1/2) i=1 where Γ(−θ + 1 ) C = √ 2 > 0. θ π2θ
1 2 1 Proof. For θ < 2 the moment generating function of log(gi ) is given by
Z 2 −θ log g2 −2θ 1 −2θ − x Ee i = E |gi| = √ |x| e 2 dx 2π R (2) 2−θ Z ∞ 1 2−θ Γ(−θ + 1 ) = √ z−θ− 2 e−zdz = √ 2 , π 0 π
x2 1 where we have used the substitution z = 2 . Note that for θ ≥ 2 the integral diverges. Now, the proposition is a simple consequence of Chernoff bound:
N ! N ! Y 2 X 2 P gi ≤ t = P log(gi ) ≤ log t i=1 i=1 N θ log t −θ log g2 ≤ inf e Ee i (3) θ>0 θ N t = π− 2 inf Γ(−θ + 1 )N . 1 2N 2 θ∈(0, 2 )
1Throughout the article, log(x) denotes the natural logarithm of x.
2 Figure1 shows that this bound is indeed quite loose. This holds in particular for values of t that are very small. However, since Proposition1 is based on Chernoff inequality – a tail bound on sums of random variables – one cannot expect good results for those intermediate values. As a matter of fact, the upper bound in (1) becomes slightly larger than the trivial bound P(Y ≤ t) ≤ 1 already for quite small values of t. Deriving a good approximations for such values t and any N is in the focus of the next section.
2. Power-log series expansion for cumulative distribution functions
N Throughout the article, we use the following notation. Let N ∈ N and denote by {gi}i=1 a set of iid standard Gaussian random variables (i.e., gi ∼ N (0, 1), for all i ∈ [N]). We denote the random variables considered in this work by
N N N Y Y 2 Y X := gi,Y := gi ,Z := |gi|. (4) i=1 i=1 i=1 The probability density function of X can be written in terms of Meijer G-functions [10]. We provide a similar representation for the CDF of X, Y , and Z as well as the corresponding power-log series expansion using the theory developed for the more general Fox H-functions [11]. It is worth noting that series of Meijer-G functions have already been extensively studied. For example, in [12] the expansions in series of Meijer G-functions as well as in series of Jacobi and Chebyshev’s polynomials have been provided. In this work, we investigate the power-log series expansion of special instances of Meijer-G functions. An advantage of this expansion is that it does not contain any special functions (excpet derivatives of the Gamma function at 1). For the sake of completeness, we give a brief introduction on Meijer G-functions together with some of their properties in AppendixA; see Refs. [6, 10] for more details. The following proposition is a special case of a result by Cook [13, p. 103] which relies on the Mellin transform. However, for this particular case we provide an elementary proof.
N Proposition 2 (CDFs in terms of Meijer G-functions). Let {gi}i=1 be a set of independent iden- QN tically distributed standard Gaussian random variables (i.e., gi ∼ N (0, 1)) and X := i=1 gi, QN 2 QN Y := i=1 gi , Z := i=1 |gi| with N ∈ N. Define the function Gα by 1 1 1, 1/2,..., 1/2 G (z) := 1 − · G0,N+1 z . (5) α α N N+1,1 2 π 2 0 Then, for any t > 0, 2N (X ≤ t) = (X ≥ −t) = G (6) P P 1 t2 2N (Y ≤ t) = G (7) P 0 t 2N (Z ≤ t) = G . (8) P 0 t2 In Ref. [4] we have provided a proof of the above proposition by deriving first a result for the random variable X. The results for random variables Y and Z then follow trivially. For completeness, we derive a proof for a random variable Y in the AppendixB (see Lemma 16). The statements for X and Z are simple consequences of this result, since for t > 0,
2 2 2 P(Z ≤ t) = P(Z ≤ t ) = P(Y ≤ t ) , (9) 1 1 (X ≤ t) = (2 (X ≤ t)) = ( (X ≤ t) + 1 − (X ≤ −t)) P 2 P 2 P P 1 1 1 = ( (−t ≤ X ≤ t) + 1) = (X2 ≤ t2) + 1 = (Y ≤ t2) + 1 . (10) 2 P 2 P 2 P
3 Now we are ready to present the main result of this section. The proof is based on the theory of power-log series of Fox H-functions [11, Theorem 1.5].
N Theorem 3 (CDFs as power-log series expansion). Let {gi}i=1 be a set of independent identically QN distributed standard Gaussian random variables (i.e., gi ∼ N (0, 1)) and X := i=1 gi, Y := QN 2 QN i=1 gi , Z := i=1 |gi| with N ∈ N. Define the function fν,ξ by
∞ N−1 1 1 X X j f (u) := ν + · u−1/2−k H · [log u] (11) ν,ξ 2ξ πN/2 kj k=0 j=0 with
N−1 −(n−j+1) (−1)Nk X 1 H := + k kj j! 2 n=j
N (k−1 ) (`k+1) X Y X Γ (1) Y −(` +1) · (k − i + 1) i (12) `k+1! j1+...+jN =N−1−n t=1 `1+...+`k+1=jt i=1 and with ji ∈ N0 and `i ∈ N0. Then, for any t > 0, 2N (X ≤ t) = (X ≥ −t) = f , (13) P P 1/2,1 t2 2N (Y ≤ t) = f , (14) P 0,0 t 2N (Z ≤ t) = f . (15) P 0,0 t2 In Figure2 we compare the CDF of X, Y , and Z. Moreover, we compare the approximations obtained from truncating the power-log series (11) at low orders: taking only the leading term into account (i.e. k = 0), we already obtain a good approximation to the CDF for relatively small values of t. Furthermore, the error decreases with an increasing number of factors N. Truncating the power-log series at the next highest order k = 1 yields an error that we can only resolve in the log-error-plot as shown in the insets of Figure2. Finding explicit error bounds is still an open problem. The main difficulty seems to be that the series (12) contains large terms with alternating signs given by sign Γ(`)(1) = (−1)` that cancel out, so that the CDFs give indeed a value in [0, 1]. To prove the above result, it is enough to obtain the power-log series for the Meijer G-function from (5), 1, 1/2,..., 1/2 G0,N+1 z , (16) N+1,1 0
2N keeping in mind that in the case of random variables X and Z we have z = t2 , and in the case 2N of random variable Y we have z = t with t > 0. Then Theorem3 is a direct consequence of Proposition2 and Theorem4. Theorem 4 (Power-log expansion of the relevant Meijer G-function). For z > 0,
∞ N−1 0,N+1 1, 1/2,..., 1/2 N X −1/2−k X j G z = π 2 − z H · [log z] (17) N+1,1 0 kj k=0 j=0 with Hkj defined in (12). Proof. By [11, Theorem 1.2 (iii)], the Meijer G-function (16) is an analytic function of z in the Nπ + sector |arg z| < 2 . This condition is satisfied for N ≥ 3 since arg z = 0 for z ∈ R0 and arg z = π, for z ∈ R−. Therefore, we have
N+1 ∞ h i 0,N+1 1, 1/2,..., 1/2 X X 0,N+1 −s GN+1,1 z = − Res HN+1,1(s)z , (18) 0 s=aik i=1 k=0
4 N=3 N=6 N=13 1 1 1
0.9 0.9 0.9
0.8 0.8 0.8
) ) )
t t
t 0.7 0.7 0.7
5 5
5 0 0 0
i i
i 0.6 0.6 0.6
g g g
1 1
1 0.5 -10 0.5 -10 0.5 -10
= = =
i i i
N N N
Q Q
Q 0.4 0.4 0.4
( (
( -20 -20 -20
r r r
P P P 0.3 0.3 0.3 -30 -30 -30 0.2 0.2 0.2 Exact -40 Exact -40 Exact -40 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 0.1 0.1 0.1 Approximation, k=1 t Approximation, k=1 t Approximation, k=1 t 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t t t QN (a) X = i=1 gi
N=3 N=6 N=13 1 1 1
0.9 0.9 0.9
0.8 0.8 0.8
) ) )
t t
t 0.7 0.7 0.7
5 5 5
0.6 0 0.6 0 0.6 0
2 2 2
i i i
g g g
1 1
1 0.5 -10 0.5 -10 0.5 -10
= = =
i i i
N N N
Q Q
Q 0.4 0.4 0.4
( (
( -20 -20 -20
r r r
P P P 0.3 0.3 0.3 -30 -30 -30 0.2 0.2 0.2 Exact -40 Exact -40 Exact -40 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 0.1 0.1 0.1 Approximation, k=1 t Approximation, k=1 t Approximation, k=1 t 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t t t QN 2 (b) Y = i=1 gi
N=3 N=6 N=13 1 1 1
0.9 0.9 0.9
0.8 0.8 0.8
) ) )
t t
t 0.7 0.7 0.7
5 5 5
j j
j 0 0 0
i i
i 0.6 0.6 0.6
g g g
j j j
1 1
1 0.5 0.5 -10 0.5 -10
= = =
i i i
N N
N -20
Q Q
Q 0.4 0.4 -20 0.4 -20
( ( (
r r r
0.3 -40 0.3 0.3
P P P -30 -30 0.2 0.2 0.2 Exact -60 Exact -40 Exact -40 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 Approximation, k=0 0 0.2 0.4 0.6 0.8 1 0.1 0.1 0.1 Approximation, k=1 t Approximation, k=1 t Approximation, k=1 t 0 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 t t t QN (c) Z = i=1 |gi|
Figure 2: The plots of the CDFs (red lines) of the random variables X, Y , and Z with their power-log series (11) N truncated at k = 0 (green lines) and k = 1 (blue lines), where {gi}i=1 is a set of iid standard Gaussian random variables. Each inset shows the approximation error of the truncations on a logarithmic scale for k = 0 (green line) and k = 1 (blue line).
5 0,N+1 where HN+1,0 is given by (85) in the Supplemental Materials with m = 0, n = p = N + 1, q = 1, a1 = 1, a2 = ... = aN+1 = 1/2, and b1 = 0, i.e.
QN+1 0,N+1 i=1 Γ(1 − ai − s) HN+1,1(s) = (19) Γ(1 − b1 − s) and aik = 1 − ai + k with k ∈ N0 denote the poles of Γ(1 − ai − s). In our scenario we have simple poles a1k = k and poles a2k = 1/2 + k of order N with k ∈ N0. Therefore,
∞ h i 0,N+1 1, 1/2,..., 1/2 X 0,N+1 −s GN+1,1 z = − Res HN+1,1(s)z 0 s=a1k k=0 ∞ X h 0,N+1 −si + − Res HN+1,1(s)z (20) s=a2k k=0 with
QN+1 N 0,N+1 i=1 Γ(1 − ai − s) Γ(1 − 1 − s)Γ (1 − 1/2 − s) HN+1,1(s) = = Γ(1 − b1 − s) Γ(1 − 0 − s) Γ(−s)ΓN (1/2 − s) Γ(−s)ΓN (1/2 − s) ΓN (1/2 − s) = = = − , (21) Γ(1 − s) −s · Γ(−s) s
2 where we have used that Γ(1 − s) = −s · Γ(−s). For the simple poles a1k we have [11, (1.4.10)] h i 0,N+1 −s −a1k − Res HN+1,1(s)z = h1kz , (22) s=a1k where the constants h1k are
h 0,N+1 i h1k = lim −(s − a1k)HN+1,1(s) . (23) s→a1k
For k ∈ N using (21) we have
N h 0,N+1 i Γ (1/2 − s) h1k = lim −(s − k)HN+1,1(s) = lim (s − k) s→k s→k s (24) k = lim 1 − ΓN (1/2 − s) = 0. s→k s
For k = 0 we have
N h 0,N+1 i Γ (1/2 − s) N N h10 = lim −sH (s) = lim s = lim Γ (1/2 − s) = π 2 . (25) s→0 N+1,1 s→0 s s→0
Combining (24), (25), and (22) with (20) we obtain ∞ 0,N+1 1, 1/2,... 1/2 N X h 0,N+1 −si 2 GN+1,1 z = π − Res HN+1,1(s)z , (26) 0 s=a2k k=0 where a2k = 1/2 +k, for all k ∈ N0. Having (20) in mind, the result now follows from the following lemma.
2Note that there is a minus sign missing in [11, (1.4.10)] due to a typo.
6 N Lemma 5 (Residues of Γ ). For k ∈ N0 and a2k = 1/2 + k
N−1 h 0,N+1 −si −1/2−k X j Res HN+1,1(s)z = z Hkj · [log z] (27) s=a2k j=0 where Hkj are defined in (12). In the following, we denote the n-th derivative of a function f by f (n) or [f](n). Furthermore, the n-th derivative of product of functions f and g is denoted by [f · g](n).
Proof. Since a2k = 1/2 + k is an N-th order pole of Γ(1/2 − s) for all k ∈ N0, also the integrand 0,N+1 −1 N HN+1,1(s) = −s Γ (1/2 − s) has a pole a2k of order N for all k ∈ N0. Thus
h 0,N+1 −si Res HN+1,1(s)z s=1/2+k (N−1) 1 h N 0,N+1 −si = lim (s − 1/2 − k) HN+1,1(s)z (N − 1)! s→1/2+k 1 (N−1) = lim (s − 1/2 − k)N · −s−1ΓN (1/2 − s) z−s (N − 1)! s→1/2+k 1 h i(N−1) = lim ((s − 1/2 − k) Γ (1/2 − s))N · (−s−1)z−s (N − 1)! s→1/2+k
1 −s(N−1) = lim H1(s) ·H2(s)z , (28) (N − 1)! s→1/2+k where
N H1(s) := ((s − 1/2 − k) Γ (1/2 − s)) (29) −1 H2(s) := −s . (30)
∂i −s i −s i Using the Leibniz rule and that ∂is [z ] = (−1) z [log z] we can expand the expression in the limit in (28), i.e.,
−s(N−1) H1(s) ·H2(s)z N−1 X N − 1 (N−1−n) (n) = [H (s)] H (s)z−s n 1 2 n=0 N−1 n X N − 1 (N−1−n) X n (n−j) j = [H (s)] (−1)j [H (s)] z−s [log z] n 1 j 2 n=0 j=0 N−1 N−1 X X N − 1 n (N−1−n) (n−j) j = z−s (−1)j [H (s)] [H (s)] [log z] . (31) n j 1 2 j=0 n=j
7 Plugging (31) into (28) we obtain
h 0,N+1 −si Res HN+1,1(s)z s=1/2+k 1 = lim z−s (N − 1)! s→1/2+k N−1 N−1 X X N − 1 n (N−1−n) (n−j) j · (−1)j [H (s)] [H (s)] [log z] n j 1 2 j=0 n=j
N−1 N−1 (32) X 1 X N − 1n = z−1/2−k (−1)j (N − 1)! n j j=0 n=j (N−1−n) (n−j) j · lim [H1(s)] lim [H2(s)] [log z] s→1/2+k s→1/2+k N−1 −1/2−k X j = z Hkj · [log z] , j=0 where
j N−1 (−1) X N − 1 n (N−1−n) (n−j) Hkj := lim [H1(s)] lim [H2(s)] . (33) (N − 1)! n j s→1/2+k s→1/2+k n=j
−1 It is easy to see (proof by induction) that for H2(s) = −s and for ` ∈ N0 it holds that
(`) `+1 −`−1 H2 (s) = (−1) `!s (34) and
(`) `+1 −`−1 lim [H2(s)] = (−1) `!(1/2 + k) for all k ∈ N0. (35) s→1/2+k
Plugging this into (33) and simplifying leads to
N−1 X n+1 1 −(n−j+1) (N−1−n) Hkj = (−1) (1/2 + k) lim [H1(s)] . (36) j!(N − 1 − n)! s→1/2+k n=j
Comparing the above expression with (12) it is enough to show that
(N−1−n) Nk−n−1 lim [H1(s)] = (−1) (N − 1 − n)! s→1/2+k
N (k−1 ) (`k+1) X Y X Γ (1) Y −(` +1) · (k − i + 1) i , (37) `k+1! j1+...+jN =N−1−n t=1 `1+...+`k+1=jt i=1 which is a direct consequence of the following result.
Lemma 6. For H1 defined in (29) and k ∈ N0
(j) N(k−1)+j lim [H1(s)] = (−1) j! s→1/2+k
N (k−1 ) (`k+1) X Y X Γ (1) Y −(` +1) · (k − i + 1) i . (38) `k+1! j1+...+jN =j t=1 `1+...+`k+1=jt i=1
To prove this result we use the following lemma whose proof can be found in AppendixB.
8 Lemma 7. Define f(s) := (s − 1/2 − k)Γ(1/2 − s). Then for every j ∈ N and k ∈ N0 it holds that
k `k+1 (j) k+j−1 X Γ (1) Y −(`i+1) lim f (s) = (−1) j! (k − i + 1) . (39) s→1/2+k `k+1! `1+`2+...+`k+1=j i=1
N Proof of Lemma6. As by definition H1 = f , we have
(j) N (j) lim [H1(s)] = lim f (s) , (40) s→1/2+k s→1/2+k with f defined in Lemma7. Using Leibniz’ rule leads to
j (j) X Y (jt) lim [H1(s)] = lim f (s). (41) s→1/2+k j1, j2, . . . , jN s→1/2+k j1+j2+...+jN =j 1≤t≤N Applying Lemma7 we obtain
(j) lim [H1(s)] s→1/2+k X j = j1, j2, . . . , jN j1+j2+...+jN =j k Γ`k+1 (1) Y k+jt−1 X Y −(`i+1) · (−1) jt! (k − i + 1) `k+1! (42) 1≤t≤N `1+`2+...+`k+1=jt i=1
j! PN X (k+jt−1) = (−1) t=1 j1!j2! . . . jN ! j1!j2! . . . jN ! j1+j2+...+jN =j k `k+1 Y X Γ (1) Y −(` +1) · (k − i + 1) i , `k+1! 1≤t≤N `1+`2+...+`k+1=jt i=1 which coincides with the lemma statement (38).
3. Moment generating functions and Chernoff bounds
There exists no moment generating function (MGF) for the product X of N iid standard Gaussian random variables. However, clearly the moments exist and are given by
" N # N ( N k Y k Y k k N [(k − 1)!!] for even k , E[X ] = E gi = E[gi ] = E[g1 ] = 0 for odd k , i=1 i=1 with n!! := n(n − 2)(n − 4) ... denoting the double factorial. Thus, the moments of Y = X2 and Z = |X| also exist. The following proposition additionally provides the MGF on [0, ∞) for the random variables Y and Z. The proof of this proposition follows from properties (95), (91), and (97) in the Supple- mental Materials and can be found in Ref. [4].
N Proposition 8 (MGF of Y and Z). Let {gi}i=1 be a set of N iid standard Gaussian random QN 2 QN variables, i.e. gi ∼ N (0, 1). For the random variables Y := i=1 gi and Z := i=1 |gi| and all t > 0 ! 1 1 1 e−tY = GN,1 , (43) E N/2 1,N N π 2 t 1/2, 1/2,..., 1/2 ! 1 1 1, 1/2 e−tZ = GN,2 . (44) E N+1 2,N N−2 2 1/2, 1/2,..., 1/2 π 2 2 t
9 Remark 9. All moments of Y and Z exist, and the MGF (given by the Meijer G-function) is smooth at the origin. Hence, we indeed have
k k k ∂ −tY E[Y ] = (−1) lim E e . (45) t&0 ∂tk Remark 10. Knowing all the moments of the random variable Y , it seems trivial to compute the corresponding moment generating function for t < 0
∞ X tj etY = Y j . (46) E E j! j=0
It is well known that one can exchange the expectation and the series if the series converges abso- lutely and this is not true in our case. Even more,
j t j lim E Y = ∞ (47) j→∞ j! for N ≥ 2 and any t > 0. Hence, the moment series does indeed not converge for any t 6= 0 and N > 1. Even more, computing the MGF of Y via the Meijer G-function allows for the following result which could be of independent interest.
Corollary 11. For k ∈ N0 and N ∈ N (with a convention that (−1)!! = 1) it holds that ! 1 1 1 − k √ N lim GN,1 = π(2k − 1)!! . (48) k 1,N N t&0 t 2 t 1/2, 1/2,..., 1/2 Proof. The result follows from the following two observations
" N # N k Y 2k Y 2k N E[Y ] = E gi = E gi = [(2k − 1)!!] (49) i=1 i=1 ! ∂k ∂k 1 1 1 e−tY = GN,1 k E k N/2 1,N N ∂t ∂t π 2 t 1/2, 1/2,..., 1/2 ! 1 (−1)k 1 1 − k = GN,1 , (50) N/2 k 1,N N π t 2 t 1/2, 1/2,..., 1/2 where the last equality can be easily proven by induction. The analogous results then also hold for the random variable Z but appear to be more technical. The k-th moment of Y and Z can also be obtained as the N-th power of the k-th moment of a Gaussian random variable squared or of its absolute value, respectively. However, often it is also important to know the MGF of the random variable, for instance, in order to obtain a tail bound for sums of iid copies of the random variable (Chernoff’s inequality). PM Next, using Proposition8 we compute the Chernoff bound for random variables j=1 Yj and PM j=1 Zj. PM PM N,M Proposition 12 (Chernoff bound for j=1 Yj and j=1 Zj). Let {gi,j}i=1,j=1 be a set of inde- pendent identically distributed standard Gaussian random variables (i.e., gi,j ∼ N (0, 1)). Then, QN 2 we have for Yj := i=1 gi,j with j ∈ [M]
M X 1 P Yj ≤ t ≤ MN min {FY (t),GY (t)} , (51) 2 j=1 π
10 where " !#M t N,1 1 : 2N FY (t) = e G1,N 1 , (52) 1/2, 1/2,..., 1/2 " !#M M N,1 t 1 G (t) := e 2 G . (53) Y 1,N N−1 2 M 1/2, 1/2,..., 1/2
QN Furthermore, for Zj := i=1 |gi,j| with j ∈ [M] we have
M X 1 P Zj ≤ t ≤ M(N+1) min {FZ (t),GZ (t)} , (54) j=1 π 2 where " !#M − N−2 1, 1/2 : t·2 2 N,2 FZ (t) = e G2,N 1 , (55) 1/2, 1/2,..., 1/2 " !#M t2 1, 1/2 G (t) := eM GN,2 . (56) Z 2,N N−2 2 2 M 1/2, 1/2,..., 1/2 Proof of Proposition 12. For θ > 0, by Chernoff’s inequality, we have
M M M X θt Y −θYj θt −θY1 P Yj ≤ t ≤ min e E e = min e E e . (57) θ>0 θ>0 j=1 j=1
Applying Proposition8 we obtain
M " !#M 1 1 X θt N,1 1 P Yj ≤ t ≤ min e G1,N θ>0 πN/2 2N θ 1/2, 1/2,..., 1/2 j=1 (58) " !#M 1 1 1 = min eθt GN,1 . MN 1,N N 1/2, 1/2,..., 1/2 π 2 θ>0 2 θ Next, we define a function " !#M 1 1 f (θ) := eθt GN,1 . (59) Y 1,N N 2 θ 1/2, 1/2,..., 1/2 To compute the minimizer, we calculate the corresponding derivative, d f (θ) dθ Y " !#M 1 1 = t · eθt GN,1 1,N N 2 θ 1/2, 1/2,..., 1/2 " !#M−1 ! 1 1 d 1 1 + eθt M · GN,1 GN,1 1,N N 1,N N 2 θ 1/2, 1/2,..., 1/2 dθ 2 θ 1/2, 1/2,..., 1/2 " !#M−1 1 1 = eθt GN,1 1,N N 2 θ 1/2, 1/2,..., 1/2 ( ! !) 1 1 d 1 1 · t · GN,1 + M GN,1 . 1,N N 1,N N 2 θ 1/2, 1/2,..., 1/2 dθ 2 θ 1/2, 1/2,..., 1/2
11 (a) N = 4
(b) N = 5
(c) N = 7
Figure 3: Numerical comparison of the Chernoff bounds stated in Proposition 12. The green and blue lines show the bounds in (52) and (53), respectively. The red line shows (58) where for each t the minimum over θ is approximated numerically by a minimum over a finite number of points. Here, we chose the values of θ, over which the minimum is taken, evenly spaced on a logarithmic scale between 2−N/2+2 and M/t.
12 (a) N = 4
(b) N = 5
(c) N = 7
PM Figure 4: The analogue of Fig.3 for the random variable j=1 Zj. The green and blue lines show the bounds in (55) and (56), respectively.
13 Since the moment generating function, when it exists, is positive (i.e. E[eθY ] ≥ eµθ with µ = E(Y )) d we have dθ fY (θ) = 0 if and only if ! ! 1 1 d 1 1 t · GN,1 + M GN,1 = 0 . (60) 1,N N 1,N N 2 θ 1/2, 1/2,..., 1/2 dθ 2 θ 1/2, 1/2,..., 1/2
Applying (92) from the Supplemental Materials leads to
! −1 ! d 1 1 1 −1 1 0 GN,1 = · GN,1 1,N N N N 2 1,N N dθ 2 θ 1/2,..., 1/2 2 θ 2 θ 2 θ 1/2, 1/2,..., 1/2 ! −1 1 0 = GN,1 . (61) 1,N N θ 2 θ 1/2, 1/2,..., 1/2 Therefore, we need to find θ such that ! ! 1 1 M 1 0 t · GN,1 − GN,1 = 0 (62) 1,N N 1,N N 2 θ 1/2, 1/2,..., 1/2 θ 2 θ 1/2, 1/2,..., 1/2 ! ! M 1,N N 1/2, 1/2,..., 1/2 1,N N 1/2, 1/2,..., 1/2 ⇔ t · GN,1 2 θ − GN,1 2 θ = 0 , 0 θ 1 where the second equation follows from (90). However, directly solving (62) for θ is still intractable. Our idea is to approximate both Meijer G-functions appearing in said equation with the lowest order of a power-log series expansion. In this way we obtain a good-enough but not necessarily PM optimal choice of θ for the bound on the CDF of j=1 Yj from (58). The necessary power-log series expansion is provided in the next Lemma. The proof can be found in AppendixB. PM Lemma 13 (Power-log series expansion related to j=1 Yj). For x ∈ {0, 1}
! ∞ N−1 1/2, 1/2,..., 1/2 X X j G1,N z = z−1/2−k Hx · [log(z)] , (63) N,1 x kj k=0 j=0 where N−1 (−1)Nk−1 X (−1)n−j Hx = · Γ(n−j)(x + 1/2 + k) kj j! (n − j)! n=j (64) N k−1 X Y X Γ(`k+1)(1) Y · (k − i + 1)−(`i+1) `k+1! j1+...+jN =N−1−n t=1 `1+...+`k+1=jt i=1 Using the full expansion it is still difficult to obtain the solution of (62). Therefore, we take the simplest approximation (k = 0, j = N − 1), i.e. !
1,N N 1/2, 1/2,..., 1/2 N −1/2 x N N−1 GN,1 2 θ ≈ 2 θ H0,N−1 · log(2 θ) , (65) x where −1 H0 = Γ(1/2) (66) 0,N−1 (N − 1)! −1 −1 1 1 H1 = Γ(1 + 1/2) = Γ(1/2) = H0 (67) 0,N−1 (N − 1)! (N − 1)! 2 2 0,N−1
14 and compute the corresponding solution. The approximation becomes better as the argument grows. Thus, since z = 2N θ, the approximation improves as N grows. Next, we solve the following equation for θ,
−1/2 N−1 M −1/2 1 N−1 t · 2N θ H0 · log(2N θ) − 2N θ H0 · log(2N θ) = 0 (68) 0,N−1 θ 2 0,N−1 which is equivalent to solving
−1/2 N−1 M 2N θ H0 · log(2N θ) t − = 0. (69) 0,N−1 2θ
1 M The solutions are θ = 2N and θ = 2t . Plugging this result into (58) we obtain
M " !#M X 1 t N,1 1 2N P Yj ≤ t ≤ MN min e G1,N 1 , 2 1/2, 1/2,..., 1/2 j=1 π " !#M M N,1 t 1 e 2 G . (70) 1,N N−1 2 M 1/2, 1/2,..., 1/2
PM Similarly, we derive the Chernoff bound for the random variable j=1 Zj. With Proposition8 we obtain
M M X θt −θZ1 P Zj ≤ t ≤ min e Ee θ>0 j=1 " !#M 1 1 1, 1/2 ≤ min eθt GN,2 N+1 2,N N−2 2 1/2, 1/2,..., 1/2 θ>0 π 2 2 θ " !#M 1 1 θt N,2 1, 1/2 = M(N+1) min e G2,N . (71) θ>0 N−2 2 1/2, 1/2,..., 1/2 π 2 2 θ Next, we define a function
" !#M 1 1, 1/2 f (θ) := eθt GN,2 . (72) Z 2,N N−2 2 2 θ 1/2, 1/2,..., 1/2 To compute the minimizer, we calculate the corresponding derivative, d f (θ) dθ Z " !#M 1 1, 1/2 = t · eθt GN,2 2,N N−2 2 2 θ 1/2, 1/2,..., 1/2 " !#M−1 ! 1 1, 1/2 d 1 1, 1/2 + Meθt · GN,2 GN,2 2,N N−2 2 2,N N−2 2 (73) 2 θ 1/2,..., 1/2 dθ 2 θ 1/2,..., 1/2 " !#M−1 1 1, 1/2 = eθt GN,2 2,N N−2 2 2 θ 1/2, 1/2,..., 1/2 ( ! !) 1 1, 1/2 d 1 1, 1/2 · t · GN,2 + M GN,2 . 2,N N−2 2 2,N N−2 2 2 θ 1/2,..., 1/2 dθ 2 θ 1/2,..., 1/2
15 Similarly as before, we apply (92) to obtain ! d 1 1/2, 1 GN,2 2,N N−2 2 dθ 2 θ 1/2, 1/2,..., 1/2 −1 ! 1 −2 1 0, 1/2 = · · GN,2 (74) N−2 2 N−2 3 2,N N−2 2 2 θ 2 θ 2 θ 1/2, 1/2,..., 1/2 ! −2 1 0, 1/2 = · GN,2 . 2,N N−2 2 θ 2 θ 1/2, 1/2,..., 1/2 Thus, we need to find θ > 0 such that ! ! 1 1, 1/2 2M 1 0, 1/2 t · GN,2 − GN,2 = 0 (75) 2,N N−2 2 2,N N−2 2 2 θ 1/2, 1/2,..., 1/2 θ 2 θ 1/2, 1/2,..., 1/2 which is equivalent to ! ! 2M 2,N N−2 2 1/2, 1/2,..., 1/2 2,N N−2 2 1/2, 1/2,..., 1/2 t · GN,2 2 θ − GN,2 2 θ = 0 , (76) 0, 1/2 θ 1/2, 1 where the second equation follows from (90). As already experienced in the analysis of the random PM variable j=1 Yj, an analytic solution for the optimal value of θ from (76) is infeasible. Once again, we solve the approximate equality obtained by replacing the Meijer G-functions with their lowest order in the power-log series expansion. In this way we obtain a good-enough but not necessarily PM optimal choice of θ for the bound on the CDF of j=1 Zj from (71). We postpone the proof of the following Lemma, which contains said power-log series expansion, to Sec.B. PM Lemma 14 (Power-log series expansion related to j=1 Zj). For x ∈ {0, 1}
! ∞ N−1 1/2, 1/2,..., 1/2 X X j G2,N z = z−1/2−k H¯ x · [log(z)] , (77) N,2 1/2, x kj k=0 j=0 where
N−1 n−j ! (−1)Nk−1 X (−1)n−j X n − j H¯ x = · Γ(n−j−i)(1 + k)Γ(i)(x + 1/2 + k) kj j! (n − j)! i n=j i=0 (78) N k−1 X Y X Γ(`k+1)(1) Y (k − i + 1)−(`i+1) . `k+1! j1+...+jN =N−1−n t=1 `1+...+`k+1=jt i=1
Using the full expansion it is still difficult to obtain the solution of (76). Therefore, we take the simplest approximation (k = 0, j = N − 1), i.e. !
2,N N−2 2 1/2, 1/2,..., 1/2 N−2 2−1/2 ¯ x N−2 2 N−1 GN,2 2 θ ≈ 2 θ H0,N−1 · log(2 θ ) , (79) 1/2, x where −1 H¯ 0 = Γ(1)Γ(1/2) (80) 0,N−1 (N − 1)! −1 −1 1 1 H¯ 1 = Γ(1)Γ(1 + 1/2) = Γ(1) Γ(1/2) = H¯ 0 (81) 0,N−1 (N − 1)! (N − 1)! 2 2 0,N−1 and compute the corresponding solution. The approximation of the aforementioned Meijer G- function is better as the argument grows. As z = 2N−2θ2, the approximation improves with
16 growing N. Thus, plugging these approximations in (76) we need to solve the following equation for θ,