Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
On some Statistical Properties of Multivariate q-Gaussian Distribution and its application to Smoothed Functional Algorithms
Debarghya Ghoshdastidar
Ph.D. Candidate Computer Science & Automation
Work done with: Dr. Ambedkar Dukkipati Prof. Shalabh Bhatnagar Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Boltzman-Gibbs-Shannon entropy For a discrete probability mass function p, X H(p) = − p(x) ln p(x), x∈X
Uncertainty of random variable. Maximum entropy principle states: choose distribution, satisfying given constraints, that maximizes entropy. uniform, exponential, normal distributions can be formulated as maximum entropy distributions. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Tsallis entropy For a discrete probability mass function p,
X q Hq(p) = − p(x) lnq p(x) , x∈X
x1−q−1 where q-logarithm lnq(x) = 1−q , q ∈ R, q 6= 1.
Proposed in context of thermodynamics1. Tends to Shannon entropy as q → 1. Pseudo-additive in nature, i.e., for X and Y independent,
Hq(X,Y ) = Hq(X) + Hq(Y ) + (1 − q)Hq(X)Hq(Y )
The q-logarithm is same as Box-Cox transformation used in statistics to “make the data more normal distribution-like”. 1C. Tsallis. Possible generalization of Boltzmann-Gibbs statistics. Journal of Statiscal Physics 52 (1-2), 1988. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Tsallis entropy functional For a contrinuous probability distribution p, Z 1 − p(x)q dx H (p) = X , q ∈ . q q − 1 R
q-expectation Corresponding generalization of expectation,
R f(x)p(x)q dx hfi = = E [f(X)], q R p(x)q dx pq
p(x)q where escort distribution, pq(x) = R p(x)q dx Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
q-Gaussian distribution: Nonextensive generalization of Gaussian distribution. Associated with L´evy super-diffussion process2. Obtained from Tsallis entropy maximization under constraints
q-mean, hXiq = µq 2 q-variance, h(X − µq) iq = σq
2D. Prato and C. Tsallis. Nonextensive foundation of L´evydistributions. Physical Review E. 60 (2), 2398–2401, 1999. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
q-Gaussian distribution
1 1 (1 − q) 1−q G (x) = 1 − (x − µ )2 q σ K (3 − q)σ2 q q q q + 2 1 (x − µq) = expq − 2 for all x ∈ R, σqKq (3 − q)σq
y+ = max(y, 0) is Tsallis cut-off condition
2−q √ √ Γ √π 3−q 1−q for −∞ < q < 1 1−q 5−3q Γ 2(1−q) Kq = √ √ 3−q Γ 2(q−1) √π 3−q for1 < q < 3 1−q 1 Γ q−1 Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Properties : Probability distribution only for q < 3. It provides a family of distributions, with behavior controlled by parameter q.
E[X] = µq for q < 2. q 3−q 2 5 Var[X] = 5−3q σq for q < 3 . Finite support for q < 1. Infinite support and power-law nature for q > 1. One-to-one correspondence with Students-t for q > 1.
Special cases : Gaussian distribution as q → 1 Cauchy distribution for q = 2 Uniform distribution as q → −∞ Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Multivariate q-Gaussian distribution3 : ! 1 1 (1 − q) kXk2 1−q Gq,σ(X) = N 1 − 2 , σ Kq,N (N + 4) − (N + 2)q σ + N for all X ∈ R , with mean, µ = 0 covariance, Σ = σ2I and normalizing constant
N N/2 2−q (N+4)−(N+2)q 2 π Γ 1−q for q < 1 1−q Γ 2−q + N 1−q 2 Kq,N = N N/2 1 N 2 π Γ − (N+4)−(N+2)q q−1 2 for 1 < q < N+4 q−1 1 N+2 Γ q−1
3C. Vignat and A. Plastino. Central limit theorem and deformed exponentials. Journal of Physics A: Mathematical and Theoretical 20(45), 2007. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Support set :
2 n N 2 ((N+4)−(N+2)q)σ o x ∈ R : kxk 6 (1−q) for q < 1 Ωq = N N+4 R for 1 < q < N+2
Consider X(1),X(2),...,X(N) identical q-Gaussian distributed with N+4 q ∈ − ∞, N+2 , q 6= 1. 2 E X(i) = 1 for all i = 1,...,N. E X(i)X(j) = 0 for all i, j = 1,...,N, i 6= j.
(1−q) 2 ρ(X) = 1 − ((N+4)−(N+2)q) kXk .
b, b1, b2, . . . , bN ∈ N. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Theorem (Generalized co-moments) :
" b1 b2 bN # X(1) X(2) ... X(N) EG q (ρ(X))b
N P bi N ! 2 (N + 4) − (N + 2)q i=1 Y bi! K¯ bi 1 − q 2bi ! = i=1 2 if bi is even for all i = 1, 2,...,N 0 otherwise
with
Γ( 1 −b+1)Γ( 1 +1+ N ) 1−q 1−q 2 N if q ∈ (−∞, 1) 1 1 N P bi Γ( 1−q +1)Γ 1−q −b+1+ 2 + 2 i=1 K¯ = N Γ 1 Γ 1 +b− N − P bi ( q−1 ) q−1 2 2 i=1 2 1 1 N if q ∈ 1, 1 + N+2 Γ( q−1 +b)Γ( q−1 − 2 ) Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Limiting case :
b b b " (1) 1 (2) 2 (N) N # N b X X ... X Y (i) i lim EG (X) = EG(X) X q→1 q b (ρ(X)) i=1
Generalized moments and q-moments :
* b1 b2 bN + X(1) X(2) ... X(N) (ρ(X))b q
" b1 b2 bN # 2 X(1) X(2) ... X(N) = EG (X) (N + 2 − Nq) q (ρ(X))b+1 Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
mth-order moments and q-moments : Relations hold only if Gamma functions exist.
N X If b = 0 and bi = m, then K¯ is function of m. i=1 2 m-th order moments and co-moments exist for q < 1 + N+m , q 6= 1. 2 m-th order q-moments exist for q < 1 + N+m−2 , q 6= 1. The distribution, as well as1 st and2 nd order q-moments, exist 2 for all q < 1 + N . 2 Usual mean, µ = µq for q < 1 + N+1 .
(N+2)−Nq 2 Covariance,Σ= (N+4)−(N+2)q Σq for q < 1 + N+2 . Motivates us to express multivariate q-Gaussian in terms of q-moments. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Lemma 2 Given Z ∼ N (0,IN×N ), a ∼ χ (ν), ν > 0, and
rν Y = Z a
2 then Y ∼ Gq(0,IN×N ), where q = 1 + N+ν .
Lemma 2 Let Y ∼ Gq(0,IN×N ) for some q ∈ 1, 1 + N+2 and
q 2−q N+2−Nq Y X = q . q−1 T 1 + N+2−Nq Y Y
0 q−1 Then X ∼ Gq0 (0,IN×N ), where q = 1 − (N+4)−(N+2)q ∈ (−∞, 1). Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Sampling algorithm : (given q, µq andΣ q)
1 Generate N-dimensional vector Z ∼ N (0,IN×N )
2 Generate chi-squared random variate
2(2−q) χ2 for −∞ < q < 1, 1−q a ∼ 2 N+2−Nq 2 χ q−1 for1 < q < 1 + N .
3 Compute
q N+2−Nq Z √ for −∞ < q < 1, 1−q a+ZT Z Y = q N+2−Nq √Z 2 q−1 a for1 < q < 1 + N .
1/2 4 X = µq + Σq Y ∼ Gq(µq, Σq) Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
System : Discrete event system, {Ym : m > 0}, controlled by parameter θ ∈ C, closed and convex subset of RN .
Cost : Long run average cost, J(θ) = Eνθ [h(Y )], where νθ is the stationary distribution of process, and h(Y ) is the single-stage cost.
Objective : Minimize J(θ) with respect to θ ∈ C.
Issue : No analytical relationship between J(θ) and θ.
Solution : Perform optimization with derivatives of J, esti- mated using Smoothed Functional approach. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Assumptions : The process is ergodic for a given θ, i.e., for large L
L−1 1 X J(θ) = [h(Y )] ≈ h(Y ). Eνθ L m m=0
J(.) is twice continuously differentiable for all θ ∈ C. The process remains stable under the sequence of parameter updates. (technically, we assume existence of a stochastic Lyapunov function.) Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
N Let f : C 7→ R be any function, then given a function Gβ : R 7→ R satisfying the Rubinstein conditions4, we have
Definition (Smoothed Functional) Z Sβ[f(θ)] = Gβ(η)f(θ − η) dη
N R
Figure: Unsmoothed function, Figure: Smoothed function, 2 1 −x2 S0.1[f(x)] f(x) = x − 4 e cos(8πx)
4R. Y. Rubinstein. Simulation and Monte-Carlo Method. John Wiley, 1981. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Rubinstein conditions : 1 η η η(1) η(2) η(N) Gβ(η) = βN G β , where G β = G1 β , β ,..., β
Gβ(η) is piecewise differentiable in η
Gβ(η) is a probability distribution function, i.e.,
Sβ[f(θ)] = EGβ (η)[f(θ − η)]
lim Gβ(η) = δ(η), where δ(η) is the Dirac delta function β→0
lim Sβ[f(θ)] = f(θ) β→0
Examples of smoothing kernels : 2 Gaussian distribution with covariance matrix β IN×N Cauchy distribution with scale parameter β N h β β i Uniform distribution on interval − 2 , 2 Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Optimization methods : Gradient descent algorithm 1 x = P x − ∇ f(x ) n+1 [−1,1] n n x n
Gradient descent on smoothed functional 1 x = P x − ∇ S [f(x )] n+1 [−1,1] n n x β n
Figure: Optimum found using Gradient (red) and SF (yellow). Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Smoothed Gradient
∇θSβ[f(θ)] = EG(η) [g1(η)f(θ + βη)| θ]
Stochastic framework :
Random vector η ∼ G = G1.
Process {Ym} controlled by parameter( θ + βη).
Function g1 depends on nature of G. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Lemma 2 The N-variate q-Gaussian distribution, with q-covariance β IN×N 2 satisfies the Rubinstein conditions for all q < 1 + N and q 6= 1.
One-simulation q-SF Gradient
∇θSq,β[J(θ)] = EGq (η) [g1(η)J(θ + βη)| θ]
2η where g1(η) = (1−q) 2 β(N + 2 − Nq) 1 − (N+2−Nq) kηk
Lemma
∇θSq,β[J(θ)] − ∇θJ(θ) = o(β) for all q < (1 + 2/N), q 6= 1. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Approximation :
M−1 L−1 ! 1 X X ∇ S [J(θ)] ≈ g (η ) h(Y ) , θ q,β ML 1 n nL+m n=0 m=0
where {YnL+m} controlled by( θ + βηn), for large M, L.
Gradient descent method :
1 Fix M, L, q and β.
2 Set parameter update θ0 = θinitial.
3 For k = 0 to a fixed number of steps
1 Estimate ∇θk Sq,β[J(θk)] using above approximation. 2 Update θk+1 = PC (θk − ak∇θk Sq,β[J(θk)]). 4 Output final parameter vector. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Remark : Parameter θ should lie in some bounded set (usually convex).
Taken care of by projection PC . Issue :
Estimation of ∇θk Sq,β[J(θk)] requires considerable computation. Nested loop scenario increases complexity further. Two-timescale approach :
Perform gradient estimation and parameter updation simultaneously. Update gradient estimation with larger step-sizes. Update parameter with smaller step-sizes. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Idea of two-timescale : Given update rules
xn+1 = xn + anf(xn, yn)
yn+1 = yn + bng(xn, yn) where an → 0, the updates can be viewed as ↓ 0, bn 1 y˙(t) = g x(t), y(t) x˙(t) = f x(t), y(t)
Faster timescale: for x quasi-static, yn → λ(x) globally asymptotically stable equilibrium of y˙(t) = g x, y(t) Slower timescale: 0 y(t) tracks λ(x(t)), and so xn → x , which is stable equi of x˙(t) = f x(t), λ(x(t))
Updates converge to x0, λ(x0). Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Step-sizes : ∞ ∞ + P 2 P 2 (an)n>0, (bn)n>0 ⊂ R satisfying an < ∞, bn < ∞, n=0 n=0 ∞ ∞ P P an an = bn = ∞ and an = o(bn), i.e., lim = 0. bn n=0 n=0 n→∞
The Gq-SF1 Algorithm :
1 Fix M, L, q and β.
2 Set gradient update Z0 = 0, parameter update θ0 = θinitial.
3 For n = 0 to M − 1 N 1 Generate η ∈ R , η ∼ Gq,1. 2 For m = 0 to L − 1
Simulate YnL+m with parameter( θn + βηn). ZnL+m+1 = (1 − bn)ZnL+m + bng1(ηn)h(YnL+m). 3 Update θn+1 = PC θn − anZ(n+1)L .
4 Output final parameter vector. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Claim
The updates θn converge to the neighbourhood of a local minimum.
Faster timescale (averaging natural timescale) : ZnL+m+1 = ZnL+m + bn g1(ηn)h(YnL+m) − ZnL+m = ZnL+m + bn E[g1(ηn)h(YnL+m)|GnL+m−1] − ZnL+m + AnL+m
GnL+m = σ(θk, ηk,Yj)k6n,j6nL+m
(AnL+m, GnL+m) is martingale difference term with bounded variance.
θn, ηn quasi-static during above update.
So, we can say5
ZnL tracks g1(ηn)J(θn + βηn).
5V. S. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint. Cambridge University Press, 2008. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Slower timescale : θn+1 = PC θn − ang1(ηn)J(θn + βηn) = PC θn + an [−∇θn J(θn) + ∆(θn) + ξn] ,
Noise term
ξn = ∇θn Sq,β[J(θn)] − g1(ηn)J(θn + βηn),
ξn is martingale difference term with bounded variance. Error term ∆(θ) = ∇θJ(θ) − ∇θSq,β[J(θ)],
Error satisfies k∆(θ)k = o(β). Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Convergence of algorithm : Updates track the ODE6 ˙ ˜ θ(t) = PC − ∇θ(t)J(θ(t)) + ∆(θ(t)) PC x+f(x) −x where P˜C f(x) = lim . ↓0 If ∆(θ(t)) → 0, then update would have converged to stable fixed points of ˙ ˜ θ(t) = PC − ∇θ(t)J(θ(t)) , which are the local minima. Since k∆(θ)k = o(β), for small β, it will converge to some -neighbourhood of minima.
6H. J. Kushner and D. S. Clark. Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer-Verlag, 1978 Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Summary : Higher order moments and q-moments of multivariate q-Gaussian distribution. Algorithm to generate q-Gaussian random vectors. q-Gaussian generalizes existing class of smoothing kernels. Two-timescale q-SF gradient descent algorithms converge to neighbourhood of local minima for all q < (1 + 2/N), q 6= 1. Simulation results show that we can reach closer to global minimum compared to other methods. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
L´evy α-stable distibutions : “When sharks can’t find food, they abandon Brownian motion.”
pX is stable distribution if X,Y ∼ pX , then paX+bY has similar shape as pX . Commonly observed in nature and finance, where α ∈ (0, 2]. Associated with L´evy flights and long-range interactions. Characteristic function, ϕ(t) := E[exp(itX)] = exp(−|ct|α) for some c > 0. Generalized Central Limit Theorem7 : 2 CLT: If X1,X2,... i.i.d. with zero mean and variance σ < ∞, n √1 P then Zn = n Xi tends to an normal distribution. i=1
GCLT: If X1,X2,... i.i.d. with zero mean and infinite variance n 1 P (i.e., with power-law tails), then Zn = c Xi tends to an i=1 α-stable distribution. 7V. V. Gnedenko and A. N. Kolmogorov. Limit Distributions of Sums of Independent Random Variables. Addison-Wesley, 1968. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Origin of q-Gaussian : If p.d.f of1-state jump is Gaussian, p.d.f of N-state jump tends to Gaussian. Question: When does p.d.f. of N-state jump tend to α-stable distribution? Such p.d.f. obtained from maximization of BGS entropy with constraint Z E exp(−itX) exp(−|ct|α) = constant for some c > 0.
No physical interpretation for such constraint. Better interpretations when Tsallis entropy maximized, with q-moment constraints. 5 If p.d.f of1-state jump q-Gaussian with q > 3 , then p.d.f. of 5 N-state jump tends to α-stable distribution. If q < 3 , it tends to normal. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Outline 1 Multivariate q-Gaussian Distribution Nonextensive Information Theory q-Gaussian distribution Moments of q-Gaussian Generating multivariate q-Gaussian 2 Smoothed Funtional Algorithms Stochastic Optimization Framework Smoothed Functional method Optimization using SF 3 q-Gaussian based SF Algorithms q-Gaussian as smoothing kernel Proposed two-timescale algorithm Convergence of algorithm 4 Discussions Summary Origin of q-Gaussian q-Central Limit Theorem Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Vignat’s approach8 : Use the fact if Z ∼ N (0, 1) and a ∼ χ2(ν), then
rν 2 Y = Z ∼ G (0, 1), where q = 1 + . a q N + ν
Theorem : If X1,X2,... i.i.d. with zero mean and unit variance, and a ∼ χ2(ν), then
n 1 X Z = √ X n an i i=1 converges weakly to a q-Gaussian distribution. Shows connections to Tsallis’ results.
8C. Vignat and A. Plastino. Central limit theorem and deformed exponentials. Journal of Physics A: Mathematical and Theoretical 20(45), 2007. Multivariate q-Gaussian SF Algorithms q-Gaussian SF Discussions
Tsallis’ approach9 : Generalize to q-algebra, q-calculus, q-Fourier transform etc. q-independence: X and Yq -independent if
ϕq,X+Y (t) = ϕq,X (t) ⊗q ϕq,Y (t)
q-convergence: X1,X2,... are q-convergent to X∞ if
lim ϕq,X (t) = ϕq,X (t) locally uniformly in t. n→∞ n ∞
Theorem : For q ∈ (1, 2), if X1,X2,...q -independent and identically distributed with q-mean µq and (2q − 1)-variance 2 σ2q−1, then X1 + X2 + ... + Xn − nµq Zn = Cq,n,σ q-converges to a( q − 1)-Gaussian distribution. 9S. Umarov, C. Tsallis and S. Steinberg. On a q-Central Limit Theorem Consistent with Nonextensive Statistical Mechanics, Milan Journal of Mathematics, 76 (1), 307–328, 2008.