Lecture series - CSCG Matrix

Lecture Series on Math Fundamentals for MIMO Communications Topic 1. Complex Random Vector and Circularly Symmetric Complex Gaussian Matrix

Drs. Yindi Jing and Xinwei Yu University of Alberta Last Modified on: January 17, 2020

1 Lecture series - CSCG Matrix

Content Part 1. Definitions and Results 1.1 Complex /vector/matrix

1.2 Circularly symmetric complex Gaussian (CSCG) matrix

1.3 Isotropic distribution, decomposition of CSCG random matrix, Complex Wishart matrix and related PDFs

Part 2. Some MIMO applications

Part 3. References

2 Lecture series - CSCG Matrix

Part 1. Definitions and Results 1.1 Complex random variable/vector/matrix Real-valued random variable • Real-valued random vector • Complex-valued random variable • Complex-valued random vector • Real-/complex-valued random matrix •

3 Lecture series - CSCG Matrix

Real-valued random variable Def. A map from the to the real number set Ω R. → A sample space Ω; • A measure P ( ) defined on Ω; • ∙ An experiment on Ω; • A function: each of the experiment a real number. • 7→ Notation: random variable X; sample value x. Outcomes of the experiment: In Ω. ”Invisible”. Does not matter. What matters: The values value x assigned by the function. Examples: Flip a coin and head 0, Tail 1. → → A random bit of 0 or 1. Roll a die and even 0, odd 1. → →

4 Lecture series - CSCG Matrix

How to describe a random variable? Def. The cumulative distribution function (CDF) of X:

FX (x) = P (ω Ω: X(ω) x) = P (X x). ∈ ≤ ≤ Def. The probability density function (PDF) of X: dF (x) f (x) = X . X dx

Keep in mind: A random variable though the result of an experiment, can be separated to the physical experiment. The values matter, not the outcomes of the experiment. In other words, ω is invisible, all one can see is x.

5 Lecture series - CSCG Matrix

Common concepts, models, and properties of random variables. Discrete random variable, continuous random variable, mixed • random variable Mean, variance, moments • Multiple random variables: Joint CDF/PDF, marginal • CDF/PDF, conditional CDF/PDF, etc. Multiple random variables: Independence and correlation •

6 Lecture series - CSCG Matrix

(Real-valued) random vector.

X1 x1 . . X =  .  , value x =  .      XK  xK          K dimensional. • Each X : random variable; • i Describe through joint PDF: •

fX(x) = fX ,X , ,X (x1, x2, , xK ) 1 2 ∙∙∙ K ∙ ∙ ∙

7 Lecture series - CSCG Matrix

Mean vector and Covariant Matrix Mean vector. • E (X1) m1 . . m = E(X) =  .  =  .  ,     E (XK ) mK          . •

T ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ Σ = E (X m)(X m) =  E [(Xi mi)(Xj mj )]  , { − − } ∙ ∙ ∙ − − ∙ ∙ ∙   ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙ ∙   Σ 0 (positive semidefinite matrix). 

8 Lecture series - CSCG Matrix

Complex-valued random variable.

X = Xr + jXs.

Equivalent to 2-dimensional real-valued random vector:

Xr Xˆ = . Xs   To describe Xˆ, use joint PDF of (X ,X ) joint PDF of the r s ⇐⇒ random vector Xˆ.

fXˆ (ˆx) = fXr ,Xs (xr, xs).

9 Lecture series - CSCG Matrix

Complex-valued random vector Equivalent to real-valued random vector with twice dimension:

Xr X = Xr + jXs Xˆ = . ⇐⇒ Xs   To describe X, use the joint PDF of (Xr, Xs) (twice the dimension of X):

fXˆ (xˆ) = fXr ,Xs (xr, xs)

= fX , ,X ,X , ,X (xr,1, , xr,K , xs,1, , xs,K ). r,1 ∙∙∙ r,K s,1 ∙∙∙ s,K ∙ ∙ ∙ ∙ ∙ ∙

10 Lecture series - CSCG Matrix

Real-/complex-valued random matrix: X = [x ]: M N matrix. ij × Each entry xij is a real-/complex-valued random variable. Also use X for a sample or a realization. an (MN)-dimensional real-/complex random vector. ⇐⇒

To make the difference between random vector and random variables, use x for both a random vector and its realization. Reserve X for both a random matrix and its realization.

11 Lecture series - CSCG Matrix

The vectorization operation for X = [xij]. By columns:

x11 .  .     xM1  xcol,1    .   .  vec(X) =  .  = .        x1N  xcol,N       .     .      xMN      By rows:

vecrow(X) = x x x x 11 ∙ ∙ ∙ 1N ∙ ∙ ∙ N1 ∙ ∙ ∙ MN h i = x x . row,1 ∙ ∙ ∙ row,M h i 12 Lecture series - CSCG Matrix

Remarks on the vectorization operation To describe the real-/complex-valued random matrix • X is to describe the real-/complex-valued random vector vec(X). For real-valued random matrix: joint PDF of entries of vec(X). • For complex-valued random matrix: two ways. • vec(X)r – Vectorize first, then separate. vec(\X) = . vec(X)s

vec(X)r and vec(X)s: real and imaginary parts of vec(X).

Xr – Separate first, then vectorize. vec(Xˆ ) = vec . Xs Equivalent. The difference is a permutation of indices, i.e., a linear transformation with a permutation matrix.

13 Lecture series - CSCG Matrix

1.2. Circularly symmetric complex Gaussian (CSCG) matrix Real-valued Gaussian random variable • Real-valued Gaussian random vector and random matrix • Complex-valued Gaussian random vector and CSCG random • vector CSCG random matrix •

14 Lecture series - CSCG Matrix

Real-valued Gaussian random variable Gaussian/: X (m, σ2). ∼ N m: mean, σ2: variance. PDF: (x m)2 − 1 − 2 fX (x) = e 2σ . √2πσ Standard Gaussian distribution: (0, 1). N Q-function: If X (0, 1), ∼ N ∞ 1 t2 Q(x) , P (X > x) = e− 2 dt. √2π Zx

15 Lecture series - CSCG Matrix

Real-valued Gaussian random vector and random

matrix X1 . Random vector: x =  . .   XK    Def. The random vector xis a Gaussian random vector if X ,X , ,X are jointly Gaussian. 1 2 ∙ ∙ ∙ K Joint Gaussian RVs: X ,X are jointly Gaussian if both Gaussian and • 1 2 X X ,X X are Gaussian. 1| 2 2| 1 Can be generalized to more Gaussian RVs, e.g., X ,X ,X are • 1 2 3 jointly Gaussian if each two are jointly Gaussian and (X ,X ) X , (X ,X ) X , (X ,X ) X are jointly Gaussian. 1 2 | 3 1 3 | 2 2 3 | 1 Independent Gaussian RVs are jointly Gaussian. • Linear combinations of jointly Gaussian RVs are Gaussian. • 16 Lecture series - CSCG Matrix

The PDF of x (joint PDF of X , ,X ) is: • 1 ∙ ∙ ∙ K 1 1 T 1 2 (x m) Σ− (x m) fx(x) = K 1 e− − − , √2π det 2 (Σ) where m is the mean vector and Σ is the covariance matrix. Notation: x (m, Σ). • ∼ N Several special cases. • – i.i.d. (0, 1) ∼ N x 2 1 k k2 1 1 K x2 − 2 − 2 i=1 i fx(x) = K e = K e √2π √2π P – Independent only   2 K (x m ) i− i 1 i=1 2σ2 − i fx(x) = K e √2π σ σ P 1 ∙ ∙ ∙ K – Zero-mean  – Other cases ...

17 Lecture series - CSCG Matrix

For a random matrix X which is M N, conduct × vectorization to have vec(X), which is an MN-random vector. Work on the M N matrix X work on the • × ⇐⇒ (MN)-dimensional vector vec(X). Thus the mean vector is MN-dimensional and the • covariance matrix is (MN) (MN). × X is a Gaussian matrix if vec(X) is a Gaussian random vector. • Matrix norm distribution: if the PDF has the following • format:

1 1 T 1 tr[V− (X M) U− (X M)] e− 2 − − fX(X) = , (√2π)MN ( det(U))N ( det(V))M for some M M matrix U andp N N matrixp V. × × Equivalently, vec(X) (vec(M), V U). ∼ N ⊗

18 Lecture series - CSCG Matrix

If columns of X are i.i.d. each following (m, Σ), the • N mean vector of vec(X) is m˜ = [mt, , mt]t and the covariance ∙ ∙ ∙ matrix is Σ˜ = diag Σ, , Σ = I Σ. { ∙ ∙ ∙ } ⊗ – Notice that m is M-dimensional and Σ is M M. × – PDF:

fX(X) = fvec(X)(vec(X))

1 1 T ˜ 1 2 (vec(x) m˜ ) Σ− (vec(x) m˜ ) = MN 1 e− − − √2π det 2 (Σ˜ ) N  1 1 T 1 2 (xcol,n m) Σ− (xcol,n m) = M 1 e− − − n=1 √2π det 2 (Σ) Y 1 T 1 1 tr (X M) Σ− (X M)  − 2 [ − − ] = MN N e , √2π det 2 (Σ)

where M = [m,  , m]. ∙ ∙ ∙

19 Lecture series - CSCG Matrix

– For this special case,

1 T Σ = E [(X M)(X M) ]. N − −

1 T – For zero-mean (m = 0), Σ = N E [XX ] and

1 T 1 1 tr X Σ− X − 2 [ ] fX(X) = MN N e . √2π det 2 (Σ)

– Many work use the simplified notation X (m, Σ) ∼ CN for this case. Cannot use this for the general case. ∗ Understand what it really means. ∗ Rigorously speaking, Σ is not the covariance matrix of X. ∗ For the general case, the right-hand-side of the top formula is ∗ the average covariance matrix of the columns of X.

20 Lecture series - CSCG Matrix

If rows of X are i.i.d. each following (m, Σ), • N – Notice that m is N-dimensional (column vector) and Σ is N N. × – PDF:

1 T 1 T T 1 tr (X M )Σ− (X M ) − 2 [ − − ] fX(X) = MN M e . √2π det 2 (Σ)

– For zero-mean,  1 T Σ = E [X X] M and

1 1 T 1 tr XΣ− X − 2 [ ] fX(X) = MN M e . √2π det 2 (Σ)

– Difference to the i.i.d. column case isa( )T -operation. ∙ – Be careful with the position of ( )T and understand why. ∙

21 Lecture series - CSCG Matrix

Complex-valued Gaussian random vector and circularly symmetric complex Gaussian vector Def. A K-dimensional complex-valued random vector

x = xr + jxs is a complex Gaussian random vector if

xr xˆ = is a real-valued Gaussian random vector. xs Def.The complex-valued Gaussian random vector x is circularly symmetric if

T 1 Re Q Im Q Σxˆ = E (xˆ E[xˆ])(xˆ E[xˆ]) = { } − { } { − − } 2 Im Q Re Q  { } { }   for some K K (Hermitian and) positive-semi-definite matrix Q, × i.e., Q 0. 

22 Lecture series - CSCG Matrix

Notation for x being CSCG: x (m, Q), where m is the • ∼ CN mean vector and Q is the covariance matrix, both are complex in general.

ˆ Re Q Im Q Notation: Qˆ = { } − { } . • Im Q Re Q  { } { } Real part and imaginary part must be jointly Gaussian. • Real part and imaginary part have the same covariance matrix. • ˆ Since Qˆ 0 implies Hermitian, Re Q must be symmetric and •  { } Im Q must be anti-symmetric (skew-symmetric). { }

23 Lecture series - CSCG Matrix

Before more details on general CSCG random vector, consider a 1-dimensional random variable: X (m, Q). ∼ CN Q is a non-negative real number. • X and X are independent and have the same variance, which • r s equals Q/2. X (0, 1) means that the real part and imaginary part of • ∼ CN X are i.i.d. (0, 1/2). ∼ N PDF of X (m, Q): • ∼ CN x m 2 1 | − | f (x) = e− Q . X πQ

24 Lecture series - CSCG Matrix

Back to the general K-dimensional CSCG random vector: x (m, Q). ∼ CN

PDF: H 1 1 (x m) Q− (x m) • f (x) = e− − − . x πK det(Q) Can be derived from the PDF of the real-valued case. ˆ Some identities related to the mappings: x xˆ and Q Qˆ . • ˆ ˆ ˆ → → –C = AB Cˆ = Aˆ Bˆ . 1 ⇔ ˆ ˆ 1 –C = A− Cˆ = Aˆ − . ˆ ⇔ – det(Aˆ ) = det(AAH ) = det(A) 2. ˆ | | – y = Ax yˆ = Aˆ xˆ. ⇔ – Re(xH y) = xˆH yˆ. ˆ –Q 0 Qˆ 0.  ⇔  ˆ –U is unitary if and only if Uˆ is orthonormal. For zero-mean, i.e., x (0, Q), the PDF is • ∼ CN H 1 1 x Q− x f (x) = e− . x πK det(Q)

25 Lecture series - CSCG Matrix

Lemma. If x (m, Σ) for any A, then ∼ CN y = Ax + b (Am + b, AΣAH ): Any linear transformation ∼ CN (affine function) of a CSCG random vector is also a CSCG random vector. Lemma. If x and y are independent CSCG random vectors, then z = x + y is CSCG. Lemma. If x (0, I) and Φ is a random unitary matrix that is ∼ CN independent of x, then y = Φx also follows (0, I) and is CN independent of Φ. x and y are equivalent in distribution. Lemma. Let x be a zero-mean complex-valued random vector, and E[xxH ] = Q, then the entropy of x satisfies H(x) log det(πeQ) ≤ with equality if and only if x is CSCG, i.e., x (0, Q). ∼ CN

26 Lecture series - CSCG Matrix

Circularly symmetric complex Gaussian random matrix By following previous slides, a complex-valued random matrix • X is Gaussian if vec(X) is a complex-valued Gaussian random vector. A complex-valued random matrix X is CSCG if vec(X) is a • CSCG random vector. Fundamentally, work on vec(X) using previous definitions and • results.

H 1 1 (vec(X) m) Q− (vec(X) m) f (vec(X)) = e− − − , vec(X) πK det(Q) where m is (MN)-dimensional and Q is (MN) (MN). ×

27 Lecture series - CSCG Matrix

If columns (or rows) of X (M N) are i.i.d. (0, I), • × ∼ CN meaning that all entries of X are i.i.d. (0, 1), X is a CSCG ∼ CN matrix and its PDF is

1 vec(X)H vec(X) f (vec(X)) = e− vec(X) πMN

1 X 2 1 tr(XH X) or equivalently, f (X) = e−k kF = e− , X πMN πMN where X is the Frobenius norm of X. k kF – Many work use the simplified notation for this case:

X (0M N , IM M ). ∼ CN × × – It implies that columns are independent and each

column has the same covariance matrix IM M . × – Not true for the general case. May cause confusion or mistake if taken for granted.

28 Lecture series - CSCG Matrix

If columns of X (M N) are independent and CSCG, i.e., • × x (m , Q ), then X is a CSCG matrix and its col,n ∼ CN n n PDF is N (x m )H Q 1(x m ) e− n=1 col,i− n n− col,i− n fX(X) = P MN π det(Q1) det(QN ) ∙ ∙ ∙ H 1 (vec(X) m ) Q− (vec(X) m ) e− − full full − full = MN , π det(Qfull) where m = [mT , , mT ]T , Q = diag Q , , Q . full 1 ∙ ∙ ∙ N full { 1 ∙ ∙ ∙ N } The case with independent row vectors can be analyzed • similarly. Matrix norm distribution for CSCG case: if the PDF has • the following format: tr[V 1(X M)H U 1(X M)] e− − − − − fX(X) = , (π)MN detN (U) detM (V) for some M M matrix U and N N matrix V. × × Equivalently, vec(X) (vec(M), V U). ∼ CN ⊗ 29 Lecture series - CSCG Matrix

1.3. Isotropic distribution, decomposition of CSCG random matrix and Wishart matrix and related PDFs Isotropic distribution and decomposition of CSCG random • matrix Wishart matrix and related PDFs •

30 Lecture series - CSCG Matrix

Isotropic distribution and decomposition of CSCG random matrix Def. An n-dimensional random complex unit vector u is isotropically distributed if its probability density is invariant to all unitary transformations. That is,

H fu(u) = fu(Φu) for any Φ Φ = I.

Uniform distribution on the set of unit vectors. • The PDF depends on the magnitude (length) only, not • direction. Elements of u are dependent. •

31 Lecture series - CSCG Matrix

The PDF of an isotropically distributed unit vector u is Γ(n) f(u) = δ(uH u 1). πN − The PDF of any L-elements of u: Γ(n) f(u(L)) = δ(1 (u(L))H u(L)), norm of each element 1. πLΓ(n L) − ≤ − u(L) is a vector of any L elements of u.

32 Lecture series - CSCG Matrix

Def. An n n unitary matrix U is isotropically distributed if × its probability density is invariant when left-multiplied by any deterministic unitary matrix, that is,

H fU(U) = fU(ΦU) for any Φ Φ = I. Uniform distribution on the set of unitary matrices. • Same density on all “directions”. • PDF is also invariant when right-multiplied by unitary matrix. • The transpose and conjugate of U are also isotropically • distributed. Any column of U is a random complex unit vector. • Columns of U are dependent. • The PDF of an isotropically distributed unitary matrix U is n Γ(i) f(U) = i=1 δ(UH U I). πn(n 1)/2 − Q − There are general results on moments of elements of U.

33 Lecture series - CSCG Matrix

Theorem: Let X be an m n standard CSCG matrix, i.e., × X (0, I), where m n. Let X = ΦR be the ∼ CN ≥ QR-decomposition normalized so that the diagonal elements of R are positive. Thus Φ is an isotropically distributed unitary matrix. • Elements of R are independent of each other. • R is independent of Φ. • The upper diagonal elements of R are (0, 1). • CN The ith diagonal element of R is a half of a χ2 random variable • 2 with 2(m i + 1) degrees of freedom: 2rii χ2(m i+1). − ∼ − The case of m < n can be considered similarly. •

34 Lecture series - CSCG Matrix

Theorem: Let X be an m n standard CSCG matrix, i.e., × X (0, I). Let X = UΣVH be the singular value ∼ CN decomposition (SVD). Thus U and V are isotropically distributed unitary matrices. • U, Σ, V are independent. • See later parts on the distributions of elements of Σ. •

35 Lecture series - CSCG Matrix

Theorem: Let X be an m n standard CSCG matrix, i.e., × X (0, I), where n m. Then X is unitarily similar to an ∼ CN ≥ m n matrix: × x 0 0 2n ∙ ∙ ∙   1 y2(m 1) x2(n 1) 0 0 − − , 2  ......   . . . .       y2 x2(n (m 1)) 0 0   − − ∙ ∙ ∙  2 2 2  where xi and yi are independent and follows χ distribution with i degrees of freedom.

36 Lecture series - CSCG Matrix

Complex Wishart matrix and related PDFs Def. Let X be an m n (m n) random matrix where each row is × ≥ a zero-mean CSCG random vector following (0, V) and the CN rows are independent. The n n matrix × m H H W = X X = xrow,ixrow,i i=1 X is a (centralized) Wishart matrix. The of W is called the (centralized) Wishart distribution, denoted as (V, m). CWn PDF of complex Wishart matrix: • m n 1 det − (W) tr(V− W) fW(W) = e− . n(n 1) m π 2− Γ(m) Γ(m n + 1) det (V) ∙ ∙ ∙ − Similar to m < n and column independent CSCG random • matrix.

37 Lecture series - CSCG Matrix

Consider the special case of V = I, i.e., X (0, I). Define ∼ CN XXH m < n W =  XH X m n  ≥ and N = max m, n ,M = min m, n . The probability distribution { }  { } of W is called Wishat distribution with parameters m and n (notice that N M and W is M M). ≥ × PDF of W: • N M det − (W) tr(W) − fW(W) = M(M 1) e . π 2 − Γ(N) Γ(N M + 1) ∙ ∙ ∙ − Joint PDF of ordered eigenvalues λ λ 0: • 1 ≥ ∙ ∙ ∙ ≥ M ≥ M M i=1 λi N M 2 f(λ1, , λM ) = C(M,N) e− λi − (λi λj ) ∙ ∙ ∙ ∙ P − i=1 i

38 Lecture series - CSCG Matrix

Joint PDF of unordered eigenvalues • M C(M,N) M i=1 λi N M 2 f(λ1, , λM ) = e− λi − (λi λj ) ∙ ∙ ∙ M! ∙ P − i=1 i=j Y Y6 for λ , , λ 0. 1 ∙ ∙ ∙ M ≥ Marginal PDF of an unordered eigenvalue • M 1 (i 1)! N M 2 N M λ1 fλ = − [Li −1 (λ)] λ1 − e− , λ 0, M (i 1 + N M)! − ≥ i=1 X − − N M 1 x M N dk x N M+k where Lk − (x) = k! e x − dxk (e− x − ) is the Laguerre polynomial of order k. Results on PDFs of the maximum and minimum eigenvalues. • Inverse Wishart distribution: Y follows inverse Wishart • distribution if its inverse follows Wishart distribution. Non-centralized Wishart matrix: when H has non-zero • mean.

39 Lecture series - CSCG Matrix

Part 2. MIMO applications MIMO channel model • MIMO capacity • Diversity analysis of distributed space-time coding with • multiple antennas Performance analysis of massive MIMO with ZF •

40 Lecture series - CSCG Matrix

MIMO channel model A wireless link: Multi-path, delay spread, mobility, etc. •

Frequency-flat fading channel: The delay spread in the channel • is negligible compared to symbol interval. The coherence bandwidth of the channel is much bigger than the signal bandwidth. Therefore, all frequency components of the signal will experience the same magnitude of fading. Frequency-selective fading channel (the counterpart). •

41 Lecture series - CSCG Matrix

testMIMO system with M transmit antennas and N receive antennas:

Antenna 1 Antenna 1 ...... Transmitter . . Receiver

Antenna M Antenna N At a given time/transmission, for frequency-flat fading over the • bandwidth of interest, the channel can be written as a matrix:

h11 h12 h1M ∙ ∙ ∙  h21 h22 h2M  H = ∙ ∙ ∙ ,  . . .. .   . . . .       hN1 hN2 hNM   ∙ ∙ ∙    where hnm is the channel gain from the m-th TX antenna to the n-th RX antenna. Each h is a complex value (quadrature-carrier multiplexing). • nm Affected by multi-path fading, path-loss, and shadowing. •

42 Lecture series - CSCG Matrix

i.i.d. Rayleigh fading model • – The number of scatters is large, all scattered contributions are non-coherent and approximately equal energy, (via central limit theorem) – Indoor and no line-of-sight – Enough spacing between antennas (e.g., half-wavelength or bigger) for independent entries – Each channel coefficient is modeled as a circularly symmetric complex Gaussian random variable with zero mean. – The magnitude of each channel entry (channel gain) follows Rayleigh PDF. The magnitude-square follows exponential PDF.

43 Lecture series - CSCG Matrix

2 – Channel variance σh depends on the large-scale fading (e.g., distance), often normalized as 1.

h (0, σ2) and i.i.d. nm ∼ CN h Re(h ), Im(h ) (0, σ2/2) independent nm nm ∼ N h x2 2x σ2 − h f hnm (x) = 2 e , for x 0. | | σh ≥ x 1 σ2 2 − h f hnm (x) = 2 e , for x 0. | | σh ≥ –H is CSCG: H (0, σ2I), which is N M. ∼ CN h ×

44 Lecture series - CSCG Matrix

Correlated Rayleigh channel model • –H (N M) is CSCG with zero-mean. × – Correlation matrix of H:

H RH = E vec(H)vec(H) , which is MN MN.  × – Another representation:

vec(H) = R1/2vec(H˜ ), where vec(H˜ ) (0, I ). H ∼ CN MN – Transmit correlation matrix and receive correlation matrix:

1 H 1 H Rt,H = E H H , Rr,H = E HH . N M  

45 Lecture series - CSCG Matrix

– Kronecker model: when 1) transmit correlation is independent from the receive antennas and vice versa; and 2) cross-correlation equals to the Kronecker product of corresponding transmit and receive correlations,

R = R R . H r,H ⊗ t,H In this case, we have H = R1/2 HR˜ 1/2 where H˜ (0, I). r,H t,H ∼ CN – See previous slides for the PDF.

46 Lecture series - CSCG Matrix

MIMO Capacity for CSI at the receiver only System description. Point-to-point multiple-antenna system with M transmit • antennas and N receive antennas, with channel matrix H. MIMO transceiver equation: • y = Hx + n,

– x:(M 1) contains signals sent by M transmit antennas. Its × mth entry is the signal send by the mth antenna. H – E[x x] P and P is the maximum transmit power. ≤ – x:(N 1) contains received signals at the N receive antennas. × Its nth entry is the signal received at the nth antenna.

– n: noise vector. Assume n (0, Σn) independent of H and x. ∼ CN Problem: To analyze the capacity when the receiver knows H. Capacity: C = max I(x; y) = max [H(y) H(y x)]. • fx(x) fx(x) − |

47 Lecture series - CSCG Matrix

Sketch of method and result. H(y x) = H(Hx + n x) = H(n x) = H(n). From previous • | | | lemma, H(n) = log det(πeΣn). From the transceiver equation, Σ = HΣ HH + Σ . • y x n Zero-mean x saves power and does not hurt the capacity, so • assume E[x] = 0. Therefore, E[y] = 0. H H trΣx = E[(x E[x]) (x E[x])] E (x x). − − ≤

For any Σx where trΣx P , from a previous lemma, • ≤ H max H(y) = log det(πeΣy) = log det[πe(HΣxH + Σn)] achieved when y is CSCG, i.e., when x is CSCG. Thus, • 1 H CMIMO,H = max E log det(IN + Σn− HΣxH ) trΣ P x≤ 1 H = max E log det(IN + Σn− HΣxH ), Σ is diagonal, trΣ P x x≤

48 Lecture series - CSCG Matrix

For any permutation matrix Π, HΠ has the same distribution • as H and log det(X) is concave for X 0. Thus  1 H E log det(IN + Σn− HΣxH ) 1 1 H H = E log det(IN + Σ− HΠΣxΠ H ) M! n XΠ 1 1 H H E log det IN + Σn− H ΠΣxΠ H ≤ ( "M! # ) XΠ trΣx 1 H = E log det IN + Σ− HH M n   with equality when Σx = (trΣx/M)IM . Thus,

P 1 H CMIMO,H = E log det IN + Σ− HH M n   with equality when Σx = (P/M)IM .

49 Lecture series - CSCG Matrix

Special/asymptotic cases and discussions. If i.i.d. noises, each follows (0, σ2), • CN P H CMIMO,H = E log det IN + HH Mσ2   ρ H = E log det IN + HH . M   When no CSI at the TX • – No reason to transmit more energy on one antenna than another; thus, same average energy/power across antennas. – No reason for correlation or dependence between transmit signals of different antennas. When N is fixed and M , • 1 → ∞ HHH I ,C N log(1 + ρ). M → N → When M fixed and N , • 1 → ∞ HH H I ,C M log(1 + ρN/M). N → M ≈

50 Lecture series - CSCG Matrix

General case: • K ρ ρ CMIMO,H = E λk log 1 + λk = KE λ log 1 + λ , M M kX=1     where – λ , , λ are eigenvalues of HHH , 1 ∙ ∙ ∙ K – λ represents one unordered eigenvalue. K = rank (H), – Use eigen-distributions in previous slides for further calculations.

51 Lecture series - CSCG Matrix

Performance analysis of distributed space-time coding for MIMO relay networks System description.

relays g f11 11 r1 t1 g transmitter f1R 1N receiver

fM1 . . . .

fMR . gR1 gRN Step1: time 1 to T rR tR Step2: time T+1 to 2T

One transmitter with M transmit antennas • One receiver with N receive antennas • R relay each with single transmit and receive antenna • Block-fading channels and CSI at the receiver only. • Transmitter-relay channels: f ’s. Relay-receiver channels: • mr grn’s.

52 Lecture series - CSCG Matrix

Transceiver protocol and model. Two-step transmission each contains T time slots • Step 1. Transmitter relays. Step 2. Relays receiver. → → Relay process: distributed space-time coding (DSTC). • Linear transformation on its received singals then transmit:

P t = 2 A r , i P + 1 i i r 1 – ri: received vector at relay in Step 1.

– ti: transmit vector from relay in Step 2. –A : a pre-determined (unitary) T T matrix. i × – P1 transmit power of Step 1.

– P2 transmit power per relay of Step 2.

53 Lecture series - CSCG Matrix

End-to-end transceiver equation: • X = βSH + W,

–X: T N received signalp matrix × – β α P1T where α P2 . , M , P1+1

–S , A1s ARs is the distributed space-time code ∙ ∙ ∙ dependingh on Ai’s and informationi vector t t t –H , (f1g1) (fRgR) is the RM N equivalent ∙ ∙ ∙ × channelh matrix i R R –W , √α gi1Aivi giN Aivi + w is the i=1 ∙ ∙ ∙ i=1 equivalent noise term. h P P i vi is noise vector at Relay i and w is the noise at the receiver.

54 Lecture series - CSCG Matrix

Result 1. Define H RW = I + αG G.

Given that sk is transmitted and the corresponding distributed

space-time code is Sk, the rows of X are independently each is t CSCG distributed with the same variance RW . The conditional PDF of X S is | k 1 1 H tr (X √βSkH)R− (X √βSkH) f(X Sk) = e− − W − . N T | (π det RW )−

Sketch of Proof: With known CSI, since v ’s and w are independent CSCG, X • i is a linear combination of CSCG random matrices and constants, thus also a CSCG random matrix. Straightforward to see that E (X Sk) = √βSkH. • | The rows of X are independent. (The columns are not.) •

55 Lecture series - CSCG Matrix

R T

xtn = β[SkH]tn + √α ginai,tτ viτ + wtn. i=1 τ=1 p X X

gˉ1n2 . Cov (xt n , xt n ) = δt t β g1n gRn  .  + δn n  . 1 1 2 2 1 2 1 ∙ ∙ ∙ 1 . 1 2  h i      gˉRn     2       By combining the results in matrix form, the covariance matrix t ˉ t of each row is IN + βG G = RW . Therefore, the PDF of the ith row is • T t t N t tr [X √αSkH] R− [X √αSkH] f([X] S ) = π det R − e− − i W − i i| k W T 1 H N tr [X √αSkH] R− [X √αSkH] = π det RW − e− − i W − i .

With independent rows, the PDF of X is • f(X S ) = T f([X] S ). | k i=1 i| k Q 56 Lecture series - CSCG Matrix

Result 2. The maximum-likelihood (ML) decoding is

1 H arg min tr X √αSkH RW− X √αSkH . S − −   With this decoding, the pairwise error probability (PEP) of

mistaking Sk by Sl has the following upper bound:

α 1 H tr [(Sk Sl)∗(Sk Sl)HR− H ] P (Sk Sl) E e− 4 − − W . → ≤ H

57 Lecture series - CSCG Matrix

Sketch of Proof: The ML decoding rule is straightforward to obtain from the • likelihood function. Chernoff upper bound: for any λ > 0, • λ(ln P (X Sl) ln P (X Sk)) P (Sk Sl) E e | − | → ≤ H 1 H 1 H 1 H λtr[α(Sk Sl) (Sk Sl)HR− H +√α(Sk Sl)HR− W +√αWR− (Sk Sl)H ] = E e− − − W − W W − H,W

T 1 H λtr[ ] N − tr (WR− W ) = E e− ∙∙∙ π det RW e− W dW. H W Z   The result can be proved via making sum-of-squares for the • exponent and the integration over CSCG PDF is 1.

58 Lecture series - CSCG Matrix

Result 3. With i.i.d. Rayleigh fading channels and fully diverse space-time code, the diversity order of DSTC is

min M,N R if M = N d = { } 6 .  MR 1 1 log log P if M = N  − M log P   Sketch of Proof: First bound RW with either of the following: • N R P2 2 RW (tr RW )I = N + gin IN . ≤ P1 + 1 | | n=1 i=1 ! X X P2R RW 1 + λmax IN , ≤ P1 + 1   where λmax is the maximum eigenvalue of G∗G∗/R, whose properties can be derived from results on Wishart matrix. Optimal power allocation with respect to error rate bound: • P P P1 = 2 and P2 = 2R , where P is the total transmit power.

59 Lecture series - CSCG Matrix

Use Chernoff bound in Result 2 and calculate the average over • fmi to get: R M P T σ2 g − (S S ) 1 + min i . P k l . E 1 R N → gin 8MNR 2 i=1 1 + NR i=1 n=1 gin ! Y | | σ2 : the minimum singular value of (S P S P)H (S S ). min k − l k − l Further conduct the calculations by splitting each integration • range into [0, x) [x, ) to split the integration into 2R ones. ∪ ∞ Calculate the order of each term with respect to P , and find a • good/the optimal choice of x to minimize the order with respect to P . Refer to [4] for details. •

60 Lecture series - CSCG Matrix

Performance analysis of massive MIMO with ZF Single-cell multi-user massive MIMO system model

One base station (BS) with M antennas (M 1) •  K single-antenna users, M K • Channel matrix G contains≥ independent Rayleigh fading • coefficients, perfect CSI at the BS. 1/2 – Uplink: G = HD (M K). The k-th column gk is channel × vector from BS antennas to user k. 1/2 – Downlink: G = D H (K M). The k-th row gk is channel × vector from User k to the BS antennas.

–H (0, I) and D = diag β1, , βK containing large-scale ∼ CN { ∙ ∙ ∙ } coefficients.

61 Lecture series - CSCG Matrix

Models for uplink with ZF Users send signals to the BS. • Transceiver equation: • y = √puGs + w. – s = [s s ]T : vector of information symbols. 1 ∙ ∙ ∙ K normalization: E ssH = I. { } – pu: average transmit power of each user – w (0, I): noise vector ∼ CN – y: Received vector at the BS Zero-forcing reception: • H 1 H 1/2 H 1 H A = √α(G G)− G = √αD− (H H)− H , 1/2 H 1 H r = Ay = √αpus + √αD− (H H)− H w. r (K 1): the signal vector after ZF reception at the BS. × 62 Lecture series - CSCG Matrix

Results on uplink sum-rate SNR of User k: • βkpu SNRk = H 1 . [(H H)− ]kk Coefficient α has no effect since it appears on both signal and noise as a scaling factor. Need the following on CSCG and inverse Wishart distribution: • H 1 1 H 1 1 E [(H H)− ]kk = tr (H H)− = . { } K { } M K − 1 E = M + 1 K. [(HH H) 1] −  − kk  Sketch of proof: R – From the QR-decomposition H = Φ ,  0  H 1 1 H  1 − − − [(H H) ]KK = [R R ]KK = 2 ([R]KK )

– Use the result on the distribution of [R]KK on Page 36.

63 Lecture series - CSCG Matrix

Thus • 1 1 E = . SNR β p (M K)  k  k u − E SNRk = βkpu(M + 1 K). { } − Capacity approximation and bounds: •

Rk log[1 + E (SNR)] = log[1 + βkpu(M + 1 K)]. ≈ − 1 R log 1 + = log [1 + β p (M K)] . k ≥ (SNR) k u −  E  Rk log[1 + E (SNR)] = log[1 + βkpu(M + 1 K)] ≤ −

64 Lecture series - CSCG Matrix

Models for downlink with ZF BS sends signals to all users. • s = [s s ]T : vector of information symbols. • 1 ∙ ∙ ∙ K normalization: E ssH = I. { } Zero-forcing precoding at BS: • H H 1 H H 1 1/2 A = √αG (GG )− = √αH (HH )− D− .

– BS transmitted the processed signal: x = √pAs. – Average transmit power of BS: Kp. p is the power per user. – From E xH x = Kp, we have E tr AH A = K. By { } { } applying the result on Page 63,

1 H 1 ˜ αE tr D− (H H)− = K α = β(M K), ⇒ { } ⇒ − ˜ 1 K 1 where β = K k=1 βk− . P 65 Lecture series - CSCG Matrix

Transceiver equation: • r = √pGAs + w = β˜(M K)ps + w. − q – w (0, I): noise vector ∼ CN – r: received signal vector at the users SNR of User k: • SNR = β˜(M K)p. k − Achievable rate of User k: • R = log[1 + β˜(M K)p]. k −

The achievable-rate analysis method be generalized to imperfect CSI, multi-cell, and other more general cases.

66 Lecture series - CSCG Matrix REFERENCES

References [1] I. E. Telatar. Capacity of multi-antenna gaussian channels. European Trans. on Telecommunications, 10:585–595, Nov. 1999. [2] T. L. Marzetta and B. M. Hochwald. Capacity of a mobile multiple-antenna communication link in Rayleigh flat-fading. IEEE Trans. on , 45:139–157, Jan. 1999. [3] A. Edelman. Eigenvalues and Condition Numbers of Random Matrices. Ph.D. Thesis, MIT, Department of Math. 1989. [4] Y. Jing, B. Hassibi. Diversity analysis of distributed space-time codes in re- lay networks with multiple transmit/receive antennas. EURASIP Journal on Advances in Signal Processing. 2008. [5] B. Clerckx and C. Oestges. MIMO Wireless Networks: Channels, Tech- niques and Standards for Multi-Antenna, Multi-User and Multi-Cell Sys- tems. 2nd Edition, Academic Press, 2013. [6] E. G. Larsson, H. Q. Ngo, H. Yang, and T. L. Marzetta. Fundamentals of Massive MIMO. Cambridge University Press. 2016.

66-1