MIMO Wireless Communication MIMO Wireless Communication Daniel W

Home , MIMO

• BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication MIMO Wireless Communication Daniel W. Bliss, Keith W. Forsythe, and Amanda M. Chan

■ Wireless communication using multiple-input multiple-output (MIMO) systems enables increased spectral efficiency for a given total transmit power. Increased capacity is achieved by introducing additional spatial channels that are exploited by using space-time coding. In this article, we survey the environmental factors that affect MIMO capacity. These factors include channel complexity, external interference, and channel estimation error. We discuss examples of space-time codes, including space-time low-density parity-check codes and space- time turbo codes, and we investigate receiver approaches, including multichannel multiuser detection (MCMUD). The ‘multichannel’ term indicates that the receiver incorporates multiple antennas by using space-time-frequency adaptive processing. The article reports the experimental performance of these codes and receivers.

- multiple-output (MIMO) sys- After an introductory section, we describe the con- tems are a natural extension of developments cept of MIMO information-theoretic capacity bounds. Min antenna array communication. While the Because the phenomenology of the channel is impor- advantages of multiple receive antennas, such as gain tant for capacity, we discuss this phenomenology and and spatial diversity, have been known and exploited for associated parameterization techniques, followed by ex- some time [1, 2, 3], the use of transmit diversity has amples of space-time codes and their respective receivers only been investigated recently [4, 5]. The advantages and decoders. We performed experiments to investigate of MIMO communication, which exploits the physi- channel phenomenology and to test coding and receiver cal channel between many transmit and receive anten- techniques. nas, are currently receiving significant attention [6–9]. While the channel can be so nonstationary that it can- Capacity not be estimated in any useful sense [10], in this article We discuss MIMO information-theoretic performance we assume the channel is quasistatic. bounds in more detail in the next section. Capacity in- MIMO systems provide a number of advantages creases linearly with signal-to-noise ratio (SNR) at low over single-antenna-to-single-antenna communication. SNR, but increases logarithmically with SNR at high Sensitivity to fading is reduced by the spatial diversity SNR. In a MIMO system, a given total transmit power provided by multiple spatial paths. Under certain envi- can be divided among multiple spatial paths (or modes), ronmental conditions, the power requirements associ- driving the capacity closer to the linear regime for each ated with high spectral-efficiency communication can mode, thus increasing the aggregate spectral efficiency. be significantly reduced by avoiding the compressive re- As seen in Figure 1, which assumes an optimal high gion of the information-theoretic capacity bound. Here, spectral-efficiency MIMO channel (a channel matrix spectral efficiency is defined as the total number of in- with a flat singular-value distribution), MIMO systems formation bits per second per Hertz transmitted from enable high spectral efficiency at much lower required one array to the other. energy per information bit.

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 97 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

The information-theoretic bound on the spectral ef- 20 ficiency is a function of the total transmit power and M = 16 M = 8 15 the channel phenomenology. In implementing MIMO M = 4 systems, we must decide whether channel estimation M = 1 information will be fed back to the transmitter so that 10 the transmitter can adapt. Most MIMO communication research has focused on systems without feedback. 5 A MIMO system with an uninformed transmitter 0 Spectral efficiency (bits/sec/Hz) efficiency Spectral (without feedback) is simpler to implement, and at high –5 0 5 10 SNR its spectral-efficiency bound approaches that of an Eb/N0 (dB) informed transmitter (with feedback). One of the environmental issues with which com- FIGURE 1. Spectral-efficiency bound as a function of noise- munication systems must contend is interference, ei- spectral-density-normalized energy per information bit E N M M ther unintentional or intentional. Because MIMO sys- ( b/ 0). The graph compares four different × multiple- input multiple-output (MIMO) systems, assuming channel tems use antenna arrays, localized interference can be matrices with flat singular-value distribution. mitigated naturally. The benefits extend beyond those achieved by single-input multiple-output systems, that is, a single transmitter and a multiple-antenna receiver, of these characteristics [11]. Preliminary ideas are dis- because the transmit diversity nearly guarantees that cussed elsewhere [6]. nulling an interferer cannot unintentionally null a large A simple and elegant solution that maximizes diver- fraction of the transmit signal energy. sity and enables simple decoupled detection is proposed in Reference 12. More generally, orthogonal space-time Phenomenology block codes are discussed in References 13 and 14. A We discuss channel phenomenology and channel pa- general discussion of distributing data across transmit- rameterization techniques in more detail in a later sec- ters (linear dispersive codes) is given in Reference 15. tion. Aspects of the channel that affect MIMO system High SNR design criteria and specific examples are giv- capacity, namely, channel complexity and channel sta- en for space-time trellis codes in Reference 16. Unitary tionarity, are addressed in this paper. The first aspect, codes optimized for operation in Rayleigh fading are channel complexity, is a function of the richness of scat- presented in Reference 17. Space-time coding without terers. In general, capacity at high spectral efficiency the requirement of channel estimation is also a com- increases as the singular values of the channel matrix mon topic in the literature. Many differential coding increase. The distribution of singular values is a mea- schemes have been proposed [18]. Under various con- sure of the relative usefulness of various spatial paths straints at the transmitter and receiver, information- through the channel. theoretic capacity can be evaluated without condition- ing on knowledge of the propagation channel [19, 20]. Space-Time Coding and Receivers More recently, MIMO extensions of turbo coding have In order to implement a MIMO communication sys- been suggested [21, 22]. Finally, coding techniques for tem, we must first select a particular coding scheme. informed transmitter systems have received some inter- Most space-time coding schemes have a strong connec- est [23, 24]. tion to well-known single-input single-output (SISO) coding approaches and assume an uninformed trans- Experimental Results mitter (UT). Later in the article we discuss space-time Because information-theoretic capacity and practical low-density parity-check codes, space-time turbo codes, performance are dependent upon the channel phenom- and their respective receivers. Space-time coding can enology, a variety of experiments were performed. Both exploit the MIMO degrees of freedom to increase re- channel phenomenology and experimental procedures dundancy, spectral efficiency, or some combination are discussed in later sections. Experiments were per-

98 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

WATER FILLING

  is a metaphor for the solution of several optimization problems related to W Noise channel capacity. The simplest physical example is perhaps the case of spectral allocation for maximal

total capacity under a total power constraint. Let xk denote the power received in the kth frequency cell, which has interference (including thermal noise) de-

noted nk. If the total received power is constrained to be x, then the total capacity is maximized by solving Power Frequency max log()1+xk / n k {}x: x =x ∑ k∑ k k FIGURE A. Notional water-filling example. k

= max log(nk+ x k ) − log()nk . : = ∑ ∑ {}xk∑ x k x k k k in each frequency cell. The volume of the water is Use Lagrange multipliers and evaluate the total received power of the signal. Note that cells with high levels of interference are not used at all.     ∂ A similar solution results when the capacity is ex-  log(n+ x ) − µ  x− x  ∂x ∑j j  ∑ j   pressed by k  j  j  

log(1+ gk x k ) to find a solution. The solution satisfies xk + nk = ∑ –1 k µ for all nonzero xk. Figure A illustrates the solu- −1 tion graphically as an example of water filling. The for gains gk. One can write the gains as gk= n k difference between the water level (blue) and the and use the water-filling argument above. In this noise level (red) is the power allocated to the signal context, cells with low gains may not be used at all. formed in an outdoor nonstationary environment in a spectral-efficiency bounds in frequency-selective envi- mixed residential, industrial, and light urban settings. ronments. Finally, we summarize alternative channel Intentional high-power interference was included. performance metrics.

Information-Theoretic Capacity Informed Transmitter The information-theoretic capacity of MIMO systems For narrowband MIMO systems, the coupling between has been widely discussed [7, 25]. The development of the transmitter and receiver for each sample in time can the informed transmitter (“water filling”) and unin- be modeled by using formed transmitter approaches is repeated in this sec- z= Hx + n, (1) tion, along with a discussion of the relative performance of these approaches. (The concept of “water filling” is where z is the complex receive-array output, explained in the sidebar entitled “Water Filling.”) In nRT× n addition, we introduce the topic of spectral-efficiency H ∈ bounds in the presence of interference, and we discuss is the nR × nT (number of receive by transmit antenna)

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 99 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

channel matrix, x is the transmit-array vector, and n is −1 zero-mean-complex Gaussian noise. Po + tr{}D CIT = log2 D . (6) The capacity is defined as the maximum of the mu- n+ tual information [26]

A water-filling argument establishes that the entries dm  p(z | x,)H  in the diagonal matrix I (,z x|H ) = log2   , (2)  p(z | H)  n+× n + D ∈ † over the source conditional probability density p(x | H) contain the n+ top-ordered eigenvalues of HH . The subject to various transmit constraints, where the ex- values dm must satisfy pectation value is indicated by the notation  . Not- n+ ing that the mutual information can be expressed as the dm > . (7) tr{}D−1 difference between two conditional entropies Po + I z x H= z H − z x H (3) (, | ) h( | )( h | , ), If Equation 7 is not satisfied for some dm, it will not be that satisfied for any smaller dm. h(z | x,)H= h() n = nlog (π e σ 2 ), In this discussion we assume that the environment is R 2 n stationary over a period long enough for the error asso- and that h(z | H) is maximized for a zero-mean Gauss- ciated with channel estimation to vanish asymptotically. ian source x, the capacity is given by In order to study typical performance of quasistationary channels sampled from a given probability distribution, 2 †† σnIH n + xx H capacity is averaged over an ensemble of quasistationary C = sup log R , 2 2 (4) environments. Under the ergodic assumption (that is, x x† σnI n R the ensemble average is equal to the time average), the mean capacity CIT is the channel capacity. where the notation  indicates determinant, † indicates Hermitian conjugate, and I indicates an identity Uninformed Transmitter nR matrix of size nR . A variety of possible constraints ex- If the channel is not known at the transmitter, then ist for xx† , depending on the assumed transmitter an optimal transmission strategy is to transmit equal limitations. Here we assume that the fundamental limi- power from each antenna PI=P / n [7]. Assum- o T nT tation is the total power transmitted. Optimization over ing that the receiver can accurately estimate the chan-

the nT × nT noise-normalized transmit covariance ma- nel, but the transmitter does not attempt to optimize its † 2 trix, P= xx /σn , is constrained by the total noise- output to compensate for the channel, the uninformed normalized transmit power Po. By allowing different transmitter (UT) maximum spectral efficiency bound transmit powers at each antenna, we can enforce this is given by constraint by using the form tr{}P ≤ P . The informed o P (8) transmitter (IT) channel capacity is achieved if the C =log IH + o H† . UT 2 nR channel is known by both the transmitter and receiver, nT giving This is a common transmit constraint, as it may be difficult to provide the transmitter channel estimates. The C = sup log IH+ PH† . (5) IT 2 nR sidebar entitled “Toy 2 × 2 Channel Model” discusses PP;tr() =P o an example of IT and UT capacities for a simple line- To avoid radiating negative power, we impose the addi- of-sight environment. tional constraint P > 0 by using only a subset of channel modes. Capacity Ratio

The resulting capacity is given by At high SNR, CIT and CUT converge. This can be ob-

100 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

served in the large P limit of the ratio of Equations 6 2 2 o h(z | x,H) ≤ log πe σ I + σ R , and 8, 2 { n n }

−1 Po +tr{}D 2 log D and σ R is the spatial-interference covariance matrix. C 2 nmin n IT → Po Equality is achieved if and only if the interference am- CUT log ID+ (9) 2 nmin nT plitudes have a Gaussian distribution. Thus the worst-

log2 D case informed capacity, the maximum-minimum mu- loog2 ()Po −log2 () nmin + → nmin tual information, log2 D log2 ()Po − log2 () nT + n (12) min = z, x|H , Cint sup inf I() → 1, p()z| H p()η becomes (13) where the nmin diagonal entries in D contain all non- † †  H zero eigenvalues of H H. If n > n , then the conver- CIT, int = sup log2 | IH+P |, T R PP;tr()  =P gence to one is logarithmically slow. using o At low SNR the ratio C /C is given by IT UT HI ≡ + RH−1 / 2 . (14) () C log [()P+1 / d d ] Gaussian interference corresponds to a saddle point of IT → 2 o max max C Po † the mutual information at which the maximum-mini- UT log IHn + H 2 R nT mum capacity is achieved. The capacity in the pres- log(1+ P maxeig{}HH† ) (10) ence of Gaussian interference has a form identical to = o   Po †   Equation 6 under the transformation DD→  , where tr log  IHn + H     R nT   D contains the eigenvalues of HH  †. The transmitted maxeig{}HH† noise-normalized power covariance matrix P is calcu- ≈ ,  1 tr{}HH† lated by using H. Similarly, the uninformed transmit- nT ter spectral-efficiency bound in the presence of noise is given by the same transformation of HH→  .

using Equation 6 with n+ = 1 and Equation 8. Given In the limit of high spectral efficiency for nJ infinite this asymptotic result, we can make a few observations. J/S jammers, the loss in capacity approaches The spectral-efficiency ratio is given by the maximum to the average eigenvalue ratio of H†H. If the channel min(nTR, n − nJ ) Cint → C . (15) is rank one, such as in the case of a multiple-input sin- min(nTR, n ) gle-output (MISO) system, the ratio is approximately † equal to nT. Finally, in the special case in which H H In general, the theoretical capacity is not significantly has a flat eigenvalue distribution, the optimal transmit affected as long as the number of antennas is much covariance matrix is not unique. Nonetheless, the ratio larger than the number of jammers. This resistance to

CIT/CUT approaches one. the effects of jammers is demonstrated experimentally later in the article. Interference By extending the discussion in the previous section [8, Frequency-Selective Channels 27], we can calculate capacity in the presence of unco- In environments in which there is frequency-selec- operative (worst case) external interference η, in additive fading, the channel matrix H(f ) is a function of tion to the spatially-white complex Gaussian noise n frequency. Exploiting the orthogonality of frequency considered previously. The mutual information is again channels, the capacity in frequency-selective fading can given by Equations 2 and 3, where entropy h(z | x,)H be calculated by using an extension of Equations 6 and in the presence of the external interference becomes 8. For the uninformed transmitter, this leads to the fre- h()n + η , quency-selective spectral-efficiency bound

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 101 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

T OY 2 × 2 CHANNEL MODEL

   of channel matrix beamwidths closely approximates many ad hoc Beigenvalues is essential to the effectiveness of definitions for physical arrays. Figure A displays the multiple-input, multiple-output (MIMO) commu- eigenvalues µ1 and µ2 as a function of generalized nication, we employ a toy example for the purposes beamwidth separation. When the transmit and re- of introduction, and we discuss the eigenvalue dis- ceive arrays are small, indicated by a small separa- tribution of a 2 × 2 narrowband MIMO system in tion in beamwidths, one eigenvalue is dominant. the absence of environmental scatterers. To visualize As the array apertures become larger, indicated by the example, we can imagine two receive and two a larger separation, one array’s individual elements transmit antennas located at the corners of a rect- can be resolved by the other array. Consequently, angle. The ratio of channel matrix eigenvalues can the smaller eigenvalue increases. Conversely, the be changed by varying the shape of the rectangle. larger eigenvalue decreases slightly. The columns of the channel matrix H (in Equation Equations 6 and 7 in the main article are em- 1 in the main article) can be viewed as the receiver- ployed to determine the capacity for the 2 × 2 sys- array response vectors, one vector for each transmit tem. The “water-filling” technique (explained in a antenna, previous sidebar) must first determine if both modes in the channel are employed. Both modes are used H= 2 a v a v , ()1 1 2 2 if the following condition is satisfied, where and are constants of proportionality a1 a2 2 µ > , (equal to the root-mean-squared transmit-to-receive 2 1 1 Po +µ + µ attenuation for transmit antennas 1 and 2 respec- 1 2 tively) that take into account geometric attenuation † 1 1 v1 v 2 and antenna gain effects, and v1 and v2 are unit- Po > − > , µ µ 2  v† v 2  norm array response vectors. For the purpose of this 2 1 a 1− 1 2 

discussion, we assume a = a1 = a2, which is valid if the rectangle deformation does not significantly af- assuming µ1 > µ2. fect overall transmitter-to-receiver distances. If the condition is not satisfied, then only the The capacity of the 2 × 2 MIMO system is a function of the channel singular values and the total transmit power. Eigenvalues of HH† are given by 0 µ =2a2 1 ± v† v , 1, 2 1 2 (dB) () 2 –10

a ue/ where the absolute value is denoted by  . The l –20 va

separation between receive array responses can be en –30 described in a convenient form in terms of general- Eig ized beamwidths [40], 0 0.2 0.4 0.6 0.8 1.0 2 b = arccos v† v . Generalized beamwidth separation 12 π {}1 2 FIGURE A. Eigenvalues of HH† for a 2 × 2 line-of- For small angular separations, this definition of sight channel as a function of antenna separation.

102 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

stronger channel mode is employed and the capacity, from Equation 6, is given by c/Hz) 10 5 CIT =log2 (1 + µ 1Po ), (b/se y  2 †  2 = + + v v ; icienc log 1 2a 1 P f 2  ()1 2 o  1 al ef al

otherwise, both modes are used and the capacity is ctr 0.5

given by Spe –10 –5 0 5 10 15 20 1 1 Po +µ + µ  µ 0  a2P (dB) C = log 1 2 1 , 0 IT 2 2  0 µ  2 FIGURE B. The informed transmitter capacity of a 2 2 × 2 line-of-sight channel, assuming antenna beam-  µ µ P +µ + µ  1  = log 1 2 o 1 2 , width separations of 0.1 (solid line) and 0.9 (dashed 2     2 µ1 µ 2  line).

2  † 2  = 2log2 a  1− v1 v 2  Po +1 {   } 2 † 2 v1 v 2  † 2  2a2 P + , −log21 − v 1 v 2 . o 2   1− v† v   1 2  Figure B displays the resulting capacity as a func- 2 tion of a Po (mean single-input single-output SNR) when using two modes, where Po is the total noise- for two beamwidth separations, 0.1 and 0.9. At low normalized power. In both cases, the total received 2 2 values of a Po the capacity associated with small power is much larger than a Po. beamwidth separation performs best. In this regime, In complicated multipath environments, small capacity is linear with receive power, and small arrays employ scatterers to create virtual arrays of beamwidth separation increases the coherent gain. a much larger effective aperture. The effect of the 2 At high values of a Po large beamwidth separation scatterers upon capacity depends on their number produces a higher capacity as the optimal MIMO and distribution in the environment. The individual system distributes the energy between modes. antenna elements can be resolved by the larger ef- The total received power is given by fective aperture produced by the scatterers. As demonstrated in Figure A, the ability to resolve antenna 2 v† v 2 1 2 a Po elements is related to the number of large singular when using one mode, and values of the channel matrix and thus the capacity.

H df CPUT((o; f ))   C = ∫ 1 Po † UT,FS ≈log2 IH + H , (16) df n n ∫ f T nf P ∆f log I + o HH()f†() f 2 nT n n where the distance between frequency samples is given ≈ ∑n=1 nf ∆f by ∆f and the nf -bin frequency-partitioned channel ∑n=1 matrix is given by

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 103 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

all the entries in the channel matrix are sampled from  H()f1 0 0 0     identical independent complex Gaussians HG . This 0 H()f 2 0 0 (17) H ≡   . assumption corresponds to an environment with com-       plicated multipath scattering. While this approach is  0 0  H()f   n f  convenient from the perspective of performing analytic calculations, it may provide a channel eigenvalue distri- For the informed transmitter channel capacity, pow- bution that is too flat. At the other extreme, channels er is optimally distributed amongst both spatial modes can be characterized by a diversity order [30], which is and frequency channels. The capacity can be expressed used to indicate an effective cut-off in the eigenvalue distribution induced by spatial correlation. A number of 1    † (18) CIT, FS ≈ ma x log2 IH+ PH , approaches that introduce spatial correlations have been P n f suggested. One approach uses the form which is maximized by Equation 6 with the appropriate HM=LRGM . (19) substitutions for the frequency-selective channel, and diagonal entries in D in Equation 7 are selected from The above model results in a link-   ()nTR⋅ n ×()nTR⋅ n the eigenvalues of HH†. Because of the block diagonal by-link covariance matrix of the Kronecker product structure of H, the space-frequency form † † ∗ for the entries in the chan- ()nT⋅ n f ×()nT⋅ n f  ()MMLL ⊗()MMRR noise-normalized transmit covariance matrix H is a nel matrix H. This product structure can arise from a block diagonal matrix, normalized so that spherical Green’s function model of propagation, pro-  vided several additional conditions are met. First, scat- tr{}P ≤ n P . f o terers are concentrated around (but not too close to) the transmitter and receiver. Second, multiple scattering of Other Performance Metrics a particular kind (from transmitter element to trans- The information-theoretic capacity is not the only pos- mitter scatterer to receiver scatterer to receiver element) sible metric of performance. As an example, another dominates propagation. Third, scatterers are sufficient- useful performance metric is the outage capacity [16], ly separated in angle when viewed by their associated or the achievable spectral-efficiency bound, assuming a array. given probability of error-free decoding of a frame. In many practical situations this metric may be the best Received Power measure of performance, for example, in the case in It is often convenient to parameterize the incoming 2 2 which the system can resend frames of data. signal power in terms of a Po, where a is the mean- squared link attenuation. It can be employed to eas- Channel Phenomenology ily compare performance by using different constraints In this section we describe tools for modeling, estimat- and environments. This choice corresponds to the typi- ing, and characterizing MIMO channels. These topics cal noise-normalized received power for a single receive 2 are discussed in greater detail elsewhere [25, 28]. First and single transmit antenna radiating power σnP o. we introduce the standard model and simple modifica- However, this choice can be mildly misleading because tions to it. Then we discuss the simplest channel char- the total received power will, in general, be much larger 2 2 acterization, which is mean receive power, followed by a than a Po. In general, a is defined by the Frobenius description of channel estimation techniques, methods norm squared of the channel matrix normalized by the for determining how much channels have changed, and number of transmitters and receivers, channel parameterization and estimation techniques. tr{}HH† (20) a2 = .

Standard Model nTR n A variety of techniques are used to simulate the channel The total received noise-normalized power produced matrix [29]. The simplest approach is to assume that by a set of orthogonal receive beamformers is given by

104 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

tr{}HPH† . The uninformed transmitter rate is maxi- trix. The maximum-likelihood estimate of H is given mized by sending equal power to all transmit antennas by † † 2 so that tr{}HPH becomes . †† −1 Po/ n T tr{}HH = nR a P o HZˆ = TT()T , (25) It is worth noting that P is not in general optimized by the informed transmitter to maximize received power assuming that the reference signals in T are known and but to maximize capacity. TT† is nonsingular. The total received power for the capacity-optimized The previous channel-estimation discussion explicit- informed transmitter, given an arbitrary channel ma- ly assumed flat fading. However, the frequency-selective trix, is channels can be estimated by first estimating a finite impulse-response MIMO channel, which can be trans-  −1   †  Po + tr()D  formed to the frequency domain. tr{}HPIT H = tr  DI− n   n  +   +   (21) A finite impulse-response extension of Equation 1 −1 2 is given by introducing delayed copies of T at delays tr{}DDtr{}tr{}D − n+ =P + . δ,, δ ,δ , o 1 2  ntaps n+ n+  T()δ  The first term in Equation 21 is bounded from below 1   T()δ2 by T ≡   , (26)    tr{}D tr{}HH†  T()δ  Po ≥ Po (22)  ntaps  n+ min{}nTR, n 2 so that the transmit matrix has dimension ⋅ × . ≥max{}nTR , n a Po . ()nT n tapsn s The resulting wideband channel matrix has the dimen-

The second term in Equation 22 is bounded from be- sion ()nT⋅ n taps × n s , low by zero. Consequently, the total received power is 2 [HHˆ ()δˆ () δ  Hˆ ()δ ] 1 2 ntaps greater than or equal to max{nTR, n }a Po . (27) 2 †† −1 For very small a Po, far from the nonlinear regime of = ZT()TT   . the Shannon limit, the optimal solution is to maximize received power. This is done by transmitting the best Using this form, an effective channel filter is associated

mode only, setting n+ = 1. In this regime the total re- with each transmit-to-receive antenna link. By assum- ceived power is given by ing regular delay sampling, we can use a discrete Fou- rier transform to construct the explicit frequency-selec- tr{}HP H† maxeig{}HH† IT → Po . (23) tive form, This result is bounded from above by n n a2 P , which TR o [HHˆ ()fˆ () f  Hˆ ()f ] is achieved if there is only a single nontrivial mode in 1 2 ntaps (28) the channel. = [HHˆ ()δˆ () δ  Hˆ (δ )]() ⊗I , 1 2 ntaps ntaps nT Channel Estimation where the n-point discrete Fourier transform is repre-

The Gaussian probability density function for a multi- sented by n and the Kronecker product is represented variate, signal-in-the-mean, statistical model of the re- by ⊗. ceived signal Z, assuming T ∈nT× n s is the transmit sequence, is given by Channel-Difference Metrics

† −1 A variety of metrics are possible. In investigating chan- e −tr[(ZH − TR)(ZH− T)] p()ZR| ,HT, = ,, (24) nel variations, no one metric will be useful for all situa- ns n R ns π |R | tions. As an example, two completely different channels can have the same capacity. Depending upon the issue where R is the noise-plus-interference covariance ma- being investigated, we may wish to think of these matri-

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 105 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

ces as being similar or very different. Here two metrics δ 2(,HH ) (32) are discussed. Both metrics are ad hoc, but motivations a b min(nTR, n ) are provided. The first metric measures differences in † † ≡ [log λ ()HH − log λ ()HH ]. channel singular-value distributions. The second metric ∑ 2 m a a 2 m b b m=1 is sensitive to differences in both the singular-value distribution and the channel eigenvector structure. Fractional Receiver Loss Metric In this section we introduce a power-weighted mean 2 Eigenvalue-Based Metric cos θ metric. The metric takes into account both the As was mentioned earlier, MIMO capacity is only a eigenvalue and eigenvector structure of the channels. It function of the channel singular values. Equivalently, is motivated by the effect of receive-beamformer mis- capacity is invariant under channel-matrix transforma- match on capacity. Starting with Equation 8, the low tions of the form SNR uninformed transmitter capacity approximation is † given by HW→1HW 2 , (29) Po † where W and W are arbitrary unitary matrices. Con- C =log2 IH + H 1 2 nT sequently, for some applications it is useful to employ a   metric that is also invariant under this transformation.  Po †  ≈ log2 (e)tr  HH  Because capacity is a function of the structure of the nT  channel singular-value distribution, the metric should Po † (33) be sensitive to this structure. = llog2()e ∑hm h m nT The channel capacity is a function of HH†. A natu- m ral metric would employ the distance between the ca- Po † 2 = log (e) wmhm 2 n ∑ pacity for two channel matrices at the same average to- T m tal received power, that is, the same , a2Po hm wm ≡ , 2 hm a P † ∆C = log I + o HH UT 2 n a a T (30) where hm is the column of the channel matrix associat- 2 ed with transmitter m, and  indicates the l2 norm. a Po † −log2 I + HHb b . In the low SNR limit, the optimal receive beamformer nT is given by the matched response given in wm. If some However, there are two problems with this definition. other beamformer is employed, labeled wm′ , then signal First, the difference is a function of Po. Second, there is energy is lost, adversely affecting the capacity, degeneracy in H singular values that gives a particular Po † 2 capacity. To address the first issue, the difference can be C′ ≈ log ( e) w′m hm . (34) 2 n ∑ investigated in a high SNR limit, giving T m One possible reason that a beamformer might use the ∆ ≈ † − † CUT log2 HHa a log2 HHb b (31) wrong matched spatial filter is channel nonstationarity. min(nTR, n ) The fractional capacity loss is given by = log λ ()† − logλ ( †) , 2 m HHa a 2 m HHb b † 2 ∑ 2 2 h h† m=1 † ′m m w′m hm hm C ′ ∑m ∑m h′m hm ≈ 2 = 2 C hm hm where λm()X indicates the mth largest eigenvalue of X. ∑m ∑m To increase the sensitivity to the shape of the eigenvalue 2 2 (35) hm cos θm distribution, the metric is defined to be the Euclidean = ∑m , difference, assuming that each eigenvalue is associated 2 ∑ hm with an orthogonal dimension, giving m

106 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

2 which is the power-weighted mean cos θ estimate, † † m FU= b AUα GV′ AVα (37) where θ is defined to be the inner product between cos m † the “good” and “bad” unit-norm array responses for the = bUAαGA α V mth transmitter. It is generally desirable for metrics to and − be symmetric with respect to H and H′, thus avoiding diag{α0,, α 1 ,αn 1} A = n , moral attributions with regard to channel matrices. Us- α − (38) tr(diag{α0,, α 1 ,αn 1}2 ) ing the previous discussion as motivation, a symmetric form is given by where b is used to set overall scale, n is given by the size

2 of A α, and U and V indicate random unitary matrices. ∑ hm h′ m cos θm Used here is the fact that arbitrary unitary transforma- γ ()HH, ′ ≡ m , (36) tions do not affect the statistics of the Gaussian matrix. hm h′ m ∑m The form of A α given here is somewhat arbitrary, but where the “power-weighted” expectation is evaluated has the satisfying characteristics that as α → 0, a rank- over transmitters. one channel matrix is produced, and as α → 1, a spatially uncorrelated Gaussian matrix is produced. Fur- Singular Values thermore, empirically this model provides good fits to

The singular-value distribution of H, or the related ei- experimental distributions. The normalization for A α is † 2 2 genvalue distribution of HH , is a useful tool for un- chosen so that the expected value of F F is b nTRn , 2 derstanding the expected performance of MIMO com- where  F indicates the Frobenius norm. munication systems. From the discussion earlier, we can see that the channel capacity is a function of channel Channel Parameter Estimation singular values, but not the singular-vector structure of An estimate for αˆ associated with particular transmit the channel. Thus channel phenomenology can be in- and receive locations is given by minimizing the mean- vestigated by studying the statistics of channel singular- square metric given in Equation 32, value distributions. ^ ^ αˆ = arg min δ 2[(HF, α)] , (39) Channel Parameterization where Xˆ indicates the estimated value of X. Here the A commonly employed model assumes the channel is expectation, denoted by  , indicates averaging is over proportional to a matrix G, where the entries are inde- an ensemble of F for a given α and an ensemble of H for pendently drawn from a unit-norm complex circular given transmit and receiver sites. Gaussian distribution. While the distribution is conve- It is worth noting that this approach does not neces- nient, it does suffer from a singular-value distribution sarily provide an unbiased estimate of α. Estimates of α, that is overly optimistic for many environments. As was using the metric introduced here, are dependent upon previously discussed, one solution is to introduce spatial the received SNR. Data presented later in the article † correlations using the transformation FM= b LRGM have sufficiently high SNR such that α can be estimat- [29]. While this approach is limited, it produces simply ed within ±0.02. more realistic channels than the uncorrelated Gaussian model. The spatial correlation matrices can be factored Space-Time Low-Density Parity-Check Codes so that MU= AU† and MV= AV†, where U This section of the article introduces low-density parity- L αL R αR and V are unitary matrices, and A and A are posi- check (LDPC) codes, which were studied extensively αL αR tive-semidefinite diagonal matrices. by R.G. Gallager [31]. The significance of modern im- Assuming that the number of transmit and receive plementations of LDPC codes rests on iterative decod- antennas are equal and have similar spatial correlation ing algorithms that, for LDPC codes, are applications characteristics, the diagonal matrices can be set equal, of techniques formulated for Bayesian belief networks, AA= = A , producing the new random channel which are introduced below. This section also discusses α αLRα matrix F, where a simple application of LDPC to space-time codes.

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 107 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

BAYESIAN BELIEF NETWORKS

  is often For decoding purposes, each each node. Messages that flow from based on the Bayesian belief node in the graph maintains an al- parent to child are denoted π P G k ()uk networks popularized by Pearl, in phabet (for example, the symbol al- and are treated as if they were priors, the context of machine learning, in phabet for coding applications) and while messages that flow from child C a well-known monograph [1]. An several (probability) distributions to parent are denoted λk ()x and are interpretation of various decoding over this alphabet. One probability treated as if they were likelihoods. algorithms in terms of Bayesian be- distribution, denoted π(x), can be At each node, messages received lief networks is presented elsewhere interpreted as a prior density on the from parents and children are used [2]. alphabet while another (nonnega- to update the internal (for that To appreciate the use of belief tive, but not a normalized density) node) prior and likelihood func- networks for decoding, consider distribution, denoted λ(x), can be tions π(x) and λ(x) for the node’s the probability density function de- interpreted as a likelihood function alphabet.

noted p() x1,,x2 x 3,x 4 in Figure A. on the alphabet. Nodes are activated in any order, This function factors in the man- In addition, each node keeps subject only to the requirement that ner shown in the figure, expressing track of a belief function that is the all incoming messages are available. simpler variable dependencies than product of priors and likelihoods: When a node is activated, it calcu- those allowed by the multivariate π(x) λ(x). The maximum of the be- lates its internal prior and likeli-

notation p() x1,,… xl . The factor- lief function can be used as a deci- hood functions and then makes its ization can be represented by a di- sion on the value of the node’s al- messages available to its parent and rected acyclic graph as shown, with phabet. child nodes. Initial settings of the directed arrows expressing condi- To evaluate a consistent set of internal functions are provided (but tional probabilities of the more gen- distribution functions, messages not shown in the figure) to enable

eral form p() x |u1 , …,ul . are received and transmitted from the process to start.

involved in only a few parity-check equations. Con- Low-Density Parity-Check Codes sequently, most entries in the parity-check matrix are

LDPC codes were developed by Gallager, who studied zero. Regular LDPC codes have nC parity-check equa- their distance properties and decoding in a well-known tions for each symbol, and each parity-check equation

monograph [31]. With the advent of graphical decod- involves nR symbols. Thus, if the dimensionality of the ing techniques, soft-decision decoding of LDPC codes parity-check matrix is r × c, we have rnR = cnC for regu- became practical, resulting in renewed interest in these lar LDPC codes. As an example, the LDPC code used codes. Subsequent developments in code design and for some of the experiments described later satisfies (r, c)

decoding have led to codes that achieve levels of per- = (512, 1024) and (nR, nC ) = (8, 4). More powerful formance astonishingly close to the Shannon capacity codes that are not regular are also known [33]. [32], albeit at the cost of extremely long codewords. However, decoding complexity of LDPC codes scales LDPC Decoding linearly (with a fixed number of iterations) with the Recently, graphical decoding techniques have motivat- code length, making relatively long codes practical. ed practical code design. Bayesian belief networks [34] LDPC codes are linear block codes defined by a can be used to formulate decoders for LDPC codes and parity-check matrix. Each symbol in the codeword is turbo codes (the sidebar entitled “Bayesian Belief Net-

108 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

Variable dependencies For loopless graphs, the order of x Loopless directed acyclic graph (DAG) activation does not matter and the 1 Directed Markov field process converges. Unfortunately, Bayesian belief network x for decoding applications, interest- 2 x3 p(x , x , x , x ) = p(x |x )p(x |x )p(x |x )p(x ) 1 2 3 4 4 3 3 1 2 1 1 ing graphs have loops, so order of

x4 activation matters and convergence is not guaranteed. Typically, nodes Parent nodes (alphabets uk ∈ Bk) Messages passed Received Sent are activated in a repetitive pattern likelihoods λ C λ P for a certain number of iterations Update node until a stopping criterion is met. (alphabet x ∈ A) Symbol decisions are based on the priors π P π C Child nodes belief function.

Node calculations and messages References s 1. J. Pearl, Probabilistic Reasoning in In- (x) p(x u ,…,u ) P(u ) π = Σ | 1 s Π π k k u k=1 telligent Systems: Networks of Plausible Inference (Morgan Kaufmann, San (x) C (x) λ = Π λ k Belief Mateo, Calif., 1988). k λ(x) π (x) 2. R.J. McEliece, D.J.C. MacKay, and C (x) π (x) λ C(x) J.-F. Cheng, “Turbo Decoding as an πj = Π k k≠j Instance of Pearl’s ‘Belief Propagation’ P P Algorithm,” IEEE J. Sel. Areas Com- λ (u ) = λ (x)p(x|u ,…, u ) π (u ) j j Σ 1 s Π k k mun. 16 (2), 1998, pp. 140–152. x,uk:k≠j k≠j

FIGURE A. Bayesian belief networks provide a framework for representing conditional probabilities in a graphical manner. Each node has a symbol alphabet on which it maintains a belief function that factors as a product of a prior-like function and a likelihood-like function. Beliefs are updated by passing messages among nodes in a manner suggested by the terminology. Initial states and a node update order must be chosen. Only in special cases do the iterations converge to a Bayesian decision, but for many interesting applications, the iterative technique is both practical and effective. Turbo codes and low density parity-check codes have decoders based on this paradigm.

works” provides more information). However, beyond Decoding occurs by treating the graph as a Bayes- connecting the decoding algorithm of LDPC codes to ian belief network using the conditional probabilites

Bayesian belief networks, a thorough explanation of the p() zk| c k , which express the likelihood ratios, and steps in this algorithm is outside the scope of this article; we present only a concise summary. p()0| ci ,…, c i = δ ci , 1 l ()∑k k For LDPC codes, Figure 2 shows a graph illustrat- ing data and parity-check dependencies for the code- which expresses the parity-check relation. The resulting words. In general, each nonzero entry in the parity- algorithm can be viewed as sweeping through the rows check matrix indicates the edge of a graph connecting a and columns of the parity-check matrix, updating like-

parity-check node (row index) and a codeword symbol lihood ratios lk for each nonzero entry in the matrix.

(column index). The example in Figure 2 is a single par- The notation below denotes lij as the likelihood ratio ity-check code on four symbols. The graph shows the stored with the ij-th (nonzero) entry in a fixed (for the symbol nodes c1,…, c4, the data nodes z1,…, z4, and given step) row or column of the parity-check matrix. In the parity-check nodes, labeled by zeroes. Each edge this form, the iterative steps of the algorithm are sum- between a parity-check node and a symbol node corre- marized for the simple case of a binary symbol alphabet sponds to a nonzero entry in the parity-check matrix. by the equations:

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 109 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

1 1 0 0 Parity-check matrix 0 1 1 0 0 0 1 1

Bayesian belief network

z1 z2 z3 z4 Evidentiary nodes (observations)

c1 c2 c3 c4 Codeword p(z|c) component

p(0|c , . . . ,c ) ( c ) i i = δ Σ i 0 0 0 Parity checks 1 s k k

Some initialization

Flat priors for codeword nodes: π (ci) Node firing order: z 0 c z 0 c ···

Fixed likelihoods for Stopping rule: parity check satisfied

evidentiary nodes: λ k(x) = δ (x − zk)

FIGURE 2. Application of Bayesian belief networks to low-density parity-check codes. Soft-decision decoding of low density parity-check codes can be based on Bayesian belief networks. Both the re-

dundancies in codewords ck and the relationship between the codewords and the data zk can be represented graphically. The data-codeword relationship is expressed through the probability densities p(z|c), which are assumed to be independent sample-to-sample. Redundancies in the codewords are expressesed in a similar notation as p(0|c ,…, c ) where the symbols c , 1 ≤ k ≤ s, are involved in a par- i1 is ik ity check. In this manner, all depedancies are expressed graphically through conditional probability densities as required for the formalism of Bayesian belief networks.

1. Row sweeps implemented in any order or in parallel. This allows ()old considerable acceleration of hardware decoders. Decod-  ()new  l  li i j tanh  k  = tanh   ing can be halted after a fixed number of iterations or   ∏   2 j≠ k 2 after the parity-check equations are satisfied. 2. Column sweeps [column c with log-likelihood ra- Some simplifications that are not possible for nonbi- ()LLR tio lc ] nary symbol alphabets are involved in the binary case. In this more general context, the row/column sweeps l()new= l () new + l ()LLR ik ∑ ij c are expressed by: j≠ k 1. Row sweeps 3. Bit decisions (column c) ()new n  ()old  π() βi pi = 2 UUn j≠k n π() βi pi   k k  j j  sign  l+ l ()LLR  . ∑ ik c  k  2. Column sweeps ()new  ()old  ()LF For the code used in the experiments, each row sweep pi ∝ j≠ k pip c k  j  involves eight lij per row and each column sweep four lij per column. 3. Symbol decisions Each of the row (column) operations is independent ()LF  p p . of any other row (column operation) and hence can be ()m im c

110 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

The components of the vector pk express probabilities GF(16). The remaining two transmitters send the oth- for the values of the kth symbol, the permutation πk in- er two bits of the same symbol. Decoding is based on dicates the effect of a particular nonbinary coefficient likelihood functions built over GF(16) using estimates

in the parity-check equation, Un is a Walsh-Hadamard of the channel matrices. Again, differential frequencies matrix, and the notation  denotes the Hadamard among transmitters enable spatial diversity. (component by component) product. Space-Time Turbo Code and Multichannel Space-Time Extension LDPC Multiuser Detectors There are a variety of extensions of LDPC codes to While the theoretical performance is determined by space-time codes, which are introduced and explained the channel phenomenology, practical MIMO perfor- in the sidebar entitled “Space-Time Codes.” For the ex- mance requires the selection of a space-time code and periments described below, only one type of extension an appropriate matched receiver. In this section we dis- was considered. cuss the space-time turbo code used in this example. Each space-time channel transmits one of several We develop a maximum-likelihood formulation of a possible quadrature phase-shift keying (QPSK) wave- multiple-antenna multiuser receiver, and we discuss forms with slightly offset carrier frequencies. The dif- suboptimal implementations of the receiver. We also ferential frequencies are sufficiently large to effectively introduce minimum-mean-squared-error extensions of decorrelate the transmitted waveforms over the length the receiver, and we discuss the value and use of train- of a codeword (1024 bits) even if the data sequences in ing data. each channel are identical. These differential frequencies are also large compared to the expected Doppler Space-Time Turbo Code spreads and small compared to the signal bandwidth. Turbo codes, introduced elsewhere [35], illustrate that In the simplest example of such a code, the I and codes constructed with simple components, such as Q components of a transmitter represent, respectively, with interleavers and convolutional encoders, combined two different LDPC codewords. Each transmitter sends with an iterative decoding process can achieve near- the same complex baseband sequence (QPSK) shifted Shannon capacity performance. The iterative decod- in frequency. The transmitter outputs, viewed collec- ing process, taking advantage of information exchange tively as a vector at any instant, vary in time and thus among component decoders, provides a feasible way to effectively probe the environment characterized by the approach optimal performance. For each component channel matrix. Since the transmitted vector varies sig- decoder, the best decoding algorithm is the maximum nificantly over the duration of a codeword, the coding a posteriori (MAP) algorithm or the BCJR algorithm provides spatial diversity. Decoding occurs by forming [36], which is derived from the MAP principle. Modi- likelihood ratios based on channel-matrix estimates and fications of the MAP algorithm include log-MAP and then using the iterative decoder described above. Note max-log-MAP [37]. Recently, implementation of turbo that the channel matrix can change during the code- decoders has been carried out and high data-rate decod- word, in which case channel-matrix estimates can vary ing is possible [38]. sample to sample. A number of space-time extensions of turbo coding The LDPC space-time code just described exhibits have been suggested [21, 22]. The approach used here, full spatial redundancy among all transmitters. Less which was introduced elsewhere [39], provides a 2-bit/ redundancy, and therefore higher data rates, can be sec/Hz link for a 4 × 4 MIMO system with independent achieved by dividing the transmitters into subsets, each QPSK waveforms from each transmitter. A single data of which is fully redundant yet different from any other stream is turbo encoded and the encoded data stream is subset. For example, the space-time code discussed lat- distributed redundantly amongst the transmitters. The er, in the section on experiments, uses four transmitters. turbo encoder employs a rate-1/3, 16-state convolution- The first two transmitters send two bits (redundant al encoder twice with two different 4096-bit random in I and Q) of a symbol of an LDPC codeword over interleavers. The distribution of systematic bits is such

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 111 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

SPACE-TIME CODES

-  are used with contained in S. The information multiple transmitters to provide bits are encoded in matrices that ∗ S SS= []0S 0 with spatial as well as temporal redun- are constrained to lie in the class S.  s− s −s− s  dancy in the data received by an ar- This class is defined by the property  1 2 3 4  †  s s s− s  ray of antennas. There are two basic that SS is proportional to the iden-  2 1 4 3  S0 =   . approaches to space-time coding. tity matrix with a fixed (indepen-  s3− s 4 s1 s2      In the first approach, the transmit- dent of S) proportionality constant.  s4 s 3 −s2 s 1  ter can be informed of the propa- The maximum-likelihood decision gation channel by the receiver and for S is based on finding Space-Time Trellis Codes 2 thus adjust its coding accordingly. argmin ZH− S Figure A provides an example of a This approach offers the largest in- S∈S space-time trellis code. A pair of bits †† 1 2 formation-theoretic capacity but = arg max Re tr()ZS H , ()IIt, t at time t enters a convolu- can be difficult to accomplish in a S∈S tional encoder with integer coeffi- p p dynamic environment. The second which involves a linear function in cients ak and bk at the pth lag in approach, which is taken here, uses the entries of S. For some simple the kth channel. The input bits are fixed codes of various rates that of- classes S, linearity of the likelihood interpreted as the integers 0 or 1. fer good performance on average function decouples decisions on the Computations occur modulo 4 and (over all channels). These codes data symbols. For example, consid- result in an integer value between share transmitted power equally er the Alamouti code [1]. 0 and 3 for each channel. A fixed among all spatial channels. mapping between these four inte- ∗ The number of different types of   s− s   gers and the quadrature phase-shift S =SS : = 1 2 . space-time codes is too large to pro-   ∗   keying (QPSK) alphabet completes   s2 s 1   vide a useful overview here. Instead the coding and modulation.

we briefly describe two important The information symbols s1 and The trellis code is defined by p p categories of space-time codes that s2 are sent redundantly over both the coefficients {}ak , bk . These are are not treated in the text. channels. The likelihood function often chosen under one of several

is linear in each sk, decoupling de- design criteria, also shown in the Block Orthogonal Codes modulation decisions. figure. Each codeword is a matrix For data Z and channel matrix H, Another example of an orthogo- symbol C. The probability of an er- consider a set of matrix symbols S nal matrix code is ror in deciding between two such

that each systematic bit is sent twice on two different transmitters. The parity bits are sent once, distributed Multichannel Multiuser Detector randomly amongst the transmitters. The difference in The multichannel multiuser detector (MCMUD) algo- weighting between the systematic and parity bits pro- rithm, discussed elsewhere [3, 39, 40], is a minimum- vides an effective puncturing of the code. Because more mean-squared-error (MMSE) extension to an iterative energy is dedicated to systematic bits, remodulation er- implementation of a maximum-likelihood multiple-an- rors have a reduced effect on subtraction performance, tenna receiver. The MCMUD algorithm employed for in principle improving the performance of the iterative this analysis iteratively combines a blind space-time-fre- multiuser detection for a given bit error rate. quency adaptive beamformer with a multiuser detector.

112 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

symbols can be bounded by (Bhat- Notation: rank r matrix codeword C tacharyya bound)

−E ()CC − ††HH()CC− 4N0 1 2 1 2 pe ≤ e . † nT transmitters nR receivers The approximation HH ≈ n I R nT motivates one of the design criteria shown in the figure. Integrating Design criteria for space-time trellis codes (4 ≤ rn or rn ≥ 4) R R over H motivates the other. r

n rn | † – R – R E † In both cases r denotes the rank | 1 –n tr (C –C )(C –C ) E –e R 4N [ 1 2 1 2 ] pe ≤ Π λ k [(C1 – C2)(C1 – C2] pe ≤ 0 k=1 4 of the matrix difference C – C . 4N0 1 2 Constrained searches over the code coefficients are commonly used to Example of space-time trellis code (ak, bk ∈ {0, 1, 2, 3}) kth transmitter find codes with the smallest pos- v v 1 2 sible error between closest code- 1 2 k x k 1 k 2 k t (It , It ) Σ It p ap + Σ It q bq mod4 xt i p=0 – q=0 – words under either criterion. When 4 , it is important to ensure data: bit pair trellis coding codeword QPSK ≥ rnR symbol modulation that the rank of the matrix differ-

ence is not too small. When 4 < rnR, FIGURE A. Space-time trellis codes introduce spatial as well as temporal maximizing the Euclidean distance redundancy in the transmitted data. Code design often involves a pruned between the two codewords Ck be- search over a class of codes based on a simple figure of merit. For exam- comes important. ple, the minimum least-squares distance between codewords (represent-

ed by space-time matrices Ck) can be maximized. In the example shown, an alphabet consisting of the integers modulo 4 is used for convolutional Reference 1. S.M. Alamouti, “A Simple Transmit Di- encoding at each transmitter. The resulting output symbols are mapped to versity Technique for Wireless Com- a QPSK alphabet. The coefficients ak, bk determine the code. Note that the munications,” IEEE J. Sel. Areas Com- spectral efficiency is 2 bits/sec/Hz. mun., 16 (8), 1998, pp. 1451–1458.

We present here the results of the maximum likeli- where R indicates the spatial covariance matrix of the hood (ML) formulation of MCMUD, employing a interference plus noise, | | indicates the determinant quasistatic narrowband MIMO-channel model. The of a matrix, † indicates the Hermitian conjugate, and tr number of receive antennas nR by number of samples, ns indicates the trace of a matrix. Maximizing the prob- data matrix, Z ∈nR× n s , is given by ability density with respect to H is equivalent to mini- ZH=TN + , (44) mizing the tr{} in Equation 41, † −1 n× n ZH−TZ − HT R , where the channel matrix H ∈ RT contains the tr {()() } (42) complex attenuation between each transmit antenna which is satisfied by nT× n s and receive antenna; T ∈ is the transmitted se- †† −1 n× n HZˆ = TTT , (43) quence; and N ∈ R s is additive Gaussian interfer- () ence plus noise. The probability density for a multivari- assuming TT† is not rank deficient. Substituting Hˆ , ate signal-in-the-mean model is given by −tr ZP⊥ZR† − 1 e { T } −tr{()ZH − TR† −1()ZH− T } p → , e ns n R ns (44) p()ZR| ,HT, = ,, (41) π |R | π ns n R |R |ns †† −1 where the matrix PTT ≡ ()TT T projects onto the

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 113 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

row space spanned by T, and PI⊥ = − P projects ond term can be simplified and interpreted in terms of T ns T onto the orthogonal complement of the row space of T. a beamformer Maximizing with respect to an internal parameter of R IP−PT = 1−()TT† −1 PT† , gives ns TZXX XX XZX X †† −1 ⊥ † −1 −1  wAX Z TX tr{RZPZT RR} −nstr{}RR = 0, (45) =1 − , ns (50) where R indicates the derivative of R with respect to w†† ZPT⊥ some internal parameter. This relationship is satisfied ATB A =1 − , when ns ZP⊥ Z† where Rˆ =T , (46) ˆ−1 ˆ ns w A =RHXA , 1 1 assuming that R is not rank deficient. Using these re- RZˆ ≡ZZ† = PZ⊥ † , XXnXT n B sults, the ML statistic for estimating T is given by s s (51) Hˆ †† −1 −ns n R A ≡ ZTX XXX()TTX  π  e ⊥ † −ns ⊥ † ⊥ † −1 maxp (ZR| ,HT,) = |ZPT Z | .. (47) = ZP TT()PT . RH   TAB ATB A ,  ns  The nR × 1 vector wA contains the receive beamforming ⊥ † ˆ The determinant of ZPT Z is minimized to demodu- weights, R X is the interference-mitigated signal-plus- late the signals for all transmitters jointly. noise covariance matrix estimate, and Hˆ A is the chan- Although it is theoretically possible to use the statis- nel estimate associated with TA. It is worth noting that ⊥ † ˆ ˆ tic ZPT Z directly for demodulation, an iterative ap- the form for HA is simply the column of H, given in †† † proach is much more practical. We define TT≡ ()ABT Equation 43, associated with TA. to be a partitioned form of T, where the nA × ns ma- ˆ †† −1 trix TA contains the signals associated with a particular HZ= TT()T subset of n transmit antennas and the (n – n ) × n −1 A T A s  TT† TT†  matrix T contains the signals associated with all other  ††   AA A B  B = ZT T  ⊥ †  AB    transmit antennas. By factoring PX= X , the rows  † †  TB  TTBA TTB B  of X form an orthonormal basis for the complement −   1 † MM11, 1, 2 of the row space of TB such that XX = I, where the  ††    ≡ ZT ABT   †  (52)   symmetric identity matrix has a dimension of ns minus  M1,,2 M2, 2  † the number of rows in TB. By defining ZZX ≡ X and  ††  † = ZT ABT  TTXA≡ X , we can show that ⊥ † ⊥ †  −1 † −1  ZP ZZ=PZ . (48) (MM11,,− 1 2MMM2, 2 1, 2 )  TX TXX ⋅  . −MM−1 † (MM− M−1 M† )−1  The determinant can be factored into terms with and  2, 2 1,, 2 11 1,, 2 2 2 1, 2  without reference to TA, By focusing on the first column and substituting in for ⊥ † † ZPXT ZZXX=ZIXTn − PPZ . M , M , and M , we can find Hˆ A. X s X X (49) 1,1 1,2 2,2 Hˆ =ZT††(TT − TT††[TT ])−1TT† − 1 Because the first term is free of TA, demodulation is per- A AA AA BB BB A formed by minimizing the second term. This form sug- † ††−1 − ZTB[TTTBB ] TTBA gests an iterative approach, where the signal associated †† ††−1 − 1 (53) with each transmitter, in turn, is considered to be user ×(TTAA − TTAB[TTBB ])TTBA A and is demodulated by minimizing IP− P . = ZP⊥ TT† ()PT⊥ † −1 , ns TZXX TAB ATB A If TA is a row vector, such that nA = 1, then the sec-

114 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication which is the same form found in Equation 51. codes require relatively long block lengths to be effec- Demodulation is performed by maximizing the tive, they are particularly sensitive to Doppler offsets. magnitude of the inner product of the beamformer out- Extending the beamformer to include delay and Dop- † put wA Z and the interference-mitigated reference sig- pler corrects this deficiency. With this approach, the nal TP⊥ . spatial-beamformer interpretation presented in Equa- ATB tion 50 is formally the same, but all projectors are ex- Suboptimal Implementation tended to include delay and Doppler spread. The data A variety of suboptimal but computationally more ef- matrix is replaced with ficient variants are possible. In general, these approxi- ZZ≡† δt , δ f Z† δt, δ f mations become increasingly valid as the number of STF [(X 1 1)(X 1 2 ) (57) samples in the block increases. Z† (δt, δ f )(Z† δt,, δ f )]† X 1 nδ f X nδt n δ f The first computational simplification is found by noting that the normalization term of the channel esti- which is a ()nR⋅ nδ f ⋅n δt× n s matrix that includes pos- mate in Equation 51 can be approximated by sible signal distortions. The new channel estimate has

dimension ()nR⋅ nδ f ⋅n δt× n T , but T remains the same. Hˆ = ZP⊥ TT† PT⊥ † −1 A TAB ()ATB A The MMSE beamformer is given by ⊥ †† −1/2 2 = ZPTATT()AAT † B (54) wSTF = argmin wSTF ZTSTF − X ×IT − TT† −1 / 2 PT† TT† −1 / 2 −1 (58) ([ AA ][ATB AA A ]) † −1 † = ()ZZSTFSTF ZTSTF X . ×(T T† )−1 / 2 AAA Figure 3 shows a diagram for this demodulator (MC- ≈ ZP⊥ TT††T −1 . TAB ()AA MUD).

(We did not assume that T is a row vector in the previ- A Block for each transmitter ous discussion.) The second approximation reduces the computation cost of the projection operator. The operator that projects on the orthogonal complement of the row space of e Info bits

M is given by adaptive Temporal frequency subtraction beamformer Space-time- Turbo ⊥ †† −1 decoder PIM = − MM()MM. (55) Space-tim demultiplexer This operator can be approximated by Channel estimation PI⊥ ≈ − MM††MM−1 , M ∏  m() m m m  (56) m where m indicates the mth row in the matrix. By re- Turbo peating the application of this approximate projection encoder multiplexer Space-time operator, we can reduce the approximation error at the expense of additional computational complexity. FIGURE 3. Diagram of a multichannel multiuser detector (MCMUD) space-time turbo-code receiver. The receiver MMSE Extension iteratively estimates the channel and demodulates the sig- Because of the effects of delay and Doppler-frequency nal. The space-time frequency-adaptive beamformer com- spread, the model given in Equation 40 for the received ponent compensates for spatial, delay, and frequency-offset correlations. By iteratively decoding the signal, previous signal is incomplete for many environments, adversely signal estimates can be used to temporally remove contri- affecting the performance of the spatial-beamformer butions from other transmitters, which is a form of multius- interpretation of the ML demodulator. Because turbo er detection.

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 115 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

20 Training Data 10 In principle, there is no need for training data, because 0 (dB) the channel and information can be estimated jointly. 2 a Furthermore, the use of training data competes directly –10 with information bits. For reasonably stationary chan- –20 nels, the estimate for the previous frame can be em- –30 ployed as an initial estimate for the demodulator. How- 0 500 1000 1500 2000 ever, for more quickly moving channels some training Link range (m) data is useful. Here, a small amount of training data is introduced within a frame (20%). This provides an FIGURE 4. Scatter plot of the peak-normalized mean- squared single-input single-output (SISO) link attenuation initial channel estimate for the space-time-frequency a2 versus link range for the outdoor environment near the adaptive beamformer. PCS frequency allocation. The error bars indicate a range In the experiment, knowledge of the encoded sig- of plus or minus one standard deviation of the estimates at nal is used to provide that training data. Because the a given site. number of training samples is relatively small, it is useful to use a small number of temporal and frequency high-performance sixteen-channel receiver system that taps during the first iteration. Larger dimension space- can operate over a range of 20 MHz to 2 GHz, sup- time-frequency processing is possible by using estimates porting a bandwidth up to 8 MHz. The receiver can of the data. be deployed in the laboratory or in a stationary “bread truck.” Phenomenological Experiment This section presents channel-complexity and channel- MIT Campus Experiment stationarity experimental results for MIMO systems. The experiments were performed during July and Au- We introduce the experiments and then discuss chan- gust 2002 on and near the MIT campus in Cambridge, nel mean attenuation and channel complexity. We then Massachusetts. These outdoor experiments were per- discuss the variation of MIMO channels as a function formed in a frequency allocation near the PCS band of time and as a function of frequency. (1.79 GHz). The transmitters periodically emitted 1.7- sec bursts containing a combination of channel-probing Experimental System and space-time-coding waveforms. A variety of coding The employed experimental system is a slightly modi- and interference regimes were explored for both mov- fied version of the system used previously at Lincoln ing and stationary transmitters. The space-time-coding Laboratory [3, 41]. The transmit array consists of up to results are discussed later in the article [39, 40]. Chan- eight arbitrary waveform transmitters. The transmitters nel-probing sequences using both four and eight trans- can support up to a 2-MHz bandwidth. These transmitters were employed. mitters can be used independently, as two groups of The receive antenna array was placed on top of a four coherent transmitters, or as a single coherent group tall one-story building (at Brookline Street and Henry of eight transmitters. The transmit systems can be de- Street), surrounded by two- and three-story buildings. ployed in the laboratory or in vehicles. When operat- The transmit array was located on the top of a vehicle ing coherently as a multiantenna transmit system, the within two kilometers of the receive array. Different individual transmitters can send independent sequences four- or eight-antenna subsets of the sixteen-channel re- by using a common local oscillator. Synchronization ceiver were used to improve statistical significance. The between transmitters and receiver and transmitter geo- receive array had a total aperture of less than 8 m, ar- location is provided by GPS receivers in the transmitters ranged as three subapertures of less than 1.5 m each. and receivers. The channel-probing sequence supported a band- The Lincoln Laboratory array receiver system is a width of 1.3 MHz with a length of 1.7 msec repeated

116 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication ten times. All four or eight transmitters emitted nearly the number of transmit antennas, and the estimated α orthogonal signals simultaneously. for the transmit site. Uncertainty in α is determined by using the bootstrap technique [42]. The CDF values re- Attenuation ported here are evaluated over appropriate entries from Figure 4 displays the peak-normalized mean-squared Table 1. The systematic uncertainty in the estimation SISO attenuation averaged over transmit and receive of α caused by estimation bias, given the model, is less antenna pairs for a given transmit site for the outdoor than 0.02. 2 † environment. The uncertainty in the estimate is evalu- Figure 5 displays CDFs of a =tr{}HH /()nTRn ated by using a bootstrap technique. estimates normalized by mean a2 for each transmit site. CDFs are displayed for narrowband SISO, 4 × 4, and Channel Complexity 8 × 8 MIMO systems. Because of the spatial diversi- We present channel complexity by using three differ- ty, the variation in mean antenna-pair received power ent approaches: variation in a2 estimates, eigenvalue cu- decreases dramatically as the number of antenna pairs mulative distribution functions (CDF), and α estimate increases, as we would expect. This reduction in varia- CDFs. Table 1 is a list of transmit sites used for these tion demonstrates one of the most important statistical results. The table includes the distance (range) between effects that MIMO links exploit to improve commu- transmitter and receiver, the velocity of the transmitter, nication link robustness. For example, if we wanted to

Table 1. List of Transmit Sites

Site Location Range Velocity Number of α (m) (m/sec) antennas

1 Henry and Hasting 150 0.0 8 0.79 ± 0.01

2 Brookline and Erie 520 0.0 8 0.80 ± 0.01

3 Boston University (BU) 430 0.0 8 0.78 ± 0.01

4 BU at Storrow Drive 420 0.0 4 0.72 ± 0.01

5 Glenwood and Pearl 250 10.0 4 0.85 ± 0.01

6 Parking lot 20 0.1 4 0.78 ± 0.02

7 Waverly and Chestnut 270 0.2 4 0.67 ± 0.02

8 Vassar and Amherst 470 0.7 4 0.68 ± 0.02

9 Chestnut and Brookline 140 0.1 4 0.70 ± 0.02

10 Harvard Bridge 1560 11.6 4 0.69 ± 0.02

11 BU Bridge 270 2.7 4 0.83 ± 0.04

12 Vassar and Mass Ave 1070 7.6 4 0.59 ± 0.01

13 Peters and Putnam 240 9.1 4 0.87 ± 0.05

14 Glenwood and Pearl 250 5.2 4 0.76 ± 0.02

15 Brookline and Pacific 780 7.2 4 0.86 ± 0.03

16 Pearl and Erie 550 0.1 4 0.71 ± 0.04

17 Storrow Drive and BU Bridge 410 9.2 4 0.85 ± 0.03

18 Glenwood and Magazine 370 0.0 4 0.78 ± 0.02

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 117 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

1.0 1.0

0.8 4 × 4 0.8

2 8 × 8 a 0.6 SISO 0.6

0.4 0.4 CDF of of CDF

0.2 0.2

0 0 CDF of channel eigenvalues channel of CDF –15 –10 –5 0 5 –40 –30 –20 –10 0 10 20 a2 (dB) Eigenvalue (dB)

FIGURE 5. Cumulative distribution function (CDF) of chan- FIGURE 6. CDF of narrowband channel eigenvalue distribu- nel a2 estimates, normalized by the mean a2 for each site, for tions for 4 × 4 MIMO systems. SISO, 4 × 4, and 8 × 8 MIMO systems. 1.0

operate with a probability of 0.9 to close the link, we 0.8 would have to operate the SISO link with an excess 0.6 2 SISO SNR (a Po) margin of over 15 dB. The MIMO 0.4 systems received the added benefit of array gain, which is not accounted for in the figure. 0.2 Figures 6 and 7 present CDFs of eigenvalues for 4 × 4 0 CDF of channel eigenvalues channel of CDF and 8 × 8 mean-squared-channel-matrix-element-nor- –40 –30 –20 –10 0 10 20

malized narrowband channel matrices, eig{HH†}. The Eigenvalue (dB) CDFs are evaluated over all site lists. Some care must be FIGURE 7. CDF of narrowband channel eigenvalue distribu- taken in interpreting these figures because eigenvalues tions for 8 × 8 MIMO systems. are not independent. Nonetheless, the steepness of the CDFs is remarkable. We might interpret this to indicate eigenvalues of the stationary transmitter do vary some- that optimized space-time codes should operate with a what. While the transmitters and receivers are physi- relatively high probability of success. cally stationary, the environment does move. This effect Figure 8 shows the CDFs for α estimates. The mean is particularly noticeable near busy roads. Furthermore, values of α for each environment are 0.76 for 4 × 4 sites while the multiple antennas are driven with the same lo- and 0.79 for 8 × 8 sites, where the form x ± y indicates cal oscillator, given the commercial grade transmitters, the estimated value x with statistical uncertainty y es- there are always some small relative-frequency offsets. timated by using a bootstrap uncertainty estimation The example variation is given for transmit sites 7 and technique. While we might expect smaller variation in 14 from Table 1. the 8 × 8 systems because of the much larger number of 1.0 paths, this effect may have been exaggerated in Figure 8 0.8 4 × 4 because of the limited number of 8 × 8 sites available in 8 × 8

α the experiment. 0.6

0.4

Channel Stationarity of CDF Figure 9 displays the temporal variation of eigenvalues 0.2 † of HH for stationary and moving transmitters. In this 0 figure the normalization is fixed, allowing for overall 0.5 0.6 0.7 0.8 0.9 1.0 shifts in attenuation. As we would expect, the eigenval- α ues of the moving transmitter vary significantly more FIGURE 8. CDF of α estimates for 4 × 4 and 8 × 8 MIMO sys- than those of the stationary environment. However, the tems.

118 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

20 eigenvalues tends to be more stable. In the example, the stationary transmitter is located at site 7, and the mov- 10 ing transmitter is located at site 14. Over the same pe- 0 riod the stationary transmitter is relatively stable. Fig-

(dB) λ ures 11 and 12 display CDFs for stationary and moving –10 transmitters. The significant variation of the moving transmitter is an indication that implementing an in- –20 0 0.05 0.10 0.15 formed transmitter MIMO system would be very chal- Time (sec) lenging for the moving transmitter, but might be viable (a) for some stationary MIMO systems. 20 Frequency-Selective Fading 10 Figure 13 gives an example of the frequency variation of 2 the power-weighted mean cos θ. The variation is indi- 0

(dB) λ cated by using the metric presented in Equation 36. In –10 the example, the stationary transmitter is located at site

–20 1.0 0 0.05 0.10 0.15 )}

Time (sec) t 0.8 (

(b) H 0.2

), ), 0.6 0 0.1 t † ( FIGURE 9. Eigenvalues ( ) of HH as a function of time for H λ { 0.4 (a) stationary and (b) moving transmitters. The same over- γ all attenuation, estimated at t = 0, is used for all time sam- 0.2 ples. 0 0.02 0.04 0.06 While the moving-transmitter eigenvalues fluctuate Time (sec) more than those of the stationary transmitter, the values FIGURE 11. CDF of time variation of power-weighted mean are remarkably stable in time. Conversely, an example 2 2 cos θ, γ {(HHt0 ), () t }, for a stationary 4 × 4 MIMO system. The of the time variation of the power-weighted mean cos θ graph displays contours of CDF probabilities of 0.1, 0.2, 0.3, metric (from Equation 36), displayed in Figure 10, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. Because there is little variation, varies significantly for the moving transmitter within all curves are compressed near a γ value of 1.0 10 msec. This variation indicates that the eigenvector structure varies significantly, while the distribution of 1.0 0.9 0.7

1.0 )} 0.5

t 0.8 ( 0.3

Stationary H )} 0.8 0.1 t ), ), 0.6 ( Moving 0 t ( H

0.6 H ), ), {

0 0.4 γ t (

H 0.4 { 0.2 γ 0.2 0 0.02 0.04 0.06 0 0.02 0.04 0.06 Time (sec) Time (sec) FIGURE 12. CDF of time variation of power-weighted mean FIGURE 10. Example time variation of power-weighted mean 2 cos θ, γ {(HHt0 ), () t }, for a moving 4 × 4 MIMO system. Con- 2 cos θ, γ {(HHt0 ), () t }, for stationary and moving 4 × 4 MIMO tours of CDF probabilities of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, systems. and 0.9 are displayed.

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 119 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

1.0 van several hundred meters away from an array of re-

)} ceivers situated on a one-story building. The environ-

f 0.8 ( ment consists predominantly of two- and three-story H ), ),

0 0.6

f residential buildings and some commercial buildings of ( H

{ similar heights in an urban setting. Propagation delay

γ 0.4 spreads are typically several microseconds and Dop- 0.2 pler spreads are at most a few hundred hertz. There is typically no identifiable line-of-sight component in –600 –400 –200 0 200 400 600 the propagation. The signal has a pulse-shaped QPSK Frequency (kHz) modulation and bandwidth of about 100 kHz. Coding FIGURE 13. Example of frequency-selective variation of the provides a spectral efficiency of 2 bits/sec/Hz. 2 power-weighted mean cos θ, γ {(HHf), () f } . 0 The receiver consists of sixteen channels fed by low- gain elements with wide azimuth beamshapes. The elements are oriented in various directions, not neces- 1.0 sarily pointing at the sources. For the example below, )}

f 0.8

( four element subarrays are chosen at random to provide H

), ), multichannel receivers. In other words, C(16, 4) 4 × 4

0 0.6 f ( 0.9 H { 0.4 0.7 γ 0.5 0.25 0.3 0.2 Measured 0.1 Simulated 0.20 –600 –400 –200 0 200 400 600 Frequency (kHz) • Spectral efficiency FIGURE 14. CDF of frequency-selective variation of the 0.15 2 2 bits/sec/Hz power-weighted mean cos θ, γ {(HHf0 ), () f } . The graph displays contours of CDF probabilities of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. 0.10 Bit-error probability Bit-error 7. Relatively small frequency offsets induce significant 0.05 changes in γ {(HHf0 )(, f )}. Figure 14 shows the CDF of the frequency-selective channel variation. This sensitivity indicates that there is significant resolved delay 0 spread and that, to safely operate with the narrowband 0 2 4 6 8 10 12 Excess E /N (dB) assumption, bandwidths less than 100 kHz should be b 0 employed. We note that delay spread, and the result- FIGURE 15. Measured and simulated results in bit error ing frequency-selective fading, are both a function of rate probability for a space-time low-density parity-check environment and link length. Consequently, some care (LDPC) code at a spectral efficiency of 2 bits/sec/Hz. Bit must be taken in interpreting this result. error rates are evaluated for an ensemble of 4 × 4 MIMO systems. The estimated channel matrices are used in the Space-Time Low-Density simulation to model propagation. Each estimated channel matrix suppports a theoretical capacity that can be in ex- Parity-Check-Code Experiments cess of 2 bits/sec/Hz. The matrix is scaled until it supports A low-density parity-check code over GF(16) provides a capacity of exactly 2 bits/sec/Hz. The resulting scale fac- E N the basis of the example of experimental and simulated tor is used to evaluate the excess (beyond Shannon) b/ 0 associated with the (unscaled) channel matrix. Agreement results shown in Figure 15. The code used is half rate between measured and simulated results are good to about with length 1024. The MIMO wireless link is realized 1 dB. About 4 to 5 dB excess Eb/N0 is required to reliably with four cohered transmitters located on a stationary complete the link.

120 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

MIMO links can be evaluated. The channel transfer ent transmit locations. In both examples the link does matrices have a random structure that varies from sub- not have line of sight. Different four-antenna subsets of array to subarray. the sixteen-channel receiver were used to improve statis- Figure 15 shows symbol-error probability as a func- tical significance.

tion of excess Eb/N0, which is related to the excess spec- Example 1. In this example, the transmitter was lo- tral efficiency (beyond 2 bits/sec/Hz) predicted by a cated in the parking lot at Boston University (Universi- capacity bound, given the measured channel transfer ty Road and Storrow Drive) with about a half-kilometer matrix. For this example, in a comparatively station- separation between the transmitter and receiver. Figure ary environment, the channel transfer matrices are used 16 shows the geometry of the experiment. Traffic on both for the simulated results and for the computation Storrow Drive is typically heavy and the posted speed

of excess Eb/N0. As the figure shows, about 4.5 dB ex- limit of 45 mph is generally misinterpreted as the mini- cess Eb/N0 is required to complete the link at 2 bits/sec/ mum allowed speed. While the transmitter is stationary, Hz. Simulations agree to within about 1 dB. the environment is nonstationary because of the traffic. Example 2. The transmitter was moving at 10 m/sec Space-Time Turbo-Code Experiments at a range of 500 m. Figure 17 shows the geometry of In this section we present the experimental performance the experiment. To simulate the effects of local oscilla- of a space-time turbo code. We begin by discussing the tor errors, we introduced artificial frequency offsets at experimental parameters, and then we summarize the the transmitters. These errors were within ±80 Hz. performance of the MIMO system with stationary Two wideband jammers were transmitting at a range transmitter and receiver in a dynamic environment. Ad- of 100 m. Each jammer was received at a jammer-to- ditionally, for an even more complicated environment, noise ratio (JNR) of approximately 25 dB. Figure 18 we describe performance results with a mobile transmit- shows the eigenvalues of the noise-normalized interfer- ter and multiple strong interferers. ence-plus-noise spatial covariance matrix. The “noise” eigenvalues of the jammer spatial eigenvalue distribution Experimental Parameters Outdoor experiments were performed in a frequency allocation near the PCS band, using a sixteen-channel receiver. A variety of coding and interference regimes Cambridge were explored for both moving and stationary transmitters. Channel-probing sequences and four- and eight-

transmitter space-time codes were transmitted. This Receive MIT section reports on the outdoor performance results of array space-time turbo codes for 4 × 4 MIMO systems. The outdoor experiments were performed during July and August 2002 on and near the MIT campus. The receive antenna array was placed on top of a one-story building (at Brookline Street and Henry Street) surrounded by two- and three-story buildings. Transmit For the examples discussed in this article, quadra- array Boston ture-phase-shift-key (QPSK) signals were transmitted University on four antennas at 123 × 103 chips per second for a total data rate of 246 kb/sec, using the space-time code discussed earlier. A 160-kHz spectral limit was enforced

by using a root-raised-cosine pulse shaping. Total trans- FIGURE 16. Example 1: map of MIMO communication exper- mit power was approximately 100 mW, radiated from iment near the MIT campus, including the locations of the 0-dBi antennas. We discuss two examples with differ- transmitter and receiver.

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 121 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

40 Transmitter 25 mph 30

20 Receive array Jammers 10 25 dB JNR 100 m (dB) power Relative 0 1 2 3 4 Eigenvalue number FIGURE 17. Example 2: map of MIMO communication experiment near the MIT campus, including the locations of FIGURE 18. Eigenvalue distribution of the noise-normalized the transmitter, receiver, and jammers at a fixed jammer-to- interference-plus-noise spatial covariance matrix. noise ratio (JNR).

are slightly higher than we would naively expect, given ing (three turbo iterations; Doppler taps: {–4/3, –2/3, the 0-dB noise normalization. This behavior is probably 0, 2/3, 4/3}; temporal taps: {–1/2, 0, 1/2}), and where an indication of either delay spread or nonstationarity in Doppler taps are represented in resolution cells (60 the received jammer signal. Either of these explanations Hz). presents additional challenges to the receiver. Performance improves with receiver complexity; the Both the delay and the Doppler spread affect the algorithm, however, must bootstrap up in complexity design and performance of the receiver. Here a space- iteratively. Starting with the highest complexity on the time-frequency adaptive processor is employed. The first iteration increases the probability of converging to number of delay and frequency taps in the adaptive pro- the wrong solution. Because the channel contains sig- cessor depends upon the phenomenology. Delay spread nificant Doppler spread, the spatial beamformer per- was found to be less than ±4 µsec. For the stationary forms poorly. With the relatively long block lengths of environment, in quiet regions (no nearby traffic), no the turbo code, Doppler beamforming is required in Doppler spread was detected. For the stationary trans- this environment. We note that experimental perfor- mitter near heavy traffic in experimental example 1, the mance is essentially the same as was found in simula- Doppler spread was found to be within ±150 Hz. For tions. Furthermore, the experimental performance is the moving transmitter in experimental example 2, the Doppler spread was found to be within ±180 Hz. 10–1 Experimental Example 1 Spatial beamforming Training-data-based Bit error rates for various detector alternatives are re- 10–2 SFAP ported as a function of mean single-input single-output STFAP 2 2 –3 Bit error rate error Bit 10 MCMUD (SISO) SNR, a Po. Here, a is the mean-squared link attenuation. Figure 19 displays the results for four detection variations: 10–4 3.0 3.5 4.0 4.5 5.0 5.5 6.0 1. Training-data-based adaptive spatial beamform- a2P (dB) ing (three turbo iterations). o 2. Coarse training-data-based space-frequency beam- FIGURE 19. Bit error rate of 4 × 4, 2-bit/sec/Hz space-time forming (one turbo iteration; Doppler taps: {–1, 0, 1}). turbo code as a function of mean SISO signal-to-noise ra- 2 3. Space-time-frequency beamforming employing tio (SNR) (a Po) for a Boston University transmit location, decision-directed channel estimation without multiuser using adaptive spatial beamforming, coarse training-data- based space-frequency adaptive beamforming (SFAP), detection (three turbo iterations; Doppler taps: {–4/3, space-time-frequency adaptive beamforming (STFAP) em- –2/3, 0, 2/3, 4/3}; temporal taps: {–1/2, 0, 1/2}). ploying decision-directed channel estimation, and MCMUD 4. MCMUD with space-time-frequency beamform- with space-time-frequency adaptive beamforming.

122 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

communication. Performance bounds assuming either 10–1 an informed transmitter or an uninformed transmitter were presented for flat-fading, frequency-selective, and –2 10 jammed environments. A channel phenomenology parameterization was introduced. Experimental phenom- 10–3 Bit error rate error Bit enological results were reported, the results indicating that the observed channels can be typically character- 10–4 –21 –20 –19 –18 –17 ized by high degrees of complexity. Furthermore, for SISO SINR (dB) environments with transmitters on moving vehicles, the channel varies significantly on a time scale less than FIGURE 20. Bit error rate of 4 × 4, 2-bit/sec/Hz space-time turbo code using the MCMUD receivver with space-time- 10 msec. Two space-time coding techniques were in- frequency beamforming as a function of mean SISO signal- troduced, one based on LDPC and the other on turbo to-interference-plus-noise ratio (SINR). codes. Experimental demodulation performance results were presented for a variety of environments, including similar to the simulated performance of the best space- those with wideband jammers. In the presence of the time codes. jammer, the MIMO system (using the MCMUD receiver) operated dramatically better than SISO systems. Experimental Example 2 The experimental data includes the effects of two high- Acknowledgments power wideband jammers, a moving transmitter, and The authors would like to thank Peter Wu of Lincoln local oscillator errors. Experimental performance of this Laboratory for his help developing the space-time turbo space-time turbo code for a stationary transmitter in the code, and Naveen Sunkavally of MIT and Nick Chang absence of interference is discussed elsewhere [39]. of the University of Michigan for their help with the ex- Figure 20 shows the bit error rate of the space-time periment. The authors would also like to thank the ex- turbo code using the MCMUD receiver. The bit error cellent Lincoln Laboratory staff involved in the MIMO rate is displayed in terms of the mean SISO signal-to- experiment, in particular Sean Tobin, Jeff Nowak, Lee interference-plus-noise ratio (SINR). This is the aver- Duter, John Mann, Bob Downing, Peter Priestner, Bob age SINR at a given receive antenna, assuming that Devine, Tony Tavilla, and Andy McKellips. We also all power of the transmit array is transmitted from a thank Ali Yegulalp of Lincoln Laboratory and Vahid single transmit antenna. We note that this experimen- Tarokh of Harvard University for their thoughtful com- tal system in this difficult environment operates at an ments, and Dorothy Ryan of Lincoln Laboratory for SINR that is 25 dB better than the information-theo- her helpful comments. Finally, the authors would like retic SISO bound, and operates probably at least 35 dB to thank the MIT New Technology Initiative Commit- better than a practical SISO system. Furthermore, there tee for their support. 2 is only approximately a 3-dB loss in a Po performance compared to the performance in an environment without jammers. The effectiveness of the receiver is due in part to its ability to compensate for delay and frequency spread. The MCMUD employs a space-time-frequency adaptive processor that uses a four-antenna receiver with temporal and frequency taps that cover a range of ±4 microseconds and ±200 Hz.

Summary In this article we addressed information-theoretic, phenomenological, coding, and receiver issues for MIMO

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 123 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

pp. 2041–2052. 19. K.W. Forsythe, “Capacity of Flat-Fading Channels Associat- REFERENCES ed with a Subspace-Invariant Detector,” 34th Asilomar Conf. on Signals, Systems and Computers 1, Pacific Grove, Calif., 29 1. W.C. Jakes, Microwave Mobile Communications (Wiley, New Oct.–1 Nov. 2000, pp. 411–416. York, 1974). 20. K.W. Forsythe, “Performance of Space-Time Codes over a 2. R.A. Monzingo and T.W. Miller, Introduction to Adaptive Flat-Fading Channel Using a Subspace-Invariant Detector,” Arrays (Wiley, New York, 1980). 36th Asilomar Conf. on Signals, Systems and Computers 1, Pa- 3. K.W. Forsythe, D.W. Bliss, and C.M. Keller, “Multichan- cific Grove, Calif., 3–6 Nov. 2002, pp. 750–755. nel Adaptive Beamforming and Interference Mitigation in 21. A. Stefanov and T.M. Duman, “Turbo Coded Modulation Multiuser CDMA Systems,” Thirty-Third Asilomar Conf. on for Wireless Communications with Antenna Diversity,” Proc. Signals, Systems & Computers 1, Pacific Grove, Calif., 24–27 IEEE Vehicular Technology Conf. 3, Amsterdam, 19–22 Sept. Oct. 1999, pp. 506–510. 1999, pp. 1565–1569. 4. A. Wittneben, “Basestation Modulation Diversity for Digital 22. Y. Liu, M.P. Fitz, and O.Y. Takeshita, “Full Rate Space–Time SIMULCAST,” Proc. IEEE Vehicular Technology Conf., St. Turbo Codes,” IEEE J. Sel. Areas Commun. 19 (5), 2001, pp. Louis, Mo., 19–22 May 1991, pp. 848–853. 969–980. 5. V. Weerackody, “Diversity for Direct-Sequence Spread Spec- 23. H. Sampath and A.J. Paulraj, “Joint Transmit and Receive trum Using Multiple Transmit Antennas,” Proc. IEEE Int. Optimization for High Data Rate Wireless Communication Communications Conf. 3, Geneva, 23–26 May, 1993, pp. Using Multiple Antennas,” Conf. Record Thirty-Third Asilo- 1775–1779. mar Conf. on Signals, Systems & Computers 1, Pacific Grove, 6. G.J. Foschini, “Layered Space-Time Architecture for Wire- Calif., 24–27 Oct. 1999, pp. 215–219. less Communication in a Fading Environment When Using 24. N. Sharma and E. Geraniotis, “Analyzing the Performance Multi-Element Antennas,” Bell Labs Tech. J. 1 (2), 1996, pp. of the Space-Time Block Codes with Partial Channel State 41–59. Feedback,” Proc. Wireless Communications and Networking 7. I.E. Telatar, “Capacity of Multi-Antenna Gaussian Chan- Conf. 3, Chicago, 23–28 Sept., 2000, pp. 1362–1366. nels,” Eur. Trans. Telecommun. 10 (6), 1999, pp. 585–595. 25. D.W. Bliss, K.W. Forsythe, A.O. Hero, and A.F. Yegulalp, 8. D.W. Bliss, K.W. Forsythe, A.O. Hero, and A.L. Swindle- “Environmental Issues for MIMO Capacity,” IEEE Trans. hurst, “MIMO Environmental Capacity Sensitivity,” Thir- Signal Process. 50 (9), 2002, pp. 2128–2142. ty-Fourth Asilomar Conf. on Signals, Systems & Computers 1, 26. T.M. Cover and J.A. Thomas, Elements of Information Theory Pacific Grove, Calif., 29 Oct.–1 Nov. 2000, pp. 764–768. (Wiley, New York, 1991). 9. D.W. Bliss, K.W. Forsythe, and A.F. Yegulalp, “MIMO 27. F. R. Farrokhi, G. J. Foschini, A. Lozano, and R.A. Valen- Communication Capacity Using Infinite Dimension Ran- zuela, “Link-Optimal Space–Time Processing with Multiple dom Matrix Eigenvalue Distributions,” Thirty-Fifth Asilo- Transmit and Receive Antennas,” IEEE Comm. Lett. 5 (3), mar Conf. on Signals, Systems & Computers 2, Pacific Grove, 2001, pp. 85–87. Calif., 4–7 Nov. 2001, pp. 969–974. 28. D.W. Bliss, A.M. Chan, and N.B. Chang, “MIMO Wireless 10. T.L. Marzetta and B.M. Hochwald, “Capacity of a Mobile Communication Channel Phenomenology,” IEEE Trans. An- Multiple-Antenna Communication Link in Rayleigh Flat tennas Propag., 52 (8), 2004, pp. 2073–2082. Fading,” IEEE Trans. Inf. Theory 45 (1),1999, pp. 139–157. 29. D. Gesbert, H. Bölcskei, D.A. Gore, and A.J. Paulraj, “Perfor- 11. L. Zheng and D.N.C. Tse, “Diversity and Freedom: A Fun- mance Evaluation for Scattering MIMO Channel Models,” damental Tradeoff in Multiple-Antenna Channels,” IEEE Thirty-Fourth Asilomar Conf. on Signals, Systems & Computers Trans. Inf. Theory 49 (9), 2003, pp. 1076–1093. 1, Pacific Grove, Calif., 29 Oct.–1 Nov. 2001, pp. 748–752. 12. S.M. Alamouti, “A Simple Transmit Diversity Technique for 30. H. Bölcskei and A.J. Paulraj, “Performance of Space-Time Wireless Communications,” IEEE J. Sel. Areas Commun. 16 Codes in the Presence of Spatial Fading Correlation,” Thirty- (8), 1998, pp. 1451–1458. Fourth Asilomar Conference on Signals, Systems & Computers 13. V. Tarokh, H. Jafarkhani, and A.R. Calderbank, “Space– 1, Pacific Grove, Calif., 29 Oct.–1 Nov. 2000, pp. 687–693. Time Block Codes from Orthogonal Designs,” IEEE Trans. 31. R.G. Gallager, Low-Density Parity-Check Codes (MIT Press, Inf. Theory 45 (5), 1999, pp. 1456–1467. Cambridge, Mass., 1963). 14. G. Ganesan and P. Stoica, “Space–Time Block Codes: A 32. S.-Y. Chung, G.D. Forney, Jr., T.J. Richardson, and R. Ur- Maximum SNR Approach,” IEEE Trans. Inf. Theory 47 (4), banke, “On the Design of Low-Density Parity-Check Codes 2001, pp. 1650–1656. within 0.0045 dB of the Shannon Limit,” IEEE Commun. 15. B. Hassibi and B. Hochwald, “High-Rate Linear Space-Time Lett. 5 (2), 2001, pp. 58–60. Codes,” Proc. IEEE Int. Conf. on Acoustics, Speech, and Sig- 33. T.J. Richardson, M.A. Shokrollahi, and R.L. Urbanke, “De- nal Processing 4, Salt Lake City, Utah, 7–11 May 2001, pp. sign of Capacity-Approaching Irregular Low-Density Pari- 2461–2464. ty-Check Codes,” IEEE Trans. Inf. Theory 47 (2), 2001, pp. 16. V. Tarokh, N. Seshadri, and A.R. Calderbank, “Space–Time 619–637. Codes for High Data Rate Wireless Communication: Perfor- 34. R.J. McEliece, D.J.C. MacKay, and J.-F. Cheng, “Turbo De- mance Criterion and Code Construction,” IEEE Trans. Inf. coding as an Instance of Pearl’s ‘Belief Propagation’ Algo- Theory 44 (2), 1998, pp. 744–765. rithm,” IEEE J. Sel. Areas Commun. 16 (2), 1998, pp. 140– 17. B.M. Hochwald and T.L. Marzetta, “Unitary Space–Time 152. Modulation for Multiple-Antenna Communications in Ray- 35. C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shan- leigh Flat Fading,” IEEE Trans. Inf. Theory 46 (2), 2000, pp. non Limit Error-Correcting Coding and Decoding: Turbo- 543–564. Codes,” Proc. IEEE Int. Communications Conf. 2, Geneva, 18. B.M. Hochwald and W. Sweldens, “Differential Unitary 23–26 May 1993, pp. 1064–1070. Space-Time Modulation,” IEEE Trans. Com. 48 (12) 2000, 36. L.R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal De-

124 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

coding of Linear Codes for Minimizing Symbol Error Rate,” IEEE Trans. Inf. Theory 20 (2), 1974, pp. 284–287. 37. P. Robertson, E. Villebrun, and P. Hoeher, “Comparison of Optimal and Sub-Optimal MAP Decoding Algorithms Op- erating in the Log Domain,” Proc. IEEE Int. Communications Conf. 2, Seattle, 18–22 June 1995, pp. 1009–1013. 38. P. H.–Y. Wu and S.M. Pisuk, “Implementation of a Low Complexity, Low Power, Integer-Based Turbo Decoder,” Proc. IEEE Global Telecommunications Conf., San Antonio, Tex., 25–29 Nov. 2001, pp. 946–951. 39. D.W. Bliss, P.H. Wu, and A.M. Chan, “Multichannel Mul- tiuser Detection of Space-Time Turbo Codes: Experimental Performance Results,” Thirty-Sixth Asilomar Conference on Signals, Systems & Computers 2, Pacific Grove, Calif., 3–6 Nov. 2002, pp. 1343–1348. 40. D.W. Bliss, “Robust MIMO Wireless Communication in the Presence of Interference Using Ad Hoc Arrays,” MILCOM 2003, 2, Oct. 2003, pp. 1382–1385. 41. C.M. Keller and D.W. Bliss, “Cellular and PCS Propagation Measurements and Statistical Models for Urban Multipath on an Antenna Array,” Proc. 2000 IEEE Sensor Array and Multichannel Signal Processing Workshop, Cambridge, Mass., 16–17 Mar. 2000, pp. 32–36. 42. B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans (Society for Industrial and Applied Mathematics, Phil- adelphia, 1982). 43. K.W. Forsythe, “Utilizing Waveform Features for Adaptive Beamforming and Direction Finding with Narrowband Sig- nals,” Linc. Lab. J. 10 (2), 1997, pp. 99–126. 44. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Net- works of Plausible Inference (Morgan Kaufmann, San Mateo, Calif., 1988).

VOLUME 15, NUMBER 1, 2005 LINCOLN LABORATORY JOURNAL 125 • BLISS, FORSYTHE, AND CHAN MIMO Wireless Communication

 .   .   .  is a a staff member in the is a senior staff member in the is an associate staff member Advanced Sensor Techniques Advanced Sensor Techniques in the Advanced Sensor Tech- group. He received M.S. and group. He received S.B. and niques group. She received Ph.D. degrees in physics from S.M. degrees, both in math- MSEE and BSEE degrees in the University of California ematics, from MIT. In 1978 electrical engineering from at San Diego, and a B.S.E.E. he joined Lincoln Laboratory, the University of Michigan. in electrical engineering from where he has worked in the Her interests are in chan- Arizona State University. areas of spread-spectrum com- nel phenomenology. She Previously employed by Gen- munication, adaptive sensor- has previously worked with eral Dynamics, he designed array processing, and syn- implementation of synthetic avionics for the Atlas-Centaur thetic aperture radar (SAR) aperture geolocation of cel- launch vehicle, and performed imaging. His work on spread- lular phones. Most recently, research and development spectrum systems includes she has worked on the imple- of fault-tolerant avionics. As electromagnetic modeling mentation of MIMO channel a member of the supercon- (geometric theory of diffrac- parameterization. ducting magnet group, he tion) of antennas mounted on performed magnetic field an airframe, error-correction calculations and optimization coding, jam-resistant synchro- for high-energy particle-accel- nization techniques, and digi- erator superconducting mag- tal matched-filter design and nets. His doctoral work was in performance. In the area of high-energy particle physics, adaptive sensor-array process- searching for bound states of ing he helped develop a num- gluons, studying the two-pho- ber of signal-processing algo- ton production of hadronic rithms that exploit waveform final states, and investigating features to achieve levels of innovative techniques for performance (beamforming, lattice-gauge-theory calcula- direction finding, geolocation, tions. At Lincoln Laboratory and other forms of parameter he focuses on multiantenna estimation) beyond those adaptive signal processing, attainable by nonexploitive primarily for communication techniques. His work on SAR systems, and on parameter imaging involves techniques estimation techniques and for resolution enhancement bounds, primarily for geolo- and interference rejection for cation. His current research foliage-penetration systems. includes ultrawide bandwidth communication, geolocation techniques using vector sensor arrays, multiple-input multiple-output (MIMO) radar concepts, algorithm development for multichannel multiuser detectors, MIMO communication channel phenomenology, and information theoretic bounds on MIMO communication systems.

126 LINCOLN LABORATORY JOURNAL VOLUME 15, NUMBER 1, 2005