[ Dirk Wübben, Dominik Seethaler, Joakim Jaldén, and Gerald Matz]

[A survey with applications in wireless communications]

attice reduction is a powerful concept for solving diverse problems involving point lattices. Signal processing applications where lat- tice reduction has been successfully used include global positioning system (GPS), frequency estimation, color space estimation in JPEG pictures, and particularly data detection and precoding in Lwireless communication systems. In this article, we first provide some back- ground on point lattices and then give a tutorial-style introduction to the theoretical and practical aspects of reduction. We describe the most important lattice reduction algorithms and comment on their performance and computational complexity. Finally, we discuss the application of lattice reduction in wireless communications and statistical signal processing. Throughout the article, we point out open problems and interesting ques- tions for future research.

INTRODUCTION Lattices are periodic arrangements of discrete points (see “Lattices”). Apart from their widespread use in pure mathematics, lattices have found applica- tions in numerous other fields as diverse as cryptography/cryptanalysis, the geometry of numbers, factorization of integer polynomials, subset sum and knapsack problems, integer relations and diophantine approximations, the spigot algorithm for p, materials science and solid-state physics (specifically crystallography), and coding theory. Recently, lattices have also been applied in a variety of signal processing problems.

Digital Object Identifier 10.1109/MSP.2010.938758 Date of publication: 19 April 2011 © DIGITAL VISION

IEEE SIGNAL PROCESSING MAGAZINE [70] MAY 2011 1053-5888/11/$26.00©2011IEEE LATTICES m A (point) lattice L is a periodic arrangement of discrete 5 u #u, x a , b,,0 , 1. points. Two-dimensional examples are the square, rhombic, ,51 and hexagonal lattices shown in Figure S1. The Voronoi region V1L2 of a lattice is the set of points that are closer to the origin than to any other lattice point. The Voronoi region is a lattice invariant, i.e., it does not depend on a specific Square Lattice Rhombic Lattice Hexagonal Lattice lattice . Translating either the fundamental parallelotope or the Voronoi region to all lattice points induces a tessellation of Euclidean space. Illustrations of the parallelotope (blue) and the Voronoi region (red) and the associated tessellations for our two-dimensional example lattices are shown in Figure S2.

Square Lattice Rhombic Lattice Hexagonal Lattice

[FIGS1] Two-dimensional examples of the square, rhombic, and hexagonal lattices.

Any lattice can be characterized in terms of a (nonunique) 5 1 c 2 basis B b1 bm that allows any lattice point to be rep- resented as a superposition of integer multiples of the basis

vectors b,, i.e., any x [ L can be written as

m 5 [ Z x a z, b,, z, . ,51

If x [ L and x [ L, then kx 1 ,x [ L for any k, , [ Z. 1 2 1 2 [FIGS2] Illustrations of the parallelotope and the Voronoi To any lattice basis, there is an associated fundamental paral- region and the associated tessellations of two-dimensional lelotope P1B2 , i.e., the set of points that can be written as example lattices.

Lattices are used to develop powerful source and channel optimization problems with integer variables). An example is codes for many communications applications, specifically in provided in [6], where an integer least squares problem—which scenarios with multiple terminals or with side-information is NP-hard—is solved in an approximate but efficient manner (e.g., [1]). Different from that strand of work, our focus in this using lattice reduction techniques. While the application con- article is on the principle of lattice reduction. Lattice reduc- text in that paper was GPS, the authors mention radar imaging tion is concerned with finding improved representations of a and magnetic resonance imaging as other fifieldselds wwherehere given lattice using algorithms like Lenstra, Lenstra, Lovász their results could have impact. Another signalnal pro- (LLL) reduction [2] or Seysen reduction [3]. It is a topic of cessing application of lattice reduction is colorlor great interest, both as a theoretical tool and as a practical space estimation in JPEG images [7]. Here, technique. Since we feel that lattice reduction may be relevant the quantized discrete cosine transform to a much wider class of engineering problems than currently coefficients in the three color planes are considered, this survey article targets a broader signal process- viewed as lattice points in three-dimen- ing audience. We give a tutorial-style introduction to lattices sional (3-D) space, with the 3-D lattice and lattice reduction algorithms, discuss the application of lat- being determined by the color space trans- tice reduction in wireless communications and parameter esti- formation matrix. Building on the analogy too mation, and as a by-product provide convenient entry points simultaneous Diophantine approximations, [8][8] and [9][9] to the literature on the topic. Wherever appropriate, we also use lattice reduction techniques for determining the intercept of point out possible topics for future research. multiple periodic pulse trains and for estimating their parame- Lattice reduction plays an important role in the above-men- ters. A somewhat similar approach was used to develop a lattice- tioned fields as it leads to efficient solutions for several classical reduction-based implementation of the maximum likelihood problems in lattice theory. For example, lattice reduction is inti- (ML) estimator for the frequency of a sinusoid in noise [10]. mately linked to the search for the shortest vector in a lattice, Lattice reduction has further been applied to various problems which in turn is related to the development of the now famous in wireless communications, e.g., to the equalization of frequen- sphere decoding algorithm (e.g., [4] and [5]). With regard to sig- cy-selective channels [11], to receiver design for unitarily pre- nal processing applications, lattice reduction is potentially use- coded quadrature amplitude modulation (QAM) transmission ful in problems that involve (i.e., over flat fading channels [12], to joint data detection and

IEEE SIGNAL PROCESSING MAGAZINE [71] MAY 2011 MIMO WIRELESS MIMO systems use multiple antennas at both link ends and Reduction for Data Detection,” the receive antennas cooper- offer tremendous performance gains without requiring addi- ate and channel state information is available at the receive tional bandwidth or transmit power. Two important MIMO side. In contrast, the MIMO precoding scheme considered in gains are the multiplexing gain, which corresponds to an the section “Lattice Reduction for Precoding” requires chan- increase of the data rate, and the diversity gain, which corre- nel state information at the transmit side and allows for dis- sponds to an increase of the transmission reliability. tributed noncooperative receive antennas. A useful MIMO performance figure is the (spatial) diversi- ty order, defined as the asymptotic slope of the error proba- 1 2 Transmit SideChannel Receive Side bility Pe SNR as a function of signal-to-noise ratio (SNR) on a log-log scale (see Figure S4). The diversity order depends on h1,1 the statistics of the MIMO channel and on the transceiver s 1 hk,1 y1 scheme. For spatial multiplexing the maximum diversity order h (“full diversity”) equals N, whereas for MIMO precoding the h1,O 1,M hk,O maximum diversity order equals M. sO yk hN,O hN,1 s h y M k,M N 100 hN,M 10 dB 10−1 One Decade 10−2 [FIGS3] MIMO system using M transmit antennas and N Two (SNR) −3 receive antennas. e 10

P Decades 10−4 In this article, we consider MIMO systems with M transmit 10 dB 10−5 antennas that transmit parallel data streams s,, , 5 1, c, M. 0102030 At the receive side, N receive antennas pick up mixtures of the SNR 5 M 1 transmit signals, i.e., yk g ,51 hk,, s, wk, where wk denotes additive Gaussian noise (Figure S3). In the MIMO spa- [FIGS4] Illustration of MIMO systems with diversity order tial multiplexing system considered in the section “Lattice 1 (blue) and with diversity order 2 (red).

channel estimation in quasi-synchronous code division multiple the reduced basis offers advantages with respect to performance access (CDMA) [13], and to equalization in precoded orthogonal and complexity. For example, it was shown recently that in frequency division multiplexing (OFDM) systems [14]. some scenarios even suboptimum detection/precoding Recently (see, e.g., [15] and [16]), lattice reduction (and lat- techniques can achieve full diversity (see “MIMO Wireless”) tice theory in general) turned out to be extremely useful for when preceded by LLL lattice reduction (e.g., [18]–[23]). detection and precoding in wireless multiple-input multiple-out- While the multitude of applications illustrates the theoreti- put (MIMO) systems, i.e., systems that use multiple antennas at cal significance of lattice reduction, its practical importance is the transmit and receive side [17] (see “MIMO Wireless” for the corroborated by the fact that very-large-scale integration (VLSI) basic notions of MIMO systems). The fundamental idea here is to implementations of lattice reduction algorithms for MIMO sys- exploit the discrete nature of the digital data and view the chan- tems have recently been presented [24]–[28]. These hardware nel matrix that acts on the data as a basis (generator) of a point implementations were motivated by the fact that two core prob- lattice. Via this interpretation, detection and precoding problems lems in the practical realization of MIMO systems are data can be tackled using tools from lattice theory. Typically, a three- detection at the receive side and broadcast precoding at the stage procedure is pursued: transmit side, and that several groups have proposed to use lat- 1) An improved basis for the lattice induced by the channel is tice reduction algorithms to solve these problems efficiently. In determined via lattice reduction. The original basis and the fact, it turned out that lattice reduction approaches have the reduced basis are related via a unimodular matrix. potential to achieve high performance at low computational 2) The detection/precoding problem is solved with respect to complexity. The hardware implementations of lattice reduction the reduced basis. algorithms will be applicable to WiMAX, WiFi, and 3GPP/3GPP2 3) The solution is transformed back to the original domain systems, which constitute an economically significant market. using the unimodular matrix. Since the reduced basis has nicer mathematical properties POINT LATTICES (e.g., smaller orthogonality defect and smaller condition num- In this section, we discuss the basic concepts relating to ber), solving detection and precoding problems with respect to point lattices in a real Euclidean space, we provide a simple

IEEE SIGNAL PROCESSING MAGAZINE [72] MAY 2011 motivating example for the use LATTICES ARE PERIODIC Matrices satisfying these prop- of lattice reduction in MIMO erties are called unimodular ARRANGEMENTS OF DISCRETE POINTS. wireless systems, and we dis- and form a group under matrix cuss briefly the necessary modi- multiplication. fications for complex-valued lattices. Further details on the Any lattice basis also describes a fundamental parallelotope theory of lattices can be found in [4] and [29]–[33]. according to

m P1 2 ! 5 u #u , REAL-VALUED LATTICES B e x` x a , b,, 0 , 1 f . A real-valued lattice L is a discrete additive subgroup of Rn. ,51 Any lattice can be characterized in terms of a set of m # n lin- P1B2 is also a so-called fundamental region, i.e., a region that 5 c 6 early independent basis (or generator) vectors b1, , bm , completely covers the span of B when shifted to all points of n b, [ R , as the lattice. Another important fundamental region is the Voronoi region, defined as the set of points in Rn that are clos- m L! 5 [ Z er to the origin than to any other lattice point, e x ` x a z, b,, z, f . ,51 V1L2 ! 5x | y x y # y x 2 y y for all y [ L6. Here, Z denotes the set of integers and m is referred to as the rank or dimension of the lattice. Lattices are periodic Voronoi regions can be associated to all other lattice points by arrangements in the sense that translations of the lattice by an a simple translation of V1L2. In contrast to the fundamental arbitrary integer multiple of any lattice point leave the lattice parallelotope P1B2 , the Voronoi region V1L2 is a lattice unchanged; formally, for any x, y [ L there is y 1 kx [ L for invariant, i.e., it is independent of the specific choice of a lat- all k [ Z. For convenience, we will often arrange the basis vec- tice basis. 3 5 1 c 2 tors into an n m matrix B b1 bm and simply call B Clearly, different bases lead to different fundamental paral- the basis of the lattice. Since the basis vectors were assumed lelotopes. However, the volume (here, volume is defined in linearly independent, B has full (column) rank, i.e., the m-dimensional space spanned by the columns of B) of rank1B2 5 m. Any element of the lattice can then be represent- P1B2 is the same for all bases of a given lattice. This volume ed as x 5 Bz for some z [ Zm. equals the so-called lattice , which is a lattice The simplest lattices are the cubic lattices, obtained for invariant defined as the square-root of the determinant of the T n 5 m by choosing the basis vectors as b, 5 e,, where e, Gramian B B, denotes the ,th column of the n-dimensional identity matrix 5 1 c 2 5 L 5 Zn k L k ! ! T In e1 en . In this case we have B In and . Note, det1B B2. (1) however, that due to its periodicity, Zn can also be generated by the basis vectors 5e , e 1 ke , c, e 1 ke 6, k [ Z, i.e., If the lattice has full rank (i.e., if n 5 m), the lattice determi- | 1 2 1 n 1 B 5 B 1 k10 e ce 2. While B 5 I is an orthogonal basis, the nant equals the magnitude of the determinant of the basis | 1 1 n | columns of B become more and more colinear as k increases matrix B, i.e., |L| 5 |det1B2|. For a transformed basis B 5 BT | | 1 T 2 (i.e., the condition number of B increases). Obviously, when (with T being unimodular), we have in deed det B B 5 | Zn 1 T T 2 5 2 1 2 1 T 2 5 k L k 2 working with , the basis In is to be preferred over B. Figure 1 det T B BT det T det B B . The “quality” of a illustrates the case n 5 m 5 2 by showing the bases B 5 I and | 2 5 1 1 2 B B 3 0 e1 along with some lattice characteristics defined later in this section. x2 The foregoing example revealed that the basis for a given lattice is not unique. Indeed, it can be shown that there exist infinitely many bases for a lattice; it is exactly these degrees of freedom with the choice of a lattice basis that are exploited by ∼ lattice reduction algorithms (see the section “Lattice b2 b2 Reduction Techniques”). The requirement that two matrices B x1 and BT (with T an m 3 m matrix) span the same lattice is ∼ b1 = b1 equivalent to BZm 5 BTZm and thus Zm 5 TZm. The last equality holds if and only if T is invertible and both T and T21 have integer elements. Equivalently, T must be a matrix with integer elements and determinant |det1T2| 5 1; in fact, these | [FIG1] Example lattice Z2 with bases B 5 Q 10R and B 5 Q 13R 01 01 two properties guarantee that the inverse exists and also is an and associated fundamental parallelograms P1B2 (in red) and | integer matrix. This follows from the observation that the P1 B2 (in blue). The lattice determinant [defined in (1)] equals 1 2 -1 12 2 k1l 1 2 1 2 [ Z 1 2 k, l th element of T equals 1 det Tkl /det T (Tkl k L k 5 1 and the orthogonality defects [see (2)] are j B 5 1 and 1 | 2 is obtained by deleting the kth row and lth column of T). j B 5 !10.

IEEE SIGNAL PROCESSING MAGAZINE [73] MAY 2011 lattice basis can be measured in terms of the orthogonality zero-forcing (ZF) detector that first equalizes the channel by | 5 21 5 1 21 defect, defined as computing s ZF H y s H w and then uses a simple component-wise slicer (quantizer) that delivers the element in 1 m j1 2 5 Z2 | | B q ||b,||. (2) closest to sZF. The square decision regions for sZF corre- |L| ,5 1 spond to decisions regions for y that are given by the funda- For any m 3 m positive definite matrix A with elements mental parallelogram P1H2 centered around each lattice 1 2 # m ak,,, the Hadamard inequality states that det A w,51 a,,, point. Unless the channel is orthogonal, the ZF decision with equality if and only if A is diagonal. Setting A 5 BTB regions are different from the ML decision regions, therefore this implies that the orthogonality defect is bounded from resulting in a larger number of detection errors. below as j1B2 $ 1, with equality if and only if B is orthogo- The error probability of a detector is largely determined nal. The fundamental parallelotope, lattice determinant, by the minimum distance of the lattice points from the and orthogonality defect are illustrated for the lattice Z2 in boundaries of the associated decision region. This distance Figure 1. can be interpreted as the maximum amount of noise that can To any lattice, there is an associated dual lattice, defined by be tolerated without ending up in an incorrect decision region. This minimum distance is much smaller for ZF Lw ! 5xw [ span1B2| xT xw [ Z for all x [ L6. detection than for ML detection, thereby explaining the infe- riority of ZF detection. The problem with the ZF detector can If B is a basis for the primal lattice L, then a basis Bw for the be seen in the fact that it is based on the (pseudo)inverse of dual lattice Lw can be obtained via the right Moore-Penrose the channel matrix H and that the component-wise slicer pseudoinverse, i.e., (quantizer) ignores the correlation introduced by the equal- izer. This implies that ZF detection is not independent of the Bw 5 B1BT B2 21. (3) basis (induced by the channel matrix), which may often be rather unfavorable. The fundamental idea now is to apply T w 5 Since B B Im, it follows that the primal and dual basis ZF detection with respect to a different, improved basis T w | vectors are bi-orthogonal, i.e., b, b 5 0 for , 2 k. H 5 HT (with unimodular T). This is possible since k | w 5 1 5 1 5 21 [ Z2 Geometrically, this means that the dual basis vector bk is y Hs w Hz w where z T s due to the uni- orthogonal to the subspace spanned by the primal basis vec- modularity of T. Lattice-reduction aided ZF detection then tors b , c, b 2 , b 1 , c, b . This is useful since for any performs equalization with the (pseudo)inverse of the 1 k 1 k 1 m | x 5 Bz [ L we can recover the kth integer coefficient via improved basis H, followed by component-wise integer quan- w 5 T zk x bk . The determinant of the dual lattice is easily seen to tization (yielding an estimate of z) and multiplication by T be given by 0 Lw 0 5 1/|L|. The cubic lattice Zn is an example of (yielding an estimate of s). This procedure is equivalent to a lattice that is self-dual in the sense that L 5 Lw. decision regions that correspond to the fundamental paral- | lelogram P1H2. Performance is improved by lattice reduc- | MOTIVATING EXAMPLE tion since P1H2 is more similar to the Voronoi region V1L2 To illustrate the relevance of lattices and lattice reduction to than P1H2 . While ZF detection with respect to a reduced MIMO communications, we consider a system transmitting basis is generally not equivalent to ML detection, lattice- s two integer symbols s 5 Q 1R [ Z2 over two transmit anten- reduction-aided ZF detection typically results in large perfor- s2 nas (see also “Motivating Example: MIMO Detection”). This mance gains (see the sections “Lattice Reduction for Data example is inspired by a similar one in [34]. At the receiver, Detection” and “Lattice Reduction for Precoding”). 5 1 1 two antennas receive the samples y1 h1,1 s1 h1,2 s2 w1 5 1 1 and y2 h2,1 s1 h2,2 s2 w2, where hk,, denotes the channel COMPLEX-VALUED LATTICES coefficient between receive antenna k and transmit antenna In many practical problems, e.g., in wireless communications, , and w1 and w2 denotes noise. We rewrite the observation the quantities involved are complex valued. The previous y1 h1,1 h model as y 5 Q R 5 x 1 w with x 5 Q R s 1 Q 1,2 R s and discussion of real-valued point lattices can be generalized to the y 1 2 w1 2 h2,1 h2,2 w 5 Q R. Clearly, the noiseless receive vectors x form a lat- complex case in a more or less straightforward manner. w2 h tice with basis given by the channel matrix H 5 Qh1,1 1,2 R, Specifically, a complex-valued lattice of rank m in the n-dimen- h2,1 h2,2 i.e., x [ L1H2. sion complex space Cn is defined as

If the elements of the noise vector are independent and m L!e ` 5 [ Z f identically distributed (i.i.d.) Gaussian, optimal ML detection x x a z, b,, z, j , ^ ,51 amounts to finding the lattice point xML lying closest to y. This [ Cn Z 5 Z 1 Z leads to decision regions that equal the Voronoi regions of the with complex basis vectors b, and j j denoting lattice and hence are independent of the specific lattice basis. the set of complex integers (also known as Gaussian integers). While ML detection is optimal, the search for the closest lat- By arranging the basis vectors into an n 3 m complex-valued tice point is computationally hard, especially in high dimen- matrix B and noticing that the complex mapping x 5 Bz can be sions. As a computationally simpler alternative, we may use a equivalently expressed as

IEEE SIGNAL PROCESSING MAGAZINE [74] MAY 2011 MOTIVATING EXAMPLE: MIMO DETECTION In both (a) and (b), green circles illustrate the detector’s Figure S5 illustrates a rotated square receive lattice induced noise robustness, which is determined by the distance of 2 by the channel realization H 5 Q 14R and integer symbols the lattice points from the associated decision boundaries. 2 2 3 | s [ Z2. The channel matrix corresponds to the basis vectors In this example, a reduced basis is obtained as h 5 h and 2 | | 1 1 h 5 Q 1R and h 5 Q4 R (shown in black in (b)), which are h 5 h 1 2h , i.e., H 5 HT with T 5 Q1 2R. This new basis 1 2 2 2 3 2 2 1 01 seen to be almost collinear. This means that H is poorly con- [shown in blue in (a)] is optimal in the sense that it consists ditioned; its orthogonality defect equals j1H2 5 !5. The ML of two orthogonal shortest vectors (in general, there is no decision regions, shown in (a), correspond to the Voronoi guarantee that an orthogonal basis exists nor that it can regions of the lattice and are rotated squares. For the specif- be found via lattice reduction). The fundamental parallelo- | ic receive vector shown as light-blue square, the closest lat- gram associated with H is congruent with the Voronoi ^ 5 2 2 tice point is xML Q R (marked as red circle) which region of the lattice. Thus, ZF equalization with respect to 1 2 2 | corresponds to the ML decision s^ 5 H 1x^ 5 Q 2R. With H followed by a Z2 slicer and a remapping z S s here (but ML ML 2 1 ZF detection, the decision regions are given by the shifted not in general) is equivalent to ML detection. Specifically, | 2 versions of the fundamental parallelogram P1H2 associated the closest lattice point Hz 5 Q 2R corresponds to the 1 with H [see (b)]. Due to the poor condition number of H, (transformed) symbol vector z 5 T21s 5 Q 0 R and hence to 21 5 5 22 the ZF and ML decision regions are markedly different. the symbol decision s Tz Q2 R, which coincides with the ^ 2 1 Indeed, the ZF estimates are given by x 5 Q 1R and ML result. 2 ZF 3 ^ 5 Q 3R Z ^ sZF 2 1 sML.

ML Detector ZF Detector y2 y2

y1 y1

(a) (b)

2 [FIGS5] A rotated square receive lattice induced by the channel realization H 5 Q 14R and integer symbols s [ Z2. (a) ML 2 2 3 decision regions and (b) decision regions of ZF detector.

R5x6 R5B6 2J5B6 R5z6 that are short and therefore this basis is called reduced. Unless a b 5 a b a b , (4) J5x6 J 5B6 R5B6 J5z6 stated otherwise, the term “short” is to be interpreted in the usual Euclidean sense. There are several definitions of lattice it is seen that any m-dimensional complex-valued lattice in Cn reduction with corresponding reduction criteria, such as can be dealt with as a 2m-dimensional real-valued lattice in R2n. Minkowski reduction [41]–[43], Hermite-Korkine-Zolotareff Yet, many of the concepts and algorithms from the real-valued reduction [44], [45], Gauss reduction [46], LLL reduction [2], space can be (and have been) formulated directly in the complex [33], Seysen reduction [3], and Brun reduction [47], [48]. The domain with minor modifications (see, e.g., [14] and [35]–40]). corresponding lattice reduction algorithms yield reduced bases This is sometimes advantageous since the problem dimension is with shorter basis vectors and improved orthogonality; they not increased, the basis matrix need not obey the structure in provide a tradeoff between the quality of the reduced basis and (4), and the algorithm complexity can be lower. the computational effort required for finding it. In the following, we discuss the basics of the various lattice LATTICE REDUCTION TECHNIQUES reduction approaches and the underlying reduction criteria. Lattice reduction techniques have a long tradition in mathe- MATLAB implementations of some of the algorithms are provid- matics in the field of number theory. The goal of lattice basis ed as supplementary material in IEEE Xplore (http://ieeexplore. reduction is to find, for a given lattice, a basis matrix with ieee.org). Since most of the lattice reduction techniques are favorable properties. Usually, such a basis consists of vectors based on an orthogonal decomposition of a lattice basis matrix,

IEEE SIGNAL PROCESSING MAGAZINE [75] MAY 2011 exactly zero, b, has no component into the direction of QR AND GRAM-SCHMIDT c b1, , b,21 and is correspondingly orthogonal to the space The unnormalized Gram-Schmidt orthogonalization of a spanned by these vectors. However, for general lattices such a basis b , p, b is described by the iterative basis updates 1 m strictly orthogonal basis does not exist and one has to settle for ,21 T ^ ^ ^ b, bk a basis satisfying less stringent criteria. b, 5 b,2 m, b , with m, 5 . a ,K k ,k y ^ y 2 k51 bk In contrast, the (thin) QR decomposition represents BASIC APPROACH the given basis vectors in terms of orthonormal basis As discussed in the section “Point Lattices,” the columns of

vectors q1, …, qm as the matrix B define the lattice L. The same lattice is also generated by any matrix that is constructed from B by the fol- , 5 lowing elementary column operations: b, a rk,, qk. k51 1) Reflection: This transformation performs a sign change by | Some algebra reveals the following relations between QR 2 52 multiplying a specific column by 1, i.e., b, b,; the cor- decomposition and Gram-Schmidt orthogonalization: responding unimodular matrix reads • The ,th column q, of Q is the normalized version of the ^ ^ 5 y ^ y 1 2 corresponding Gram-Schmidt vector b,, i.e., q, b,/ b, . , T T 5 I 2 2e, e,. (5) • The length of the ,th Gram-Schmidt vector equals the R m ,th diagonal element of the triangular matrix R, i.e., y ^ y 5 2) Swap: Here, two columns of the basis matrix are inter- b, r,,,. | changed. Swapping column k and , according to b, 5 b and • The off-diagonal elements of the triangular matrix R are | k 5 related to the Gram-Schmidt coefficients according to bk b, amounts to postmultiplication of the basis matrix m 5 ,,k rk,,/rk,k. with the unimodular matrix Using the above analogies, any lattice reduction technique for- 1k,,2 T T T T mulated in terms of the QR decomposition can be rephrased 5 2 2 1 1 TS Im ekek e,e, ek e, e, ek. (6) using Gram-Schmidt orthogonalization and vice versa. 3) Translation: This operation adds one column to another | 5 1 column, i.e., b, b, bk. Such a translation is character- we first briefly review the QR decomposition. Some texts build ized by the unimodular matrix on the (unnormalized) Gram-Schmidt orthogonalization instead 1 2 k,, 5 1 T of the QR decomposition (see “QR and Gram-Schmidt” for a dis- TT Im ek e,. (7) cussion of their relation). In practical implementations, the QR decomposition is preferable since it can be performed efficiently A translation by an integer multiple corresponds to repeated | 1k,,2 5 1 m m [ Z and numerically more stable (using e.g., Givens rotations or application of TT , i.e., b, b, bk with is 3 1k,,2 4m 5 1 m T Housholder transformations [49]) than Gram-Schmidt. obtained via postmultiplication with TT Im ek e,. Subtraction of integer multiples of a column can be achieved QR DECOMPOSITION by reflecting the corresponding column before and after the | 3 $ 5 2 m Consider an n m lattice basis B with n m and translation, i.e., b, b, bk corresponds to the unimodu- 1 2 5 1k2 3 1k,,2 4m 1k2 5 3 1k,,2 4 2m 5 2 m T rank B m. The (thin) QR decomposition factorizes B lar matrix TR TT TR TT Im eke,. 5 5 1 c 2 3 according to B QR, where Q q1 qm is an n m We note that only a translation actually impacts the quality of column-orthogonal matrix and R is an m 3 m upper triangular the basis directly, whereas reflections and column swaps help to matrix with positive diagonal elements. We denote the element systematically perform the appropriate translations. Since uni- , of R in row k and column by rk,,. Since Q has orthogonal col- modular matrices form a group under multiplication, any T 5 umns with unit norm, we have Q Q Im. The QR decomposi- sequential combination of the above three elementary operations tion amounts to expressing the ,th column of B in terms of the corresponds to post-multiplying B with a specific unimodular c orthonormal basis vectors q1, ,q, as matrix T. The goal of lattice reduction algorithms is to determine , a sequence of elementary column operations (equivalently, a uni- b, 5 r , q . modular matrix T) that transforms the given basis B into a a k, k | k51 reduced basis B 5 BT according to the specific requirements of T 5 Here, qk b, rk,, characterizes the component of b, collinear the corresponding reduction criterion. In the following, we dis- with q . Furthermore, r, , describes the component of b, which cuss various lattice reduction approaches. Hereafter, the QR fac- k , | | | | c | 5 | is orthogonal to the space spanned by b1, ,b,21 or, equiva- tors of B will be denoted by Q and R, i.e., B QR. c lently, by q1, , q,21. The QR decomposition gives a descriptive explanation for MINKOWSKI AND HERMITE-KORKINE-

the orthogonality of a basis. A basis vector b, is almost orthogo- ZOLOTAREFF REDUCTION c nal to the space spanned by b1, , b,21, if the absolute values H. Minkowski [41]–[43] developed the field of geometry of num- c of r1,,, , r,21,, are close to zero. If these elements of R are bers and exploited the concept of point lattices to formulate the

IEEE SIGNAL PROCESSING MAGAZINE [76] MAY 2011 corresponding theory. He introduced a very strong reduction | GAUSS REDUCTION criterion that requires that the first vector b1 of the ordered | | Below, we provide pseudocode for Gauss reduction. The basis B is a shortest nonzero vector in L1B2. In the following, algorithm takes a two-dimensional basis matrix B as input when we speak of “shortest vectors,” this is meant to implicitly | and success ively performs column swaps and size reduc- exclude the trivial all-zeros vector. All subsequent vectors b, for tions until a Gauss reduced basis is obtained. 2 # , # m have to be shortest vectors such that the set of vec- | | | | | 1: B d B tors b , c, b, can be extended to a basis of L1B2. Thus, b, is a 1 | 2: repeat shortest vector in L1B2 that is not a linear combination of | | | | c 4 b1, , b,21. 3: b1 b2 | | The definition of a Hermite-Korkine-Zolotareff-reduced basis | | bT b | d 2 l |1 2 k is related to that of Minkowski. It requires that the projection of 4: b2 b2 7 7 b1 | b1 the basis vectors b, onto the orthogonal complement of the | | | | y y # y y 5 c 6 5: until b1 b2 space spanned by b1, , b,21 are the shortest vectors of the corresponding projected lattices [44], [45]. Consequently, the | | first vector b is again a shortest vector of L1B2. and column swapping operations is repeated until the length 1 | Due to their high computational complexity (see the section of b is shorter—after the preceding size reduction step— 1 | “Complexity of Lattice Reduction”), the Minkowski and than that of b2, which implies that no further column swap- Hermite-Korkine-Zolotareff reductions will not be considered in ping operation is performed (see “Gauss Reduction”). After a the context of MIMO detection and precoding (cf. the sections finite number of iterations, this algorithm provides a Gauss- | | | “Lattice Reduction for Data Detection” and “Lattice Reduction reduced basis B, where y b y # y b y and the properties of a 1 2 | | for Precoding”). size-reduced basis are satisfied. In particular, b1 and b2 are the two shortest vectors in the lattice L that form a basis for SIZE REDUCTION L. “Example: Size and Gauss Reduction” illustrates these two A rather simple but not very powerful criterion is given by the reduction techniques with a simple example. | so-called size reduction. A basis B is called size reduced if the | elements of the corresponding upper triangular matrix R satisfy LLL REDUCTION the condition A powerful and famous reduction criterion for arbitrary lat- tice dimensions was introduced by A.K. Lenstra, | # 1 | # , , # |r k,,| 2 |rk ,k| for 1 k m . (8) H.W. Lenstra, and L. Lovász in [2], and the algorithm they proposed is known as the LLL (or L3) algorithm (see also [29] | | Thus, the component of any vector b, into the direction of q and [33]). It can be interpreted as an extension of | k , , . for k is not longer than half of the length of bk perpendicu- Gauss reduction to lattices of rank m 2. | | | | || lar to span5q , c,q 2 6. If this condition is not fulfilled for an In particular, a basis B with QR decomposition B 5 QR is 1 k 1 | index pair 1k,,2, the length of b, can be reduced by subtracting called LLL reduced with parameter 1/4 ,d#1, if | a multiple of bk where the integer multiplication factor is given m5 | | # | # 1 | # , , # by < r k,,/ r k,k; with < ; denoting the rounding operation. This |r k,,| 2 | rk,k|, for 1 k m, (9a) subtraction corresponds to applying the unimodular translation 3 1k,,2 4 2m d | 2 # | 2 1 | 2 , 5 c matrix T T [cf. (7)]. We note that size reduction does not |r ,21,,21| |r ,,,| |r ,21,,| , for 2, , m. (9b) involve any column swaps. A basis fulfilling (8) is often called weakly reduced. The choice of the parameter d affects the quality of the reduced basis and the computational complexity. Larger d results in a better GAUSS REDUCTION basis at the price of higher complexity (see also the section “LLL In contrast to the other reduction approaches, the reduction Complexity”). A common choice is d53/4. The inequality in (9a) method introduced by C.F. Gauss in the context of binary qua- is the condition for a size-reduced basis [cf. (8)] and (9b) is the so- 5 | 2 dratic forms is restricted to lattices of rank m 2, i.e., called Lovász condition. The quantities |r ,21,,21| and n32 | 2 | 2 B 5 1b b 2 [ R [46]. For such two-dimensional lattices, |r , ,| 1 |r ,2 ,| in the Lovász condition equal the squared lengths 1 2 , 1, | | Gauss reduction constructs a basis that fulfills the reduction cri- of the components of b,2 and b, that are orthogonal to the space | | 1 teria introduced by Minkowski and Hermite-Korkine-Zolotareff. spanned by b , c,b,2 . If the Lovász condition is not fulfilled for |1 |2 In addition to size reduction, Gauss reduction also two vectors b,21 and b,, these vectors are swapped (similar to includes column swapping operations that correspond to Gauss reduction) and the resulting basis matrix is again 1k,,2 applying the unimodular matrices TS defined in (6). In par- QR-decomposed and size reduced. This process of subsequent size ticular, after first size reducing the given basis matrix B, the reduction and column swapping operations [which corresponds to | | | columns of the resulting basis B 5 1 b b 2 are swapped if the the application of the reflections, swaps, and translations defined in | | 1 2 length of b1 is larger than that of b2 and the resulting basis is (5)–(7)] is repeated until—after a finite number of iterations—an again size reduced. This process of successive size reduction LLL-reduced basis is obtained (see “The LLL Algorithm” and [2],

IEEE SIGNAL PROCESSING MAGAZINE [77] MAY 2011 EXAMPLE: SIZE AND GAUSS REDUCTION We illustrate size reduction and Gauss reduction with a small example. Figure S6 shows a two-dimensional lattice L spanned 5 2.2 5 3.2R ~ ~ by the basis vectors b1 Q R and b2 Q (shown in blue). ′ 1 1 b2 b1 = b1 b2 The orthogonality defect of this basis equals j1B2 5 8.1. ~′ –2b1 Size Reduction −b1 The QR decomposition of this basis yields R 5 Q2.417 3.327R. 0 0.414 5 . 5 ~ ~ Since |r1,2| 3.327 |r1,1|/2 1.208, the basis is not size ′ b2 = b1 reduced. The corresponding size reduction step is to replace | b with b 5 b 2 mb where m5<3.327/2.417; 5 1, leading to 2 2 2 1 | the new basis vector b 5 Q1R (shown in red). The first basis 2 0 | 5 vector remains unchanged, i.e., b1 b1. Since | [FIGS6] Illustration of size reduction (red) and Gauss reduction r 5 r 2 r 5 0.91 , |r | /2 5 1.208, the new basis is size 1,2 1,2 1,1 1,1 | (green) for a two-dimensional lattice L spanned by the basis reduced; its orthogonality defect is j1B2 5 2.42. vectors b 5 Q2.2R and b 5 Q3.2R (shown in blue). 1 1 2 1 Gauss Reduction | | After the size reduction, b is more than twice as long as b . This We note that the above sequence of size reduction, col- 1 | | 2 4 umn swap, and size reduction can be described in terms of suggests to perform a column swap b1 b2 that leads to the 12.2 basis matrix Q R that is actually upper triangular. Since a unimodular matrix T that decomposes into the elementa- | 01| |r | 5 2.2 . |r | /2 5 1/2, we perform another size reduction ry column operations [see (5)–(7)] 1,2 | |1,1 | | | | r 5 2 < ; 5 0.2 r 5 r step, i.e., b2 b2 2.2/1 b1 Q R and b1 b2. Since b1 is |r 1 shorter than b no further column swap or size reduction is pos- 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 2 T 5 T 1 T 1,2 T 1 T 1,2 T 1 T 1,2 T 1,2 T 1 sible and a Gauss-reduced basis has been obtained, i.e., the basis R T R S R T T R | Br 5 Q1 0.2R consists of two shortest vectors that span L. The 1 21 01 1 22 213 0 1 | 5 a ba b a b 5 a b. orthogonality defect of this basis is j1Br2 5 1.02. 01 10 01 1 22

[50], and [51] for more details). LLL reduction results in many and the dual basis Bw. M. Seysen defined the orthogonality | interesting properties of the corresponding reduced basis B. For criterion example, it can be shown that (9a) and (9b) imply that the length of | m the shortest vector b1 in an LLL-reduced basis is upper bounded in 1 2 5 y y 2 y wy 2 S B a b, b, , terms of the length of a shortest vector x in the lattice, i.e., ,51 | min y y # !am21 y y a5 d2 21 b1 xmin with Q 1/4R (it was observed that in practice LLL reduction provides a much better approxima- which achieves its minimum, S1B2 5 m, if and only if the tion of the shortest vector than suggested by this bound). basis B is orthogonal. A basis B is called S-reduced if Furthermore, the orthogonality defect of an LLL-reduced basis is S1B2 # S1BT2 holds for all possible unimodular transforma- bounded (though not necessarily minimal), i.e., tion matrices T. Thus, for an S-reduced basis, no unimodular | j1B2 # !4 am1m212. This property has been used in [18] to show transformation matrix T leads to a further reduction of the that suboptimum detectors aided by LLL lattice reduction achieve corresponding Seysen’s measure. The determination of an S full diversity (see also [14] and [19]–[23]). We refer to the section -reduced basis in general is computationally too expensive. “LLL Complexity” for a discussion of the complexity of LLL. Hence, one typically considers relaxed versions of lattice reduction based on Seysen’s measure that successively per- DUAL LATTICE REDUCTION form elementary column operations with respect to just two Instead of applying a particular lattice reduction algorithm to basis vectors [cf. (5)–(7)] to find a local minimum of S1B2. the basis B, a reduction of the dual basis Bw defined in (3) can The greedy variant that selects the two columns involved in be performed. We note that a reduction of Bw by a unimodular the basis update such that S1B2 is maximally decreased is matrix T corresponds to a reduction of the primal basis B by referred to as S2 reduction. A MATLAB implementation of S2 the unimodular matrix T21. For lattice reduction based on the reduction is provided as supplementary material in IEEE

LLL algorithm, this dual lattice-reduction approach was pro- Xplore (http://ieeexplore.ieee.org). The application of S2 posed in [18] to improve the performance of lattice-reduction- reduction for data detection has recently been proposed in based linear equalization schemes (see the section “Lattice [36] (see also the section “Lattice Reduction for Data Reduction for Data Detection”). Detection”).

SEYSEN’S LATTICE REDUCTION ALGORITHM BRUN’S ALGORITHM The basic principle of Seysen’s lattice reduction algorithm In 1919, V. Brun [47], [48] proposed an efficient algorithm for [3], [52] lies in the simultaneous reduction of the basis B finding approximate integer relations (see also [8]), i.e., finding

IEEE SIGNAL PROCESSING MAGAZINE [78] MAY 2011 Lattice reduction based on Brun’s algorithm performs itera- THE LLL ALGORITHM T tive column updates (translations/size reductions) such that |u1t,| We provide pseudocode for the LLL algorithm, summariz- decreases (the column pair and the size reduction parameter can ing the main algorithmic steps. The algorithm inputs are be determined very efficiently). This process is repeated as long as the basis matrix B and the reduction parameter d. | the orthogonality defect of the corresponding reduced basis 1: B d B decreases. Compared to LLL and Seysen reduction, Brun reduc- | | | 2: 3 Q, R4 d qr1 B2 tion performs poorer but has significantly lower complexity. A 3: , d 2 MATLAB implementation of Brun reduction is available as sup- 4: repeat plementary material in IEEE Xplore (http://ieeexplore.ieee.org). | | | r,2 , | d 2 l 1, k 5: b, b, | b,21 r,21, ,21 REDUCTION PERFORMANCE d | 2 . | 2 1 | 2 In the following, the quality of the different reduction schemes is 6: if |r ,21,,21| |r ,,,| |r ,21,,| then | | | compared by means of the condition number k1B2 (the ratio of 7: b,21 4 b, | the maximum and the minimum singular value of B), the 8: , d max1, 2 1, 22 | orthogonality defect j1B2, and the normalized Seysen criterion 9: else | | Sr1B2 5 S1B2/m. These three performance metrics are lower- 5 , 2 10: for k 2 to 1 do bounded by one and should be as small as possible. The compari- | | | r , | d 2 l k, k son here does not take into account the complexity of the various 11: b, b, | bk rk,k lattice reduction methods. While the preceding discussion of lat- 12: end for tice reduction methods was restricted to real-valued basis matri- 13: , d , 1 1 ces, we here compare the respective extensions to complex-valued basis matrices (cf. the section “Complex-Valued 14: end if Lattices”) for the case of complex Gaussian matrices whose ele- , . 15: until m ments are i.i.d. with zero mean and unit variance. For the LLL In lines 5 and 11, a size reduction is performed, whereas line 7 algorithm, we used the common choice d53/4. Better results at implements a column swap. In the above pseudocode, we the expense of higher complexity can be achieved with larger d. omitted the updates of the QR decomposition, which essential- For matrices of dimension 6 3 6, Figure 2 shows the cumula- ly consist of simple Givens rotations and are necessary after tive distribution functions (CDFs) of the different performance each basis change (lines 5, 7, and 11). Actual implementations of the LLL algorithm (e.g., [50]) operate directly on the QR fac- tors and additionally provide the corresponding unimodular matrix T as output. MATLAB code for such an implementation 1 is made available as supplementary material in IEEE Xplore 3/4 None LLL (http://ieeexplore.ieee.org). 1/2 DLLL CDF 1/4 SEY 0 0 1234 ~ ln κ (B) n integer vectors t, [ Z that are (almost) orthogonal to a given (a) vector u [ Rn while being as short as possible. It has been real- 1 ized in [35] that Brun’s algorithm can also be used for lattice 3/4 | 5 1/2 reduction. In particular, let us consider a transformed bases B CDF | | 1/4 1 c 2 5 5 1 c 2 b1 bm BT where T t1 tm is unimodular. The 0 | 01234 orthogonality defect of B is determined by the lengths of the basis ~ | ln ξ (B) vectors b,, which equal (b) | m 1 y y 2 5 y y 2 5 l 1 T 2 2 b, Bt, a k uk t, , (10) 3/4 k51 1/2 l . 5 c CDF 1/4 where k 0 and uk, k 1, , m, denote the eigenvalues and T l 0 eigenvectors of the Gramian B B. Assuming that 1 is the largest 01234 l W l 2 ~ eigenvalue and that 1 k, k 1, (10) suggests that decreas- ln S′ (B) | 2 ing the orthogonality defect by reducing y b, y requires t, to be as (c) orthogonal as possible to u1; this is exactly the approximate inte- ger relation problem solved by Brun’s algorithm. The assumption [FIG2] Performance assessment of LLL, dual| LLL, and| Seysen S2 k j l W l is particularly well justified for precoding in wireless reduction| in terms of the CDF of (a) ln ( B), (b) ln (B), and 1 k (c) ln Sr(B) (i.i.d. complex Gaussian basis matrices with zero mean MIMO systems; see the section “Brun’s Algorithm for Lattice- and unit variance, m 5 n 5 6 ). LLL and dual LLL lie practically on Reduction-Assisted Precoding.” top of each other.

IEEE SIGNAL PROCESSING MAGAZINE [79] MAY 2011 metrics (on a log scale) for the LLL-reduced basis, for the shortest vector problems, and any method for computing the

S2-reduced basis (labeled “SEY”), for dual LLL (“DLLL”—LLL shortest lattice vector is thus applicable for use in the applied to the dual basis Bw), and for the unreduced basis. All lat- Hermite-Korkine-Zolotareff reduction [4]. The sphere decod- tice reduction algorithms scale the three performance metrics er [4] may be used to solve the shortest vector problems. down significantly. Furthermore, SEY marginally outperforms Since the sphere decoder has a (worst case and average) com- LLL and DLLL (which perform identically). plexity that grows exponentially with the dimension m of the Figure 3 shows the CDFs of the squared length of the longest lattice [54], it is practically feasible only for lattices of moder- | |w primal basis vector b, and of the longest dual basis vector b ,. It is ate dimension. In this context, it is interesting to note that seen that for the primal basis SEY and LLL perform identically the sphere decoding algorithm of Fincke and Pohst was in and superior to DLLL; in contrast, for the dual basis SEY and fact presented as a solution to the shortest vector problem in DLLL achieve equivalent results with LLL being inferior. This con- [5], rather than the closest vector problem to which it is typi- forms that SEY reduces the primal and dual basis simultaneously cally applied in the context of detection and precoding. while LLL only reduces the primal basis and DLLL only reduces the dual basis. LLL COMPLEXITY The metrics considered in this section give an indication of the The complexity of the LLL algorithm is traditionally assessed performance of the various algorithms; they are not necessarily in by counting the number of LLL iterations, denoted K, where one-to-one correspondence, though, with other performance met- an iteration is started whenever the Lovász condition in (9b) is rics like bit error rate (BER) in a communication system. tested (this corresponds to one repeat loop in the pseudocode provided in “The LLL Algorithm”). Thus, an iteration may COMPLEXITY OF LATTICE REDUCTION consist of the selection of a new pair of columns to operate on, or a column swap followed by a size reduction (cf. the section COMPLEXITY OF MINKOWSKI AND “LLL Reduction”). The number of LLL iterations required to HERMITE-KORKINE-ZOLOTAREFF REDUCTION reduce a given basis was treated already in [2], which made The Minkowski and Hermite-Korkine-Zolotareff reductions clever use of the quantity

are the strongest but also the computationally most demand- n21 k ! | 2 ing to obtain. In both the Minkowski and the Hermite-| D q q |r ,,,| . Korkine-Zolotareff-reduced lattice basis the first vector b of k51 ,51 | 1 the reduced basis B corresponds to the shortest vector in the It can be shown that D is decreased by (at least) a multiplicative lattice. This implies that the computation of the Minkowski factor d [cf. (9b)] whenever two columns are swapped by the LLL and Hermite-Korkine-Zolotareff-reduced bases are at least as algorithm and left unchanged by the size reduction. If the origi- complex as the computation of the shortest lattice vector, a nal basis matrix B is integer-valued, then D $ 1 throughout the problem known to be NP-hard (under randomized reduc- execution of the LLL algorithm [2], which in turn implies that tions) [53]. In fact, the computation of a Hermite-Korkine- the number of swap operation carried out by the LLL algorithm Zolotareff-reduced basis essentially consists of solving m is finite and that the algorithm always terminates. In fact, given integer basis vectors of bounded Euclidean length V, the algo- rithm terminates after at most 1 None 3/4 # 2 1 LLL K m logt V m 1/2 DLLL CDF 1/4 SEY iterations [2], [55], where t ! 1/ "d . 1. This also implies that 0 0 2 4 6810 the complexity (in terms of binary operations) of the LLL algo- 2 rithm is polynomial in the (binary) description length of the basis max {||b˜ B|| } (a) B—a widely celebrated result. For an integer-valued basis B, the 1 binary description length is the number of bits required to speci- 3/4 fy the elements of B. A similar statement can be made also in the 1/2 extreme case where d 5 1, although the proof of this statement CDF 1/4 requires some additional work [56]. 0 The complexity analysis for the case of general (real- or com- 0 1 2 345 plex-valued) B is complicated by the fact that it is no longer nec- {||˜ w||2} max bB essarily true that D $ 1. However, other strictly positive lower (b) (and upper) bounds on D can be found in this case. Based on such bounds, it was shown in [55] that the number of LLL itera- [FIG3] The CDF of maximum squared length of reduced basis tions in general is upper bounded as vectors for (a) the primal basis and (b) the dual basis (i.i.d. complex A Gaussian basis matrices with zero mean and unit variance, # 2 1 K m logt m, (11) m 5 n 5 6 ). a

IEEE SIGNAL PROCESSING MAGAZINE [80] MAY 2011 ! | where A max, |r ,,,| and LATTICE REDUCTION IS EXTREMELY algorithm must be finite fol- ! | a min, |r ,,,|. The bound (11) USEFUL FOR DETECTION AND lows by [3, Corollary 9] implies that the LLL algorithm although the bound given there PRECODING IN WIRELESS terminates for arbitrary bases. is too loose to be useful in a The expression in (11) was used MIMO SYSTEMS. practical complexity analysis. in [55] to prove polynomial aver- There is, to our knowledge, no age complexity of the LLL algorithm for the case where the basis similar statement available for Brun’s algorithm. It is likely vectors are uniformly distributed inside the unit sphere in Rm. that an argument similar to [3] could be used to show that This result has subsequently been extended to i.i.d. real- and Brun’s algorithm also always terminates. It is also likely that complex-valued Gaussian bases in the context of MIMO commu- (similar to the LLL algorithm) there are bases that require an nications [40], [57]. arbitrarily large number of iterations to reduce, both for It is possible to further upper bound (11) based on the condi- Brun’s algorithm and Seysen’s algorithm. tion number of B according to [57] Like the LLL algorithm, Seysen’s and Brun’s algorithm have a data-dependent complexity, which makes numerical # 2 k1 2 1 K m logt B m. (12) studies of their complexity relevant. Such investigations have found that Brun’s algorithm tends to be an order of magnitude This bound is valuable specifically because it applies to the LLL less complex than LLL and Seysen’s algorithm (at the expense reduction of both the primal and the dual basis due to of reduced performance) [35]. The overall complexity (in k1B2 5k1Bw 2. It further follows from (12) that a large number terms of floating point operations) of reducing real-valued of LLL iterations can occur only when the original basis matrix bases is slightly lower for the LLL algorithm than for Seysen’s B is poorly conditioned. Although the opposite is not true (there algorithm [58]. No comparable analysis has been made for the are arbitrarily poorly conditioned bases that are simultaneously complex-valued case. LLL reduced), it has been shown (by explicit construction of corresponding bases) that the number of LLL iterations can be NUMERICAL RESULTS arbitrarily large [57]. This means that there is no universal We illustrate the effective complexity of LLL and Seysen lattice upper bound on the number of LLL iterations in the MIMO con- reduction for random n 3 m bases whose elements are text, or equivalently that the worst-case complexity is unbound- i.i.d. complex Gaussian with zero mean and unit variance. In the ed. This said, it should be noted that (in addition to the remainder of this article, such matrices will, for brevity, be polynomial average complexity) the probability mass function of referred to as i.i.d. Gaussian matrices. Figures 4 and 5 show the K has an exponential tail [57], which implies that atypically complementary cumulative distribution function (CCDF) of the large values of K are rare. Numerical experiments also indicate number of (complex) LLL and Seysen iterations. For Seysen that the complexity of the algorithm is relatively low in practice. reduction, we used the greedy S2 implementation in which a pair of basis vectors is always selected so that the largest reduction in COMPLEXITY OF SEYSEN AND BRUN REDUCTION the Seysen measure over any pair of two basis vectors is obtained There are much fewer analytical results for the complexity of [3], [52]; here, an iteration is defined as the selection of a pair of

Seysen and Brun reduction than for the LLL algorithm. That basis vectors, followed by their S2 reduction. In Figure 4, one can the number of S2 reductions performed by Seysen’s clearly see that the probability of experiencing an atypically large

100 100 m = n = 2 m = n = 2 m = n = 3 m = n = 3 10−1 m = n = 4 10−1 m = n = 4 m = n = 5 m = n = 5 m = n = 6 m = n = 6 10−2 m = n = 7 10−2 m = n = 7 m = n = 8 m = n = 8 CCDF 10−3 CCDF 10−3

10−4 10−4

10−5 10−5 0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 Number of Iterations Number of Iterations

[FIG4] The CCDF of the number of LLL iterations when reducing [FIG5] The CCDF of the number of Seysen iterations when n 3 m random bases with i.i.d. complex Gaussian elements with reducing n 3 m random bases with i.i.d. complex Gaussian zero mean and unit variance. elements with zero mean and unit variance.

IEEE SIGNAL PROCESSING MAGAZINE [81] MAY 2011 SYSTEM MODEL AND BACKGROUND w In the following, we consider a spatial multiplexing MIMO sys- tem with M transmit and N $ M receive antennas as shown in y s H Detector s^ Figure 6. Here, the lattice concepts and lattice reduction algo- rithms apply with n 5 N and m 5 M. At the transmitter the data is demultiplexed into M parallel [FIG6] Block diagram for a MIMO spatial multiplexing system. data streams that are mapped to symbols from the QAM alphabet S ( Z j and simultaneously transmitted over the M antennas. For S ( Z number of LLL iterations vanishes exponentially fast, and also QAM constellations, the condition j can easily be satisfied by that the number of iterations increases with the dimension of the an appropriate combination of scaling and translation. Consider basis. The number of Seysen iterations shows a remarkable S 5 5211 1 j2/!2 , 11 2 j2/!2, 2 11 2 j2/!2,11 1 j2/!2 6, similarity to the LLL case; this observation has yet to be con- i.e., power-normalized 4-QAM. By adding 11 1 j2/!2 and then firmed analytically. multiplying by 1/!2, this constellation is transformed to 5 1 6 ( Z 0,1, j,1 j j. The fixed additive offset can be moved into the TOPICS FOR FUTURE RESEARCH observation whereas the multiplicative scaling can be incorporat- There are a number of open problems concerning the com- ed into the channel (see, e.g., [16] and [50] for details). We consid- plexity of lattice reduction. One is the lack of analytical results er one time slot of the discrete-time complex MIMO baseband regarding the complexity of Seysen’s and Brun’s algorithm. model. Let s denote the complex-valued transmitted data vector Another important open problem is the development of better with i.i.d. elements, each of which has power normalized to one. (probabilistic) bounds on the complexity of the LLL algorithm. The corresponding received vector y is given by (see, e.g., [17]) In essence, bounds of the form (11) and (12) refer to the worst case complexity over some set of matrices, e.g., a set of matri- y 5 Hs 1 w (13) ces with bounded condition number. However, as noted previ- ously, given a specific condition number there are with the N 3 M (tall) channel matrix H and the noise vector w. LLL-reduced bases with this condition number. This means The noise is assumed to be spatially white (in case of spatially that for some matrices the bound in (12) is overly pessimistic. correlated noise, a spatial whitening filter can be used to obtain Numerical evidence also indicates that the average complexity an equivalent model with white noise) complex Gaussian with s2 is significantly lower than what is suggested by currently variance w. available bounds. CONVENTIONAL MIMO DETECTORS LATTICE REDUCTION FOR DATA DETECTION The ML detector for detecting the transmitted data vector based The application of lattice reduction for efficient near-opti- on (13) amounts to mum data detection in MIMO wireless systems has attract- ^ 5 y 2 y 2 ed a lot of attention over recent years (see “MIMO Wireless” sML arg min y Hs , (14) s[SM and, e.g., [15], [16], [18], [20], [21], [23] [36], [50], [51], and [59]). Here, parallel data streams are transmitted using where SM denotes the set of possible data vectors. For a rea- multiple antennas to increase the spectral efficiency at the sonable number of antennas and modulation order, solving cost of increased complexity for data detection at the this closest vector problem via exhaustive search is typically receiver. Optimal ML detection achieving full diversity can not feasible. Consequently, various more efficient optimum be performed using sphere decoding [4], [5], [60], whose and suboptimum data detection algorithms have been pro- complexity has been analyzed, e.g., in [54], [61], and [62]. posed in the literature. In particular, this includes efficient ML Less complex suboptimal detection schemes are abundant implementations based on sphere decoding (see, e.g., [4] and but mostly do not achieve full diversity. Interestingly [60]) and low-complexity suboptimum algorithms based on though, suboptimal detectors augmented with lattice linear equalization and successive interference cancelation reduction perform close to optimal and have the potential (SIC) (see, e.g., [17]). to achieve full diversity. This is of practical interest since With linear equalization, the received vector is first passed lattice-reduction-aided detection also has two major com- through a linear spatial filter and the resulting filter output is plexity advantages over sphere decoding: 1) lattice reduc- then quantized (sliced) component-wise with respect to the tion complexity is low (LLL has polynomial average symbol alphabet S. For linear ZF equalization, the filter is complexity as opposed to exponentially growing average designed to completely suppress spatial interference in the fil- complexity of sphere decoding) and 2) with lattice reduc- ter output signal. To this end, the received vector is multiplied tion, the main computational burden accrues only when with the (left) Moore-Penrose pseudoinverse of the channel the channel changes whereas with sphere decoding the matrix H1 5 1HHH2 21HH, computationally expensive steps have to be performed for | 5 1 5 1 1 each new data vector. s ZF H y s H w. (15)

IEEE SIGNAL PROCESSING MAGAZINE [82] MAY 2011 w H Detector for s

y z Detector z s –1 H T s T for z

Equivalent System Model

[FIG7] Block diagram for a MIMO spatial multiplexing system with lattice-reduction-aided data detection.

| Note that this is the unconstrained least squares solution of responding reduced basis H with better properties, one can (14). ZF detection is optimal (i.e., equivalent to ML) for orthog- use one of the lattice reduction algorithms described in the onal channel matrices but otherwise suffers from noise section “Lattice Reduction Techniques” with B 5 H. Since enhancement. An alternative to ZF equalization with improved H is in general complex-valued, we note that either the cor- performance is the minimum mean-square error (MMSE) responding lattice reduction algorithm has to be adapted to equalizer, which minimizes the overall error consisting of resid- the complex-valued case or the equivalent real-valued sys- ual spatial interference and noise at the filter output. As shown tem model has to be used (cf. the section “Complex-Valued in [63] and [64], MMSE detection is mathematically equivalent Lattices”). Figure 7 illustrates the application of lattice to ZF detection with respect to an extended system model with reduction to data detection. H the 1N 1 M2 3 M channel matrix H 5 Qs R and length- We first rewrite the system model (13) as wI 1N 1 M2 receive vector y 5 QyR, i.e., 0 | y 5 HTT21s 1 w 5 Hz 1 w, (16)

| 1 2 s5 H y 5 1HHH 1s2 I2 1HHy. MMSE w | where H 5 HT and z 5 T21s. Assuming for the moment that Even better performance can be achieved by nonlinear the symbol alphabet is given by the complex integers, i.e., detection schemes such as SIC (or decision-feedback) detec- S 5 Z , it follows from the unimodularity of T that z [ ZM. j | j tion [17], [64], [65]. With these methods, data symbols are Note that Hs and Hz describe the same lattice point but the | detected successively by canceling the effect of previously reduced matrix H is more orthogonal than the original channel | detected symbols. Hence, the order in which the symbols are matrix H. Thus, equalizing with the pseudoinverse H1 leads to | | | 5 1 5 1 1 detected affects the performance and can be optimized. For z ZF H y z H w, which suffers less noise amplification example, the V-BLAST detection ordering algorithm [65] finds than conventional equalization according to (15). Consequently, | the ordering that leads to the maximum postequalization SNR a quantization based on z ZF is more reliable than that based on | at each detection step. Several computationally efficient algo- s ZF. Denoting the quantization result obtained with the reduced | ^ ^ rithms have been proposed for this task in the literature, channel H by zZF, the final detection result is obtained as TzZF. e.g., [63]–[66]. During SIC detection, the yet undetected data For practical systems, the symbol alphabet is a finite subset S ( Z symbols have to be suppressed (“nulled”), which can be done of the infinite set of integers, i.e., j (recall that we again using either a ZF or an MMSE approach (see [64] for assumed S to be appropriately scaled and translated). more details). Consequently, the domain of the transformed symbols z is also a ZM subset of j . Appropriately taking into account the constella- LATTICE-REDUCTION-AIDED DETECTION tion boundary in the transformed domain leads to a shaping For common suboptimal detection approaches like linear problem, which is illustrated in Figure 8 for a two-dimensional equalization and SIC detection (see the previous section), real-valued pulse amplitude modulation (PAM) constellation. the performance strongly depends on the specific channel For the original constellation, the optimum quantizer is equiva- ZM realization (see, e.g., [50] and [67]). For example, ZF detec- lent to rounding (quantization with respect to j ) followed by tion is optimal if the channel realization H happens to be clipping to enforce the constellation boundary. In contrast, the orthogonal but results in poor performance otherwise. It is decision regions for the transformed constellation differ from ZM therefore natural to try to apply the detector not directly to those of j at the constellation boundary. Implementing these H, but rather to a transformed| system model with a “more decision regions would again be computationally very expensive orthogonal” channel matrix H [16], [50], which can be [68]. A simple but suboptimal alternative consists of the follow- | obtained with lattice reduction. To this end, the channel ing three steps [16], [68]: 1) quantize (i.e., round) z with ZM matrix H is interpreted as a basis for the discrete lattice of respect to j ; 2) return to the original symbol domain by mul- noise-free received vectors given by Hs. To determine a cor- tiplying with T; and 3) requantize (i.e., clip) the result with

IEEE SIGNAL PROCESSING MAGAZINE [83] MAY 2011 shown in [69], applying the MMSE criterion also limits the s1 z Z 1 performance loss when quantizing with respect to j instead S of j. To further improve the detection performance, lattice reduction can be combined with SIC detection [50], [51]. The resulting method is equivalent to the algorithm for finding the so-called Babai point [70]. While the reduced channel generally does not fulfill the V-BLAST ordering criterion, such an ordering s 2 z2 can be achieved by using a postsorting algorithm [64]. We note that lattice-reduction-aided SIC detection can be interpreted as generalization of V-BLAST in which not only column swaps but (a) (b) also translation/size-reduction steps are performed. z = T–1s PERFORMANCE RESULTS [FIG8] Illustration of the shaping problem: (a) original two- In the following, we compare the performance of various lattice- dimensional 4-PAM constellation S2 5 5 2 1, 0, 1, 262 (blue dots) embedded in Z2 ; (b) constellation resulting with the unimodular reduction-aided data detection algorithms. Gauss reduction will transformation matrix T 21 5 Q10R. Slicer boundaries are 11 not be considered, since it only applies to lattices of rank two. indicated by dashed lines. Furthermore, Hermite-Korkine-Zolotareff reduction is not con- sidered as it results only in marginal performance improve- respect to SM (the last step need not be performed if the inter- ments compared to other lattice reduction methods but features mediate result already belongs to the symbol constellation). a much higher computational complexity [71], [72]. To compare This simple approach entails a noticeable performance loss [68], the different combinations, the BER performance is investigated specifically for small constellation size. for a MIMO system with M 5 N 5 6 antennas employing

The extension of linear lattice-reduction-aided detection 4-QAM. The SNR is given by Eb/N0 with Eb denoting the aver- according to the MMSE criterion is achieved by applying lattice age receive energy per bit. 5 H reduction to the extended channel matrix H Q R [50], [51]. In Figure 9 shows BER versus Eb/N0 for the ML, ZF, and MMSE swI addition to improved performance, MMSE-based lattice reduction detectors along with their lattice-reduction-aided versions.

has a significantly smaller complexity than ZF-based lattice Lattice reduction was achieved using LLL, S2 reduction reduction; this can be explained by the fact that (due to the scaled (referred to as SEY), and DLLL (LLL applied to the dual basis). identity matrix at the bottom) the extended channel matrix H is It can be observed that lattice-reduction-aided linear equaliza- more orthogonal and better conditioned than H. Furthermore, as tion achieves full diversity order (in this case six), leading to strong performance improvements compared to conventional linear equalization that achieves a diversity order of only one. For the case of V-BLAST transmission over an i.i.d. Gaussian 0 10 channel, the property of achieving full diversity was proven for LLL and DLLL in [18] (see also [14], and [19]–[23]). For general −1 10 channel statistics and space-time mappings, full diversity of lat- tice-reduction-aided detection based on the LLL algorithm and 10−2 the MMSE criterion follows from recent results in [69].

BER Furthermore, we can observe that SEY and DLLL noticeably 10−3 outperform LLL. This can be explained as follows (cf. [59]). For ZF equalization, the layer with the worst postequalization SNR 10−4 dominates the overall performance. The postequalization SNR , 1s2 y w y 2 2 −5 of layer equals 1/ w h, and is thus inversely proportion- 10 w 0 5 10 15 20 al to the Euclidean length of the dual basis vector h, [59]. Thus, Eb /N0 [dB] for lattice-reduction-aided linear equalization, the goal is to find ZF MMSE-LLL a reduced basis for which the longest dual vector is as short as ZF-LLL MMSE-SEY possible so that the worst postequalization SNR is maximized. ZF-SEY MMSE-DLLL w ZF-DLLL ML Both, SEY and DLLL reduce the dual basis H and thus achieve MMSE a strong reduction of the longest dual basis vector [cf. Figure 3(b)], thereby explaining the performance advantage / over LLL-aided linear equalization. [FIG9] BER versus Eb N0 for a MIMO system using ZF detection (red) and MMSE detection (blue) and their lattice-reduction- Figure 10 shows BER versus Eb/N0 for plain SIC detectors aided variants. ML detection is shown as an ultimate benchmark. based on ZF and MMSE and with V-BLAST ordering (referred to Note that the SEY and DLLL variants lie virtually on top of each other. The MIMO system used N 5 M 5 6 antennas, a 4-QAM as OSIC) and for their lattice-reduction-aided variants. It can be symbol constellation, and an i.i.d. Gaussian channel. observed that lattice reduction significantly improves the

IEEE SIGNAL PROCESSING MAGAZINE [84] MAY 2011 performance of SIC detectors. Furthermore, OSIC-LLL, OSIC- SEY, and OSIC-DLL perform almost identically. OSIC-LLL 100 offers the advantage that the QR decomposition required for SIC detection can be directly provided by the LLL algorithm. 10−1

TOPICS FOR FUTURE RESEARCH 10−2 The application of lattice reduction to MIMO detection is BER an active research area with several interesting open 10−3 problems. Specifically, the impact of imperfect channel state information on lattice reduction aided detection is lit- 10−4 tle understood. Numerical evidence (e.g., [50]) suggests −5 that lattice reduction gains are preserved with imperfect 10 0 5 10 15 20 channel knowledge. However, there are neither analytical Eb /N0 [dB] performance results for this case nor specifically tailored MMSE-OSIC ZF-OSIC detector designs. MMSE-OSIC-LLL ZF-OSIC-LLL Another topic of high practical relevance is the extension MMSE-OSIC-SEY ZF-OSIC-SEY of lattice-reduction-aided hard-output detectors (as dis- MMSE-OSIC-DLLL ZF-OSIC-DLLL ML cussed in this article) to the soft-output case. This is relevant for coded MIMO systems where iterative detection and [FIG10] BER versus Eb/N0 for a MIMO system using optimally decoding is enabled by exchanging soft information ordered SIC (OSIC) based on ZF (red) and MMSE (blue) and their (e.g., log-likelihood ratios). Apart from list-based approaches lattice-reduction-aided variants (all of which lie practically on top (e.g., [73]–[75]), extensions into this direction seem difficult of each other). ML detection is shown as an ultimate benchmark. The MIMO system used N 5 M 5 6 antennas, a 4-QAM symbol since the required quantization has to be performed in a constellation, and an i.i.d. Gaussian channel. transformed domain, where the information about the bit labels is not explicit. suboptimum) VP techniques have been developed in the litera- Furthermore, no analytical performance results along the ture. Examples include linear precoding, Tomlinson- lines of [18], [20], and [21] (i.e., diversity order and SNR gap Harashima precoding (THP) [77], [78], and variants involving for LLL) are known for Seysen’s lattice reduction algorithm. lattice reduction [16], [79]. Recently, it was shown that Finally, it would be very important to develop efficient hard- approximate VP preceded by lattice reduction using the LLL ware architectures for the various lattice-reduction-aided algorithm can achieve full diversity [18]. data detection algorithms (first results in this direction have been reported in [25] and [26]). SYSTEM MODEL AND BASIC APPROACH We again consider the linear input/output relation (13), where LATTICE REDUCTION FOR PRECODING the transmitter is equipped with M transmit antennas and there Another promising application of lattice reduction in wireless are N single-antenna users (or a single user equipped with N communications is efficient precoding. Precoding is used to receive antennas). However, in contrast to the section “Lattice realize MIMO gains in the following scenarios: 1) the receive Reduction for Data Detection,” we now assume N # M (i.e., antennas are not colocated (e.g., if they belong to distinct users) there are more transmit antennas than receive antennas/users so that no spatial receiver processing is possible and 2) as much and the channel H is a fat matrix). In this context, the discus- computational complexity as possible shall be shifted from the sion from the sections “Point Lattices” and “Lattice Reduction receiver to the transmitter (base station). The main prerequisite Techniques” applies with n 5 M and m 5 N to the lattice gener- for precoding is channel state information at the transmitter. ated by the M 3 N right pseudoinverse P 5 HH 1HHH 2 21 of the We will focus on precoding schemes that consist of pre-equal- channel H. Note that this right pseudoinverse P is different ization in conjunction with a perturbation of the data vector to from the left pseudoinverse H1 used in the context of MIMO minimize the transmit power [76]–[78]. Such vector perturba- detection. At each time instant, the base station transmits N tion (VP) precoding schemes enable simple per-antenna (or, per- data symbols. The ,th data symbol d, [ S is intended for user user) receiver processing. (or receive antenna) ,. With perfect channel state information As with data detection (see the section “Lattice Reduction at the transmitter, interference free transmission to each user is for Data Detection”), the problem of finding the optimum (in achieved by using ZF precoding/pre-equalization (e.g., [76]). For the sense of minimizing the transmit power) perturbation vec- simplicity, we do not consider MMSE-based precoding tor is equivalent to a closest lattice point problem [cf. (14)], (e.g., [76]), even though it generally performs better than ZF which can be solved using sphere-decoding approaches (in the precoding. precoding context, this is often referred to as sphere-encoding Here the transmit vector is obtained by multiplying the data 5 1 c 2 T [77]). However, since this optimum approach may be compu- vector d d1 dN with the pseudoinverse P. The main tationally too expensive, several efficient approximate (i.e., problem of this linear ZF precoding scheme lies in the fact that

IEEE SIGNAL PROCESSING MAGAZINE [85] MAY 2011 y tz w 1 ˆ Vector Detector d1 Perturbation s y d P H y N ˆ Detector dN

[FIG11] Block diagram for a MIMO precoding system using VP. the transmit power y Pd y 2 of the pre-equalized signal can ing the translated constellation back to the origin). Data detec-

become very large. This happens specifically if H is poorly con- tion then amounts to slicing y, mod t with respect to the ditioned and d is oriented along a channel singular vector asso- constellation S. We note that VP precoding has been shown to ciated to a small singular value. This power enhancement can achieve the full diversity order of M. be combatted by VP [77] (see Figure 11 for a block diagram). Finding the optimum perturbation vector according to (17) With VP, the transmit vector is formed by adding a (scaled) inte- amounts to a closest vector problem that can be solved by [ ZN [ ZN t ger perturbation vector z j to the data vector before pre- sphere-encoding [77]. In fact, we look for z j such that Pz equalization, i.e., s 5 P1d 1tz2, with t a real-valued constant. is closest in Euclidean distance to the vector 2Pd. This problem This allows the data vector to be reoriented into a more favor- is very similar to the ML detection problem (just replace H, s, able signal space direction. VP amounts to using a periodically and y in (14) with 2tP, z, and Pd, respectively), with the differ- S 1tZN extended symbol constellation j in which all possible ence that the perturbation vector z comes from the infinite lat- ZN perturbed versions of a symbol vector constitute an equivalence tice j , whereas the symbol vector s is taken from the finite class that actually carries the information (see Figure 12 for an constellation SM. illustration). The real-valued constant t is chosen such that all Several suboptimal precoding techniques can be interpreted translates of the symbol alphabet are nonoverlapping. The as approximations to (17). Specifically, (17) can be solved [ ZN integer perturbation vector z j is designed to minimize the approximately by first computing the (unconstrained) least- transmit power, i.e., squares solution by applying the pseudoinverse of 2tP to Pd and then rounding the result to the nearest integer. This leads 5 y 1 1t 2 y 2 zopt arg min P d z . (17) to the perturbation vector [Z N z j 5 1 5 1t 1 5 l 12t 2 1 k 5 l 2 1 k 5 The receive vector y Hs w d z w is free of spatial zZF P Pd t d 0 (18) interference and hence allows for per-antenna detection based on y, 5 d, 1tz, 1 w,. The perturbation can be removed by (the last equality follows from the fact that the real part and the

subjecting the receive values to a modulo-t operation (i.e., shift- imaginary part of d,/t are less than 1/2). We thus reobtain plain 5 ZF precoding, i.e., sZF Pd. The diversity order achieved by ZF precoding is only M 2 N 1 1.

s2 Alternatively, using a decision feedback approach (similar to SIC detection) for finding the elements of the perturbation vector z leads to nonlinear THP [78]. THP can be interpreted as successive optimization of the elements of the perturbation vector instead of a joint optimization. It performs better than ZF precoding, but its diversity order also equals only M 2 N 1 1. s1 LATTICE-REDUCTION-AIDED PRECODING As with lattice-reduction-aided data detection (see the section “Lattice Reduction for Data Detection”), using lattice reduc- tion for precoding is motivated by the fact that optimum and good approximate choices of the integer perturbation can be found much more efficiently if the lattice basis that appears in [FIG12] Illustration of the main idea of VP for the real-valued the closest vector problem (17) is more orthogonal. In con- 2 5 62 constellation S 5 21,1 (shown in blue). The periodic trast to lattice-reduction-aided data detection, lattice reduc- extension of S2 is obtained with t54 (boundaries are shown tion for precoding does not suffer from the shaping problem, as dashed lines). The equivalence class for the symbol vector Q1R is indicated with stars; this equivalence class contains the 1 i.e., the relaxation from a finite to an infinite lattice. Hence, in 2 2 1 vectors Q1R 1tQ kR 5 Q 4k 1R, which become increasingly the context of precoding lattice reduction itself does not imply 1 k 4k 1 1 orthogonal to Q1R as |k| increases. any performance loss (cf. [68]). 1

IEEE SIGNAL PROCESSING MAGAZINE [86] MAY 2011 We now briefly review the basic concepts of lattice-reduc- tion-aided precoding [16], [18], [79]. In the following, the rela- 100 tion between the original precoding matrix P and the | | corresponding reduced matrix P is given by P 5 PT, where T is 10−1 an unimodular transformation matrix obtained by a certain lat- 10−2 tice reduction algorithm (see the section “Lattice Reduction ZF 5

Techniques” with B P). SER ZF-Brun −3 The cost function in (17) can be rewritten in terms of the 10 ZF-LLL | THP reduced matrix P as −4 THP-Brun 10 THP-LLL y P 1d 1t z2 y 2 5 y PTT21 1d 1t z2 y 2 Optimal | | −5 | 2 10 5 y P 1 d 1t z 2 y , (19) 0 5 10 15 20 25 30 35 SNR (dB) | 2 | 2 where d 5 T 1d and z 5 T 1z. The transformed perturbation | vector z is still integer-valued since T is unimodular. Hence, we [FIG13] SER versus SNR for ZF precoding (red) and THP (blue) arrive at a cost function that is completely equivalent to that used without lattice reduction, with Brun lattice reduction, and with LLL lattice reduction. Optimum VP (green) is shown as a in the original formulation (17). Consequently, the optimum per- performance benchmark. The results were obtained for turbation vector in (17) can also be found by first minimizing the N 5 M 5 8 antennas, a 4-QAM symbol constellation, and an | lattice-reduction-transformed cost function (19) over z and then i.i.d. Gaussian channel. | applying the transformation T to the corresponding result z opt, 5 | i.e., zopt T z opt. Clearly, lattice reduction itself does not imply PERFORMANCE any loss of optimality. However, the optimum solution can be Figure 13 illustrates the symbol-error-rate (SER) versus SNR per- | found more efficiently via sphere-encoding since P is more formance for the various precoding schemes with M 5 8 transmit orthogonal than P. Furthermore, using conventional approxima- antennas, N 5 8 users, 4-QAM symbols, and i.i.d. Gaussian chan- tion techniques (like linear ZF or nonlinear THP) results in a bet- nels. We compared linear ZF precoding and nonlinear THP with ter approximation quality when preceded by lattice reduction. and without lattice reduction. We considered the LLL algorithm As an example, let us consider lattice-reduction-aided ZF pre- and Brun’s algorithm for lattice reduction. The corresponding lat- | coding. Here, the reduced basis P is used to compute the (uncon- tice-reduction-aided precoding schemes are referred to as ZF-LLL, strained) least squares solution of (19) followed by a rounding THP-LLL, ZF-Brun, and THP-Brun. As a performance benchmark, operation [cf. (18)], we also considered the exact (optimum) solution of (17) via sphere-encoding. We can observe that lattice reduction significant- | 1 | | | | 5 l 12t 2 k 5 l 2 1 k zZF P P d t d . ly improves the performance of the approximate (i.e., ZF and THP) precoding schemes. With lattice reduction using the LLL algo- | | 2 5 21 In general, zZF 0 since d T d is an integer linear combina- rithm, close to optimum performance is achieved and the full tion of data symbols and thus can have elements that are larger diversity order (in this case, M 5 8) is obtained. The lattice-reduc- than t. Finally, the perturbation vector obtained by lattice-reduc- tion-aided precoding schemes based on the simple Brun’s algo- | tion-aided ZF precoding is given by T z ZF. If T is obtained by the rithm suffer from a performance loss as compared to the LLL algorithm (see the section “LLL Reduction”), this approach corresponding schemes assisted by the LLL algorithm, but are still can be shown to achieve the full diversity order of M [18]. able to achieve a large part of the available diversity. This is in con- trast to ZF and THP without lattice reduction, which just achieve a BRUN’S ALGORITHM FOR diversity order of one (cf. [76]). LATTICE-REDUCTION-ASSISTED PRECODING The use of Brun’s algorithm for lattice reduction algorithm in TOPICS FOR FUTURE RESEARCH the context of precoding in wireless systems has been pro- There are various interesting directions for future research posed in [35]. This was motivated by the fact that under the dealing with lattice-reduction-aided precoding. In particu- usual i.i.d. Gaussian model, the channel matrix H typically has lar, no analytical results on the performance (i.e., diversity just one small singular value associated with the left singular order and SNR gap) of lattice-reduction-aided precoding vector u1 [67], [76]. When reducing P, this means that using Brun’s algorithm (complementing the analytical find- PHP 5 1HHH 2 21 has one dominating eigenvalue with associat- ings of [18] for the LLL algorithm) are available. ed eigenvector u1. As explained in the section “Brun’s Furthermore, little is known about the computational com- Algorithm,” P can be reduced by finding approximate integer plexity of sphere-encoding (with and without lattice reduc- relations for u1 via Brun’s algorithm. Full details are provided tion), which is in contrast to sphere-decoding techniques in [35]. A high-throughput VLSI implementation of Brun’s for data detection, where various results on the complexity algorithm for lattice-reduction-aided precoding (including of sphere-decoding (without lattice reduction) have been fixed-point considerations) can be found in [24]. provided in the literature (see [54], [61], and [62]). Indeed, a

IEEE SIGNAL PROCESSING MAGAZINE [87] MAY 2011 5 3 46K where uk t k51 is a prescribed set of linearly independent u 5 1u cu 2 T φ˜ [t ] basis functions and 1 K is an unknown parameter 4π vector that we want to estimate. We assume that we are given measurements of the wrapped 2π phase f3t4 [ 30, 2p3 of x3t4 at L irregularly spaced time c w3 4 5 K 3 4 u 1e3 4 p instants t1 tL, i.e., t, g k51 uk t, k t, mod 2 ; e3 4 0 t here, t denotes the phase error caused by the additive noise w3t4. The case of polynomial phase signals with uniform −2π sampling (t, 5 ,) considered in [80] is obtained with the model 3 4 5 ,k21 uk t, (frequency and phase estimation for a single sinu- φ [t ] soid amounts to the special case K 5 2 [10]). 2π f 5 1f3 4 cf3 42T By defining t he length-L vectors t1 tL , 5 1 3 4 c 3 42T ` 5 1e3 4 ce3 42T 3 uk uk t1 uk tL , t1 tL , and the L K 0 t 5 1 c 2 matrix U u1 uK , we obtain the equivalent matrix-vector z[t ] mod el 1 0 t f 5 Uu 1 ` mod 2p. −1 | −2 If we had the unwrapped phase f 5 Uu 1 ` to our disposal , the parameter vector u could be estimated according to a simple least [FIG14] Illustration of the phase unwrapping problem with squares approach. The wrapped and unwrapped phase are related | unwrapped phase (blue), wrapped phase (gray), and integer as f 5 f 2 2pz where the integer vector z [ ZL models the unwrapping function (red); sampling instants are marked with dots. unknown unwrapping (see Figure 14). We use an extended least squares appro ach to jointly estimate u and z according to theoretical underpinning of the complexity savings achieved 1u^ ^ 2 5 y f 2 p 2 u y 2 with lattice reduction for sphere-encoding is completely , z arg min 2 z U . (20) u[RK, z[ZL missing. Here, only numerical complexity results are avail- able (see, e.g., [79] for sphere-encoding aided by the LLL For any given unwrapping vector z, the associated parameter algorithm). Finally, similar to MIMO detection, little is estimate eq uals known about lattice-reduction-aided precoding in situations with imperfect channel state information. u^ 1z2 5 U1 1f 2 2pz2. (21)

PHASE UNWRAPPING Here, U1 denotes the (left) pseudoinverse of U. By inserting ^ In the previous sections, we have studied applications of lat- u 1z2 into the cost function (20), we obtain after some al gebra tice reduction in wireless communications in considerable ^ 5 y ' 1f 2 p 2 y 2 detail. We next briefly discuss the usefulness of lattice reduc- z arg min PU 2 z . (22) z[ZL tion for a signal processing application outside MIMO wire- ' 5 2 1 5 2 less. Specifically, we focus on the phase unwrapping Here, PU I UU denotes the rank-m (m L K) orthogo- problem, which was used to formulate lattice-reduction- nal projection matrix onto the orthogonal complement of the based estimators of the frequency and phase of a single sinu- column span of U. The final parameter estimate is obtained as ^ soid [10] and of the parameters of polynomial-phase signals u1 z^ 2 according to (21) with z^ the solution to (22). [80]. In line with the suggestions in [10, Sec. 7], we discuss a The integer least squares problem (22) is recognized to be more general setup with a nonlinear phase and nonuniform- a closest vector problem with respect to the m-dimensional ' ly spaced samples. This case has not been dealt with explicit- lattice induced by PU , similar to the MIMO detection and ly up to now and is intended to stimulate further research. precoding problems. Hence, all lattice-reduction-aided The model we consider is potentially useful in applications approximate solution techniques discussed in the MIMO con- like radar, sonar, geophysics, biomedical signal processing, text can be applied to the phase unwrapping problem as well. sensor networks, and communications. The main difference is that with MIMO the lattice is deter- Consider the complex discrete-time signal x3t4 5 mined by the random channel matrix whereas here it jc3t4 1 3 4 c3 4 3 4 5 3 46K e w t , where t is a phase function and w t is addi- depends on the underlying bas is uk t k51 and the sampling c tive noise. The phase function is modeled using a basis expan- instants t1, ,tL. This difference prompts several questions sion, i.e., regarding the performance and complexity of lattice-reduc- tion-aided estimation o f the parameter vector u, which are K c3 4 5 3 4 u left for future research. Furthermore, lattice reduction algo- t a uk t k, k51 rithms are particularly interesting in the context of phase

IEEE SIGNAL PROCESSING MAGAZINE [88] MAY 2011 unwrapping since here the lat- LATTICE REDUCTION AND ITS implementations of various lat- tice dimension grows linearly APPLICATION TO WIRELESS tice reduction algorithms. with the number of sampling Lattice reduction and its COMMUNICATIONS REMAINS AN points, which in practice can be application to wireless communi- much larger than the lattice ACTIVE RESEARCH AREA WITH cations remains an active dimension (i.e., n umber of SEVERAL INTERESTING research area with several inter- antennas) for MIMO. OPEN PROBLEMS. esting open problems, e.g., the analysis of performance and com- CONCLUSIONS plexity of lattice reduction in spe- In this article, we provided a survey of lattice reduction tech- cific applications and practical hardware implementations. Our niques and their application to wireless communications and hope is that this survey article provides a convenient entry point parameter estimation. After reviewing the basic concepts of to this exciting field and motivates a larger number of research- lattices, we described the various lattice reduction algorithms ers to apply lattice reduction algorithms to a wider class of signal that have b een proposed in the literature for solving classical processing problems. problems in lattice theory (such as the shortest vector prob- lem). MATLAB code for some of the lattice reduction algo- ACKNOWLEDGMENTS rithms has been made available as supplementary material in We thank N. Sidiropoulos and the anonymous reviewers for IEEE Xplore (http://ieeexplore.ieee.org). We then discussed their careful and critical reading and for their numerous how the lattic e reduction principle can be applied to simplify comments and suggestions, which resulted in a significant the detection and precoding problem in wireless communica- improvement of this article. This work was supported in part tions with emphasis on multiple antenna systems. These com- by the STREP project MASCOT (IST-026905) within the Sixth munications problems were complemented by a discussion of Framework of the European Commission, by grant S10606 of the usefulness of lattice reduction algorithms in parameter the Austrian Science Fund (FWF), by grant ICA08-0046 of the estimation problems involving phase unwrapping. In all of Swedish Foundation for Strategic Research (SSF), and by the these applications, the fundamental approach is as follows: 1) FP7 grant 228044 of the European Research Council (ERC). use a lattice reduction algorithm to determine an improved (i.e., “more orthogonal”) basis for the lattice of interest; 2) AUTHORS solve the given problem with respect to the reduced basis; and Dirk Wübben ([email protected]) received the Dipl.- 3) transform the solution back to the original domain. Ing. (FH) degree in electrical engineering from the University of One of the key results in the literature showed that in cer- Applied Science Münster, Germany, in 1998, and the Dipl.- tain setups lattice reduction using the LLL algorithm allows Ing. (Uni) degree and the Dr.-Ing. degree in electrical engineer- suboptimum detection and precoding techniques to achieve ing from the University of Bremen, Germany, in 2000 and 2005, full diversity. By means of numerical results we demonstrated respectively. He was a visiting student at the Daimler Benz that the various lattice reduction approaches (such as LLL, Research Departments in Palo Alto, California, in 1997, and in dual LLL, and Seysen) offer different advantages depending on Stuttgart, Germany, in 1998. From 1998 to 1999 he was with the specific performance target (such as the orthogonality the Research and Development Center of Nokia Networks, defect of the lattice basis or the error rate of a subsequent Düsseldorf, Germany. In 2001, he joined the Department of data detector). In particular, we showed that suboptimum lin- Communications Engineering, University of Bremen, Germany, ear and nonlinear detection and precoding schemes aided by where he is currently a senior researcher and lecturer. His lattice reduction are able to achieve excellent error rate per- research interests include wireless communications, signal pro- formance. cessing for multiple antenna systems, cooperative communica- We reviewed the known analytical results about the complexi- tion systems, and channel coding. He was a technical program ty of the various lattice reduction algorithms that were comple- cochair of the 2010 International ITG Workshop on Smart mented by numerical complexity assessments. Here, one of the Antennas and has been member of the program committee of main results is that for lattices resulting from i.i.d. Gaussian several international conferences. His doctoral dissertation on channels (a model often used in wireless communications) the reduced complexity detection algorithms for MIMO systems LLL algorithm has an average complexity that scales polynomial- received the Bremer Studienpreis in 2006. He is a Member of ly with the lattice dimension. This is in contrast to optimum the IEEE. detection and precoding approaches (e.g., sphere decoding), Dominik Seethaler ([email protected]) re- which feature an expected complexity that scales exponentially ceived the Dipl.-Ing. and Dr. techn. degrees in electrical/ with the lattice dimension. We conclude that the tools provided communication engineering from Vienna University of by lattice theory are very powerful and at the same time easily Technology, Austria, in 2002 and 2006, respectively. From applied to simplify the hard detection and precoding problems in 2002 to 2007, he was a research and teaching assistant at the wireless communications. The practical feasibility and impor- Institute of Communications and Radio Frequency tance of lattice reduction is corroborated by recent hardware Engineering, Vienna University of Technology. From 2007 to

IEEE SIGNAL PROCESSING MAGAZINE [89] MAY 2011 2009, he was a postdoctoral researcher at the Communication [3] M. Seysen, “Simultaneous reduction of a lattice basis and its reciprocal basis,” Technology Laboratory, ETH Zurich, Switzerland. Since 2009, Combinatorica, vol. 13, no. 3, pp. 363–376, 1993. [4] E. Agr ell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point search in lattices,” he has been freelancing in the area of home robotics. IEEE Trans. Inform. Theory, vol. 48, no. 8, pp. 2201–2214, Aug. 2002. Joakim Jaldén ([email protected]) received the M.Sc. and [5] U. Fin cke and M. Pohst, “Improved methods for calculating vectors of short Ph.D. degrees in electrical engineering from the Royal length in a lattice, including a complexity analysis,” Math. Comput., vol. 44, no. 170, pp. 463–471, 1985. Institute of Technology (KTH), Stockholm, Sweden, in 2002 [6] A. Hassibi and S. Boyd, “Integer parameter estimation in linear models with ap- and 2007, respectively. From 2007 to 2009, he held a postdoc- plications to GPS,” IEEE Trans. Signal Processing, vol. 46, no. 11, pp. 2938–2952, Nov. 1998. toral research position at the Vienna University of Technology, [7] R. Nee lamani, R. G. Baraniuk, and R. de Queiroz, “Compression color space Austria. He also studied at Stanford University, California, estimation of JPEG images using lattice basis reduction,” in Proc. IEEE Int. Conf. Image Processing (ICIP), Thessaloniki, Greece, Oct. 2001, vol. I, pp. from 2000 to 2002, and worked at ETH, Zürich, Switzerland, 890–893. as a visiting researcher, in 2008. In 2009, he joined the Signal [8] I. V. L. Clarkson, “Approximation of linear forms by lattice points with ap- Processing Lab within the School of Electrical Engineering at plications to signal processing,” Ph.D. dissertation, Australian Nat. Univ., Can- berra, Australia, 1997. KTH, Stockholm, Sweden, as an assistant professor. He won [9] I. V. L. Clarkson, S. D. Howard, and I. M. Y. Mareels, “Estimating the period the IEEE Signal Processing Society’s 2006 Young Author Best of a pulse train from a set of sparse, noisy measurements,” in Proc. Int. Symp. Paper Award for his work on MIMO communications, and in Signal Processing and Its Applications, Aug. 1996, vol. 2, pp. 885–888. [10] I. V. L. Clarkson, “Frequency estimation, phase unwrapping and the nearest 2007, he won first prize in the Student Paper Contest at the lattice point problem,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal International Conference on Acoustics, Speech and Signal Processing (ICASSP), Phoenix, AZ, Mar. 1999, pp. 1609–1612. Processing. He also received the 2009 Ingvar Carlsson Award [11] W. H. Mow, “Maximum likelihood sequence estimation from the lattice view- point,” Ph.D. dissertation, The Chinese Univ. of Hong Kong, June 1991. by the Swedish Foundation for Strategic Research. [12] G. Rek aya, J.-C. Belfiore, and E. Viterbo, “A very efficient lattice reduction Gerald Matz ([email protected]) received the Dipl.- tool on fast fading channels,” in Proc. IEEE Int. Symp. Information Theory and Its Applications (ISITA), Parma, Italy, Oct. 2004, pp. 714–717. Ing. and Dr. techn. degrees in electrical engineering in 1994 [13] K. J. Kim and R. A. Iltis, “Joint constrained data detection and channel esti- and 2000, respectively, and the Habilitation degree for com- mation algorithms for QS-CDMA signals,” in Proc. Asilomar Conf. Signals, Sys- munication systems in 2004, all from Vienna University of tems and Computers, Pacific Grove, CA, 2001, vol. 1, pp. 394–398. [14] X. Ma, W. Zhang, and A. Swami, “Lattice-reduction aided equalization for Technology, Austria. Since 1995, he has been with the OFDM systems,” IEEE Trans. Wireless Commun., vol. 8, no. 4, pp. 1608–1613, Institute of Communications and Radio-Frequency Apr. 2009. Engineering, Vienna University of Technology, where he is [15] H. Yao and G. W. Wornell, “Lattice-reduction-aided detectors for MIMO com- munication systems,” in Proc. IEEE Global Communications Conf. (GLOBE- currently associate professor. In 2004 and 2005, he was an COM), Taipei, Taiwan, Nov. 2002. Erwin Schrödinger Fellow with the Laboratoire des Signaux [16] C. Windpassinger and R. F. H. Fischer, “Low-complexity near-maximum- likelihood detection and precoding for MIMO systems using lattice reduction,” et Systèmes, Supélec, France. In 2007, he was a guest in Proc. IEEE Information Theory Workshop (ITW), Paris, France, Mar. 2003, researcher with the Communication Theory Lab at ETH pp. 345–348. Zurich, Switzerland. He has directed or actively participated [17] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time Wireless Communications. Cambridge, U.K.: Cambridge Univ. Press, 2003. in several research projects funded by the Austrian Science [18] M. Taherzadeh, A. Mobasher, and A. K. Khandani, “LLL reduction achieves Fund (FWF), the Vienna Science and Technology Fund the receive diversity in MIMO decdoing,” IEEE Trans. Inform. Theory, vol. 53, (WWTF), and the European Union. He has published more no. 12, pp. 4801–4805, Dec. 2007. [19] M. Taherzadeh, A. Mobasher, and A. K. Khandani, “Communication over than 130 papers in international journals, conference pro- MIMO broadcast channels using lattice-basis reduction,” IEEE Trans. Inform. ceedings, and edited books. His research interests include Theory, vol. 53, no. 12, pp. 4567–4582, Dec. 2007. wireless communications, statistical signal processing, and [20] C. Lin g, “Towards characterizing the performance of approximate lattice de- coding in MIMO communications,” in Proc. Int. ITG Conf. Source and Channel information theory. He serves on the IEEE Signal Processing Coding (SCC), Munich, Germany, Apr. 2006. Society (SPS) Technical Committee on Signal Processing for [21] C. Lin g, “Approximate lattice decoding: Primal versus dual basis reduction,” in Proc. IEEE Int. Symp. Information Theory (ISIT), Seattle, WA, July 2006, Communications and Networking and on the IEEE SPS pp. 1–5. Technical Committee on Signal Processing Theory and [22] J. Jaldén and P. Elia, “LR-aided MMSE lattice decoding is DMT optimal for all approximately universal codes,” in Proc. IEEE Int. Symp. Information Theory Methods. He is an associate editor for IEEE Transactions (ISIT), Seoul, Korea, June 2009, pp. 1263–1267. on Signal Processing. He was an associate editor for IEEE [23] X. Ma and W. Zhang, “Performance analysis for MIMO systems with lattice- Signal Processing Letters (2004–2008) and for the EURASIP reduction aided linear equalization,” IEEE Trans. Commun. Technol., vol. 56, no. 2, pp. 309–318, Feb. 2008. journal Signal Processing (2007–2010). He was technical pro- [24] A. Burg, D. Seethaler, and G. Matz, “VLSI implementation of a lattice-reduc- gram cochair of EUSIPCO 2004 and has been on the Technical tion algorithm for multi-antenna broadcast precoding,” in Proc. IEEE Int. Symp. Program Committee of numerous international conferences. Circuits and Systems (ISCAS), May 2007, pp. 673–676. He received the 2006 Kardinal Innitzer Most Promising Young [25] B. Gestner, W. Zhang, X. Ma, and D. V. Anderson, “VLSI implementation of a lattice reduction algorithm for low-complexity equalization,” in Proc. IEEE Int. Investigator Award and is a Senior Member of the IEEE. Conf. Circuits and Systems for Communications (ICCSC), May 2008, pp. 643–647. [26] D. Wu, J. Eilert, and D. Liu, “A programmable lattice-reduction aided detec- tor for MIMO-OFDMA,” in Proc. IEEE Int. Conf. Circuits and Systems for Com- REFERENCES munications (ICCSC), Shanghai, China, May 2008, pp. 293–297. [1] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,” IEEE Trans. Inform. Theory, vol. 48, no. 6, pp. 1250– [27] L. G. Barber o, D. L. Milliner, T. Ratnarajah, J. R. Barry, and C. F. N. Cowan, 1276, June 2002. “Rapid prototyping of Clarkson’s lattice reduction for MIMO detection,” in Proc. IEEE Int. Conf. Communications (ICC), Dresden, Germany, June 2009, pp. 1–5. [2] A. K. Lenstra, H. W. Lenstra, and L. Lovász, “Factoring polynomials with rational coefficients,” Mathematische Annalen, vol. 261, no. 4, pp. 515–534, [28] L. Bruderer, C. Studer, M. Wenk, D. Seethaler, and A. Burg, “VLSI im- 1982. plementation of a low-complexity LLL lattice reduction algorithm for MIMO

IEEE SIGNAL PROCESSING MAGAZINE [90] MAY 2011 detection,” in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), Paris, Acoustics, Speech, and Signal Processing (ICASSP), Las Vegas , NV, Apr. 2008, France, May 2010, pp. 3745–3748. pp. 2685–2688. [29] H. Cohen, A Course in Computational Algebraic Number Theory, 3rd ed. [58] L. G. Barbero, T. Ratnarajah, and C. Cowan, “A comparison of complex lat- Berlin, Germany: Springer-Verlag, 1996. tice reduction algorithms for MIMO detection,” in Proc. IEEE Int. Conf. Acous- tics, Speech, and Signal Processing (ICASSP), Las Vegas, NV, Apr. 2008, pp. [30] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, 2705–2708. 3rd ed. New York: Springer-Verlag, 1998. [59] D. Wübben and D. Seethaler, “On the performance of lattice reduction [31] N. D. Elkies, “Lattices, linear codes, and invariants, Part I,” Notices Amer. schemes for MIMO data detection,” in Proc. Asilomar Conf. Signals, Systems Math. Soc., vol. 47, no. 10, pp. 1238–1245, Nov. 2000. and Computers, Pacific Grove, CA, Nov. 2007, pp. 1534–1538. [32] N. D. Elkies, “Lattices, linear codes, and invariants, Part II,” Notices Amer. [60] A. D. Murugan, H. E. Gam al, M. O. Damen, and G. G. Caire, “A unified frame- Math. Soc., vol. 47, no. 11, pp. 1382–1391, Dec. 2000. work for tree search decoding: Rediscovering the sequential decoder,” IEEE Trans. [33] P. Q. Nguyen and B. Vallée, Eds., The LLL Algorithm: Survey and Applica- Inform. Theory, vol. 52, no. 3, pp. 933–953, 2007. tions. Berlin, Germany: Springer-Verlag, 2010. [61] B. Hassibi and H. Vikalo, “On the sphere decoding algorithm I. Expected [34] H. Yao, “Effcient signal, code, and receiver designs for MIMO communica- complexity,” IEEE Trans. Signal Processing, vol. 53, no. 8, pp. 2806–2818, Aug. tion systems,” Ph.D. dissertation, Dept. Elect. Eng. and Comp. Sci., Massachu- 2005. setts Inst. Technol., Cambridge, MA, June 2003. [62] H. Vikalo and B. Hassibi, “On the sphere decoding algorithm II. Generaliza- [35] D. Seethaler and G. Matz, “Efficient vector pertubation in multi-antenna tions, second-order statistics, and applications to communications,” IEEE Trans. multi-user systems based on approximate integer relations,” in Proc. European Signal Proce ssing, vol. 53, no. 8, pp. 2819–2834, Aug. 2005. Signal Processing Conf. (EUSIPCO), Florence, Italy, Sept. 2006. [63] B. Hassibi, “An efficient square-root algorithm for BLAST,” in Proc. IEEE [36] D. Seethaler, G. Matz, and F. Hlawatsch, “Low-complexity MIMO detec- Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, tion using Seysen’s lattice reduction algorithm,” in Proc. IEEE Int. Conf. June 2000, pp. 5–9. Acoustics, Speech, and Signal Processing (ICASSP), Honolulu, HI, Apr. 2007, [64] D. Wübben, R. Böhnke, V. Kühn, and K. D. Kammeyer, “MMSE extension of pp. 53–56. V-BLAST based on sorted QR decomposition,” in Proc. IEEE Vehicular Technol- [37] Y. H. Gan, C. Ling, and W. H. Mow, “Complex lattice reduction algorithm ogy Conf. (VTC), Orlando, FL, Oct. 2003, pp. 508–512. for low-complexity full-diversity MIMO detection,” IEEE Trans. Signal Processing, [65] P. W. Wolniansky, G. J. Foschini, G. D. Golden, and R. A. Valenzuela, “V- vol. 57, no. 7, pp. 2701–2710, July 2009. BLAST: An architecture for realizing very high data ra tes over the rich-scattering [38] H. W. Mow, “Universal lattice decoding: A review and some recent results,” wireless channel,” in Proc. URSI Int. Symp. Signals, Systems and Electronics, in Proc. IEEE Int. Conf. Communications (ICC), Paris, France, June 2004, vol. Pisa, Italy, Sept. 1998, pp. 295–300. 5, pp. 2842–2846. [66] D. Wübben, R. Böhnke, J. Rinas, V. Kühn, and K. D. Kammeyer, “Efficient [39] Y. H. Gan and H. W. Mow, “Complex lattice reduction algorithms for low- algorithm for decoding layered space-time codes,” Electron. Lett., vol. 37, no. 22, complexity MIMO detection,” in Proc. IEEE Global Communications Conf. (GLO- pp. 1348–1350, Oct. 2001. BECOM), St. Louis, MI, Nov. 2005, pp. 2953–2957. [67] H. Artés, D. Seethaler, and F. Hlawatsch, “Efficient detection algorithms [40] C. Ling and N. How grave-Graham, “Effective LLL reduction for lattice for MIMO channels: A geometrical approach to approximate ML detection,” IEEE decoding,” in Proc. IEEE Int. Symp. Information Theory (ISIT), Nice, France, Trans. Signal Processing, vol. 51, no. 11, pp. 2808–2820, Nov. 2003. June 2007, pp. 196–200. [68] C. Studer, D. Seethaler, and H. Bölcskei, “Finite lattice-size effects in MIMO [41] H. Minkowski, “Ueb er positive quadratische Formen,” Journal für die reine detection,” in Proc. Asilomar Conf. Signals, Systems and Computers, Oct. 2008, und angewandte Mathematik, vol. 1886, no. 99, pp. 1–9, 1886. pp. 2032–2037. [42] H. Minkowski, “Ueber die positiven quadratischen Formen und über ketten- [69] J. Jaldén and P. Elia, “DMT optimality of LR-aided linear decoders for a general bruchähnliche Algorithmen,” Journal für die reine und angewandte Mathematik, class of channels, lattice designs, and system models,” IEEE Trans. Infom. Theory, vol. 1891, no. 107, pp. 278–297, 1891. vol. 56, no. 10, pp. 4765–4780, Oct. 2010. [43] H. Minkowski, Geometrie der Zahlen. Leipzig, Germany: Teubner Verlag, 1896. [70] L. Babai, “On Lovász lattice reduction and the nearest latt ice point problem,” Combinatorica, vol. 6, no. 1, pp. 1–13, 1986. [44] C. Hermite, “Extraits de lettres de M. Ch. Hermite á M. Jacobi sur dif- férents objects de la théorie des nombres,” Journal für die reine und ange- [71] C. Windpassinger and R. F. H. Fischer, “Optimum and sub-optimum lattice- wandte Mathematik, vol. 1850, no. 40, pp. 279–290, 1850. reduction-aided detection and precoding for MIMO communications,” in Proc. Canadian Workshop Information Theory, Waterloo, ON, Canada, May 2003, pp. [45] A. Korkine and G. Zolotareff, “Sur les formes quadratiques,” Mathematische 88–91. Annalen, vol. 6, no. 3, pp. 366–389, 1873. [72] D. Wübben, “Effiziente Detektionsverfahren für Multilayer-MIMO-Systeme,” [46] C. F. Gauss, Untersuchungen über höhere Arithmetik, (Disquisitiones (in German), Ph.D. dissertation, Universität Bremen, Arbeitsbereich Nachrich- Arithmeticae). Berlin, Germany: Springer-Verlag, 1889. tentechnik, Shaker Verlag, Germany, Feb. 2006. [47] V. Brun, “En generalisation av kjedebrøken I,” Skr. Vid ensk. Selsk. Kristiana, [73] P. Silvola, K. Hooli, and M. Juntti, “Suboptimal soft-output MAP detector with Mat. Nat. Klasse, vol. 6, pp. 1–29, 1919. lattice reduction,” IEEE Signal Processing Lett., vol. 13, no. 6, pp. 321–324, June 2006. [48] V. Brun, “En generalisation av kjedebrøken II ,” Skr. Vidensk. Selsk. Kristiana, Mat. Nat. Klasse, vol. 6, pp. 1–24, 1920. [74] X.-F. Qi and K. Holt, “A lattice-reduction-aided soft demapper for high-rate coded MIMO-OFDM systems,” IEEE Signal Processing Lett., vol. 14, no. 5, pp. [49] G. Golub and C. van Loan, Matrix Computations, 2nd ed. London: The John 305–308, May 2007. H opkins Univ. Press, 1993. [75] W. Zhang and X. Ma, “Approaching optimal performance by lattice-reduc- [50] D. Wübben, R. Böhnke, V. Kühn, and K. D. Kammeyer, “MMSE-based lat- tion aided soft detectors,” in Proc. Conf. Information Sciences and Systems tice reduction for near-ML detection of MIMO systems,” in Proc. ITG Workshop (CISS), Baltimore, MD, Mar. 2007, pp. 818–822. Smart Antennas (WSA), Mun ich, Germany, Mar. 2004. [76] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation [51] D. Wübben, R. Böhnke, V. Kühn, and K. D. Kammeyer, “Near-maximum-likeli- technique for near-capacity multiantenna multiuser communication—Part I: hood detection of MIMO systems using MMSE-based lattic e reduction,” in Proc. IEEE Channel inversion and regularization,” IEEE Trans. Commun., vol. 53, no. 1, pp. Int. Conf. Communications (ICC), Paris, France, June 2004, vol. 2, pp. 798–802. 195–202, Jan. 2005 . [52] B. A. LaMacchia, “Basis reduction algorithms and subset sum problems,” [77] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector-perturbation Master’s thesis, Massachusetts Inst. Technol., May 1991. technique for near-capacity multiantenna multiuser communication—Part II: Perturbation,” IEEE Trans. Commun., vol. 53, no. 3, pp. 537–544, Mar. 2005. [53] M. Ajtai, “The shortest vector problem in L2 is NP-hard for randomized re- ductions,” in Proc. 30th Annu. ACM Symp. Theory of Computing, Dallas, TX, [78] C. Windpassinger, R. F. H. Fischer, T. Vencel, and J. B. Huber, “Precoding in May 1998, pp. 10–19. multiantenna and multiuser communications,” IEEE Trans. Wireless Commun., [54] J. Jaldén and B. Ottersten, “On the complexity of sphere decoding in digital vol. 3, no. 4, pp. 1305–1316, July 2004. communications,” IEEE Trans. Signal Processing, vol. 53, no. 4, pp. 1474–1484, [79] C. Windpassinger, R. F. H. Fischer, and J. B. Huber, “Lattice-reduction-aided Apr. 2005. broadcast precoding,” IEEE Trans. Commun ., vol. 52, no. 12, pp. 2057–2060, Dec. [55] H. Daudée and B. Vallée, “An upper bound on the average number of iterations 2004. of the LLL algorithm,” Theor. Comput. Sci., vol. 123, no. 1, p. 95115, Jan. 1994. [80] R. G. McKilliam, I. V. L. Clarkson, B. G. Quinn, and B. Moran, “Polynomial- [56] A. Akhavi, “The optimal LLL algorithm is still polynomial in fixed dimension,” phase estimation, phase unwrapping and the nearest lattice point problem,” in Theor. Comput. Sci., vol. 297, no. 1, pp. 3–23, Mar. 2003. Proc. Asilomar Conf. Signals, Systems, and C omputers, Pacific Grove, CA, Oct. 2009, pp. 483–485. [57] J. Jaldén, D. Seethaler, and G. Matz, “Worst- and average-case complexity of LLL lattice reduction in MIMO wireless systems,” in Proc. IEEE Int. Conf. [SP]

IEEE SIGNAL PROCESSING MAGAZINE [91] MAY 2011