1

Signal Shaping for Non-Uniform Beamspace Modulated mmWave Hybrid MIMO Communications Shuaishuai Guo, Member, IEEE, Haixia Zhang, Senior Member, IEEE, Peng Zhang, Member, IEEE, Shuping Dang, Member, IEEE, Chengcheng Xu, and Mohamed-Slim Alouini, Fellow, IEEE

Abstract—This paper investigates adaptive signal shaping dividing the signal processing in analog and digital domains to methods for millimeter wave (mmWave) multiple-input multiple- reduce the number of RF chains has attracted a lot of attention. output (MIMO) communications based on the maximizing the Given such an mmWave hybrid MIMO system offering a minimum Euclidean distance (MMED) criterion. In this work, we utilize the indices of analog precoders to carry information fixed transmission rate of n bits per channel use (bpcu), and optimize the symbol vector sets used for each analog we are interested in finding the optimal transmit vector set n precoder activation state. Specifically, we firstly propose a joint XN = {x1, x2, ··· , xN } (N = 2 ) subject to an average power optimization based signal shaping (JOSS) approach, in which constraint to maximize the minimum Euclidean distances the symbol vector sets used for all analog precoder activation (MMED) among the noise-free received signal vectors. In this states are jointly optimized by solving a series of quadratically constrained quadratic programming (QCQP) problems. JOSS work, we also discuss the extension of the proposed signal exhibits good performance, however, with a high computational shaping methods to other optimization criteria, including the complexity. To reduce the computational complexity, we then minimizing the symbol error rate (MSER) criterion and the propose a full precoding based signal shaping (FPSS) method and maximizing the mutual information (MMI) criterion. a diagonal precoding based signal shaping (DPSS) method, where the full or diagonal digital precoders for all analog precoder activation states are optimized by solving two small-scale QCQP A. Related Work problems. Simulation results show that the proposed signal shaping methods can provide considerable performance gain in Transmit vectors of mmWave hybrid MIMO systems are reliability in comparison with existing mmWave transmission jointly determined by the hybrid precoders and the symbol solutions. vectors. All existing precoding and symbol vector optimization Index Terms—mmWave MIMO communiations, signal shap- approaches can be regarded as the signal shaping methods. ing, hybrid precoder, beamspace To summarize, we classify related works into three categories according to the precoding strategy. I.INTRODUCTION 1) Best Beamspace Based Signal Shaping (BBSS): In this ILLIMETER wave (mmWave) communications are category, only a couple of analog and digital precoders are M next frontier for wireless communications. As the sig- employed at the transmitter to steer the beam to the best nal frequency goes higher, the required antenna size becomes beamspace during the transmission in a coherent time slot. smaller and a large number of antennas can be integrated in a The signal shaping can be optimized by carefully designing limited area. Owing to the cost and hardware complexity, it is the hybrid precoders. To do so, [1] proposed an orthogonal impractical to equip each antenna with a radio frequency (RF) pursuit matching (OMP) based precoding in fully-connected chain. As a result, multiple-input multiple-output (MIMO) hybrid (FCH) mmWave MIMO systems leveraging the channel sparsity. To improve the spectral efficiency (SE), the authors of arXiv:2006.12705v1 [eess.SP] 23 Jun 2020 systems with reduced RF chains are becoming a new trend for mmWave MIMO communications, where hybrid precoding [2] developed alternating minimization algorithms for the hy- brid precoder optimization. [3] and [4] investigated successive The work of S. Guo, and P. Zhang were supported in part by Young interference cancellation (SIC) based precoding in partially- Taishan Scholars, in part by the National Natural Science Foundation of connected hybrid (PCH) mmWave MIMO systems. By trading China under Grant 61801266 and in part by the Shandong Natural Science Foundation under Grant ZR2018BF033. The work of C. Xu and H. Zhang off the SE and implementation complexity, a hybrid precoder were supported by National Natural Science Foundation of China under Grant design with dynamic partially-connected MIMO structure was No. 61860206005 and No. 61671278 (Corresponding author: Haixia Zhang). proposed in [5]. Recently, [6] has developed a deep-learning- S. Guo, H. Zhang and C. Xu are with Shandong Provincial Key Lab- oratory of Wireless Communication Technologies and School of Control enabled mmWave massive MIMO framework for effective Science and Engineering, Shandong University, Jinan 250061, China (email: hybrid precoder optimization, where the hybrid precoders are shuaishuai [email protected]; [email protected]; chengchengxu@mail. selected through a training based deep neural network with a sdu.edu.cn). P. Zhang is with the School of Computer Engineering, Weifang University, substantially reduced complexity. Considering existing hybrid Weifang 261061, China (e-mail: [email protected]). precoding solutions typically require a large number of high- S. Dang and M.-S. Alouini are with the Computer, Electrical and Math- resolution phase shifters, which still suffer from high hardware ematical Science and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia (email: complexity and power consumption. To address this issue, the [email protected]; [email protected]). authors of [7] employed a limited number of low-resolution 2 phase shifters and an antenna switch network to realize the B. Contributions hybrid precoders. It is worth mentioning that the hybrid The paper attempts to optimize the multi-dimensional sym- precoder solutions to maximize the SE are based on a Gaussian bol vector set for each beamspace activation state in the input assumption, resulting in that the designs are far from complex domain. It is an intricate task since the multiple the optimality in practical mmWave MIMO communications symbol vector set optimization couples the discrete set size with finite alphabet inputs [8]. With practical finite alphabet optimization and the continuous set entry optimization in the inputs, [9]–[11] have recently developed various effective and complex domain. efficient hybrid precoding methods to maximize the mutual • Firstly, we propose a joint optimization based signal information, which are referred as MMI precoding. However, it shaping (JOSS) method, where the symbol vector sets should be emphasized that the information carrying capability used for each analog precoder activation state are opti- by changing the precoder activation state has not been explored mized. The size of the sets are optimized in a recursive by the BBSS approach, which promises the potential for way. Given an optimized set size, the optimization of further optimization. the entries in the sets is formulated as a quadratically 2) Uniform Beamspace Modulation Based Signal Shaping constrained quadratic programming (QCQP) problem and (UBMSS): In this category, the information carrying capability can be solved by existing algorithms. JOSS exhibits by changing the precoder activation state has been explored by good performance in reliability, however, with a high uniformly activating a set of precoders. For example, a receive computational complexity. spatial modulation (RSM) for line-of-sight (LOS) mmWave • Secondly, to reduce the complexity of JOSS, we then MIMO communication systems was proposed in [12], where propose a full-precoding based signal shaping (FPSS) a set of precoders that steer the beams to each receive antenna method and a diagonal-precoding based signal shaping were adopted. Later, a virtual space modulation (VSM) trans- (DPSS) method. Based on all adaptive modulation candi- mission scheme and hybrid precoder designs were proposed dates, we refine the modulation symbol vector sets with in [13], [14]. Using the sparse scattering nature of mmWave full digital precoders or diagonal digital precoders. In channel, [15] proposed a spatial scattering modulation (SSM). our design, the full/diagonal digital precoders for each Relying on the beam index for modulation, the authors of analog precoder activation state are different and jointly [16] developed a beam index modulation (BIM) scheme and optimized by solving a small-scale QCQP problem. showed its superiority in SE for mmWave communications. • Thirdly, comprehensive comparisons among JOSS, FPSS, Roughly speaking, the difference among above transmission and DPSS are made in terms of reliability and computa- schemes lies in that the employed analog and digital precoders tional complexity. To show the superiority of the proposed are slightly different. They are the same in activating each designs over existing mmWave transmission solutions, we beamspace with equal probabilities since the symbol vector also compare the proposed signal shaping aided NUBM sets used for all beamspace activation states are the same. This with BBSS and UBMSS in terms of minimum Euclidean results in limited performance because different beamspaces distance and symbol error rate (SER). corresponding to different channels have inherently different • Fourthly, we probe into the capability of the proposed sig- information carrying capabilities. Besides, the employed sym- nal shaping methods for mmWave hybrid MIMO systems bol vector set has not been optimized. in approaching the fully-digital signal shaping (FDSS) methods for mmWave fully-digital MIMO systems. More- 3) Non-Uniform Beamspace Modulation Based Signal over, we investigate the impact of channel state informa- Shaping (NUBMSS): Most recently, we proposed a gen- tion (CSI) estimation errors and hardware impairments on eralized non-uniform beamspace modulation (NUBM) for the performance by simulations. We discuss the extension mmWave communications in [17], where the beamspace is to orthogonal frequency division (OFDM) activated more flexibly. In the proposed NUBM scheme, good based mmWave broadband MIMO systems. In addition, beamspaces are activated with high probabilities while poor the impact of hybrid receiver and the discussion on the beamspaces are activated with low probabilities. It has been implementation challenges are also included. theoretically proven that NUBM proposed in [17] outperforms the best beamspace selection (BBS) approach in terms of SE. It has also been proven in [18] that NUBM is capacity-achieving C. Organization and Notations for MIMO communications subject to a limited number of The remainder of the paper is organized as follows. Section RF chains. The analysis on SE is based on the Gaussian II describes the system model. Section III formulate the input assumption. With finite alphabet inputs in practice, the optimization problems. The proposed signal shaping methods beamspaces can be activated with non-equal probabilities by are presented in Section IV. Section V discusses the implemen- employing different symbol vector sets for different analog tation challenges and the extension to other criteria. Section precoder activation states, such as the adaptive modulation VI presents the simulation results. Section VII concludes the schemes studied in [19], [20]. However, the adaptive mod- paper. ulation schemes can only be chosen from a limited number In this paper, scalars are represented by italic lower-case of modulation orders, which are the power of two. How to letters. Boldface upper-case and lower-case letters are used to optimize the input to each beamspace in the complex domain denote matrices and column vectors. (·)T and (·)H stand for remains unsolved. In this paper, we target this problem. the transpose and transpose-conjugate operations, respectively. 3

TABLE I TABLE II SUMMARY OF ABBREVIATIONS SUMMARY OF SYSTEM MODEL NOTATIONS

Abbreviation Full name Notation System Parameter

ADC analog-to-digital converter ft , fr transmit and receive antenna array response vectors AMSS adaptive modulation-based signal shaping k {FRF } analog precoders AoAs angles of arrival {( )k } FBB l digital precoders AoDs angles of departure H channel matrix AP analog precoder K number of candidate analog precoders AWGN additive white Gaussian noise m rank of the channel BBSS best beamspace based signal shaping n transmission rate in bpcu BBS best beamspace selection n noise vector BIM beam index modulation N number of transmit vectors bpcu bits per channel use Nt number of transmit antennas CSI channel state information Nr number of receive antennas DAC digital-to-analog converter NRF number of transmit radio frequency chains DPSS diagonal-precoding based signal shaping NRF number of receive radio frequency chains DP digital precoder Ps average power constraint for the symbol vectors EE energy efficiency Px average power constraint for the transmit vectors FCH fully connected hybrid {sk } symbol vectors FDSS fully-digital signal shaping l {sˆk } precoded symbol vectors FPSS full-precoding based signal shaping l { } GBM generalized beamspace modulation xi transmit vectors X JOSS joint optimization based signal shaping N transmit vector set LOS Line-of-sight y receive vector {S } MIMO multiple-input multiple-output k symbol vector sets Z ML maximum-likelihood set of symbol vector sets MMED maximizing the minimum Euclidean distance MMI maximizing the mutual information mmWave millimeter wave II.SYSTEM MODEL MSER minimizing the symbol error rate MRC maximum ratio combining We consider a point-to-point mmWave MIMO system, NUBM non-uniform beamspace modulation where the transmitter has N antennas and the receiver has N NUBMSS NUBM based signal shaping t r Nt OFDM orthogonal frequency division multiplexing antennas. Let xi ∈ C denote the transmitted signal vector. OMP orthogonal pursuit matching Then, the received signal vector y ∈ CNr can be presented as PCH partially connected hybrid QCQP quadratically constrained quadratic programming y = Hxi + n, (1) RF radio frequency RSM receive spatial modulation where n ∈ CNr denotes the additive white Gaussian noise SE spectral efficiency 2 SER symbol error rate (AWGN) vector with mean zero and variance σ at the 2  Nr ×Nt SIC successive interference cancellation receiver, i.e., n ∼ CN 0Nr , σ INr ; H ∈ C is the SNR signal-to-noise ratio channel matrix between the transmitter and the receiver. Due SSM spatial scattering modulation UBMSS uniform beamspace modulation based signal shaping to the limited number of scatterers in the mmWave propagation UPA uniform planar array environment, the commonly used rich-scattering model at low VSM virtual spatial modulation frequencies is no longer applicable. Here, we adopt the Saleh- Valenzuela model [21], which is given by

L 1 Õ H = α f θr, φr  fH θt, φt  , (2) Tr(A) and rank(A) denote the trace and rank of matrix A, √ l r l l t l l L l=1 receptively. Furthermore, ||A||F is the Frobenius norm of matrix A and diag(A) denotes a vector formed by the diagonal where L represents the number of effective propagation paths, r elements of matrix A. For a vector a, ||a||2 denotes its l2 and αl is the channel coefficient of the l-th path. θl ∈ [0, π) r norm. Moreover, diag(a) denotes a diagonal matrix whose and φl ∈ [0, 2π) are the elevation and azimuth angles of arrival diagonal elements are assigned by vector a. and ⊗ denote (AoAs). θt ∈ [0, π] and φt ∈ [0, 2π] represent the elevation and l l   the Hadamard and Kronecker products. IN indicates the N ×N t t azimuth angles of departure (AoDs). Finally, ft θl, φl and identity matrix. 0n and 1n are n-dimensional all-zeros and  r r  all-ones vectors, repectively. CN(µ, Σ) denotes a complex fr θl , φl denote the transmitter and receiver antenna array Gaussian vector with mean µ and covariance Σ. C denotes response vectors. In this paper, an uniform planar√ array (UPA) the set of complex numbers.  represents the imaginary unit. with W1 and W2 elements (W1 = W2 = Nw) on horizon n  and vertical is considered, whose array response vector can b·c denotes the floor operation. m is a binomial coefficient. U M×N denotes the set of all M × N-dimensional matrices be written as whose elements have unit magnitude. For a set A, |A| 1 h j 2π d(x sin(φ) sin(θ)+y cos(θ)) fw(θ, φ) =√ 1,..., e λ ,... represents its size. log2(·) stands for the logarithmic functions Nw (3) of base 2. For clarity, we tabulate all abbreviations in Table I T j 2π d((W −1) sin(φ) sin(θ)+(W −1) cos(θ))i and important notations in Table II. e λ 1 2 , 4

antennas are partitioned into Nt /NRF groups and each RF chain is connected to only a subset of antennas. Therefore, RF Chain the number of required phase-shifters is reduced to Nt , and no ⋮ power combiner is needed. There exists a trade-off between . energy efficiency (EE) and SE for the two hybrid architectures. Baseband . ⋮ . That is, the FCH architecture can provide the full beamforming Precoder . . gain at the expense of hardware cost/power consumption, . whereas the low complexity PCH architecture realizes a low RF ⋮ beamforming gain [25]. Moreover, it is noteworthy that FCH Chain and PCH arrays are chosen as examples to introduce our work Analog and the proposed designs can be directly extended to mmWave communications with other array structures. (a) FCH Transmitter Structure

III.PROBLEM FORMULATION RF ⋮ In this paper, we are interested in the optimization of the Chain ⋮ transmit vector set. That is, for an mmWave hybrid MIMO . system with a target transmission rate of n bpcu, we aim Baseband . . ⋮ Precoder to find the optimal vector set XN = {x1, x2, ··· , xN }, where . . N = 2n. Maximizing the minimum Euclidean distance among . the noise-free received signal vectors is our target, where the RF ⋮ ⋮ Chain minimum Euclidean distance can be expressed as Analog dmin(XN, H) = min ||Hxi − Hxi0 ||2. (4) i,i0 ∈I,i,i0 (b) PCH Transmitter Structure With a hybrid structure, a transmit vector xi ∈ XN can be expressed as Fig. 1. FCH and PCH mmWave MIMO transmitter structures. k k xi = FRF sˆl , (5) k ( )k k where sˆl = FBB l sl can be regarded as a precoded sym- where λ and d represent the signal wavelength and the antenna ( )k k bol vector by the digital precoder FBB l ; FRF denotes spacing, respectively. In addition, w = t or r in Nw and k the k-th analog precoder and sˆl denotes the l-th precoded fw(θ, φ), 0 ≤ x ≤ (W1 − 1) and 0 ≤ y ≤ (W2 − 1), where k symbol vector when FRF is activated. We use FRF = x and y stand for the antenna indices in the two-dimensional {F1 , F2 , ··· , FK } of size K to denote the set of all plane. RF RF RF analog precoder candidates. Sets S1, S2, ··· , SK of sizes In this paper, we assume H is perfectly known by the n1, n2, ··· , nK are used to denote the precoded symbol vector transceivers. It is noted that although this is an ideal as- 1 2 K sets when FRF, FRF, ··· , FRF are activated, respectively. sumption, in practical applications, CSI at the receiver can be Remark: Based on the denotations, we can clearly see the obtained by the downlink channel estimation. Specifically, in differences among BBSS, UBMSS and NUBMSS. In BBSS, time division duplex (TDD) systems with uplink and downlink 1 only a fixed precoder FRF is adopted and other precoders channel reciprocity, CSI at the transmitter can be acquired by will not be activated. In other words, we have the precoded uplink channel estimation. In frequency division duplex (FDD) symbol vector set sizes n1 = N and n2 = n3 = ··· = nK = 0. systems, CSI at the transmitter can be acquired by feeding blog K c In UBMSS, a subset of Kˆ = 2 2 precoders in FRF are back the estimated CSI from the receiver. uniformly activated to send equal-size S1, S2, ··· , SKˆ . That is, In this work, we adopt the commonly considered hybrid ˆ n1 = n2 = ··· = nKˆ = N/K and nKˆ +1 = nKˆ +2 = ··· = nK = 0. analog and digital array architectures, which significantly re- In NUBMSS, all precoders are non-uniformly activated subject duce the number of required RF chains by cascading an analog ÍK to a size constraint k=1 nk = N. From this perspective, feed network after the baseband digital signal processor. Fig. it is found that BBSS and UBMSS can be regarded as the 1 depicts two major hybrid array architectures, namely the special realizations of NUBMSS and the globally optimized FCH array and PCH array. In both cases, the transmitter NUBMSS will inherently outperform BBSS and UBMSS. has N antennas but only N ( N ) RF chains and is t RF t For convenience, we further define Z , {S1, S2, ··· , SK } capable of transmitting up to NRF independent data streams and the minimum Euclidean distance among the noise-free simultaneously [22]–[24]. In an FCH array architecture, each received signal vectors can be rewritten as of NRF RF chains is connected to all Nt antennas via Nt phase shifters and an (NRF + 1)-port combiner. As a consequence, dmin(XN, H) = dmin(FRF, Z, H) k k k0 k0 the fully-connected architecture provides full beamforming = min ||H(F sˆ − F sˆ 0 )|| , 0 0 RF l RF l 2 Fk sˆk Fk sˆk gain of massive antenna arrays but with a high hardware RF l , RF l0 (6) 0 Fk ,Fk ∈F RF RF complexity of total NRF Nt phase shifters and Nt combiners. 0 sˆk ∈S , sˆk ∈S 0 l k 0 k PCH architecture is also referred to as sub-array, where Nt l Sk ,Sk0 ∈Z 5

The signal shaping becomes a problem finding FRF and Z to Given a feasible analog precoder set FRF , the signal shaping maximize dmin(FRF, Z, H) subject to a size constraint that problem is reduced to find the symbol vector sets for different K analog precoder activation states, which can be given by Õ nk = N, (7) (P3) : Given : H, N, FRF k=1 Find : Z = {S , S , ··· , S } and an average power constraint P that 1 2 K s 2 Maximize : dmin(FRF, Z, H) K nk (11) 1 Õ Õ k H k K P(Z) = (sˆ ) sˆ ≤ Ps. (8) Õ N l l Subject to : nk = N, k=1 l=1 k=1 Thus, the signal shaping problem for UBMSS in mmWave P(Z) ≤ Ps. hybrid MIMO communications can be formulated as   (P1) H, N Remark: The determination of K = m is owing to : Given : NRF 1 2 K that the number of mutually-independent non-zero subspace Find : FRF = {FRF, FRF, ··· , FRF },     matrices is m . The rationale for using m subspace Z = {S1, S2, ··· , SK } NRF NRF 2 matrices instead of restricting to the subspace matrix corre- Maximize : dmin(FRF, Z, H) sponding to the strongest NRF singular vectors is that the K (9) Õ differences between subspace matrices are also employed to Subject to : nk = N, enlarge the mutual Euclidean distances among the noise- k=1 free received signal vectors. We note that even though there k Nt ×NRF FRF ∈ U , exist K legitimate beamspaces, it does not mean that all K P(Z) ≤ Ps. beamspaces will be activated during the transmission. Whether a subspace will be used or not is determined by whether the The variables F and Z are coupled. To solve the problem, RF associated symbol vector set is a non-empty set or not. If the we have to decouple them. In this paper, we propose to firstly associated symbol vector set of a beamspace matrix is empty, optimize F and then find the optimal Z based on the RF the corresponding beamspace will not be activated during the optimized F . Since the channel considered in this paper RF transmission phase. In this subsection, we propose to solve the is sparse, each precoder in F should not steer the beam to RF original problem (P1) by solving two separate subproblems the zero space of H. To guarantee this, we provide a singu- (P2) and (P3). That is, we firstly find the non-zero beamspace lar matrix approximation approach, which can be described set formed by feasible analog precoders, and then optimize the as follows. First, we perform singular value decomposition symbol vectors for each beampsace activation states. However, (SVD) as H = UΛVH , where U ∈ CNr ×m is the left-singular it is difficult to provide a rigorous proof for the equivalence in matrix, Λ ∈ Cm×m is the diagonal matrix with m = rank(H) splitting (P1) into (P2) and (P3). To investigate the capability non-zero singular values as diagonal entries and V ∈ CNt ×m of the splitting in approaching the optimal performance, we is the right-singular matrix. Since the symbol vectors should compare the proposed signal shaping methods by solving (P2) be transmitted through the subspace expanded by the V and   and (P3) with the signal shaping by directly solving a relaxed for N data steams, there are m subspace matrices, RF NRF problem of (P1), i.e.,   i.e., K = m and we denote the subspace matrices NRF (RP1) : Given : H, N as F1, F2, ··· , FK . The subspace matrix are implemented by fully-digital structures. In our work, we adopt analog precoders Find : XN = {x1, x2, ··· , xN } (12) { ··· } 2 and digital precoders to approximate F1, F2, , FK . The Maximize : dmin(XN, H) approximation can be performed by solving Subject to : P(XN ) ≤ Px, k k 2 (P2) : min ||Fk − F F || k k RF BB F {FRF }, {FBB } where the hybrid structure is relaxed and the average power k Nt ×NRF k NRF ×NRF constraint Px on the transmit vector set is given by subject to : FRF ∈ U , FBB ∈ C , ||Fk Fk ||2 = ||F ||2 . N RF BB F k F 1 Õ (10) P(X ) = E(||x||2) = x T x ≤ P . (13) N N i i x i=1 The problem can be solved by numerous existing algorithms, e.g., the OMP algorithm for FCH MIMO systems in [1] or Problem (RP1) can be regarded the formulation of the signal the SIC algoritm for PCH MIMO systems in [3]. Besides, we shaping for mmWave fully-digital MIMO systems, whose k k note that only {FRF } is useful in our designs and {FBB } is solution can provide a performance bound for the solution k considered in the next step of optimizing sˆl . It is noteworthy of (P1). The detailed discussion on the solution of (RP1) that this paper just provides a way for designing FRF and the and the comparison are included in Sections V-B and VI-A, following signal shaping methods in the paper are suitable for respectively. Next, we focus our attention on solving (P3) since any feasible FRF . (P2) can be solved by existing algorithms. 6

NNRF IV. SIGNAL SHAPING METHODS and oi ∈ R as oi , gi ⊗ 1NRF , where gi is the Problem (P3) is a set optimization problem. It includes the ith N-dimensional vector basis with all zeros except the ith entry being one. Based on these definitions, the square of the set size optimization, i.e., finding the optimal n1, n2, ··· , nK ÍK pairwise Euclidean distances can be expressed as that satisfy the size constraint k=1 nk = N. After that, one still needs to perform set entry optimization, i.e., finding the 2 2 ||Hx − Hx 0 || = ||HGD o − HGD o 0 || optimal set entries in S , S , ··· , S . To solve the problem, i i 2 z i z i 2 1 2 K H H H H we propose three signal shaping approaches in this section. = (oi − oi0 ) Dz G H HGDz(oi − oi0 ) (17)  H  = Tr Dz RHGDz∆Oii0 , A. Joint Optimization Based Signal Shaping (JOSS) H H H The set size optimization is a discrete optimization satis- where RHG = G H HG and ∆Oii0 = (oi − oi0 )(oi − oi0 ) . ÍK fying k=1 nk = N and nk ≥ 0, k = 1, 2, ··· , nk . According Given any two diagonal matrices Du = diag(u) and Dv =  N K−  H H + 1 diag(v), we have an equality Tr(DuUDvV ) = u (U V)v, to the analysis in [26], there are K−1 feasible solutions, which is a large number. For instance, given N = 64, i.e., n = 6 based on which (17) can be re-expressed as  73  10 bpcu, K = |FRF | = 10, there are around = 9.7 × 10 2 H 9 ||Hxi − Hxi0 ||2 = z Zii0 z, (18) feasible set size solutions. For each set size solution, one NNRF H also needs to perform set entry optimization. Thus, exhaustive where z = diag{Dz} ∈ C and Zii0 = RHG ∆Oii0 ∈ search over all feasible set size solutions is prohibitive. To CNNRF ×NNRF . As a consequence, the average power con- solve this problem for practical systems, we resort to a greedy straint can be expressed as recursive design method which was firstly introduced in [26]. 1  H  1 H To introduce the recursive design method, we first define a P(Z) = Tr DzDz = z z ≤ Ps. (19) N ×NN N N matrix GN ∈ C t RF by Based on the above reformulations, the set entry optimization G , N becomes n1 n2 nK   0 z }| { z }| { z }| { (P4) : Given : Zii0, i , i ∈ {1, 2, ··· , N}  1 ··· 1 2 ··· 2 ··· K ··· K  ∀ FRF, , FRF, FRF, , FRF, , FRF, , FRF  ,   Find : z   H (20) Maximize : min z Z 0 z  (14) ii H Subject to : z z ≤ NPs. which corresponds to a feasible solution (n1, n2, ··· , nK ). With the definition of GN , the recursive design can be described as Because the minimum Euclidean distance monotonically in- k follows. Given GN−1, we can choose an FRF ∈ FRF to adjoin creases with the increase of the average power, maximizing the GN−1 for generating |F | candidates of GN . For each candidate minimum Euclidean distance in (P4) can also be reformulated of GN , we perform the set size optimization and obtain to minimize the average power for a target minimum distance the corresponding candidates of XN . Then, by comparing all dT , which can be expressed by of the candidates of XN , we can obtain a suboptimal XN 0 (P5) : Given : Zii0, i , i ∈ {1, 2, ··· , N} from all of the candidates and the corresponding suboptimal ∀ GN . According to this principle, we use the optimal X and Find : z 2 (21) G2, which can be obtained by exhaustive search, to find a Minimize : zT z suboptimal X3 and G3, then X4 and G4 and so on, until the T 2 0 Subject to : z Zii0 z ≥ dT, i , i ∈ {1, 2, ··· , N}. size constraint is satisfied. ∀ For denotation convenience, we use G to represent GN . The It is worth mentioning that problem (P5) is formulated with- set entry optimization in the recursive design can be performed out any power constraint, and hence the optimized transmit k k as follows. We define Sl , diag(sˆl ) for all k = 1, 2, ··· , K, vectors should be further scaled to satisfy the average power l = 1, 2, ··· , nk and express the transmit vector xi as constraint. Problem (P5) is an optimization problem in which both the objective function and the constraints are quadratic xi = GDzoi, (15) functions. That is, (P5) is a typical quadratically constrained ∈ NNRF ×NNRF where Dz C is a diagonal matrix defined by quadratic programming (QCQP) problem with NNRF complex 1  N   S1 0 0 ··· 0 0 0  variables and 2 constraints, which can be solved by existing    .  algorithms, e.g., the algorithm in [27], with a complexity of  0 . . 0 ···· 0 0  4 2   O(N NRF ). For clearly viewing the recursive design process,  1   0 0 S ··· 0 · 0  we list the JOSS in Algorithm 1. The algorithm in [27] for  n1   ......  solving the non-convex QCQP problems is a kind of gradient Dz ,  ......  , (16)  ......  descent algorithm. It starts from a randomly generated solution    0 · 0 ··· SK 0 0  and is updated when the objective function is decreased. Since  1    the objective function is lower bounded by 0, the algorithmic  . .   0 0 ···· 0 . 0  convergence can thereby be ensured. The convergence rate is    0 0 0 ··· 0 0 SK  high and has been investigated in [27].  nK  7

Algorithm 1 JOSS algorithm 2 where the matrix W of size Nt × KNRF is defined as Input: H, N, FRF  NRF NRF NRF  Output: XN z }| { z }| { z }| { 2  1 1 2 2 K K  Generate |FRF | feasible candidates of G2, and compute z2 W , F , ··· , F , F , ··· , F , ··· , F , ··· , F  ,  RF RF RF RF RF RF  by solving (P5) using the algorithm in [27]. Compare all   X   the candidates of 2, which are generated by G2 and z2,  (25) and obtain the optimal X2 and the corresponding G2. 2 2 the diagonal matrix Dq of size KN × KN is defined as Initialize t = 3. RF RF 1 repeat  QBB 0 ··· 0   2  Generate |FRF | feasible Gt based on Gt−1. and compute  ···   0 QBB 0  z by solving (P5) using the algorithm in [27]. Compare   t Dq ,  . . . .  , (26) all the candidates of X , which are generated by G and  . . . . .  t t   zt , and obtain the optimal Xt and the corresponding Gt .  K   0 0 ··· QBB Update t ← t + 1.   2 2 until t > N. the diagonal matrix Qi of size NRF × NRF is defined as Output the optimized XN . k QBB   T   k  = diag vec FBB According to similar complexity analysis in [26], [28], the ( T ) aggregated computational complexity of JOSS is  k   k   k  = diag FBB , FBB , ··· , FBB , 1,1 1,2 NRF, NRF JO 5 2 CJO = O(NiterKN NRF ), (22) (27) 2 JO the vector ei of size KN × 1 can be expressed as where Niter denotes the average iteration number that the RF algorithm in [27] takes to converge to solve (P5). k ei = g˜k ⊗ s˜l , (28) k 2 the vector s˜l of size NRF × 1 is expressed as

B. Full Precoding Based Signal Shaping (FPSS) NRF z }| { Observing the computational complexity in (22), it is found k k T k T k T T s˜l = [(sl ) , (sl ) , ··· , (sl ) ] , (29) that the complexity is at least the fifth power of the transmit vector set size N. In the case with a large N, we have to resort and g˜k is a K-dimensional vector basis with all zeros except to their methods for solving this problem. In this subsection, the kth entry being one. Because the proof of the reformulation we propose the full precoding based signal shaping approach. is the same as that in [27], we do not repeat it here for simplicity. The idea is expatiated as follows. First, we express xi ∈ XN as With the reformulation in (24), we rewrite the square of the Euclidean distance between Hx and Hx 0 as k k k i i xi = FRF FBBsl . (23) 2 2 ||Hxi − Hxi0 || = ||HWDq(ei − ei0 )|| c c c H H H H S S ··· S 0 0 We denote 1 , 2 , , K as unprecoded symbol vector sets = (ei − ei ) Dq W H HWDq(ei − ei ) which are optimized in a given codebook similarly to that  H  ∆ 0 Z {Sc Sc ··· Sc } = Tr Dq RHWDq Eii , in adaptive modulation schemes and c = 1 , 2 , , K . It is slightly different from the expression in (5). The dif- (30) ference lies in that we refine {sk } with the same Fk , i.e., H H H l RF where RHW = W H HW and ∆Eii0 = (ei − ei0 )(ei − ei0 ) . (F )k = (F )k = ···(F )k = Fk , while in (5) BB 1 BB 2 BB nc RF Similarly, we can rewrite (30) as { k } {( )k } sl are respectively precoded by different FBB l . The 2 T || − 0 || 0 new expression can be regarded as a special case of the Hxi Hxi = q Qii q, (31) 2 KNRF T general case in (5). The optimized performance is inherently where q = diag{Dq} ∈ C and Qii0 = RHW ∆Eii0 ∈ KN 2 ×KN 2 less comparable to the globally optimized solution. But, the C RF RF . Accordingly, the power constraint can be given most important thing is that the expression can reduce the by optimization complexity. The detailed optimization procedure 1   1 P(Z) = Tr D DH = qH q ≤ 1. (32) is described as follows. N q q N Given Zc which is chosen from feasible normalized mod- According to (31) and (32), the original problem can be ulation symbol vector sets for adaptive modulation schemes, expressed as we optimize the digital precoders F1 , F2 , ··· , FK to refine 0 BB BB BB (P6) : Given : Q 0, i , i ∈ {1, 2, ··· , nk + 1} Sc Sc ··· Sc Z c c ··· c ii 1 , 2 , , K in c. By denoting n1, n2, , nK as the set ∀ c c c Find : q sizes of S , S , ··· , S , we re-express xi in (23) as 1 2 K T (33) Maximize : min q Qii0 q k k k T xi = FRF FBBsl = WDqei, (24) Subject to : q q ≤ N. 8

TABLE III COMPARISONS AMONG DIFFERENT SIGNAL SHAPING METHODS

Approach Parameters to be optimized Number of variables Number of constraints Computational complexity   { k } {( )k k } N O( JO 5 4 ) JOSS in Section IV-A sˆl = FBB l sl NNRF 2 NiterKN NRF k 2  N   FP 2 2 4  FPSS in Section IV-B Full {FBB } KNRF 2 O Nc NiterK N NRF     ˆ k N DP 2 2 2 DPSS in Section IV-C Diagonal {FBB } KNRF 2 O Nc Niter K N NRF  N   FD 4 2 FDSS in Section V-B Fully-digital {xi } NNt 2 O Niter N Nt

ˆ 1 ˆ 2 ˆ K By introducing an auxiliary variable τ, similarly to Section Similarly, we can directly optimize FBB, FBB, ··· , FBB IV-A, the optimization problem can be transformed to be jointly by solving 0 0 ˆ 0 (P7) : Given : Qii , i , i ∈ {1, 2, ··· , N} (P8) : Given : Qii0, i , i ∈ {1, 2, ··· , N} ∀ ∀ Find : q Find : qˆ (34) (39) Minimize : qT q Minimize : qˆT qˆ T 0 0 ≥ ∈ { ··· } T ˆ 0 Subject to : q Qii q τ, i , i 1, 2, , N . Subject to : qˆ Qii0 qˆ ≥ τ, i , i ∈ {1, 2, ··· , N}, ∀ ∀ 2  N  T It is a QCQP problem with KN variables and con- ˆ KNRF ˆ ˆ ˆ RF 2 where qˆ = diag{Dq} ∈ C , Qii0 = RHWˆ ∆Eii0 ∈ H straints. The problem can also be solved by using existing KNRF ×KNRF ˆ ˆ H ˆ ˆ C , RHWˆ = W H HW and ∆Eii0 = (eˆi − eˆi0 )(eˆi − H algorithms, e.g., the one in [27]. The corresponding compu- eˆi0 ) . tational complexity is around O(NFP K2N2N4 ), where NFP  N  iter RF iter It is also a QCQP problem with KNRF variables and denotes the average iteration number that the algorithm in 2 constraints. The corresponding computational complexity is [27] takes to converge to solve (P7). For all candidates for around O(NDPK2N2 N2), where NFP denotes the average Z , we preform the refinement, compare them in terms of iter RF iter c iteration number, by which the algorithm in [27] takes to the minimum Euclidean distance, and then obtain the final converge to solve (P8). For all candidates for Z , the ag- signal shaping. The aggregated computational complexity can c FP 2 2 4  gregated computational complexity for refinement can thereby thereby be expressed as O Nc NiterK N NRF , where Nc de- DP 2 2 2  be expressed as O Nc Niter K N NRF . notes the number of feasible candidates for Zc in the adaptive modulation scheme. D. Comparison C. Diagonal Precoding Signal Shaping (DPSS) The proposed signal shaping approaches are quite similar since the optimizations are all conducted by solving QPCP Besides employing the full digital precoders problems and the number of constraints are the same. But, the F1 , F2 , ··· , FK to refine Sc, Sc, ··· , Sc , one can BB BB BB 1 2 K numbers of their optimization variables are different, which ˆ 1 ˆ 2 ˆ K also employ diagonal precoders FBB, FBB, ··· , FBB to induce difference in computational complexity. For clearly perform the refinement, which can reduce the optimization viewing their differences, we illustrate them in Table III. and implementation complexity. Similarly, the refinement can be performed as follows. First, we define a matrix Wˆ of size V. EXTENSIONAND DISCUSSION Nt × KNRF as The most unique characteristic of mmWave communications Wˆ F1 , F2 , ··· , FK  , (35) , RF RF RF is the broadband nature. To benefit from the broadband nature, a diagonal matrix Dˆ q of size KNRF × KNRF as we discuss the signal shaping for OFDM-based mmWave MIMO communications. Besides, we also discuss the perfor- 1  Fˆ 0 ··· 0  mance loss compared to fully-digital signal shaping (FDSS),  BB   2  implementation challenges, the design with a hybrid receiver  0 Fˆ ··· 0   BB  structure, the extension to MSER and MMI signal shaping Dˆ q ,   , (36)  . . . . .  methods in this section.  . . . .     ˆ K   0 0 ··· FBB   A. Extension to Broadband mmWave Communications and a vector eˆi of size KNRF × 1 as Using OFDM, the proposed signal shaping methods can k be directly extended to broadband mmWave MIMO systems. eˆi = g˜k ⊗ s . (37) l Particularly, let k˜ represent the sub-carrier index and K˜ be Then, the transmit vector in use of diagonal precoders can be the number of carriers, the received signal in the frequency expressed by domain can be given by k ˆ k k ˆ ˆ ˜ ˜ k k ˜ ˜ ˜ ˜ xi = FRF FBBsl = WDqeˆi . (38) y[k] = ρH[k]FRF sˆl [k] + n[k], k = 0, 2, ··· , K − 1, (40) 9

T ˜ ˆ NNt ˆ ˆ ˆ NNt ×NNt where H[k] denotes the channel matrix of the k-th sub-carrier, where zˆ = diag{Dz} ∈ C , Zii0 = RGˆ ∆Eii0 ∈ C , which is also characterized by the Saleh-Valenzuela model [2] ˆ ˆ H ˆ ˆ H RGˆ = G G and ∆Oii0 = (oˆi − oˆi0 )(oˆi − oˆi0 ) . It is a QCQP L  N  ˜ problem with NN variables and constraints. The corre- 1 Õ r r H t t − 2πlk t 2 H[k˜] = √ α f θ , φ  f θ , φ  e K˜ , (41) l r l l t l l sponding computational complexity is around O(NFDN4N2), L l=1 iter t where NFD denotes the average iteration number, by which k ˜ iter and sˆl [k] represents the digital precoded signal vector when the algorithm in [27] takes to converge to solve (P10). The k FRF is activated. We assume that the sub-channels correspond- comparison between FDSS with the hybrid signal shaping ing to all sub-carriers of the same rank, i.e., m = rank(H[1]) = methods, including JOSS, FPSS and DPSS, is also included rank(H[2]) = ··· = rank(H[K˜ ]). The number of signal vector in Table III for clearly viewing. FDSS can be adopted as a combinations for N data streams per sub-carrier equals to   RF benchmark to measure the performance of the proposed signal m . Because the transmissions over all carriers share the NRF shaping method for mmWave hybrid MIMO systems. 1 2 K same analog precoders [17], set FRF = {FRF, FRF, ··· , FRF } can be obtained by solving C. Implementation Challenges K˜ −1 The proposed signal shaping is designed at each coherent Õ ˜ k k ˜ 2 (P9) : min ||Fk [k] − FRF FBB[k]||F time. After that, the output results can be saved and the {Fk }, {Fk [k˜]} RF BB k˜=0 signal shaping will be performed according to the saved k Nt ×NRF (42) subject to : FRF ∈ U , results at each symbol time. In the high-symbol-rate mmWave Fk [k˜] ∈ CNRF ×NRF , communications, the digital part in the signal shaping can BB be efficiently performed. Therefore, the switching speed of || k k [ ˜]||2 || [ ˜]||2 FRF FBB k F = Fk k F, analog precoders is the key factor that determines whether where Fk [k˜] is a matrix composed by NRF singular vectors beamspace modulation schemes can be realized in practical of H[k˜]. The algorithms for solving (P9) can also be found mmWave communications. To address this concern, Wang in rich literature, such as [1], [2] and [29]. Based on FRF and Zhang have researched the switching speed of analog and H[k], we can use the proposed signal shaping method to phase shifters in [14]. Specifically, there are four types of design XN [k] for each sub-carrier. phase shifters, which are semiconductor, ferroelectric, ferrite, and micro-electromechanical phase shifters. [30] showed that B. Performance Loss the switching slots of semiconductor and ferroelectric phase To measure the performance loss of the proposed designs, shifters are in the order of nanosecond. A low-cost phase we resort to comparing the proposed signal shaping with shifter design with tens of nanoseconds switching time was FDSS, which is obtained by directly solving (RP1). The FDSS reported in [31]. Thanks to these hardware developments, the NN challenge can be well addressed for high-rate transmissions. can be obtained as follows. Firstly, we introduce oi ∈ C t The other implementation challenge is the computational as oi , g ⊗ 1N and rewrite xi as i t complexity. Even though the FPSS and DPSS have greatly ˆ ˆ xi = GDzoˆi, (44) reduced the computational complexity, the complexity still increases with the second power of the number of the analog where Gˆ is a matrix of dimension Nr × NNt expressed as precoders. To reduce the computational complexity, we can N z }| { use part of beamspace activation states, where several strong Gˆ = [H, H, ··· , H], (45) beamspace activation states are selected. By this way, the computational complexity can be reduced. One extreme case the matrix Dˆ ∈ CNNt ×NNt is a diagonal matrix represented z is that only the strong beamspace activation state is selected, by the signal shaping is reduced to BBSS.  X1 0 ··· 0     ···   0 X2 0  Dˆ   , (46) D. Hybrid Receiver-Aware Design z ,  . . . . .   . . . .    Considering the signals are typically processed by a hybrid    0 0 ··· XN  receiver, we can re-express the signal model as   H H H H and yˆ = WBBWRF Hx + WBBWRF n. (49) Xi = diag{xi }. (47) In such a system, the hybrid receiver can be firstly designed Based on the similar reformulation illustrated in Sections by X { ··· } IV-A and IV-B, we can optimize N = x1, x2, , xN 2 directly by solving (P11) : min ||Wr − WRF WBB ||F WRF,WBB (50) r r 0 Nr ×N N ×m (P10) : Given : Zˆ ii0, i , i ∈ {1, 2, ··· , N} subject to : WRF ∈ U RF , WBB ∈ C RF . ∀ Find : zˆ N ×N r (48) where Wr ∈ C r RF is a matrix combined by right-singular T N r ×N r Minimize : zˆ zˆ vectors; WBB ∈ C RF RF represents the digital combiner; N ×N r r T 0 ∈ U r RF Subject to : zˆ Zˆ ii0 zˆ ≥ τ, i , i ∈ {1, 2, ··· , N}, WRF stands for the analog combiner and NRF ∀ 10

 0.2274 − 0.3324 0.6728 + 0.4259 −0.6300 − 0.9119 1.1798 + 0.5234    −0.5838 + 1.1369 −0.0901 − 1.0357 −0.1849 + 0.6814 −0.4524 − 0.0135   H =   . (59)  0.3709 − 0.2147 0.6403 + 0.1256 −0.9033 − 0.6788 1.4362 + 0.3338    −0.4873 + 1.0504 −0.2734 − 0.8536 0.0812 + 0.6987 −0.6052 − 0.0623  

3.5 DPSS + OMP AP, FCH 3 FPSS + OMP AP, FCH JOSS + OMP AP, FCH DPSS + SIC AP, PCH 2.5 FPSS + SIC AP, PCH JOSS + SIC AP, PCH 2 FDSS, Fully Digital

1.5

1

Minimum Euclidean Distance 0.5

0

Fig. 2. Minimum Euclidean distances among the noise-free received vectors when different signal shaping methods are applied in a (4, 4, 2, 3, 3) mmWave MIMO system over a constant channel. is the number of receive RF chains. The problem can be which is lower bounded by [33] solved by existing algorithms developed in [1]–[5]. Then, by H H ILB(x; y|H) = log2 N + Nr (1 − log2 e) replacing H with WBBWRF H and employing the proposed N N 2 ! signal shaping methods, we can obtain the hybrid receiver- 1 Õ Õ ρ||H(xi − xi0 || − log exp − 2 . aware signal shaping. N 2 2 i=1 i0=1 (56) E. Extension to MSER and MMI Signal Shaping

0 2 T 0 0 With a maximum-likelihood (ML) employed at With the reformulations ||H(xi − xi )||2 = z Zii z when i , i , the receiver, the SER of mmWave MIMO systems is upper the mutual information lower bound can be expressed as bounded by [28] ILB(q) = log2 N + Nr (1 − log2 e) N N 1 Õ Õ  ρ 2 P (X ) = exp − ||H(x − x 0 )|| , (51) N  N  T  s N 2N 4 i i 2 1 Õ  Õ ρz Zii0 z  (57) i=1 i0=1, − log2 1 + exp −  . i0,i N  2  i=1  i0=1   i0,i  where ρ represents the SNR. Based on the re-formulation in   Sections IV-A, we have Thus, the MMI signal shaping can be formulated as 2 T 0 ||H(xi − xi0 )||2 = z Zii0 z, (52) (MI-OP) : Given : Zii0, i , i ∈ {1, 2, ··· , N}, ρ ∀ and the upper bound can be re-expressed as Find : z (58) N N Maximize : ILB(z) 1 Õ Õ  ρ T  P (z) = exp − z Z 0 z . (53) T s 2N 4 ii Subject to : z z ≤ N i=1 i0=1, 0 i ,i By replacing (P5) with (SEP-OP) and (MI-OP), the JOSS Thus, the MSER signal shaping can be formulated as approach can be directly extended to the designs based on the 0 (SER-OP) : Given : Zii0, i , i ∈ {1, 2, ··· , N}, ρ MSER and MMI criteria, respectively. (SEP-OP) and (MI- ∀ Find : z OP) can also be solved by the existing algorithms, e.g., the (54) algorithm designed in [34]. The computational complexity are Minimize : P (z) s of the same order as that for solving (P5), because the ob- T Subject to : z z ≤ N. jective functions of (SEP-OP) and (MI-OP) are the functions Given X as inputs, the mutual information of mmWave MIMO of {zZii0 z}, whose computation dominates the computational systems can be written as [32] complexity. In a similar way, FPSS, DPSS, and FDSS can I(x; y|H) = log N − · · · be extended. It should be remarked that the MSER and MMI 2 criteria are equivalent to the MMED criterion in the high SNR N ( N ) 1 Õ Õ  2 2 2  regime. The reason is that the upper bound on SER given (53) E log exp −ρ(||H(x − x 0 + n )|| − ||n|| ) , N n 2 i i 2 2 and the lower bound on MI given in (57) are dominated by i=1 i0=1 (55) the minimum Euclidean distance term with the help of the 11 exponential operator. Moreover, it should be noted that MMED design is invariant to the instantaneous SNR, while the MSER 1 JOSS + OMP AP and MMI designs have to be updated as SNR varies. 0.9 FPSS + OMP AP BBSS + MMI AP&DP 0.8 DPSS + OMP AP AMSS + OMP AP&DP VI.SIMULATION AND ANALYSIS BBSS + OMP AP&DP 0.7 UBMSS + OMP AP&DP In this section, we will present simulation results to show 0.6 the superiority of the proposed signal shaping methods aided 0.5

NUBM and to validate its effectiveness in mmWave broadband CDF systems, in the systems with channel estimation errors and 0.4 that with hardware impairments. For clear denotation, we use 0.3 the parameters (Nt, Nr, NRF, n, L) to characterize an mmWave FCH/PCH MIMO system. 0.2 0.1

A. Performance over a Constant mmWave MIMO Channel 0 0 2 4 6 8 10 12 14 16 18 20 Considering the time consumption for the CSI acquisition Minimum Euclidean Distance and the computation latency in the design procedure, the pro- posed signal shaping methods are more suitable for mmWave Fig. 3. CDF of the minimum Euclidean distances among the noise-free received vectors when different signal shaping methods are applied in a MIMO communications that experience slowly varying chan- (64, 4, 2, 3, 3) mmWave FCH MIMO system. nels. The potential applications can be wireless backhaul communications, wireless big data communications for data centers, and in-car/in-device high-rate data communications. Thus, we first investigate the application over a constant mmWave MIMO channel. -1 In a (4, 4, 2, 3, 3) mmWave MIMO system over a constant 10 channel given in (59), we demonstrate the minimum Euclidean distance of the proposed signal shaping methods for mmWave hybrid MIMO systems with different structures and analog 10-2 precoding (AP) methods in Fig. 2. In particular, we compare SER DPSS, FPSS, and JOSS in FCH MIMO with OMP AP [1] and that in PCH MIMO with SIC AP [3]. FDSS with a fully- UBMSS + OMP AP&DP 10-3 BBSS + OMP AP&DP digital structure is also included as a benchmark. Simula- AMSS + OMP AP&DP DPSS + OMP AP tion results show that the proposed hybrid JOSS and FPSS BBSS + MMI AP&DP FPSS + OMP AP methods can approximately approach the FDSS with a fully- JOSS + OMP AP digital structure, i.e., obtaining almost the same minimum 10-4 -10 -9 -8 -7 -6 -5 -4 -3 Euclidean distance. This indicates that solving (P2) and (P3) SNR (dB) to obtain the solution to (P1) is an effective way. Observing our methods of solving (P3), it is found that JOSS and Fig. 4. SER of different signal shaping methods in a (64, 4, 2, 3, 3) mmWave FPSS outperform DPSS. Specifically, the achieved minimum FCH MIMO system. Euclidean distances by JOSS and FPSS are around 10% higher than that achieved by DPSS. It is also shown that PCH with the proposed signal shaping methods can achieve algorithm [1]. “BBSS + OMP AP&DP” and “AMSS + OMP similar performance with the FCH structure. This is because AP&DP” stand for the BBSS scheme with OMP AP&DP the number of transmit antennas is small. When the number and adaptive modulation-based signal shaping (AMSS) [20] of transmit antennas becomes larger, the difference between with OMP AP&DP, respectively. “BBSS+ MMI AP&DP” the PCH and the FCH structures will also become larger. represents the BBSS scheme with MMI AP&DP [9], which is designed by assuming finite alphabet inputs. Observing the comparison results in Figs. 3 and 4, we find that JOSS exhibits B. Performance in mmWave Massive FCH MIMO Systems the best performance and slightly outperforms FPSS. Both of In a (64, 4, 2, 3, 3) mmWave FCH MIMO system, we evalu- them outperform existing transmission solutions for mmWave ate the cumulative distribution function (CDF) of the minimum MIMO communications. The proposed DPSS enjoys lower Euclidean distances among all noise-free received vectors, computational complexity compared to FPSS and JOSS, but SER and the computational complexity quantified by the the reduction of complexity reduces the system performance number of floating operations of various schemes as illustrated greatly. The performance loss of DPSS compared to FPSS in Figs. 3-4, and Table IV. In detail, seven signal shaping mainly comes from the limited symbol vector set refinement schemes are compared in the system setup. “UBMSS + capability, because only the diagonal elements in the precoding OMP AP&DP” represents the UBMSS scheme with analog matrix can be adjusted. Observing the comparison in compu- precoders and digital precoders (AP&DP) generated by OMP tational complexity quantified by the number of operations in 12

TABLE IV COMPUTATIONAL COMPLEXITYOF DIFFERENT SIGNAL SHAPING METHODSBYTHE NUMBEROF FLOATING-POINT OPERATIONS IN A (64, 4, 2, 3, 3) MMWAVE FCHMIMOSYSTEM. 10-1 Signal Shaping Methods Computational Complexity JOSS + OMP AP 4.89 × 107 FPSS + OMP AP 7.45 × 106 BBSS + MMI AP&DP 4.82 × 106 10-2 DPSS + OMP AP 3.21 × 106 AMSS + OMP AP&DP 5.03 × 104 SER BBSS + OMP AP&DP 2.31 × 104 UBMSS + OMP AP&DP 4.92 × 104 UBMSS + SIC AP&DP 10-3 BBSS + SIC AP&DP AMSS + SIC AP&DP DPSS + SIC AP BBSS + MMI AP&DP 1 FPSS + SIC AP JOSS + SIC AP JOSS + SIC AP 0.9 FPSS + SIC AP 10-4 BBSS + MMI AP&DP -10 -9 -8 -7 -6 -5 -4 -3 0.8 DPSS + SIC AP SNR (dB) AMSS + SIC AP&DP BBSS + SIC AP&DP 0.7 UBMSS + SIC AP&DP Fig. 6. SER of different signal shaping methods in a (64, 4, 2, 3, 3) mmWave 0.6 PCH MIMO system.

0.5 CDF TABLE V 0.4 COMPUTATIONAL COMPLEXITYOF DIFFERENT SIGNAL SHAPING METHODSBYTHE NUMBEROF FLOATING-POINT OPERATIONS IN A 0.3 (64, 4, 2, 3, 3) MMWAVE PCHMIMOSYSTEM.

0.2 Signal Shaping Methods Computational Complexity 0.1 JOSS + SIC AP 1.98 × 107 FPSS + SIC AP 3.01 × 106 0 6 0 2 4 6 8 10 12 14 16 18 20 BBSS + MMI AP&DP 2.53 × 10 Minimum Euclidean Distance DPSS + SIC AP 1.61 × 106 AMSS + SIC AP&DP 3.51 × 105 Fig. 5. CDF of the minimum Euclidean distances among the noise-free BBSS + SIC AP&DP 9.78 × 104 received vectors when different signal shaping methods are applied in a UBMSS + SIC AP&DP 3.43 × 105 (64, 4, 2, 3, 3) mmWave PCH MIMO system.

D. Performance in mmWave MIMO-OFDM Systems Table IV, as expected, the proposed signal shaping involves ad- ditional symbol vector optimization for performance improve- For broadband mmWave MIMO communications, we simu- ment and incurs much computational complexity. However, late the SER of a (16, 9, 2, 3, 3) mmWave FCH MIMO-OFDM the complexity is acceptable for mmWave MIMO communica- system with 128 sub-carriers. Similarly, we compare NUBM tions, in which the transmit signals propagate over slow fading using FPSS, DPSS and JOSS with UBMSS and BBSS. The channels. results are illustrated in Fig. 7, from which it is validated the proposed signal shaping can be easily extended to broad- C. Performance in mmWave Massive PCH MIMO Systems band communication to bring considerable gain. Specifically, NUBM with FPSS and JOSS outperforms BBSS with MMI Besides the FCH MIMO system, we also make the compar- AP&DP by around 0.8 dB. They outperform AMSS and isons in a PCH MIMO system as illustrated in Figs. 5-6, and UBMSS with OMP AP by 5 dB and 10 dB, respectively. Table V. In detail, seven signal shaping schemes are compared in the system setup. “UBMSS +SIC AP&DP” represents the UBMSS scheme with AP&DP generated by SIC algorithm [3]. E. Performance in the Presence of Channel Estimation Errors “BBSS + SIC AP&DP” and “AMSS + SIC AP&DP” stand All of the designs are based on the perfect CSI at the for the BBSS scheme with SIC AP&DP and AMSS [20] with transceivers. To show the robustness in the presence of SIC AP&DP, respectively. From comparison results, we can channel estimation errors, a simplified channel error model draw similar conclusion that the proposed JOSS and FPSS Him = H + He [35], [36] is adopted, where He denotes the outperforms existing mmWave MIMO transmission solutions. matrix of channel estimation errors with each entry obeying a DPSS can only outperform existing signal shaping methods, complex Gaussian distribution with zero mean and variance 2 2 which are obtained by assuming complex Gaussian inputs. σe ; σe is propositional to the variance of the noise, i.e., 2 2 Besides, by jointly observing the results in Figs. 4 and 6, one σe = ησn. It is noteworthy that the channel estimation error can find that FCH MIMO with the proposed signal shaping model is just an example demonstrating the worst case. In greatly outperforms PCH MIMO with the proposed signal the simulation, we set η = 0.1 and compare the simulation shaping. The better performance of FCH MIMO results from results with that using perfect CSI, as illustrated in Fig. 8. the higher beamforming gain of FCH MIMO. Simulation results demonstrate that all schemes experience 13

100

10-1 10-1

10-2 10-2 SER SER

UBMSS + OMP AP&DP -3 10 BBSS + OMP AP&DP -3 AMSS + OMP AP&DP 10 AMSS + Perfect DAC DPSS + OMP AP JOSS + 3-bit DAC BBSS + MMI AP&DP JOSS + 4-bit DAC FPSS + OMP AP JOSS + 5-bit DAC JOSS + OMP AP JOSS + Perfect DAC 10-4 10-4 -10 -9 -8 -7 -6 -5 -4 -3 -10 -9 -8 -7 -6 -5 -4 -3 -2 SNR (dB) SNR (dB)

Fig. 7. SER of different signal shaping methods in a (16, 9, 2, 3, 3) mmWave Fig. 9. SER of the proposed JOSS method in (16, 4, 2, 3, 3) mmWave FCH FCH MIMO-OFDM system with 128 sub-carriers. MIMO systems using perfect DAC and imperfect DAC.

100 method maintain good performance when the 5-bit DAC is UBMSS, =0.1 adopted. When the 4-bit and 3-bit DAC are adopted, there are BBSS, =0.1 AMSS, =0.1 0.4 dB and 0.8 dB performance losses, respectively. Despite DPSS, =0.1 FBBSS, =0.1 that, it can be observed that the achieved performance gain is FPSS, =0.1 JOSS, =0.1 substantial compared to AMSS.

10-1 VII.CONCLUSION SER For mmWave MIMO communications with CSI at the trans- mitter, we investigated the signal shaping methods according UBMSS, =0 BBSS, =0 to the MMED criterion. Different from existing BBSS schemes AMSS, =0 -2 10 DPSS, =0 that only activate fixed beamspace per coherent time and FBBSS, =0 FPSS, =0 UBMSS schemes that equiprobably activate each beamspace, JOSS, =0 our proposed methods activate different beamspace with dif-

-10 -9 -8 -7 -6 -5 -4 -3 ferent probabilities and with different symbol vector sets. In SNR (dB) other words, our designs are more generalized, which also results in better performance. Fig. 8. SER of different signal shaping methods in a (36, 4, 2, 3, 3) mmWave Specifically, in mmWave hybrid MIMO communication FCH MIMO system with perfect CSI (η = 0) and imperfect CSI (η = 0.1). systems, we split the transmit vector shaping methods into an analog precoder design problem and a symbol vector similar performance losses in the presence of imperfect CSI set optimization problem. Then, based on existing work on and the proposed JOSS and FPSS maintain the superiority analog precoder optimization, we dedicated our effort to the over other schemes. In other words, all schemes have similar symbol vector set optimization. We proposed three signal robustness to channel estimation errors, and in the presence shaping methods: JOSS, FPSS and DPSS. Among them, of a similar level of channel estimation errors, the proposed JOSS optimizes the symbol vector sets for each optimized signal shaping methods can achieve the best performance in analog precoder directly, including the set size optimization comparison with existing transmission solutions. and set entry optimization. The searching space of JOSS is the largest and thus JOSS is of the highest computational complexity. To reduce the complexity, we adopted full or F. Performance in the Presence of Hardware Impairments diagonal precoders to refine predefined symbol vector sets, Since mmWave communication systems could be equipped i.e., FPSS and DPSS. By reducing the optimization search with imperfect hardware, such as the phase/amplitude in- space, the computational complexity is reduced accordingly. consistent circuits, low-resolution digital-to-analog converters Finally, we discussed the proposed signal shaping methods in (DAC) and analog-to-digital converters (ADC). To show the the applications in OFDM-based mmWave MIMO communi- robustness of the proposed designs, we simulate the perfor- cations and in mmWave MIMO communications with hybrid mance of the proposed JOSS method in mmWave MIMO transceivers. systems using low-resolution DAC at the transmitter as illus- Simulation results revealed that the proposed JOSS and trated in Fig. 9. As depicted in Fig. 9, the proposed JOSS FPSS outperform existing BBSS and UBMSS methods; FPSS 14

exhibits similar performance compared to JOSS but with much [19] S. Gao, X. Cheng, and L. Yang, “Generalized beamspace modulation lower complexity; DPSS also reduces a lot of complexity but at for mmwave MIMO,” in Proc. IEEE GLOBECOM 2018, Abu Dhabi, UAE, Dec. 2018, pp. 1–6. the cost of significant performance loss. Moreover, simulations [20] ——, “ with limited RF chains: Generalized also validated that the proposed signal shaping methods can beamspace modulation (GBM) for mmwave massive MIMO,” IEEE J. be extended to mmWave MIMO-OFDM systems, mmWave Sel. Areas Commun., vol. 37, no. 9, pp. 2029–2039, Sep. 2019. [21] A. A. M. Saleh and R. Valenzuela, “A statistical model for indoor MIMO systems with hybrid transceivers, mmWave MIMO multipath propagation,” IEEE J. Sel. Areas Commun., vol. 5, no. 2, pp. systems with imperfect CSI and hardware impairment. The 128–137, Feb. 1987. superiority of the proposed signal shaping maintains in these [22] X. Gao, L. Dai, S. Han, C. I, and R. W. Heath, “Energy-efficient hybrid analog and digital precoding for mmwave MIMO systems with large systems. In summary, the proposed signal shaping methods antenna arrays,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 998– achieve better performance than BBSS and UBMSS, and 1009, Apr. 2016. can be a promising candidate in the future mmWave MIMO [23] V. Jamali, A. M. Tulino, G. Fischer, R. Muller, and R. Schober, “Reflect- and transmit-array antennas for scalable and energy-efficient mmwave communications. massive MIMO,” arXiv preprint arXiv:1902.07670, 2019. [24] H. Yan, S. Ramesh, T. Gallagher, C. Ling, and D. Cabric, “Performance, REFERENCES power, and area design trade-offs in millimeter-wave transmitter beam- forming architectures,” IEEE Circuits and Systems Magazine, vol. 19, [1] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, no. 2, pp. 33–58, May 2019. “Spatially sparse precoding in millimeter wave MIMO systems,” IEEE [25] I. Ahmed, H. Khammari, A. Shahid, A. Musa, K. S. Kim, E. De Poorter, Trans. Wireless Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014. and I. Moerman, “A survey on hybrid beamforming techniques in 5G: [2] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization Architecture and system model perspectives,” IEEE Commun. Surveys algorithms for hybrid precoding in millimeter wave MIMO systems,” Tuts., vol. 20, no. 4, pp. 3060–3097, Jun. 2018. IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 485–500, Feb. [26] S. Guo, H. Zhang, P. Zhang, D. Wu, and D. Yuan, “Generalized 3- 2016. D constellation design for spatial modulation,” IEEE Trans. Commun., [3] L. Dai, X. Gao, J. Quan, S. Han, and C. I, “Near-optimal hybrid analog vol. 65, no. 8, pp. 3316–3327, Aug. 2017. and digital precoding for downlink mmwave massive MIMO systems,” [27] P. Cheng, Z. Chen, J. A. Zhang, Y. Li, and B. Vucetic, “A unified in Proc. IEEE ICC 2015, London, UK, June 2015, pp. 1334–1339. precoding scheme for generalized spatial modulation,” IEEE Trans. [4] S. Han, C. I, Z. Xu, and C. Rowell, “Large-scale antenna systems with Commun., vol. 66, no. 6, pp. 2502–2514, June 2018. hybrid analog and digital beamforming for millimeter wave 5G,” IEEE [28] S. Guo, H. Zhang, P. Zhang, S. Dang, C. Liang, and M.-S. Alouini, Commun. Mag., vol. 53, no. 1, pp. 186–194, Jan. 2015. “Signal shaping for generalized spatial modulation and generalized [5] S. Park, A. Alkhateeb, and R. W. Heath, “Dynamic subarrays for hybrid quadrature spatial modulation,” IEEE Trans. Wireless Commun., vol. 18, precoding in wideband mmwave MIMO systems,” IEEE Trans. Wireless no. 8, pp. 4047–4059, Aug. 2019. Commun., vol. 16, no. 5, pp. 2907–2920, May 2017. [29] F. Sohrabi and W. Yu, “Hybrid analog and digital beamforming for [6] H. Huang, Y. Song, J. Yang, G. Gui, and F. Adachi, “Deep-learning- OFDM-based large-scale MIMO systems,” in Proc. IEEE SPAWC, based millimeter-wave massive MIMO for hybrid precoding,” IEEE Edinburgh, UK, July 2016, pp. 1–6. Transactions on Vehicular Technology, vol. 68, no. 3, pp. 3027–3032, [30] R. R. Romanofsky, Array Phase Shifters: Theory and Technology. Mar. 2019. Antenna Engineering Handbook, 4th ed, New York, NY, USA: McGraw- [7] M. Li, Z. Wang, H. Li, Q. Liu, and L. Zhou, “A hardware-efficient hybrid Hill, 2007. beamforming solution for mmwave mimo systems,” IEEE Wireless [31] N. Co, “A low cost analog phase shifter product family for military, Communications, vol. 26, no. 1, pp. 137–143, Feb. 2019. commercial and public safety applications,” Microw. J., vol. 49, no. 3, [8] Y. Wu, C. Xiao, Z. Ding, X. Gao, and S. Jin, “A survey on MIMO pp. 152–156, Mar. 2006. transmission with finite input signals: Technical challenges, advances, [32] S. Guo, H. Zhang, J. Zhang, and D. Yuan, “On the mutual information and future trends,” Proceedings of the IEEE, vol. 106, no. 10, pp. 1779– and constellation design criterion of spatial modulation MIMO systems,” 1833, Oct. 2018. in Proc. IEEE ICCS, Nov. 2014, pp. 487–491. [9] R. Rajashekar and L. Hanzo, “Hybrid beamforming in mm-wave MIMO [33] W. Zeng, C. Xiao, and J. Lu, “A low-complexity design of linear pre- systems having a finite input alphabet,” IEEE Trans. Commun., vol. 64, coding for MIMO channels with finite-alphabet inputs,” IEEE Wireless no. 8, pp. 3337–3349, Aug. 2016. Communications Letters, vol. 1, no. 1, pp. 38–41, Feb. 2012. [10] Y. Wu, D. W. K. Ng, C. Wen, R. Schober, and A. Lozano, “Low- [34] W. Wang and W. Zhang, “Diagonal precoder designs for spatial modu- complexity MIMO precoding for finite-alphabet signals,” IEEE Trans. lation,” in Proc. IEEE ICC, London, UK, June 2015, pp. 2411–2415. Wireless Commun., vol. 16, no. 7, pp. 4571–4584, July 2017. [35] S. Guo, H. Zhang, P. Zhang, and D. Yuan, “Link-adaptive mapper [11] J. Jin, Y. R. Zheng, W. Chen, and C. Xiao, “Hybrid precoding for designs for space-shift-keying-modulated MIMO systems,” IEEE Trans. millimeter wave MIMO systems with finite-alphabet inputs,” in Proc. Veh. Technol., vol. 65, no. 10, pp. 8087–8100, Oct. 2016. IEEE GLOBECOM 2017, Dec. 2017, pp. 1–6. [36] ——, “Adaptive mapper design for spatial modulation with lightweight [12] N. S. Perovic, P. Liu, M. D. Renzo, and A. Springer, “Receive spatial feedback overhead,” IEEE Trans. Veh. Technol., vol. 66, no. 10, pp. modulation for LOS mmwave communications based on TX beamform- 8940–8950, Oct. 2017. ing,” IEEE Commun. Lett., vol. 21, no. 4, pp. 921–924, Apr. 2017. [13] M. Lee and W. Chung, “Adaptive multimode hybrid precoding for single-RF virtual space modulation with analog phase shift network in MIMO systems,” IEEE Trans. Wireless Commun., vol. 16, no. 4, pp. 2139–2152, Apr. 2017. [14] W. Wang and W. Zhang, “Transmit signal designs for spatial modulation with analog phase shifters,” IEEE Trans. Wireless Commun., vol. 17, no. 5, pp. 3059–3070, May 2018. [15] Y. Ding, K. J. Kim, T. Koike-Akino, M. Pajovic, P. Wang, and P. Orlik, “Spatial scattering modulation for uplink millimeter-wave systems,” IEEE Commun. Lett., vol. 21, no. 7, pp. 1493–1496, July 2017. [16] Y. Ding, V. Fusco, A. Shitvov, Y. Xiao, and H. Li, “Beam index modulation wireless communication with analog beamforming,” IEEE Trans. Veh. Technol., vol. 67, no. 7, pp. 6340–6354, July 2018. [17] S. Guo, H. Zhang, P. Zhang, P. Zhao, L. Wang, and M.-S. Alouini, “Generalized beamspace modulation using multiplexing: A breakthrough in mmwave MIMO,” IEEE J. Sel. Areas Commun., vol. 37, no. 9, pp. 2014–2028, Sep. 2019. [18] S. Guo, H. Zhang, and M.-S. Alouini, “MIMO capacity with reduced RF chains,” 2019. [Online]. Available: https://arxiv.org/abs/1901.03893