Hybrid Beamforming/Combining for Millimeter Wave MIMO

SUBMITTED TO GLOBALSIP 1

Hybrid Beamforming/Combining for Millimeter Wave MIMO: A Machine Learning Approach Jiyun Tao∗, Jing Xing∗, Jienan Chen∗Member, IEEE, Chuan Zhang†Member, IEEE, Shengli Fu‡Senior Member, IEEE ∗University of Electronic Science and Technology of China, Chengdu, China (Email:[email protected]) †National Mobile Communications Research Laboratory, Southeast University, Nanjing, China ‡Department of Electrical Engineering, University of North Texas, Denton, Texas, USA

i.e., analog beamformer, and optimizes the digital stage with Abstract—Hybrid beamforming (HB) has emerged as a promis- the optimization goal of full digital beamforming matrix. ing technology to support ultra high transmission capacity and Then the digital beamformer is fixed and the analog stage with low complexity for Millimeter Wave (mmWave) multiple- input and multiple-output (MIMO) system. However, the design is optimized. The two processes are performed iteratively of digital and analog beamformer is a challenge task with until the algorithm is converged. However, the solution is non-convex optimization, especially for the multi-user scenario. sub-optimal by the limitation of analog beamformer based Recently, the blooming of deep learning research provides a new on matrix decomposition methods. To this end, the result vision for the signal processing of communication system. In will prone to be a local-optimum point for the separating this work, we propose a deep neural network based HB for the multi-User mmWave massive MIMO system, referred as optimizing digital and analog part. DNHB. The HB system is formulated as an autoencoder neural As alternative optimization tools, machine learning and network, which is trained in a style of end-to-end self-supervised artificial intelligence (AI) provide new approaches for solving learning. With the strong representation capability of deep neural the over-complicated problems in communication systems, network, the proposed DNHB exhibits superior performance than the traditional linear processing methods. According to the which have been applied in intelligent radio network [10]– simulation results, DNHB outperforms about 2 dB in terms of bit [12], backhaul optimization for mmWave system [13] and error rate (BER) performance compared with existing methods. signal processing for physical layer, [14], [15] and network Index Terms—HB, mmWave Massive MIMO, Deep learning, traffic analysis [16], [17] etc. As the communication problem Autoencoder neural network just gets what you transmit, it is exactly as the same as the expectation of autoencoder neural network which hopes outputs equal inputs. Hence, to address the above critical challenges, I.INTRODUCTION we introduce a deep neural network based HB design method The massive multiple-input multiple-output (MIMO) with by mapping HB multi-user system as an autoencoder (AE) Millimeter Wave (mmWave) can provide a high spatial gain neural network, referred as DNHB. and diversity gain for the high data transmission, which is The contributions of this work are summarized as follows . considered as a key technology for the future wireless communication system [1]. The HB technology, which consists • Deep autoencoder neural network based self-supervised of digital and analog beamformer, is with low hardware and learning. Instead of applying the data and labels from computation complexity and maintains high data transmission matrix decomposition method for training, we integrate capacity. [2] the CSI matrix into a hidden layer of the considered AE neural network. By training the DNN in an end- arXiv:1910.06585v4 [eess.SP] 3 Jan 2020 However, the global optimization of digital and analog beamformer is still a challenge task [3]. The analog beam- to-end style, the proposed AE neural network HB can former/combiner is implemented with a constant modulus break through the performance limit of the existing matrix constraint phase shifter network [4]. Besides that, the analog decomposition method. beamformer/combiner and digital beamformer/combiner are • Supporting multi-user neural network scenario. Com- coupled. Consequently, the joint optimization of HB system is pared with our previous work, we propose a splitting an non-convex problem. This problem is even more difficult neural network to support multi-user massive MIMO for the multi-user scenario, where the beamformer/combiner system scenario. After the channel network layer, we design for each user can only be optimized separatively [5]. decompose the network to several sub-layer networks to The existing methods [6]–[9] to solve the challenges and match the practical multi-user MIMO system. achieve feasible near-optimal solutions attempt to approximate • Superior performance over the existing methods. The the full digital optimal beamformer by decoupled the design proposed design outperforms the existing methods about of analog and digital stage. It first fixes one processing stage, 2 dB in bit error rate (BER) performance. The organization of this paper is as follows. In section II, This work was supported by National Nature Science Foundation of China with Grant No. 61971107. Corresponding author is Jienan Chen the multi-user mmWave massive MIMO HB system module ([email protected]) is illustrated and the optimization problem is clarified. In 2 SUBMITTED TO GLOBALSIP section III, multi-user mmWave massive MIMO the AE based r f Nr HB system structure is proposed. The map from traditional Digital/ N Baseband RF chain r RF chain Ns Combiner structure to the neural network is detailed explained. The H 1 User1 following sections present the comparison of simulation results Digital/ rf N KNs Baseband Nt t in BER with other methods. The proposed method can get 2-3 Beamformer rf H Nr dB gain in the contrast of traditional algorithms. K Digital/ N RF chain r RF chain Baseband Ns The notation in this paper is introduced as below A repre- Combiner UserK sents the matrix and A(i, j) represents the (i, j)−th entry of Digital Analog Analog Digital a matrix A. |·| is the modulus of a complex number and k·kF Beamformer Beamformer Combiner Combiner is the Frobenius norm. For a vector or matrix, the superscripts (a) traditional multi-user hybrid beamforming system T ∗ H (·) , (·) and (·) represent transpose, complex conjugate and Fig. 1. Illustration of a narrowband downlink single-cell multi-user mmWave complex conjugate transpose, respectively. <(·) is the real-part massive MIMO HB system. rf operator and =(·) is the imaginary-part operator. E[·] denotes N N N expectation, and IK is the K × K identity matrix. complexKN GaussianN rf noise.N The received signal after beamformer at k-th user is given by

rf H H HN H N X N II.SYSTEMMODELANDPROBLEMFORMULATION ˜sk = WDkWAkHkFAFDksk + WDkWAkHk FAFDlsl | {z } l6=k desired signals A. system model | {z } effective interference H H A typical narrowband downlink single-cell multi-user + WDkWAknk, | {z } mmWave massive MIMO system is as shown in Fig. 1. A effective noise base station (BS) is equipped with Nt transmit antennas, and (3) rf Nt radio frequency (RF) chains. Each RF chains serves K rf Nr ×N rf where WAk ∈ C r is the analog combining matrix users, which is equipped with Nr receiver antennas and Nr rf N ×Ns and W k ∈ r is the digital combining matrix for RF chains [1]. The number of independent data streams is Ns, D C the k-th user. Since W k is also implemented by the analog which means that total KNs data streams are processed by the A phase shifters, all elements of WAk should have the constant BS. To guarantee the quality with the limited number of RF 2 chains, the number of the transmitted streams is constrained amplitude such that |WAk(i, j)| = 1. In this paper, it rf rf is assumed that perfect channel state information (CSI) is by KNs ≤ Nt ≤ Nt for the BS and Ns ≤ Nr ≤ Nr for each user. available at both the transmitter and receiver and that there is perfect synchronization between them. At the BS, the transmitted symbols are assumed to be rf processed by a Nt ×KNs digital beamformer FD and then by rf an analog beamformer FA of dimension Nt ×Nt to construct B. Problem formulation the final transmitted signal. Notably, the digital beamformer In this work, we employ the sum-MSE [18], [19] of all users FD enables both amplitude and phase modification, while and all streams as the performance measure and optimization only phase changes (phase-only control) can be realized by objective for the joint transmit and receive HB design, which analog beamformer FA with only phase shifters. To this end, 2 is defined as each element of FA is normalized to satisfy |FA(i, j)| = 1. K K Mathematically, the transmitted signal can be written as ∆ X X n 2 o MSE = MSEk = E ksk − ˜skkF , (4) K k=1 k=1 X x = Fs = FAFDs = FAFDksk, (1) where MSEk denotes the MSE of the k-th user. By substituting k=1 (3) into (4) and irrespective of effective interference, we have

K where F = FAFD denotes the HB matrix with ∆ X H H MSE = E{||sk − (WDkWAkHFAFDksk FD=[FD1, FD2, ..., FDK ]. FDk is a digital beamformer matrix (5) k=1 for k = 1, 2, ..., K, and s ∈ KNs×1 is the vector of signal C + WH WH n )||2 }. symbols which is the concatenation of each user’s data stream Dk Ak k F T vector such as s = [sT , sT , ..., sT ] , where s denotes the data 1 2 K k Finally, the optimization problem in the narrowband scenario stream vector for user k. It is assumed that E[ssH ] = I . KNs for multi-user can be formulated as For user k, the received signal can be modeled as K X X minmize MSEk H H yk = HkFAFDksk + Hk FAFDlsl + nk, (2) F ,FA,W ,W Dk Ak Dk k=1 l6=k H H subject to tr FAFDFD FA ≤ P, (6) 2 Nr ×Nt where Hk ∈ C is the channel matrix for the k-th user |FA(i, j)| = 1,∀i, j, Ns×1 2 2 and nk ∈ C is the vector of i.i.d. CN (0, σ ) additive |WAk(i, j)| = 1,∀i, j, k, SHELL et al.: BARE DEMO OF IEEETRAN.CLS FOR IEEE JOURNALS 3

parameters set of the real channel and imaginary channel in PS PS rf the n-layer digital beamformer neural network. Nr Ns Nr PS PS Similarly, the receivers signal after baseband combining can User1 be stated as KN N rf N s t t n

Noise layer Noise ˜

Channel Layer Channel S = f (SA; αr), (9) PS PS r N rf Nr r Ns PS PS where the output of the digital combiner neural network can UserK ˜ T T T T Digital Analog Analog Digital be written as S = [<(˜s), =(˜s)]. Here, ˜s = ˜s1 ,˜s2 , ...,˜sK Channel H Beamformer Beamformer Combiner Combiner denotes the concatenation of each user’s signal symbols pro- (b) AE neural network multi-user beamforming system cessed by HB system. For each element ˜sk, it denotes the re- Fig. 2. A simpliﬁed hardware block diagram of downlink single-cell multi- user mmWave massive MIMO autoencoder based HB system. ceived data stream vector for user k. Here, SA= [<(sA), =(sA)] represents the output of the analog combiner neural network. The complex signal processed by analog combiner be written where P is the total power budget at the BS. Obviously, the as sA. Besides, αr represents the parameters set of the real optimization problem is non-convex and it is intractable to channel and imaginary channel in the n-layer digital combiner obtain global optima for similar constrained joint optimization neural network. problems [20].

III.AE BASED HB DESIGN B. Analog beamformer/combiner design AE neural network is a subclass of the artiﬁcial neural The analog beamformer/combiner neural network should network used for unsupervised learning. AE neural network also satisfy the constraints of analog phase shifters to match was ﬁrstly proposed in data compression, by learning a pre- the practical hardware scheme. For fully connected beam- sentation for a set of data, such as images. Traditional AE former, each RF chain is connected to all Nr antennas via neural networks are designed to recovery the original data from phase shifter neural network. The transmit complex signals compressed information, which is similar to the optimization processed by analog beamformer can be written as problem as writing (6). In this work, we propose a neural HB st = ρ[st , st , ..., st ]T network by mapping the hardware block diagram of downlink 1 2 Nt rf rf rf Nt Nt Nt single-cell multi-user mmWave massive MIMO system shown jθt jθt jθt P p,1 P p,2 P p,Nt T = ρ[ sD,1e , sD,2e ,..., sD,N rf e ] , in Fig. 2 . p=1 p=1 p=1 t (10) t A. Digital beamformer/combiner design where sn, n ∈ {1, 2, ..., Nt} denotes the transmitted signal of rf For digital beamformer/combiner, the input complex signal the n-th antenna, and sD,p, p ∈ {1, 2, ..., Nt } denotes the p- x of the neural network is divided into two parts, the real part th RF chain signal processed by digital beamformer. To meet <(x) and the imaginary part =(x). Therefore, the input signal the transmit power constraint, the power control parameter ρ of the fully connected neural network can be written as X = is set as Nt [<(x), =(x)]. The function for a one layer fully connected X t 2 −1/2 ρ = (P sn ) . (11) neural network is given by n=1

Y =[<(y), =(y)] = [σ(W<<(x) − W==(x) + b<), (7) For user k, the processed signal by analog combiner can be σ(W<=(x) + W=<(x) + b=))], modeled as where Y is the concatenation of the real part <(y), and the T imaginary part =(y) of the processed complex signal y. The sA,k =[sA,k,1, sA,k,2, ..., s rf ] A,k,Nr weights of the real and imaginary channel can be stated as W< Nr Nr and W , respectively. The biases of the real and imaginary X r jθr X r jθr = =[ s e k,m,1 , s e k,m,2 , ..., b b k,1 k,1 channel can be stated as < and =, respectively. Here, the m=1 m=1 (12) symbol σ(·) denotes the activation function. Nr r jθ rf Digital beamformer/combiner can adjust both the phase X r k,m,N T sk,1e r ] , and the amplitude of the original signals without limitations. m=1 We employ two n-layers fully connected neural networks rf to implement baseband beamformer/combiner, respectively. where sA,k,q, q ∈ {1, 2, ..., Nr } denotes the q-th RF chain’s r Concisely, the processed signal of the digital beamformer is signal of the user k. Here, θk,m,q, m ∈ {1, 2, ..., Nr} denotes represented as the phase parameter between the m-th receive antenna and the n q-th RF chain for the user k. SD = ft (S; αt), (8) Notice that the formulas (10) and (12) involve multiplication where SD = [<(sD), =(sD)]. The output complex signal sD of complex number such as xejθ, combining with Euler’s of the digital beamformer neural network can be obtained by formula xejθ is equivalent to n combining SD. Here, ft represents the concatenation of n- jθ layers fully connected neural network and αt represents the y = xe = x(cosθ + j sin θ). (13) 4 SUBMITTED TO GLOBALSIP

In phase shifter neural network, the multiplication in (13) Fully, K=2 can be stated as Fully, K=4 10 -1 partially, K=2 Y = [<(y), =(y)] partially, K=4 = [(<(x)cosθ − =(x) sin θ), (<(x) sin θ + =(x)cosθ)]. (14) 10 -2

Similar to the writing (8) and (9) the output of the analog BitError Rate beamformer/combiner can be represented respectively as 10 -3 t t S = gt(SD; θ ), r r (15) SA = gr(S ; θ ), 10 -4 -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 where gt(·)/gr(·) denotes the analog beamformer/combiner SNR(dB) neural network. The phase parameters are represented as Fig. 3. Illustration of the BER comparison with different number of users θt/θr. rf rf rf K = 2, 4 DNHF. The parameters are set as Nr = 2, Nt = KNr and the total data streams KNs = 2 × K C. Channel transmission

The channel transmission and noise adding process are HBF[20] OMP[19] realized by neural network with ﬁxed parameters, for the any -1 MO[17] 10 DNHB, partially connected k-th user, which is stated as DNHB, fully connected r r r Sk = [<(sk), =(sk)] -2 -8 -7.5 t t (16) 10 = [<(s Hk + nk), =(s Hk + nk)] 2 1.5 h iT BitError Rate r r T r T r T t -3 where s = (s1) , (s2) ,..., (sK ) . s is the transmitted 10 1 r signal at BS, and sk is the received signal at user k. The

CSI matrix between the user k with the BS is represented -3 10 -4 10 as Hk. nk represents the corresponding noise vector of i.i.d. -20 -18 -16 -14 -12 -10 -8 -6 -4 -2 2 SNR(dB) CN 0, σk . Fig. 4. Illustration of the comparison with some existing methods. The rf rf D. Optimization Problem Formation parameters are set as K = 2, Nt = Nr = 2 and the totally data streams KN = 4. In the multi-user system, the sum-MSE (4) is regarded as s the reconstruction error. Thus, the loss function of DNHB is K The Fig. 4 is the comparison with other exsiting beamform- ˜ X 2 L =L S, S = Eksk − ˜skkF ing methods, such as the conventional orthogonal matching k=1 pursuit based algorithm in [22] (labelled with ’OMP’) and K one conventional HB algorithm in [23] (labelled with ’HBF’) X h 2 2 i = E k< (sk) − < (˜sk)kF + k= (sk) − = (˜sk)kF and the manifold optimization HB algorithm in [18] (labelled k=1 with ’MO’). It illustrates the BER performance comparison (17) when K = 2. The proposed DNHB algorithm has 2 ∼ 3dB performance advantages. IV. SIMULATION In the simulation experiments, the mmWave propagation V. CONCLUSION channel is based on a geometry channel model [21]. The con- In this paper, a new vision on the HB design is proposed ﬁguration of the MIMO system simulation is set as, Nt = 64 rf in the perspective of autoencoder neural network. Compared at BS, Nr = 16, Nr = 2 and Ns = 2 at each user. The with traditional matrix decomposition, the AE neural network number of channel clusters is 2 and 2 rays with each cluster. provides a strong representation ability to map the non- Fig. 3 shows the BER comparison of the different number of convex HB design to a network training process. The method users K = 2, 4 in fully connected and partially connected ana- exhibits signiﬁcantly superior performance than the traditional log beamformer/combiner. As it depicts, the fully connected linear matrix decomposition methods, in terms of BER. The analog beamformer/combiner has 3dB better performance than proposed work provides a new vision on the HB design, which partially connected analog beamformer/combiner in BER. The may have great potential in guiding designs in the future partially connected analog beamformer/combiner has more intelligent radio. constriction due to its structure. And the 2 users has superior performance in comparison than 4 users, as the information between different users cannot be transmitted and exchanged in the multi-user system. With the increases in number of users is in the system, its performance is suggested to get worse. SHELL et al.: BARE DEMO OF IEEETRAN.CLS FOR IEEE JOURNALS 5

REFERENCES systems. In 2017 IEEE International Conference on Communications (ICC), pages 1–6, May 2017. [1] Yikun Yu, Peter G. M. Baltus, and Arthur H. M. van Roermund. Millimeter-Wave Wireless Communication. In Yikun Yu, Peter G.M. [13] S. Hur, T. Kim, D. J. Love, J. V. Krogmeier, T. A. Thomas, and Baltus, and Arthur H.M. van Roermund, editors, Integrated 60GHz RF A. Ghosh. Millimeter Wave Beamforming for Wireless Backhaul and Beamforming in CMOS, Analog Circuits and Signal Processing, pages Access in Small Cell Networks. IEEE Transactions on Communications, 7–18. Springer Netherlands, Dordrecht, 2011. 61(10):4391–4403, October 2013. [2] S Mumtaz, J Rodriguez, and Linglong Dai. mmWave Massive MIMO: [14] Tianqi Wang, Chao-Kai Wen, Hanqing Wang, Feifei Gao, Tao Jiang, A Paradigm for 5G. 2016. and Shi Jin. Deep Learning for Wireless Physical Layer: Opportunities [3] D.P. Palomar, J.M. Cioffi, and M.A. Lagunas. Joint tx-rx beamforming and Challenges. arXiv:1710.05312 [cs, math], October 2017. arXiv: design for multicarrier mimo channels: a unified framework for convex 1710.05312. optimization. IEEE Transactions on Signal Processing, 51(9):2381– [15] Timothy J. O’Shea, Tugba Erpek, and T. Charles Clancy. Deep Learning 2401, September 2003. Based MIMO Communications. arXiv:1707.07980 [cs, math], July [4] Andreas F. Molisch, Vishnu V. Ratnam, Shengqian Han, Zheda Li, 2017. arXiv: 1707.07980. Sinh Le Hong Nguyen, Linsheng Li, and Katsuyuki Haneda. Hybrid Beamforming for Massive MIMO: A Survey. IEEE Communications [16] G. Aceto, D. Ciuonzo, A. Montieri, and A. Pescap. Mobile en- Magazine, 55(9):134–141, 2017. crypted traffic classification using deep learning: Experimental evalu- [5] Irfan Ahmed, Hedi Khammari, Adnan Shahid, Ahmed Musa, ation, lessons learned, and challenges. IEEE Transactions on Network Kwang Soon Kim, Eli De Poorter, and Ingrid Moerman. A Survey on and Service Management, 16(2):445–458, June 2019. Hybrid Beamforming Techniques in 5g: Architecture and System Model [17] G. Aceto, D. Ciuonzo, A. Montieri, V. Persico, and A. Pescap. Know Perspectives. IEEE Communications Surveys & Tutorials, 20(4):3060– your big data trade-offs when classifying encrypted mobile traffic with 3097, 2018. deep learning. In 2019 Network Traffic Measurement and Analysis [6] Jiang Jing, Cheng Xiaoxue, and Xie Yongbin. Energy-efficiency based Conference (TMA), pages 121–128, June 2019. downlink multi-user hybrid beamforming for millimeter wave massive MIMO system. The Journal of China Universities of Posts and [18] T. Lin, J. Cong, Y. Zhu, J. Zhang, and K. Ben Letaief. Hybrid Telecommunications, 23(4):53–62, August 2016. beamforming for millimeter wave systems using the mmse criterion. [7] Weiheng Ni and Xiaodai Dong. Hybrid Block Diagonalization for IEEE Transactions on Communications, 67(5):3693–3708, May 2019. Massive Multiuser MIMO Systems. arXiv:1504.02081 [cs, math], April [19] M. Joham, W. Utschick, and J. A. Nossek. Linear transmit processing 2015. arXiv: 1504.02081. in mimo communications systems. IEEE Transactions on Signal [8] A. Alkhateeb, G. Leus, and R. W. Heath. Limited Feedback Hybrid Processing, 53(8):2700–2712, Aug 2005. Precoding for Multi-User Millimeter Wave Systems. IEEE Transactions [20] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath. on Wireless Communications, 14(11):6481–6494, November 2015. Spatially sparse precoding in millimeter wave mimo systems. IEEE [9] Duy H. N. Nguyen, Long Bao Le, and Tho Le-Ngoc. Hybrid MMSE Transactions on Wireless Communications, 13(3):1499–1513, March precoding for mmWave multiuser MIMO systems. In 2016 IEEE 2014. International Conference on Communications (ICC), pages 1–6, Kuala [21] F. Sohrabi and W. Yu. Hybrid digital and analog beamforming design Lumpur, Malaysia, May 2016. IEEE. for large-scale antenna arrays. IEEE Journal of Selected Topics in Signal [10] Jiyun Tao, Qi Wang, Siyu Luo, and Jienan Chen. Constrained Deep Processing, 10(3):501–513, April 2016. Neural Network based Hybrid Beamforming for Millimeter Wave Massive MIMO Systems. 2019 IEEE International Conference on [22] M. Kim and Y. H. Lee. Mse-based hybrid rf/baseband processing for Communications Workshops (ICC), pages 1–1. millimeter-wave communication systems in mimo interference channels. [11] T. J. O’Shea, K. Karra, and T. C. Clancy. Learning to communicate: IEEE Transactions on Vehicular Technology, 64(6):2714–2720, June Channel auto-encoders, domain specific regularizers, and attention. In 2015. 2016 IEEE International Symposium on Signal Processing and Infor- [23] X. Yu, J. Shen, J. Zhang, and K. B. Letaief. Alternating minimization mation Technology (ISSPIT), pages 223–228, December 2016. algorithms for hybrid precoding in millimeter wave mimo systems. IEEE [12] X. Gao, L. Dai, Y. Sun, S. Han, and I. Chih-Lin. Machine learning Journal of Selected Topics in Signal Processing, 10(3):485–500, April inspired energy-efficient hybrid precoding for mmWave massive MIMO 2016.