arXiv:1912.10561v1 [cs.IT] 22 Dec 2019 n ehooy(AS) hwl akhPoic,SuiAra Saudi Province, Makkah [email protected]. Thuwal, (KAUST), Technology and rcsnRsac,See,Email: Sweden, Research, ali.behravan Ericsson LTE superposition of multi-user study-item DL a of as [2]. name well (DMST) the as transmission under [1] 13, 12 Release Release mit LTE (ICI) in tion inter-cell for an (NAICS) as cancellat suppression and/or interference introduced network-assisted been the different frequency of has in extension time, NOMA NOMA instance, in considered For applications. has resources and 3GPP co-scheduled radio Particularly, 5G are code. same beyond candi- UEs the and multiple a 5G share NOMA, LTE, as With for attention technique systems. access considerable multiple received date has to (NOMA) leading UEs, receivers. among simple interference with performance mutual system benefit no high the is have there designs that orthogonal downlink and Such (UL) transmission. uplink (DL) both (NR) in radio new waveform 5G OFDMA Also, utilizes schemes. orthogo- (OMA) as access (SC)-FDMA multiple nal single-carrier multipl LTE- and division (OFDMA) and frequency access introduced, (LTE) orthogonal been developed Evolution have Advanced Long-Term schemes code) Then, (T: 1G- (C: respectively. TDMA CDMA a In (FDMA), provide and access in manner. time) multiple to resources division complexity-efficient is frequency radio and 3G, goal with cost- the (UEs) spectrum-, Here, equipments design. user multiple systems cellular the mrv h nryefiinyadteedt-n transmissi end-to-end the systems. applied NOMA-based and be of efficiency delay hybr can complex energy of pairing techniques the user cost smart improve the different the as complexity, demonstrated, well receiver As as request the both repeat reducing automatic of to a as particular paid Here, delay transmission, efficiency. is its NOMA-based methods and improve (DL) to of ways complexity downlink number different and implementation a MU- (UL) propose the to uplink we compared reduce NOMA, Then, to this of negligible. de- gain In is possible such MIMO, 5G. the ( in work- beyond where multiple-input-multiple-output ended multi-user MIMO), a in b that and comparisons as use NOMA simulation discussions the it possible present the (NR). we for with Particularly, review radio cision. it continue first new leave to we 5G to paper, not for and decided 3GPP item, was in it study-item However, a as considered ero Makki, Behrooz oae-lmAoiii ihteKn bulhUiest o University Abdullah the with is Alouini Mohamed-Slim ero ak,KihaCit,adAiBhaa r with are Behravan Ali and Chitti, Krishna Makki, Behrooz ntels e er,nnotooa utpeaccess multiple non-orthogonal years, few last the In in interest of is schemes access multiple of design The Abstract uvyo OA urn ttsadOpen and Status Current NOMA: of Survey A } Nnotooa utpeacs NM)hsbeen has (NOMA) access multiple —Non-orthogonal @ericsson.com eirMme,IEEE Member, Senior .I I. NTRODUCTION { ero.ak,krishna.chitti, behrooz.makki, eerhChallenges Research rsn hti l ervnadMhmdSi Alouini, Mohamed-Slim and Behravan Ali Chitti, Krishna , i,Email: bia, Science f ttention o and ion etween MU- iga- ity. on to id e Lsses(ntematm,ms h icsin are discussions threefold: the are paper most the cont of meantime, The tions transmission). the DL for (in applicable/easy-to-extend 5G. beyond a systems in as use UL NOMA possible with for it continue leave the to to in not explain and we decided work-item, that OMA was reasons the it the to following, to addition due in However, NOMA, NR [28]. UL whether [27], least) on (at guidelines provide support and should NOMA of eval enough to benefits study-item the not fashion. a considered OMA-based are 3GPP an there 2018, in in Particularly, that them number serve such to large access resources a for orthogonal with requesting networks possibility UEs dense a of in as transmission suggested been data complexit has for coordination at NOMA and techniques reasons, pairing these OMA UE For has receiver, existing NOMA of the cost settings, shown outperform the parameter As to [26]. proper potential [25], with the to [4], works, complexity these [19]–[21], and receiver in NOMA [22]–[24], the schemes using reduce pairing to cases repeat UE the automatic low-complexity typical hybrid DL develop to the as both (HARQ) incorporate such in request to methods [15]–[18], NOMA transmission UL of data and performance [13]–[16] [3], ultimate the termine distinctness. codin symbol-level spreading, interleaving is on or based and them scrambling, is which among bit- design difference signature in UEs’ main the differences the with implementation, NOMA along interleave- and, (ID: superposition the IDMA ciple follow and techniques These [11], IGMA mu [12]. [10], division) spread [8], (WSMA) equality (MUSA) access Welch-bound tiple access [9], resource shared interleave-grid) (RS: (IG: RSMA multi-user [6], [7], [4], division) code) pattern spread) sparse (PD: (SC: PDMA SCMA [3], [5], NOMA domain power ing, nti ae,w td h efrac fNM in • NOMA of performance the study we paper, this In de- to presented been have results fundamental Various includ- NOMA for proposed been have schemes Different • rvd udlnsfrtersaceso o oimprove NOMA. to of how practicality on the researchers the we not for conclusions of guidelines Such Particularly, provide work-item. conclusion a the as NOMA. NOMA to with continuing leading on discussions the 3GPP study-item present in 15 presented conclusions Release final the summarize We oM-IO ntrso lc ro ae(LR,is complexity. (BLER), implementation rate its motivate error to large of that terms not in compared NOMA MU-MIMO, we of to As gain estimation. cases performance relative channel the the non-ideal show, for and presented ideal are both with results the multi-user Here, different conditions. and in NOMA the (MU-MIMO) compare multiple-input-multiple-output WSMA-based to of results performance evaluation link-level present We elw IEEE Fellow, prin- ribu- uate g, y. l- 1 2

• We demonstrate different techniques to reduce the imple- called a Welch bound equality (WBE) set. The constituent SSs mentation complexity of NOMA-based systems. Here, we satisfy WB as an ensemble and not individually, so there exists concentrate on developing low-complexity schemes for several sets S with similar correlation properties satisfying the UE pairing, receiver design and NOMA-HARQ, where WB for the same optimal PI. Also, it is required to have a H simple methods can be applied to reduce the implementa- low correlation value, given as ρij = s sj , between the i 2 tion complexity of NOMA remarkably. These results are | | K constituent vectors of S. The motivation to have TSC = L interesting for academia because each of the proposed is that, at the equality, several performance metrics in the schemes can be extended and studied analytically in a system, such as sum-capacity and sum- mean square error separate technical paper. (MSE), are optimized simultaneously [10], [11]. This makes it As we demonstrate, different techniques can be applied to an attractive option for multiple access implementation. Such reduce the implementation complexity of NOMA. Moreover, optimization and SS generation are well understood in the there is a need to improve the spectral efficiency and the prac- context of interference avoidance techniques [30, Chapter 2]. ticality of implementation, in order to have NOMA adopted Other PIs that may be considered for the SS genera- by the industry. tion include the worst-case matrix coherence given as µ = maxi6=j ρij and the minimum chordal distance dcord (for II. PERFORMANCE ANALYSIS detailed mathematical definition see [31]). Optimizing each PI separately will result in a set of SSs each with a different In this section, we first present the principles of WSMA set of correlation properties. Each of these sets is a subset as an attractive spreading-based NOMA technique. Then, of the WBE set. The number of vectors in each set must we compare the performance of WSMA NOMA with MU- be decided before the optimization of the respective PIs. As MIMO and summarize the final conclusions presented in 3GPP an example, optimizing TSC will result in a WBE set whose Release 15 study-item on NOMA. constituent vectors may have unequal correlation among them. Similarly, optimizing the worst-case matrix coherence µ will A. WSMA-based NOMA also produce a WBE set but with an additional property that WSMA is a spreading-based NOMA scheme [29]. Here, the each constituent SS is equally correlated with every other SS key feature is to use non-orthogonal short spreading sequences in the set. Such a set is known as a Grassmann set or an with relatively low cross-correlation for distinguishing multi- equiangular set, and the optimization problem is often referred ple users, and the spreading sequences are non-sparse. The to as line-space packing problem [32]. WSMA spreading sequences are based on the Welch bound At times, it may be required to have zero correlation [10], [11], the details of which are explained in the following. between few vectors of the constituent SSs in the set. In that Let us consider K UEs and signals of dimension L. The case, optimizing dcord is an attractive option. The optimization focus here is limited to symbol-level NOMA where each problem is then referred to as sub-space packing problem UE is assigned a UE specific vector from a set of pre- [31]. Equations (3)-(5) show the correlation properties of S, designed vectors. These vectors jointly have certain correlation each generated by optimizing a different PI, w.r.t the element- H properties. Consider K vectors, sk, k = 1,...,K called wise absolute value of the (K K) Grammian matrix (S S) { } × H signature sequences (SS), such that each sk is of the dimension when the number of active UEs K =4. This S S matrix is . L 2 independent of the dimension L and is given for different PIs (L 1) and sk 2 = 1, k, where sk 2 = l=1 sk,l . Let × || || ∀ || || P 1 S = [s1, s2, , sK ], be the overall (L K) signature matrix. as follows K··· × The factor L is referred to as the overloading factor in the WSMA context. Since one of the objectives of NOMA is to 1 ρ12 ρ13 ρ14 K   support a higher user density, it is required to have L > 1. H ρ12 1 ρ23 ρ24 K S S TSC = , (3) However, beyond a certain value of , the system will be | | ρ13 ρ23 1 ρ34 L   interference-limited. Depending on the required correlation ρ14 ρ24 ρ34 1  properties of S, a certain performance indicator (PI) is chosen and optimized for the generation of S. One such PI is the total 1 ρ ρ ρ squared correlation (TSC) and is given as   H ρ 1 ρ ρ K K S S µ = , (4) H 2 | | ρ ρ 1 ρ TSC = s sj , (1)   X X | i | ρ ρ ρ 1 i=1 j=1 where ( )H denotes the Hermitian operator. This scalar PI is · 1 0 ρ13 ρ14 lower-bounded by a value called the Welch bound (WB) and   H 0 1 ρ23 ρ24 is given as [10], [11] S S d = . (5) | | cord ρ13 ρ23 1 0  2   K ρ14 ρ24 0 1  TSC . (2) ≥ L 1WSMA is mainly designed for overloaded systems where K>L. How- On obtaining the optimal value of the chosen PI, TSC in ever, for simplicity, (3)-(5) show the Grammian matices w.r.t the mentioned this case, the WB is satisfied with equality and the set S is PIs at 100% overloading where K = L. 3

In another case, a baseline system for comparison could be Input bits Channel coding and QAM symbol spreading and Output symbols optional scrambling modulation optional interleaving based on MU-MIMO [33], [34]. In this case, both the NOMA and the MU-MIMO systems could be compared for the same Figure 1. Baseband transmitter implementation of WSMA-based NOMA at number of users per RE. In addition to the frequency domain, a user. the space domain provides additional degrees of freedom (DoF) to the BS. With multiple receive antennas at the BS, a joint space-frequency multiuser detector may be employed. B. NOMA vs MU-MIMO The MU-MIMO system relies only on the spatial separation A generalized block diagram of the baseband transmitter while NOMA has additional frequency domain, the assumed for NOMA implementation is shown in Fig. 1. The infor- spreading domain, for UE separation. The same multiuser mation bits of a UEk are channel coded and then digitally detector, with a little or no modification in implementation, modulated. For a bit-level NOMA implementation, the channel may be used for both NOMA and MU-MIMO. For the coded bits may be scrambled by a UE specific scrambling MU-MIMO, an additional UE grouping and scheduling each sequence and then digitally modulated. Symbol-level UE spe- group over orthogonal REs must also be considered for a cific NOMA block appears after the quadrature amplitude fair comparison. Since increasing UE density is one of the modulation (QAM)-modulation block. Using WSMA, each NOMA objectives, a comparison for the maximum number of incoming QAM-symbol qk is repeated L times in a weighted admissible UEs at a given target BLER may also be verified manner by a UE assigned SS sk to obtain an (L 1) output while comparing NOMA and MU-MIMO. × symbol vector qksk, i.e., a symbol spreading functionality. Considering ideal channel estimation, Figs. 2 and 3 show The repeated symbols qksk may optionally be interleaved to the link-level performance comparison of WSMA with MU- increase the randomness of the multiuser interference (MUI) MIMO when the modulation is QPSK and the transport block to simplify the detector implementation. Usually, S is pre- size (TBS) is 20 bytes. The carrier frequency is 700 MHz generated in the system. To achieve collision-free multiple and we assume that the channels follow the Tapped Delay access, these SSs may be pre-assigned to the UEs, such that no Line (TDL-C) model [35, Section 7.7.2]. Note that TDL-C two UEs have the same SS. Cooperation among the UEs may channel model may be used for simplified evaluation of non- further improve the performance, but comes at an increased line-of-sight (NLOS) communication. The considered channel complexity and additional communication overhead among the model for link-level simulations suffices the NOMA setup UEs before the actual transmission to the base station (BS). which usually targets a high user density coupled with low With K NOMA transmitters, the received (L 1) composite mobility and small delay spread values. vector can be mathematically written as × There are four receive antennas at the BS and each UE is equipped with a single antenna. It is assumed that each UE K is transmitting with a unit power value over its allocated 6 y = hk sk√pkqk + z, (6) X ⊙ physical resource blocks (PRBs) and 12 data OFDM symbols. i=1 The detector at the BS is MMSE (M: minimum) based. With where for a UEk, qk is its QAM-symbol, pk is its transmit the spread length 4, the codebook is based on the PI TSC. The power, and hk is the (L 1) fading channel to the BS. Also, channel encoding employed is the rate-matched LDPC code. × z is the (L 1) zero mean AWGN vector, and is the element- There is no scrambling and interleaving at the transmitters. × ⊙ wise multiplication. Symbol repetition at each UE may be AWGN is assumed to have a unit variance. In Figs. 2 and performed in the frequency domain, but it can also be applied 3, the average BLER values per UE are shown for varying in the code domain. With a scalar value pk, the transmitter number of UEs K =6 and K = 12, respectively. For a given allocates the same power to all its incoming QAM symbols. K, to have the same number of UEs per PRB as in the case This may be replaced by an (L 1) per subcarrier power of WSMA, MU-MIMO divides the UEs into varying number × allocation power vector pk. Finally, note that (6) assumes that of groups G and varying number of UEs per group Nu such each UE is equipped with a single transmit antenna. However, that K = GNu. it may be extended for higher number of transmit antennas, From Figs. 2 and 3, it can be observed that, for the where each spatial layer may have its own SS. assumed setup and various values of G, WSMA outperforms With WSMA, each UE uses L times more resources to MU-MIMO, in terms of BLER, if ideal channel estimation transmit the same number of QAM-symbols. This may not is considered. There is also a saturation observed for MU- be spectrally efficient. Hence it is important to overload the MIMO when it is heavily loaded. MU-MIMO systems are system, i.e., increase K for a fixed L, to increase the sum- interference-limited, i.e., beyond a certain signal-to-noise ratio rate. This may lead to situations that require optimization of (SNR), an increase in the transmit power at each user may different metrics with conflicting interests. result in diminishing returns of the performance. A user’s The baseline system for comparison with NOMA could be signal is drowned in the multiple access interference (MAI). an OMA setup when there are K = L users. This is similar With the overloading that NOMA targets, the available spatial to 100% overloading with SS matrix equal to identity matrix DoF in MU-MIMO system are not sufficient to isolate the of size K K, and the UEs are scheduled over orthogonal constituent signals from the received composite signal. This resources. The× receiver, BS in this case, receives the composite leads to the saturation in the BLER of MU-MIMO. signal and separates UEs in the frequency domain. With NOMA, on the other hand, due to symbol repetition 4

100 0 G=6, 1UE/PRB, MU-MIMO 10 G=6, 2UEs/PRB, MU-MIMO G=3, 2UEs/PRB, MU-MIMO G=3, 4UEs/PRB, MU-MIMO G=2, 3UEs/PRB, MU-MIMO G=2, 6UEs/PRB, MU-MIMO G=1, 6UEs/PRB, MU-MIMO -1 -1 12 UEs/PRB, WSMA 10 6 UEs/PRB, WSMA 10

1.3 dB Gain Ideal channel estimation 1 dB Gain by NOMA -2 by NOMA 10 10-2 Average BLER per UE Average BLER per UE Ideal channel estimation

-3 10 10-3 -20 -15 -10 -5 0 5 10 15 -20 -15 -10 -5 0 5 10 SNR (dB) SNR (dB)

Figure 2. NOMA vs MU-MIMO (K = 6, an ideal channel estimation). Figure 3. NOMA vs MU-MIMO (K = 12, an ideal channel estimation).

100 K=12 by the low correlation spreading, the energy per resource K=6 element (RE) on an average is reduced (note: the SS are 10-1 unit ). This ensures that users’ signals perceive a lower Non-ideal channel estimation MAI. This, however, comes at a possible reduced spectral 10-2 efficiency, since each NOMA UE will consume L times more

REs than its MU-MIMO counterpart. This is very prominent at 10-3 lower overloading factor, where the MU-MIMO outperforms Average BLER per UE

NOMA. Hence a trade-off exists between the overloading and 10-4 spectral efficiency. Nevertheless, with NOMA the error floors -15 -10 -5 0 5 10 15 20 SNR (dB) are lowered thereby providing a possibility to squeeze in more UEs per RE for the same target BLER as in MU-MIMO. Figure 4. NOMA vs MU-MIMO. K = 6 and K = 12, non-ideal channel However, as G increases, MU-MIMO experiences saturation estimation. in lower BLERs and the difference between NOMA and MU- MIMO decreases. This is because by increasing G in MU- possible advantage of MU-MIMO over WSMA and vice versa. MIMO, the number of users per RE is reduced, leading to less However, considering different cases, the performance gain of multiuser interference for each user. More importantly, even NOMA, compared to MU-MIMO, is negligible. with an ideal channel estimation, the relative performance gain of NOMA, compared to MU-MIMO, is negligible, and the relative performance gain decreases with the number of UEs. C. NOMA for beyond 5G For instance, considering the parameter setting of Figs. 2 and During the NOMA Study in 3GPP for 5G NR, a large 2 3 and BLER 10− , NOMA-based data transmission reduces number of link-level simulations of transmission schemes and the required SNR, compared to MU-MIMO, only by 1.3 and 1 corresponding receivers were carried out [27]. Particularly, dB in the cases with K =6 and K = 12 UEs, respectively. A 14 different companies, each with its own NOMA scheme, definite advantage of having a higher UE density with NOMA provided link-level results and studied the BLER for more is visible from Fig. 3. than 35 cases. The link-level parameters were generally well Figure 4 compares WSMA and MU-MIMO for both K =6 aligned among companies, which enabled easy comparison and K = 12, but with non-ideal channel estimation. A TDL- between different methods. Moreover, all NOMA schemes, A channel model is assumed. At the UEs, the modulation including those supported by Rel-15, performed similarly at is 16-QAM and the TBS is 60 bytes. As before, the BLER link-level in key conditions. Then, 8 companies provided performance of MU-MIMO depends on its configuration, i.e., system-level simulation results, where in total 37 different sets on the parameters G and Nu. For K = 12, the performance of NOMA versus baseline results were provided. As opposed of MU-MIMO with G =6 and WSMA is very similar over a to the cases with link-level simulations, widely different pa- wide range of considered SNRs, with the latter outperforming rameter sets were used in the system-level simulations, with the former only beyond a target BLER of 10−3. When the different baselines, making comparisons intractable. Here, setup is relatively less dense, i.e., K = 6, a similar trend is the results have been presented for both synchronous and observed between MU-MIMO with G = 3 and WSMA. The asynchronous operation models, while the main focus was on target BLER beyond which WSMA performs better is now the synchronous operation. 10−2. At lower SNR, less than 0 dB, MU-MIMO has a better According to the results presented during the 3GPP study- performance when compared to WSMA. This is also the case item, in ideal conditions, NOMA can be better or worse than for MU-MIMO with G = 2. These results do not indicate a MU-MIMO, depending on the number of UEs and simulation 5

UE Messages of UE (1) (2) ... (1) g Messages of ... UE (1) (1)

SIC of signal UE Retransmit (1) if UE signal UE UE retransmission g 1 : NACK 1 : ACK 1 deocoding signal deocoding stops or BS decodes 1 : NACK 1 : ACK 1 but (1) is not BS 2 : ACK decoded yet Decoding - Decode and - Decode and remove (1) using all retransmitted signals process at BS remove (1) - Decode 1 - Decode 1 Figure 5. UL NOMA. UEs with different channels qualities are paired and - Decode (2) the BS performs SIC to decode the signals sequentially. Figure 6. Reducing the expected number of retransmissions in NOMA. If the BS fails to decode both signals, it asks for retransmission from only one of the UEs, while the other UE delays the retransmission. The retransmission gives parameters. With realistic channel estimation in multipath, the chance to decode the retransmitted signal. Then, removing the interference, the BS can decode the other failed signal interference-free and with no need however, the relative perfromance gain of NOMA decreases, for retransmission. and MU-MIMO may outperform NOMA, depending on the parameter settings/channel model. No clear gain from NOMA over Rel-15 mechanisms were observed in all studied sce- suffers from poor diversity. Also, NOMA is faced with error narios. Particularly, in a large number of conditions, link- propagation problem where, if the receiver fails to decode a level results from many companies show no gain over Rel- signal, its interference affects the decoding probability of all 15 techniques and the system-level simulations do not show remaining signals which should be decoded sequentially. For conclusive gain. In summary, in harmony with our results these reasons, there may be a high probability for requiring presented in Figs. 2-4, it was hard to find worthwhile NOMA multiple HARQ retransmissions leading to high end-to-end gains. This was the main reason that 3GPP decided not (E2E) packet transmission delay [19]–[21]. The following to continue with NOMA as a work-item, and leave it for schemes develop NOMA-HARQ protocols with low imple- beyond 5G where new use-cases with ultra-dense UEs may mentation complexity. be motivating for NOMA. 1) Smart NOMA-HARQ [36]: Our proposed retransmission process is explained in Fig. 6. Assume that in Slot 1 the BS III. REDUCINGTHE IMPLEMENTATION COMPLEXITY can not decode correctly the signals of the UEs, i.e., X1(1) One of the key challenges of NOMA is the implementation and X2(1). While buffering both failed signals, it asks UE2 to complexity, in different terms of UE pairing, signal decoding, delay the retransmission of the failed signal. Also, it asks UE1 (resp. UE2) to retransmit the failed signal X1 (resp. send a CSI acquisition, etc. This is specially because NOMA is useful (1) in dense networks where the implementation complexity of the new signal X2(2)) in Slot 2. At the end of Slot 2, the BS first system increases rapidly with the number of UEs. This section combines the two interference-affected copies of X1(1) and decodes it using, e.g., maximum ratio combining (MRC). If proposes different techniques to reduce the implementation complexity of NOMA. These results are interesting because the signal of UE1 is correctly decoded, the BS has the chance to use SIC, remove X1 and decode both the failed and the 1) they provide guidelines to use NOMA with relatively low (1) complexity. Also, 2) each of the proposed schemes, which have new signals of UE2, i.e., X2(1) and X2(2) received in Slots 1 been filed in patent applications, can be studied analytically and 2, respectively, interference-free and with no need for the retransmission from UE2. Finally, UE2 starts retransmitting by academia in a separate paper. X2 only if the retransmission of UE1 stops (either because For generality and in harmony with the discussions in 3GPP, (1) the maximum number of retransmissions is reached or the BS we present the proposed schemes for UL NOMA. However, has correctly decoded the signal of UE1) while the signal of it is straightforward to extend our proposed approaches to UE2 has not been decoded yet. the cases with DL transmission. We consider the cases with In this way, NOMA gives an opportunity to reduce the pairing a cell-center and a cell-edge UE, i.e., UE1 and UE2 in number of retransmissions, and improve the E2E throughput. Fig. 5, respectively, with g1 g2 where gi denotes the channel ≥ Also, the fairness between the UEs increases because the gain in the UEi-BS link. However, the discussions hold for arbitrary number of paired UEs. Moreover, for simplicity, we required number of retransmissions of the cell-edge UE, i.e., present the setups for the cases with power-domain NOMA UE2 in Fig. 6, decreases remarkably. The keys to enable such and successive interference cancellation (SIC)-based receivers, a setup are that 1) the BS should decode all buffered signals while the same approaches are applicable for different types of in each round and 2) it should inform the UEs about the NOMA-based data transmission/receivers. We concentrate on appropriate retransmission times. reducing the implementation complexity of the HARQ-based 2) Dynamic UE Pairing in NOMA-HARQ [37]: Here, the data transmission, UE pairing and receiver as follows. objective is to improve the performance gain of NOMA- HARQ by adding virtual diversity into the network. In our proposed setup, depending on the message decoding status, A. HARQ using NOMA different pairs of UEs may be considered for data transmission Due to the CSI acquisition and UE pairing overhead, in different retransmission rounds. As an example, considering NOMA is of most interest in fairly static channels with Fig. 5, assume that UE1 and UE2 with g1 g2 are paired no frequency hopping where channels remain constant for and send their signals to the BS in a NOMA-based≥ fashion. a number of packet transmissions. As a result, the network However, the BS fails to decode X1(1) (and with high 6

Successful decoding UE In this way, compared to the cases with conventional Unsuccessful decoding OMA techniques, using the adaptive multiple access scheme,

UE along with HARQ, makes it possible to exploit the net-

BS work/frequency diversity and increase the UEs’ achievable rates. Moreover, our proposed scheme satisfies the tradeoff between the receiver complexity and the network reliability,

Frequency and, compared to the state-of-the-art OMA-based systems, UE UE UE UE UE UE UE UE UE & UE & UE & improves the service availability/the network reliability sig- nificantly. Finally, the proposed scheme improves the fairness UE UE UE UE UE UE UE UE UE & & UE between UEs and is useful in buffer-limited systems. Finally, note that, while we presented the proposed NOMA- Time HARQ schemes for RTD (Type II) HARQ [39], the same Figure 7. Multiple access adaptation in different (re)transmission rounds. If approaches are applicable for other HARQ protocols as well. the signal of an UE is not correctly decoded, it has the chance to reuse the spectrum resource of the other UE during retransmissions. B. Simplifying the UE Pairing With NOMA, optimal UE pairing becomes challenging probability X2(1)). Then, in [37], we propose that in the as the number of UEs increases, because it leads to huge retransmission(s) UE1 can be paired with a new UE, namely, CSI acquisition and feedback overhead as well as running UE0 with g0 g1. This is intuitively because a large portion complex optimization algorithms [22]–[24]. For these reasons, ≥ of the SNR required for successful decoding of X1(1) has we present low-complexity UE pairing schemes as follows. been provided in Round 1. Thus, although we have failed, we 1) Rate-based UE Pairing [40]: Consider a dense network are very close to successful decoding and the signal can be with N UEs and Nc time-frequency chunks where Nc < N, correctly decoded by a small boost in the retransmission round. i.e., when the number of resources are not enough to serve all Such a boost can be given by pairing UE1 with UE0 having a UEs in orthogonal resources. An optimal UE pairing algorithm better channel to the BS. On the other hand, pairing UE0 and needs to know all NcN channel coefficients and all N rate UE1 gives UE2 the chance to use a separate resource block demands of the UEs, making the whole system impractical to retransmit its own signal interference-free. Also, although as N and/or Nc increases. This is especially because a large the channel coefficients are constant, pairing different UEs in portion of this information is used only for UE pairing and successive rounds provides the BS with different SNR/SINR not for data transmission. (I: interference) powers, i.e., diversity, which improves the To limit the CSI requirement, in [40], we propose that the performance of HARQ protocols considerably. In this way, UE pairing is performed only based on the UEs rate demands the fairness between the UEs and the expected E2E packet and the probability of successful pairing. The proposed scheme transmission delay of the network are improved. is based on the following procedure: 3) Multiple Access Adaptation in Retransmissions [38]: • Step 1: The BS asks all UEs to send their rate demands. Our proposed scheme can be well explained in Fig. 7. In • Step 2: Receiving the UEs’ rate demands and without our proposed setup, each UE starts data transmission in its knowing the instantaneous CSI, the BS finds the prob- own dedicated bandwidth in an OMA-based fashion. Then, ability that two specific UEs can be successfully served if a UE’s message is not correctly decoded in a time slot, through NOMA-based data transmission (see [40] for the in the following retransmission rounds it is allowed to reuse detail procedure of finding these probabilities). the bandwidth of the other UE as well. Let us denote the • Step 3: If the probability of successful pairing for two resource block at time i and bandwidth wj by B(i, wj ). As specific UEs, i.e., the probability that the BS can correctly an example, consider time Slot 2 in Fig. 7 where the UE2’s decode their signals, exceeds some predefined threshold, message is not correctly decoded while the message of UE1 is the BS assigns resources for UL transmission and asks successfully decoded by the BS. Then, in Slot 3, UE2 uses both those UEs to send pilots sequentially. w1 and w2 to retransmit its message. On the other hand, UE1 • Step 4: Using the received pilots from those paired UEs, only uses w1 to send a new message. Using, e.g., repetition the BS estimates the channel qualities in that specific re- time diversity (RTD) HARQ, in B(3, w1) and B(3, w2), UE2 sources, decides if the UEs can be paired and determines sends the same signal as in B(2, w1) and the BS decodes the the appropriate power level of each UE such that their message based on all three copies of the signal. An example rate demands can be satisfied. method for message decoding is to first use SIC to decode • Step 5: The BS informs the paired UEs about the power the message of UE1 in B(3, w1), then remove this message levels to use and sends synchronization signals such that from the received signal in B(3, w1), and use MRC of the their transmit timings are synchronized. three copies of the UE2’s signal to decode its message. Note In this way, with our proposed scheme the CSI is acquired that, even if the message of UE1 is not correctly decoded in only if the BS estimates a high probability for successful Slot 3, the BS can still perform, e.g., MRC of the three (two UE pairing. This reduces the CSI overhead considerably, interference-free and one interference-affected) copies of the particularly in dense networks and/or in the cases with multiple UE2’s signal. antennas at the UEs. Finally, as we show in [40], to have 7

NACK for both UEs Backhaul link Point B: ACK for UE , • SIC-based receiver at one BS ACK or NACK without decoding the signal Success for UE Failed UE BS BS • Backhauling between the BSs • Remove-and-decode receiver at other BS UE signal Decode & UE signal Decode & Power of Decode Feedback Feedback UE remove remove (i) UE signal process process Power of UE UE signal UE signal UE signal UE signal

Power Power UE UE allocation Power of UE UE Γ() Γ() D Γ() D Time Frequency A B C Sent by UEs SIC-based decoding at BS Decoding at BS (ii) Sent by UEs Power of UE Power of UE

OMA-based NOMA-based OMA-based Power region region region allocation Power of UE BS buffers the undecoded signals of UEs Frequency Slot 1 Slot 2 Points A and C: • OMA-based data transmission

(iii) Power of UE Power of UE Figure 9. Adapting the decoding scheme based on the estimated successful Power Power

Power of UE1 Power of UE2 Power of UE3 allocation Power of UE decoding probability. Power Power

allocation Frequency Frequency

Figure 8. UE pairing in COMP-NOMA. If a cell-edge UE can be successfully Considering Fig. 9, if the signal of UE1 is correctly decoded, paired with a cell-center UE of one BS, it can be paired with each of the cell- center UEs of other BSs with SIC-based receiver only at one BS. the BS continues in the typical SIC-based receiver scheme to first remove the signal of UE1 and then decode the signal of UE2 interference-free. On the other hand, if the BS fails to the maximum number of successful paired UEs, the BS can decode the signal of UE1, it immediately sends NACKs for initially consider the pairs with the highest and lowest rate both UEs without decoding the UE2’s signal. This is to save demands. For instance, assume r1 ... r where r is the N i on the decoding delay and because there is low probability for rate demand of UE . Then, an appropriate≥ ≥ UE pairing approach i successful decoding of the second signal without removing the would be (r1, rN ), (r2, rN−1),.... first interfering signal. For instance, let us denote the decoding 2) UE Pairing in COMP-NOMA: The high-rate reliable delay for decoding a codeword of length L by Γ(L). Then, as backhaul links give the chance to simplify the UE pairing shown in Fig. 9, the proposed scheme reduces the E2E delay in coordinated multi-point (COMP) networks using NOMA. by Γ(L) in Slot 2. The BS buffers the undecoded signal of The idea can be well presented in Fig. 8. Here, depending UE2 for process in the next rounds of HARQ. Then, depending on the UEs positions and rate demands, they may be served on the decoding approach of the BS and its corresponding with different multiple access schemes. For instance, in Point decoding delay, in each time slot the UEs’ data transmission A (resp. C) of Fig. 8 where the channel g21 (resp. g22) is synchronized correspondingly. experiences a good quality, UEs transmit in an OMA-based In this way, the proposed setup reduces the implementation fashion and the BSs may use typical OMA-based receivers complexity considerably and improves the E2E throughput with no need for backhauling. In Point B, however, NOMA because the decoding scheme is adapted depending on the is used in a COMP-based fashion and the UEs may share estimated probability of successful decoding. Particularly, an spectrum. As an example, with UE2 being in Point B, BS1 interested reader may follow the same method as in [41] may use SIC-based receiver to first decode-and-remove the to study the E2E performance gain of the proposed scheme message of UE1 and then decode the message of UE2. Then, analytically. Finally, while we presented the setup for the cases using the backhaul link, BS2 is informed about the message of with two UEs, it can be shown that the relative performance UE2 (and, possibly, UE1) and, removing the interfering signal, gain of the proposed scheme increases with the number of it decodes the signal of UE3 interference-free. Finally, as paired UEs. demonstrated in the figure, depending on the rate requirements and channel conditions, different NOMA-based transmissions IV. CONCLUSIONS with partial spectrum sharing can be considered in Point B, which affect the data transmission, backhauling and message In this paper, we studied the challenges and advantages decoding schemes correspondingly. of NOMA as a candidate technology in dense networks. As The advantages of the proposed scheme are: 1) SIC-based we showed through simulations and in harmony with the receiver is used only in BS1. Also, 2) UE pairing algorithm discussions in the 3GPP Release 15 study-item on NOMA, can be run only in one of the BSs. That is, NOMA-based data NOMA may or may not outperform the typical OMA-based transmission is used as long as at least one of the BSs can schemes such as MU-MIMO, in terms of BLER. However, for find a good pair for UE2. Finally, 3) pairing (UE1,UE2), BS2 the current use-case scenarios of interest, the relative perfor- can consider each of its own cell-center UEs to be paired with mance gain of NOMA was not so much such that it could not them as long as the interference to BS1 is not high. That is, convince the 3GPP to continue with it as a work-item. On the BS2 does not need to run advanced UE pairing algorithms. other hand, the unique properties of NOMA give the chance to develop different techniques reducing its implementation C. Receiver Adaptation complexity, which may make it more suitable for practical Compared to OMA-based systems, the sequential decoding implementation. Therefore, there is a need to improve the process of the BS may lead to large E2E transmission delay, as spectral efficiency and the practicality of implementation, in well as high receiver complexity/energy consumption [4], [25], order to have NOMA adopted by the industry. [26]. Therefore, it is beneficial to use the sequential decoding REFERENCES only if there is high probability for successful decoding. This [1] 3GPP, TR 36.866V.12.0.1, Study on network-assisted interference can- is the motivation for the scheme proposed in the following. cellation and suppression (NAICS) for LTE, March 2014. 8

[2] 3GPP, RP-150496, MediaTek, New SI Proposal: Study on downlink [26] L. Liu, Y. Chi, C. Yuen, Y. L. Guan, and Y. Li, “Capacity-achieving multiuser superposition transmission for LTE, March 2015. MIMO-NOMA: Iterative LMMSE detection,” IEEE Trans. Signal Pro- [3] P. Xu, Z. Ding, X. Dai, H. V. Poor, NOMA: An information cess., vol. 67, no. 7, pp. 1758–1773, April 2019. theoretic perspective, arXiv:1504.07751, Available at: [27] 3GPP, TR 38.812, Study on Non-Orthogonal Multiple Access (NOMA) https://arxiv.org/abs/1504.07751. for NR, Dec. 2018. [4] F. Wei and W. Chen, “Low complexity iterative receiver design for sparse [28] 3GPP, R1-166056, Final Minutes report RAN185-v100, Available at: code multiple access,” IEEE Trans. Commun., vol. 65, no. 2, pp. 621– http://www.3gpp.org/ftp/tsg ran/WG1 RL1/TSGR1 86/Docs/. 634, Feb. 2017. [29] 3GPP, R1-1802767, “Signature design for NoMA,” Ericsson, RAN1-92, [5] Y. Yuan and C. Yan, “NOMA study in 3GPP for 5G,” in Proc. IEEE Athens, Feb. 2018. ISTC, Hong Kong, Dec. 2018, pp. 1–5. [30] D. Popescu and C. Rose, Interference Avoidance Methods for Wireless [6] S. Chen, B. Ren, Q. Gao, S. Kang, S. Sun, and K. Niu, “Pattern Systems. Springer Science & Business Media, 2006. division multiple access-a novel nonorthogonal multiple access for fifth- [31] R. Calderbank, A. Thompson, and Y. Xie, “On block coherence of generation radio networks,” IEEE Trans. Veh. Technol., vol. 66, no. 4, frames,” Applied and Computational Harmonic Analysis, vol. 38, no. 1, pp. 3185–3196, April 2017. pp. 50–71, 2015. Wavelets XI [7] 3GPP, R1-164688, Resource Spread Multiple Access, Qualcomm, May [32] J. A. Tropp, “Complex equiangular tight frames,” in , vol. 2016. 5914. International Society for Optics and Photonics, 2005, p. 591401. [33] 3GPP, R1-1806248, “MU-MIMO operation in NR with configured [8] Yuan, G. Yu, W. Li, Y. Yuan, X. Wang, and J. Xu, “Multi-user shared grant,” Ericsson, RAN1-93, Busan, May 2018. access for internet of things,” in Proc. IEEE VTC Spring’2016, Nanjing, [34] 3GPP, R1-1808981, “Link level evaluations for MU-MIMO,” Ericsson, China, May 2016, pp. 1–5. RAN1-94, Gothenburg, Aug. 2018. [9] 3GPP, R1-163992, Non-Orthogonal Multiple Access Candidate for NR, [35] 3GPP, TR-38.900 Study on channel model for frequency spectrum above Samsung, May 2016. 6 GHz, June 2017. [10] NOMA Design Principles and Performance, document IMT-2020, Eric- [36] B. Makki, M. Hashemi, A. Behravan, Smart HARQ in NOMA- sson, Beijing, China, Jun. 2017. based networks, Patent application, Ericsson, Application number: [11] X. Li, H. Chen, Y. Qian, B. Rong, and M. R. Soleymani, “Welch bound PCT/EP2018/062393, May, 2018. analysis on generic code division multiple access codes with interference [37] B. Makki, M. Hashemi, A. Behravan, Dynamic user pairing in NOMA- free windows,” IEEE Trans. Wireless Commun., vol. 8, no. 4, pp. 1603– HARQ networks, Patent application, Ericsson, Application number: 1607, April 2009. PCT/EP2018/062456, May, 2018. [12] C. Xu, Y. Hu, C. Liang, J. Ma, and L. Ping, “Massive MIMO, non- [38] B. Makki, M. Hashemi, A. Behravan, Hybrid automatic repeat request orthogonal multiple access and interleave division multiple access,” using an adaptive multiple access scheme, Patent application, Ericsson, IEEE Access, vol. 5, pp. 14 728–14 748, Aug. 2017. Application number: PCT/EP2018/053633, Feb., 2018. [13] F. Mokhtari, M. R. Mili, F. Eslami, F. Ashtiani, B. Makki, M. Mir- [39] B. Makki and T. Eriksson, “On the performance of MIMO-ARQ systems mohseni, M. Nasiri-Kenari, and T. Svensson, “Download elastic traffic with channel state information at the receiver,” IEEE Trans. Commun., rate optimization via NOMA protocols,” IEEE Trans. Veh. Technol., vol. 62, no. 5, pp. 1588–1603, May 2014. vol. 68, no. 1, pp. 713–727, Jan. 2019. [40] B. Makki, M. Hashemi, A. Behravan, CSI-constrained user pairing in [14] Y. Sun, D. W. K. Ng, Z. Ding, and R. Schober, “Optimal joint power and dense uplink NOMA networks, Patent application, Ericsson, Application subcarrier allocation for full-duplex multicarrier non-orthogonal multiple number: PCT/EP2018/053733, Feb., 2018. access systems,” IEEE Trans. Commun., vol. 65, no. 3, pp. 1077–1091, [41] B. Makki, T. Svensson, G. Caire, and M. Zorzi, “Fast HARQ over finite March 2017. blocklength codes: A technique for low-latency reliable communication,” [15] Z. Ding, R. Schober, and H. V. Poor, “A general MIMO framework for IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 194–209, Jan. 2019. NOMA downlink and uplink transmission based on signal alignment,” IEEE Trans. Wireless Commun., vol. 15, no. 6, pp. 4438–4454, June 2016. [16] Z. Yang, Z. Ding, P. Fan, and N. Al-Dhahir, “A general power allocation scheme to guarantee quality of service in downlink and uplink NOMA systems,” IEEE Trans. Wireless Commun., vol. 15, no. 11, pp. 7244– 7257, Nov. 2016. [17] N. Zhang, J. Wang, G. Kang, and Y. Liu, “Uplink nonorthogonal multiple access in 5G systems,” IEEE Commun. Lett., vol. 20, no. 3, pp. 458–461, March 2016. [18] H. Tabassum, E. Hossain, and J. Hossain, “Modeling and analysis of uplink non-orthogonal multiple access in large-scale cellular networks using poisson cluster processes,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3555–3570, Aug. 2017. [19] J. Choi, “On HARQ-IR for downlink NOMA systems,” IEEE Trans. Commun., vol. 64, no. 8, pp. 3576–3584, Aug. 2016. [20] D. Cai, Z. Ding, P. Fan, and Z. Yang, “On the performance of NOMA with hybrid ARQ,” IEEE Trans. Veh. Technol., vol. 67, no. 10, pp. 10 033–10 038, Oct. 2018. [21] Z. Shi, S. Ma, H. ElSawy, G. Yang, and M. S. Alouini, “Cooperative HARQ-assisted NOMA scheme in large-scale D2D networks,” IEEE Trans. Commun., vol. 66, no. 9, pp. 4286–4302, Sept. 2018. [22] M. S. Ali, H. Tabassum, and E. Hossain, “Dynamic user clustering and power allocation for uplink and downlink non-orthogonal multiple access (NOMA) systems,” IEEE Access, vol. 4, pp. 6325–6343, Oct. 2016. [23] W. Liang, Z. Ding, Y. Li, and L. Song, “User pairing for downlink non- orthogonal multiple access networks using matching algorithm,” IEEE Trans. Commun., vol. 65, no. 12, pp. 5319–5332, Dec. 2017. [24] H. Zhang, D. Zhang, W. Meng, and C. Li, “User pairing algorithm with SIC in non-orthogonal multiple access system,” in Proc. IEEE ICC’2016, Kuala Lumpur, Malaysia, May 2016, pp. 1–6. [25] Y. Chi, L. Liu, G. Song, C. Yuen, Y. L. Guan, and Y. Li, “Practical MIMO-NOMA: Low complexity and capacity-approaching solution,” IEEE Trans. Wireless Commun., vol. 17, no. 9, pp. 6251–6264, Sept. 2018.