Hybrid Transceiver Design for Beamspace MIMO-NOMA in Code-Domain for MmWave Communication Using Lens Antenna Array

As a hybrid MIMO architecture, beamspace multiple input multiple output (MIMO) can significantly reduce the number of required radio frequency (RF) chains in millimeter wave (mmWave) massive MIMO systems without obvious performance loss, in which, however, the number of users supported cannot be larger than that of RF chains. To break this fundamental limit, we introduce the concept of code-domain non-orthogonal multiple access (NOMA) into beamspace MIMO. A beam selection scheme is proposed first to maximize the sum-rate by utilizing the quasi-orthogonality of the beamspace channel. Furthermore, a low-complexity detection algorithm is developed to realize the transceiver design in mmWave communications using lens antennas. Finally, numerical results of the decoding complexity at receiver side are analyzed. Simulation results show that the proposed beamspace MIMO-NOMA in code domain can achieve higher spectrum and energy efficiency compared with the existing beamspace MIMO.

To address the problems of unprecedented traffic volume and limited bandwidth, millimeter-wave (mmWave) communication (ranging from around 30 GHz to 300 GHz) becomes a promising technology for the future mobile communication systems [1]- [3]. Compared to the conventional wireless communications in the microwave bands, mmWave communications possess much more abundant available bandwidth. Furthermore, thanks to the short wavelength of mmWave radios, tens-to-hundreds of antenna elements can be integrated onto a small-size chip, enabling the application of massive MIMO to provide more multiplexing and beamforming gain [4]- [6]. Consequently, mmWave technology has drawn much attention from technical standard organizations, e.g., IEEE 802.11ad Task Group, IEEE 802.15 Task Group 3c, and Wireless Gigabit Alliance.
However, in practice, mmWave massive MIMO is difficult to implement due to the high complexity and energy consumption [7]- [9]. Particularly, because each antenna in MIMO systems usually requires one dedicated radio-frequency (RF) chain, the hardware cost and energy consumption caused by a large number of RF chains in mmWave massive MIMO systems become unaffordable [10]- [12].
As a promising solution for mmWave MIMO, the hybrid architecture combines the digital and analog architecture together to balance the hardware/power constraints and performance gain [1], [13], [14]. By the use of analog circuit, the number of simultaneous data streams between the transmitter (Tx) and receiver (Rx) is much less than that of the antennas, which largely decreases the number of RF chains required, thereby reducing the hardware cost and energy consumption significantly [15], [16].
The concept of beamspace MIMO using lens antenna array was proposed in the pioneering work [4], [5] to reduce the number of required RF chains in mmWave massive MIMO systems effectively. By deploying the lens antenna array, which plays a role in realizing spatial discrete Fourier transformation [25], beamspace MIMO can transform the conventional spatial channel into beamspace one so as to capture the channel sparsity at the mmWave frequencies [26]. Accordingly, the dominant beams are selected based on the sparse beamspace channel to reduce the number of required RF chains. Compared with phase-shifter network employing a large number of phase-shifters, power splitters/combiners and signal/control lines, the hardware cost and energy consumption incurred by lens antenna array is relatively low.
Nevertheless, a fundamental limit of beamspace MIMO is that, each RF chain can only serve one user at the same time-frequency resource, thus the maximum number of users served cannot exceed that of the RF chains [4], [7], [25]. To break this limit, in [27], a spectrum and energy efficient mmWave transmission scheme integrating the new concept of power domain non-orthogonal multiple access (NOMA) [28]- [30] with beamspace MIMO, i.e., beamspace MIMO-NOMA, was proposed.
In this paper, we focus on hybrid transceiver design for MIMO-NOMA deploying lens antenna array. In contrast to power domain NOMA, we consider code domain here. Sparse code multiple access (SCMA) [31]- [33] and pattern division multiple access (PDMA) [34], [35] are two code domain NOMA techniques proposed for the 5G wireless networks to provide high spectrum efficiency by using of multidimensional complex codebooks/constellations. In SCMA system, the coded bits of different users are mapped to sparse multi-dimensional codewords and transmitted on the same orthogonal-frequency division multiplexing (OFDM) subcarrier, providing massive connectivity with low multiuser detection complexity [36]- [38]. Potential performance gain can be achieved by integrating SCMA into beamspace MIMO, resulting the number of the served users can exceed that of RF chains. Obviously the benefits of SCMA can be easily generalized to PDMA and other code-domain NOMA systems. Specifically, the contributions of this paper is summarized as follows. We propose a beam selection scheme to reduce the required number of RF chains. Specifically, by taking the intra-beam interference into consideration and utilizing the quasi-orthogonality of the beamspace channel, we can select the beams through the sum-rate maximization criterion-based low-complexity algorithm. For beamspace MIMO-SCMA and MIMO-PDMA, the conventional magnitude maximization (MM) and interference-aware (IA) beam selection will suffer from serious intra-beam interferences, due to which, we propose to perform the beam selection based on factor graph and codebook design respectively. Simulation results show that the proposed beam selection algorithms can achieve near optimal sum-rate performance. 3) We propose a low complexity iterative algorithm to realize the SCMA/PDMA detection in practical mmWave communications using lens antenna array. As mentioned above, the inter-beam interference caused by multiple transmit antennas needs to be removed before processing the intra-beam interference. Because the complexity of ML detection is unaffordable in mmWave systems due to the huge number of antenna arrays, we propose to perform inter-and intra-beam interference cancellation in one step by using joint factor graph and MPA algorithm. In addition, we put forward a threshold-based MPA, where a belief threshold is applied, to control the algorithm process. Simulation results show that the proposed scheme obtains a much lower computation complexity with only a slight performance degradation when the threshold is set appropriately.
The rest of this paper is organized as follows. The system model of the proposed code-domain beamspace MIMO-NOMA is introduced in Section II. Section III and IV propose beam selection schemes for beamspace MIMO-PDMA and beamspace MIMO-SCMA systems respectively based on the achievable sum rate analysis. In Section V, a low-complexity joint MPA combined with threshold-based MPA is proposed. Numerical and simulation results are provided in Section VI. Finally, conclusions are drawn in Section VII.
Notation: We use bold letters to denote column vectors or matrices. For a matrix or vector, the operations and · denote transpose, conjugate transpose, matrix inversion and Euclidean norm, respectively. The shorthand form CN m, σ 2 denotes the probability density function (PDF) of a complex Gaussian random variable (RV) with mean m and variance σ 2 . Similarly Ray σ 2 represents the PDF of a Rayleigh RV with mean σ π/2. Furthermore, Exp (1/λ) represents the PDF of an exponential RV with mean λ. We use E [x] to denote the mean of a random variable x, and diag (h) to denote the diagonal matrix, whose k-th diagonal element corresponds to the k-th entry of vector h. C represents the set of complex numbers.

II. SYSTEM MODEL
In this paper, we consider a typical single-cell downlink mmWave communication system, where the base station (BS) employs N antennas and N RF RF-chains. Following the introduction to the system model of existing beamspace MIMO, the proposed code domain beamspace MIMO-NOMA is presented in detail.

A. Beamspace MIMO
As shown in Fig. 1(a), for traditional MIMO in the spatial domain, the receive signal vector y ∈ C K×1 for all K users in the downlink can be presented as where x ∈ C K×1 is the transmit signal vector for all K users with normalized power, P ∈ C N ×K represents the precoding matrix, and H = [h 1 , h 2 , · · · , h K ] is the channel matrix, h k ∈ C N ×1 is the channel vector between the BS and kth user. Applying Saleh-Valenzuela channel model to mmWave communications [27], we can have where the first term is the line-of-sight (LoS) component of the kth user, β k are complex gain and spatial direction respectively; the second term is the non-line-of-sight (NLoS) component of the kth user, L is the total number of NLoS components, and a (ψ) ∈ C N ×1 is the array steering vector. For the typical uniform linear array (ULA) with N antennas, we have where Γ (N ) = {q − (N − 1) /2, q = 0, 1, · · · , N − 1} is a symmetric set of indices centered around zero. The spatial direction is defined as ψ Δ = d λ sin θ, where θ is the physical direction, λ is the signal wavelength, and d is the antenna spacing satisfying d = λ/2 at mmWave frequencies.
The conventional spatial domain channel in (2) can be transformed to the beamspace one by employing a carefully designed discrete lens array (DLA) as shown in Fig. 1(b). Specifically, such DLA plays the key role of an N × N spatial discrete fourier transform matrix U, which contains the array steering vectors of N orthogonal directions covering the entire space as whereψ n is the spatial direction of the nth beam, n = 1, 2, . . . , N. Then we can obtain the receive signal vector in beamspace MIMO as whereH ∈ C N ×K is the beamspace channel defined as Hereh k is the beamspace channel of the kth user. In (6), the N rows ofH correspond to the N orthogonal beams with spatial directionψ 1 ,ψ 2 , · · · ,ψ N , respectively. Note that the number of dominant scatters in the mmWave prorogation environments is quite limited. Therefore, the number of NLoS components L N , leading the beamspace channel a sparse structure, i.e., the number of dominant elements h k is much smaller than N . As a result, selecting only a small number of appropriate beams to reduce the dimension of MIMO system will not bring obvious performance loss. Consequently we can approximate the receive signal vector as whereH r =H(x, :) x∈B , B contains the indices of selected beams, P r is the dimension-reduced digital precoding matrix. As the dimension of P r is much smaller than that of the original digital precoding matrix P in (1), beamspace MIMO can significantly reduce the number of required RF chains as shown in Fig. 1(b). Note that the required RF chains can not be less than the users to guarantee the spatial multiplexing gain. Without loss of generality, we consider the case N RF = K in this paper.

B. Proposed Beamspace MIMO-NOMA in Code Domain
In order to further improve spectrum efficiency and connectivity density, we propose to combine code domain NOMA (SCMA, PDMA) with beamspace MIMO. We consider SCMA in this section, which can be generalized to PDMA easily. Fig. 2 illustrates the precoding procedure of SCMA, in which M-ary symbols are mapped to sparse codewords according to the codebooks allocated. Due to the overloading property of SCMA, in the proposed beamspace MIMO-SCMA system, J users can be simultaneously served within I subcarriers and therefore the overloading factor λ = J/I > 1 [31].
The block diagram of the SCMA spreader for the kth beam, i.e. kth seleted antenna after beam selection, is shown in Fig. 3, k = 1, 2, . . . , K. The spreader operates J symbols in one transmission period, in which the log 2 (M )-bit symbol s k j maps to one of the columns of the sparse codeword matrix C k j ∈ C I×M so as to get the codeword of the jth layer x k j ∈ C I×1 , j = 1, 2, . . . , J. Then the codewords from all the J layers are added together to obtain the transmit vector on the kth beam as x k ∈ C I×1 . Then, each symbol is transmitted over I resources, such as time slots or orthogonal frequency division multiplexing (OFDM) tones, without inter-carrier interference.
We use the indicator matrix e., f k j,i = 1 denotes that the ith chip of x k j does not equal to zero. Without loss of generality and for notation brevity, we assume the indicator matrix F k at each transmit antenna is regular and has the same column weight d c and same row weight d r .
We consider all the N transmit antennas share the same I subcarriers for multiplexing. Each user are equipped with I antennas. The received signal of user j can be written as where x k satisfies a power constraint E x k 2 = P ; h k j ∈ C I×1 indicates the channel fading vector between the kth beam and jth user, the entries of which are assumed to be independently and identically distributed (i.i.d.) complex Gaussian random variables with zero mean and unit variance; n nr j ∼ CN (0, I) is the additive white Gaussian noise (AWGN) vector over the n r th antenna of user j.

III. BEAM SELECTION FOR PDMA
For traditional beam selection including MM [4] and IA [10] scheme, K users select K beams from N candidates. When considering PDMA, the non-orthogonality of the code introduces an overloading factor greater than one. As a result, J users select K beams among N candidates, J > K, which can be performed in two phases: 1)use the MM algorithm to select J beams (J > K) from N candidates based on the beamspace channels of J users; 2) choose K beams from J beams selected in first phase.
Here we define MM selection as a criterion. Because only a small amount of elements of the beamspace channel matrix have dominant values contributing a lot to the receive signal, MM selection takes advantage of the properties of the sparse nature of the beamspace channel.
Firstly, to apply the MM selection, we define the index set of the selected beams for the j-th user, as where h n,j is the n-th element of the j-th column of h j , j = 1, 2, . . . , K, and ξ (j) ∈ [0, 1] is the threshold used to define M (j) . Then we can have the beam index set for all the users as which is used by the transmitter to identify the dominant beams to be selected for transmission. After obtaining the beam indices, we derive the new channel matrix containing only a subset of beams from the original channel matrix as the size of which depends on the number of dominant beams n d = |M|. In fact, MM selection often leads to multiple selections of the same beam for different users and the number of required RF chains is not fixed for different channel realization. Secondly, for beamspace MIMO-PDMA, we further select K beams from the J candidates since there are only K RF chains (K < J), and the selected index set is where By setting the threshold ψ (j) ∈ [0, 1] carefully, beam selection ends until |P| ≤ K. However, the conventional MM may suffer serious interuser interference on the same beam if it is selected by different users. To deal with this situation, in [10], an IA beam selection scheme is proposed to select total J beams. But when system supports only K beams due to RF chains limit, IA scheme cannot be used in beamspace MIMO-PDMA system directly.
To solve these problems above, we propose a beam selection scheme in this section based on the property of PDMA. In the precoding process, signals for different users are stacked. PDMA can separate different user's signal through message passing algorithm (MPA) at the receiver. By designing the pattern for each user, inter-user interference in beamspace MIMO-SCMA is brought in manually to achieve better performance.
As shown in Fig. 4(a), the pattern of each beam consists of I rows and J columns. Here grey and blue box represent 1 and 0 respectively. Each pattern P (n) , n = 1, 2, . . . , N has a one-to-one mapping relationship with indicator matrix F (n). Fig. 4(b) shows the intra-beam and inter-beam interference in beamspace MIMO-PDMA systems. P is the interference stacking matrix added by K selected pattern matrix. The best way is exhaustion to select K beams among N pattern matrix. However the complexity of C K N makes it unacceptable in practical scenarios.
We propose a low-complexity factor graph allocation (FGA) algorithm. The basic idea is to select the factor graph that differs most with existing interference stacking matrix P to make the variance of the elements in final P as small as possible. By canceling dominant interference, FGA can obtain much better capacity and BER performance.
The algorithm consists of two stages, which will be explained as below.

A. Stage 1: Identify NIBs
We define beam k is non-interference beams (NIB) if its indicator matrix (pattern) has no conflict with the existing interference stacking matrix P. In typical mmWave communication scenario, I and J can be much larger than 4 and 6 respectively, thus there can be a few NIBs that keep the elements in interference stacking matrix P equal to 0 or 1.

B. Stage 2: Search the Best Beam Among NIBs
After selecting all NIB, we conduct interference-aware (IA) beam selection [10] among NIBs. In the first phase of stage 2, we identify interference users (IUs) and non-interference users (NIUs) and adopt the best beams for NIUs. The task in the second phase of stage 2 is to search another Card (Γ IU ) = K − Card (Γ NIU ) beams from the remaining Card (Γ NIB ) − Card (Γ NIU ) beams in accordance with the criteria of maximizing the achievable sum rate.
We start by sorting the beams of the beamspace channel, let b * j ∈ {1, 2, · · · , N} denote the strongest beam index of jth user. We define beam index :) and A =H(s, :) s∈{b * j |j∈ΓNIU} . We summarize the procedure of the proposed solution in Algorithm 1.
IV. BEAM SELECTION FOR SCMA Compared with PDMA, SCMA focuses on the codeword design of each user to achieve better bit error rate (BER) performance. After NIB identification and best unselected beam selection procedure, we propose a codebook selection (CS) algorithm to allocate codebooks to users dynamically, aiming at minimizing the channel correlation. The low channel correlation between beams is beneficial for inter-beam interference cancellation.
We commonly use union bound to estimate the average symbol error probability (ASEP) so as to evaluate the error performance, assuming the symbols are transmitted with equal probability. In general, ASEP is dominated by the nearest neighbor of symbols, which results in a tight upper bound. However, it is difficult to find the nearest neighbor in multiuser scenarios. To tackle this, we take into account all possible constellation points that contribute to the ASEP. The ASEP for the j-th user with joint ML detection is upper bounded by where is the pair-wise error probability (PEP).
Here τ j is the j-th dimension-wise distance for the downlink BC channel. It can be seen from (15), PEP is in direct proportion with τ j . Inspired by this, we propose a low-complexity codebook selection (CS) algorithm. The basic idea is selecting codebook that has the least squares norm on each beam. By maximizing the minimum codeword distance among different beams, CS will obtain much better capacity and BER performance.
The algorithm consists of two stages: i) identify all the NIB by considering the potential inter-beam interferences, the layer with same pattern is eliminated in this stage; 2) choose the best unselected beam with dynamic codebook allocation. Different from the scenario in PDMA, we should take the value of codeword into consideration in SCMA. We summarize the procedure in Algorithm 2.

V. PROPOSED LOW-COMPLEXITY DETECTION SCHEME
In this section, conventional maximum likelihood (ML) detection is described, based on which, we propose the joint MPA.

A. ML Detection
Aiming at detecting x mapped from s ∈ M (K·J)×1 , we can have the output of ML detector aŝ where y j and H j represents the receive signal and channel matrix of user j respectively. While ML is optimal for MIMO-SCMA, the high detection complexity restricts its application in practical system. When taking the example of MIMO-SCMA into consideration, as shown in Fig. 3, the ergodic number increases exponentially with not only the number of beams K but also that of the users J.

B. Joint MPA
Here we propose a low complexity joint MPA for downlink beamspace MIMO-SCMA. Because the resource nodes are intermediate variables at the detector, we attempt to ignore those and build a brand-new mapping function between layer nodes and receiver nodes as shown in Fig. 5. Thanks to the sparsity of both the codebook and the equivalent channel  matrix, the proposed joint MPA detector reduces the complexity exponentially. For clarifying, two parameters, M and K will be chosen to observe the complexity ratio of ML (conventional MPA) over joint MPA later.
The proposed joint MPA detection can be decomposed into two steps. Firstly, a virtual new codebook for different layers should be built at the receive. Because different transmit antennas share the same subcarriers, each receiver antenna will capture a combination of the spreading sequences from KJ layers. Also, the new mapping function turns the influence of channel to the codebook. According to Fig. 5, we can obtain Equation (17) as shown at the bottom of the previous page. Let F k be the original indicator matrix of the k-th selected beam. The new indicator matrix can be written as The new indicator matrix has K = K ·I rows and J = K · J columns with row index k = 1, 2, . . . , K and column index j = 1, 2, . . . , J . Let C k j ∈ C I×M be the orginal codebook for layer j of selected beam k. Then the virtual codebook C j for layer j = J · (k − 1) + j in the new factor graph can be written as where Secondly, following the joint factor graph and virtual codebook, the joint MPA is performed, i.e., the total KJ original symbols are decoded together. The normalized probability delivered from the receive node and the layer node are given as r i k →j (s) and q i j →k (s), respectively.ŝ j is the final estimation of the layer node j .
Assuming no a priori probability is available, the initial probabilities are set to be equal as After obtaining the initial probability, the computation is performed as a traditional MPA detector, which exchanges extrinsic information between layer nodes and receive nodes. The i-th iteration is For the k-th receive node, M k (s) is The position sets ofF are defined as ζ j = k F Here I T represents the number of iterations. As shown above, the joint MPA detector explores and utilizes the sparsity of codebook and equivalent MIMO channel matrix, the computation falls in a more rational region and fixed. The ergodic number is an exponential function about represents the row weight of joint indicator matrixF. Actually the complexity in terms of exact number of complex and real operations can be derived in subsection C.

C. Complexity Analysis
The complexity is investigated in terms of the number of real floating point operations (flops). One complex addition and multiplication involve 2 and 6 flops respectively. The total number of calculations for one block in ML detection is For joint MPA, we can have Here d k r and d k c represents the row weight and column weight of k th indicator matrix after beam selection.

D. Threshold-Based MPA Detection
Original MPA focuses on the maximum number of iterations and updates the message of function nodes (FNs) and variable nodes (VNs) in spite of the information of codewords, which involves many unnecessary calculations. In joint MPA, a belief threshold T h is set to choose credible codewords promptly. The presented scheme computes the reliability of every codeword in each iteration and judges whether there is a user that has a reliable codeword.
The detailed procedure of the proposed scheme is summarized in Algorithm 3.

VI. NUMERICAL AND SIMULATION RESULTS
We evaluate the performance of the proposed code domain beamspace MIMO-NOMA system is this section. Specifically, a typical downlink mmWave massive MIMO system is considered, where the BS is equipped with a ULA with N = 256 antennas and N RF RF chains (N RF ≤ N ). In the Judge: Equation (24), 8:X = [x 1 , · · · ,x j , · · · , x J ], 9: t = t + 1, 10: end if 11: end for 12: if size X = J then 13  proposed schemes, K beams are selected (K ≤ N RF ) and J users are served simultaneously by I resources such as time slots or OFDM tones. As defined above, λ = J/I (λ > 1) is the overloading factor based on SCMA/PDMA property. M represents the dimension of codebook. Specifically, we choose M = 4 in this paper as shown in Fig. 2. Fig. 6 and Fig. 7 show the sum-rate performance comparison against signal-noise ratio (SNR) between the proposed beam selection (FGA and CS) and conventional schemes (MM and IA) for 16 and 32 users respectively. Three typical mmWave massive MIMO schemes are considered here for comparison: (1) "Fully digital MIMO", where each antenna is connected to one RF chain, i.e., N RF = N ; (2) "Beamspace MIMO", where each beam only serves one user with N RF = K; (3) The proposed beamspace MIMO-NOMA, which integrates code domain NOMA and beamspace MIMO. From Fig. 6 and Fig. 7, we can observe that the proposed beam selection schemes achieve better sum-rate performance compared with MM and IA beam selection. FGA beam selection exhibits around 1.5 dB and 2.5 dB SNR gain for 16 users and 32 users scenario respectively at high SNR region.  Since CS algorithm works based on FGA algorithm, beam selection schemes designed for beamspace MIMO-SCMA is definitely better than FGA which did not consider codebook design. We can see clearly from Fig. 6 and Fig. 7, the proposed CS algorithm provides an obvious performance gain compared with FGA for both 16 users and 32 users scenario. Specifically, the gap for 32 users is larger than that of 16 users scenario.

A. Beam Selection for MIMO-PDMA and MIMO-SCMA
The energy efficiency of beamspace MIMO systems can be defined as ζ = R ρ+NRF PRF (bps/Hz/W). Here ρ and P RF are transmit power and energy consumption by RF chains respectively. For beamspace MIMO-SCMA, the number of RF chains N RF = J/λ while for digital system N RF = N . The transmitter energy efficiency of beamspace MIMO-SCMA is much higher than that in fully digital system.

B. Low-Complexity Threshold-Based Joint MPA Detection
As shown in (25) and (26) we define the complexity ratio C JMPA /C ML to describe the degree of reduction. Fig. 8 and Fig. 9 depict how the complexity ratio changes with M and K respectively, which is found to decrease with the increase of both M and K, implying that joint MPA is applicable to high order modulation and multi-antenna systems.    Fig. 10 presents the BER performance curves for ML, joint MPA and proposed threshold-based detection. As expected, joint MPA detector shows almost the same performance as ML detector in the whole SNR region over mmWave channels. For the threshold-based scheme, instead, when the threshold is small, i.e., the information updating between FNs and VNs would finish once the threshold is reached, the complexity is reduced significantly at the expense of BER performance degradation to some extent. With the increase of the threshold, the BER performance is improved and when T h = 10, the proposed threshold-based scheme can achieve near optimal (Joint MPA) BER. Fig. 11 depicts the relationship between complexity ratio and threshold T h . For joint MPA, the iteration number I T is a constant and equals to 5. For the proposed threshold-based scheme, when T h = 2, the complexity ratio is a little higher than 20%, which means that most detections jump out of loop after first iteration. When T h = 10, the complexity is about 88% over joint MPA with very close BER performance, which means each detection conducts 4.4 iterations in average.

VII. CONCLUSION
In this paper, we propose a new beamspace MIMO-NOMA in code domain to combine PDMA and SCMA with beamspace MIMO, so as to break the fundamental limit of existing beamspace MIMO that the number of users can not exceed that of RF chains. We put forward beam selection schemes to improve the spectrum-efficiency of both beamspace MIMO-PDMA and beamspace MIMO-SCMA systems. Based on which, a low complexity threshold-based joint MPA is proposed. The simulation results exhibit obvious complexity reduction while keeping the BER performance acceptable.