Generalized Polarization-Space Modulation

A novel generalized polarization-space modulation (GPSM) is proposed for polarized multiple-input multiple-output (MIMO) systems with a limit number of radio frequency (RF) chains. In the spatial domain, multiple dual-polarized (DP) transmit antennas are activated, and then combinations of those indices are used to convey information. While in the polarization domain, depending on the random input bits, only one polarization state is selected for each active DP transmit antenna to transmit information following the rule of the polarized shift keying. At the receiver, the maximum likelihood detector is employed as a benchmark to detect information bits being used to select the polarization state and activated DP antennas. In the detector, imperfect channel state information (CSI) is taken into account. Two less computationally complex detectors, i.e., a linear detector and a sphere decoding (SD) detector are proposed to relieve the computational burden. Sacrificing the average bit error probability (ABEP) performance, the proposed linear detector can reduce the computational complexity significantly. The proposed SD detector can achieve the optimum ABEP performance, while reducing computational complexity by reducing the search space. A closed-form union upper bound (UUB) on the ABEP of the GPSM system with imperfect CSI at the receiver is analytically derived and validated through simulations. From the UUB, a loose asymptotic bound on the ABEP, which sheds light on deriving the diversity gain and the coding gain, is derived. Numerical results show that the signal-to-noise ratio loss caused by increasing the number of transmit antennas is less than 3 dB while the spectral efficiency is increased by 7 b/z/Hz. Therefore, the GPSM can be a promising candidate of down link massive MIMO systems to achieve a high spectral efficiency with a limit number of RF chains.

Allowing multiple RF-chains at the transmitter, the conventional single-RF MIMO system has been generalized into MIMO systems with multiple but a limited number of RF chains, such as the generalized SM (GSM) [24], [25], generalized SSK (GSSK) [26], and generalized QSM [27], etc. In these systems, degrees of freedom (DoF) in the spatial domain is further exploited to increase the spectral efficiency.
Moreover, an additional available DoF in the polarization domain is promising to increase the spectral efficiency, and to reduce the required antenna spacing [28]- [32]. As a polarized single-RF MIMO, a dual-polarized (DP)-SM system, where only one of the horizontal and vertical polarization states is chosen to transmit a signal, was proposed by [33] and analyzed by [34]- [36]. For example, if the transmitted bit is 0, the vertical polarization state is selected, otherwise, the horizontal one is selected. Therefore, the polarization DoF has only 1 b/s/Hz multiplexing gain in the DP-SM system. To exploit the available polarization DoF more efficiently, a polarization shift keying (PolarSK) system was proposed by [37] for DP single RF single-input multiple-output (SIMO) schemes, with optimizing the signal constellation diagram. In the PolarSK system, more polarization states are employed to increase the spectral efficiency. However, PolarSK systems use only one DP transmit antenna, so that an available DoF in the spatial domain is not well exploited. Thus, exploiting available resources in the spatial and polarization domains and considering the MIMO scheme that supports multiple RF-chains, we propose a novel generalized polarization-space modulation (GPSM) system in this paper. In the proposed GPSM, polarized signals are generated by active DP transmit antennas and then amplified by RF chains to exploit polarization domain DoF. The selection of active DP antennas is made to obtain multiplexing gain achievable by spatial domain DoF. 0090-6778 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Due to a flexible structure of GPSM, GSM [24], DP-SM [33] and PolarSK [37] can be recognized as special configurations of the proposed GPSM. Beyond proposing the novel GPSM system, the following challenging issues are addressed in this paper. To the authors' knowledge, there is no existing work addressing these issues in the published literature.
1) At the receiver, the optimal maximum likelihood (ML) detector is designed as a benchmark to analyze the performance limit of the GPSM system. Most state-ofthe-art works assume that the receiver knows perfect channel state information (CSI). However, in a practical system, perfect CSI may not be available due to a limited length of pilot signals. Therefore, we design the ML detector considering the channel estimation errors.
Recently, imperfect CSI has been taken into account in SM systems [38], [39]. 2) In the GPSM system, the computational complexity of the optimum detector increases dramatically with the number of transmit antennas. Thus, it is necessary to develop a computational but highly accurate detector for the proposed GPSM system with multiple active DP antennas and multiple polarization states. In this paper, a linear detector and a sphere decoding (SD) detector are also proposed taking into account imperfect CSI. In the proposed linear detector, the search over polarization and spatial domains are separated. Moreover, when detecting the symbol in the polarization domain, the searching process for symbol detection is removed when polarization constellation in [37] is used. Therefore, the linear detector is able to reduce the computational complexity significantly with a minor bit error performance sacrifice. Especially, the reduced searching space is designed in the SD detector to keep all possible optimum candidates for the symbol detection, so that the SD detector can achieve the same bit error performance as that of the ML detector. 3) For the ML detector, the closed-form union upper bound (UUB) on the average bit error probability (ABEP) have been previously derived for DP-SM [33], [34], [36] and PolarSK [37], respectively. However, the error probability caused by a joint wrong detection of activated DP transmit antennas and polarization states has not been investigated in existing works. Therefore, it is not straightforward to use state-of-the-art UUBs to analyze the performance of GPSM. In this paper, an upper bound on the ABEP of the GPSM with imperfect CSI is analytically derived and verified to be tight through link-level Monte Carlo simulations. The coding gain and the diversity gain are derived to evaluate the impact of the joint use of spatial and polarization domain DoFs to convey information. From the simulations and analytical performance analysis, it is verified that the GPSM system achieves the same diversity gain as those of DP-SM and PolarSK systems. Moreover, analytic results show that the diversity gain of the GPSM system with imperfect CSI depends on the model of channel estimator error. 4) Performance of the GPSM is also analyzed via numerical analysis, which shows that when GPSM can detect correctly the information mapped onto the polarization constellation, then a smaller constellation size can be used at the same spectral efficiency. Thus, a less ABEP can be achieved over the DP-SM. Moreover, the extension from the PolarSK to the GPSM sheds light on the massive MIMO scheme, which is important for emerging 5G applications. Due to the use of additional transmit antennas, the GPSM results in a slight SNR gap in comparison with the PolarSK. However, the spectral efficiency can be improved as much as 7 b/s/Hz over the PolarSK for the considered several scenarios. The organization of this paper is as follows. Section II introduces the system model, including the mapping and ML detection process. In Section III, a linear and SD detectors are proposed to reduce the computational complexity, whose computational complexity analysis is given in Section IV. In Section V, an analytical expression for an upper bound on the ABEP of the GPSM system over fading channel is proposed. Section VI illustrates numerical results. Section VII concludes this paper.
The notations in this paper are summarized as follows. i) I N denotes a N × N unit matrix. ii) A H denotes the Hermitian transform of a matrix A. iii) det(A) and A † denote the determinant and the adjugate matrix of a matrix A, respectively. iv) a denotes the two-norm of a vector a. v) E(A) denotes the expected value of the random variable A. vi) S {a | b} denotes the set formed from the class of all a satisfying condition b. vii) Re(a) and Im(a) respectively denote the real and imaginary parts of a complex number a, with its corresponding angle denoted by ∠(a). a * denotes the conjugate of the complex number a. viii) · , · and · denote the floor, ceiling, and round operations for a real number.

A. Transmitter
The proposed GPSM system for a generic N T × N R DP-MIMO scheme is illustrated in Fig. 1, where N T denotes the number of transmit DP antennas, N RF denotes the number of RF chains, (M, K) determines the size of the polarized signal constellation, and (q V,nRF , q H,nRF , k nRF ) is the parameter set defining the polarized signal amplified by the n RF -th RF chain.
In the polarization domain, polarized states are generated following [37,Eq. (1)]: for q V , q H = 1, 2, . . . , M, and k = 1, 2, . . . , K. p V and p H are signals that are transmitted by vertically and horizontally polarized transmit antennas, respectively. In this correspondence, PolarSK signal constellations for K = 1 and K = 2, i.e., C 1 and C 2 constellations in [37, Fig. 2] are employed as typical examples. In wireless communications, binary bits are transmitted. Therefore, each selection of the polarization state conveys log 2 (M 2 K) bits of information in the polarization domain. Moreover, the C 2 constellation diagram is optimized following [37,Algorithm 2]. In the spatial domain, an array of DP transmit antennas is used to convey information following the rule of the GSSK [26] with N RF RF chains. In the polarization switchable antennas [41], [42], one oscillator with a single RF chain is used to excite one DP antenna and generate electromagnetic waves with a polarization state selected by diodes. Therefore, we assume that N RF polarized signals are generated by N RF DP transmit antennas to exploit polarization domain DoF, and the selection of the combination of N RF DP transmit antennas are employed to obtain multiplexing gain using spatial domain DoF simultaneously following the rule of the GSSK system [26]. In a transmit antenna array with N T DP antennas, where the modulated PolarSK signal s nT is transmitted over the n T -th DP antenna. 2N RF columns of s are generated according to (1). Values of elements in the other 2N T − 2N RF columns of s are assigned as 0. s is normalized such that s 2 = 1. Signals with linear, circular or elliptic polarization states, determined by q V , q H , and k, are generated by each activated DP transmit antenna according to (1). For each transmission, N RF sets of (k, q V , q H ) are selected to convey N RF log 2 (M 2 K) bits of information, since N RF polarized signals are transmitted by N RF DP antennas. The parameters of transmitted polarized signals are denoted by k = [k 1 , k 2 , . . . , k NRF ], q V = [q V,1 , q V,2 , . . . , q V,NRF ], and q V = [q H,1 , q H,2 , . . . , q H,NRF ], where k nRF , q V,nRF , and q H,nRF denote the selection of k, q V , and q H in the n RF -th activated transmit antenna, respectively. Since E[ s 2 ] = 1 and E[ x nRF 2 ] = 1 NRF , the PolarSK signal transmitted by the n RF -th activated transmit antenna is given by and a vector x k,qV,qH = [x 1 , . . . , x nRF , . . . , x NRF ] T denotes decimation of none-zero elements in s. For each transmission, the selection of polarization state conveys N RF log 2 (M 2 K) bits, and the selection of activated transmit DP antennas conveys L = log 2

NT NRF
bits. Thus, the spectral efficiency of the GPSM system is given by It is worth to note that GSM [24], DP-SM [33] and PolarSK [37] are special configurations of the proposed GPSM. If kn RF = 0, only a vertically linear polarized signal is transmitted and the GPSM system becomes the GSM system, whose data rate is L + N RF log 2 M b/s/Hz. If N RF = 1 and k ∈ {0, π 2 }, the GPSM system becomes the DP-SM system. When k = 0, the transmitted polarization state becomes x k,qV ,qH = exp −j2π qV−1 M , 0 H , whereas q H is not available to convey information. Otherwise when k = π/2, q V is not available. Therefore, the spectral efficiency of the DP-SM system becomes log 2 (2M N T ) b/s/Hz. In contrast, if N T = 1, the DoF in the spatial domain is lost and the GPSM system becomes the PolarSK system, whose spectral efficiency is log 2 (M 2 K) b/s/Hz. We observe that for a same set of K, M , and N T , the data rate of GPSM is significantly higher than the state-of-the-art GSM, DP-SM and PolarSK. When N T = N RF , the GPSM is specialized as a spatial multiplexing (SMX) system with the PolarSK signal constellation. In comparisons with the SMX system, the proposed GPSM system may require a more SNR to achieve the same ABEP under the same spectral efficiency. However, the SMX system has to equip N T RF chains at the transmitter. Therefore, the proposed GPSM is promising for applications that require a high data rate but a limited cost of RF chains. Example 1: In a GPSM system with M = 4, N T = 5, N RF = 2, and K = 2, we have the data rate R = 13 b/s/Hz. We assume that data bits are to be transmitted. The first 3 bits, [1 1 0], are conveyed by selecting l = 7 that activates the second and the fifth transmit DP antenna to transmit signal according to [26, (1). Similarly, for s 5 transmitted by the fifth DP antenna, we can obtain s 5 = 1 √ in this transmission, n T = [3,5], q V = [2,3], q H = [3,2], and k = [1,2], and the transmitted signal is given by denotes the channel matrix from the n T -th DP antenna at the transmitter to the n R -th DP antenna at the receiver, X denotes the expected power ratio of co-polarized channel and the cross-polarized channel [50]. In this paper, for the sake of simplicity, we assign a constant value to X, as in [51]. For different n T , H nT is assumed to be independently distributed, and we consider the narrowband fading channel model, so that h VV,nR,nT , h VH,nR,nT , h HV,nR,nT , and h HH,nR,nT are independent complex normal distributed random variables with a unitary variance.

B. Optimum Receiver
At the receiver, both the index of the combination of activated transmitting antennas and the transmitted polarization state need to be determined. As a benchmark, the ML detector [43] is employed to detect the GPSM signal, i.e., [38,Eq. (2)] In a practical GPSM system, the receiver may not be able to obtain perfect CSI. In this subsection, we design the ML detector with imperfect CSI. We assume that the transmitter transmit mutually orthogonal pilot sequences for channel estimation before transmission, and CSI is estimated at the receiver. The real channel matrix is given by the summation of the estimated channel matrixĤ and the error matrix E at the receiver, i.e., where σ 2 e denotes the esitmation error of CSI. Each element of E is complex Gaussian distributed.
Substituting (12) into (7), we obtain Therefore, assuming that σ 2 e is known at the receiver, the ML detector of GPSM system with imperfect CSI is given by

A. Linear Detector
The computational complexity of the linear detector is significantly less than that of the ML detector since the searching process is avoided [45], [46]. However, the linear detector is difficult to be obtained by using the ML detector without loss of optimality. Nevertheless, if we detect l using the estimated s regardless of (k, q V , q H ), which was selected in a finite domain, a linear detector can be obtained with a very low computational complexity, as verified in Theorem 1.
Theorem 1: For N R ≥ N RF , the linear detector of the index of combination of activated DP antennas l is given bỹ Given the detectedl-th combination of active DP tranmit antennas, the receiver knowsñ T . Then, for theñ T -th active DP transmit antenna, the optimum estimate of the transmit signal is given by so that the optimum estimates of parameters of the PolarSK constellation symbol, q V,ñRF , q H,ñRF , and kñ RF , are, respectively, given bỹ andkñ where Algorithm 1 Pseudo-Code for the sub-Optimum Linear Detector Input:Ĥ, y Output:l,k, The pseudo-code for the proposed linear detector is given in Algorithm 1, where d l , d l,0 , k0 , l 0 , k 0 , d k , and d k,0 , are local variables.
Proof: See Appendix A. If K > 2, the distance between PolarSK constellation points in different circles of latitude in the Poincaré sphere is so small that the ABEP performance of the PolarSK is degraded significantly. Therefore, the value of K is usually assigned to be equal to 1 or 2.
1) K = 1: In this case, the detection ofkñ RF is not needed. Therefore, lines 15-22 in Algorithm 1 are skipped.

B. Sphere Decoding Detector
Linear detectors usually suffer from performance loss. Instead, SD detectors keep all possible optimum candidates for the symbol detection, and thus they can achieve the same ABEP performance as that of the optimum ML detector [47]- [49]. In this subsection, a SD detector is designed for the proposed GPSM system.
As introduced in the system model section of this paper, A l denotes the l-th combination of activated DP transmit antennas. For any given l, by substituting (12) into (7), we have whereĤ A l and E A l are the channel estimate of H A l and its estimation error, respectively. Now using QR decomposition, as in [52], we obtain where The triangular matrix R A l is denoted by Each R p1,p2 is a 2 × 2 matrix. When p 1 = p 2 , R becomes an upper triangular matrix.

By definingy
we obtainy Therefore, for a given l, the optimum detection problem of (k, q V , q H ) is given by Given a squared Euclidean distance d, rewrite (31) as follows which includes the optimum solution. By substituting (27) into (32), we obtain an alternative expression for S 1 as (33), as shown at the bottom of this page, provided at the bottom of this page. Defining S p for p = 1, 2, . . . , N RF as (34), as shown at the bottom of this page, we can obtain the following Lemma 1. Lemma 1: Proof: The equation (35) holds as long as S p ⊆ S p+1 for any p = 1, 2, . . . , N RF − 1.
For any [k,q V ,q H ] ∈ S p , the following inequality holds according to S p defined in (34).
Then, we can obtain Therefore, [k,q V ,q H ] ∈ S p+1 for any [k,q V ,q H ] ∈ S p , and the lemma has been proved. Based on (32), the search set for the SD detector can be reduced sequentially using the following observations. 1) If only one solution is in S 1 , then this is the optimum solution. 2) If the optimum solution is in S p , it has to be in S p+1 .
That Based on this sequentially reduced search set, the SD detector of the proposed GPSM system is described as follows. Before the searching of the optimum solution, we set d = +∞ to guarantee that the optimum solution is in the set S 1 . In the searching process, d is updated sequentially and the searching space is then reduced. First, we find all possible candidates in S NRF , and then update d consequently for each candidate. Second, we find all candidates in S p−1 given the set of possible candidates in S p , and then update d consequently. Thirdly, we obtain the only candidate in S 1 . This candidate is the optimum solution for a specific l. Finally,l that leads to the minimum Euclidean between the optimum solution and the received signal is determined. The proposed SD detector is summarized in Algorithm 2, where d 0 and l 0 are local variables.
Since every possible optimum solution is evaluated in Algorithm 2, the ABEP performance of the SD detector is the same as that of the ML detector. Nevertheless, since less

IV. COMPLEXITY EVALUATION
In this section, we provide the computational complexity of the proposed detectors. In general, floating operations (flops) are machine dependent. To appropriately compare the complexity of detectors, we assume that each real addition, multiplexing, division, flooring, ceiling, rounding or comparison costs one flop. Moreover, we assume that each computation of the basic functions, including ∠(·), exp(·), square root, and trigonometric function, costs N f flops. Key operations used in all detectors are summarized in TABLE I, where q denotes an integer, a denotes a complex number, a denotes a column vector with 2N RF elements, b denotes a column vector with 2N R elements, A denotes a 2N RF × 2N RF matrix, and B denotes a 2N R × 2N RF matrix. How we obtain TABLE I is explained as follows.
A complex addition and a complex multiplexing cost 2 and 6 flops, respectively. Computing the product of a real number and a complex number costs 2 flops. Computing the a * a for a complex number a costs 3 flops.
The computation of a and A costs 32N R N RF − 4N RF flops and the adjugate matrix of A. Because A is a Hermitian matrix, A † is a Hermitian matrix too. Therefore, we only need to compute the elements that are in the upper triangular region of A † . Given a matrix A, computation of det(A) and A † cost To compute the the QR decomposition, Gram-Schmidt orthogonalization [53], and householder transformation [54] are commonly employed. In this paper, we use the Gram-Schmidt orthogonalization for complex vectors for the sake of simple hardware implementation [55]. For a 2N R × 2N RF matrix, the computation of its QR decomposition costs , cos k exp 2jπ q−1 M does not change under a given signal constellation diagram and a given k and q during transmission. Therefore, cos k exp 2jπ q−1 M is cached as a constant. According to this table and assumptions, the computational complexities of the ML detector and the proposed detectors are summarized as follows:

A. ML Detector
We observe that in the ML detector defined by (11), operation 1 in TABLE I is computed for 2 LNRF log 2 (M 2 K) times. Associated with each computation, a comparison is executed. Therefore, the computational complexity of the ML detector is

B. Linear Detector
In Algorithm 1, operation 2 in TABLE I is computed for 2 L times and costs In line 4, comparison of two real numbers is executed for 2 L times and costs 2 L flops. In line 9, operation 3 is computed once and costs −10N RF (6N f +14) flops, respectively. For K = 2, following (23), N RF extra flop is required for comparison η V and η H to detect k. When K > 2, lines 12-17 costs N RF (KN f + 3K) flops. Therefore, for K = 1, K = 2, and K > 2 signal constellation, flops consumptions of the linear detector are computed by (41), as shown at the bottom of this page, provided at the bottom of this page, where For massive MIMO configurations, 2 L has a large scale, and the flops taken by the linear receiver in the large N T regime is asymptotically equal to T A .

C. SD Detector
The computational complexity of the proposed SD detector is random because the size of the search space is random. Thus, in section V, we will count flops in the average-case for the proposed SD detector through Monte-Carlo simulations.

A. Analytic ABEP
In general, the exact ABEP has no closed-form expression. Therefore, the UUB technique [44,Eq. (3)] is employed to derive an analytic tight upper bound on the ABEP of the GPSM system. Taking into account imperfect CSI, the UUB on the ABEP is provide in Theorem 2.
Proof: See Appendix B.

B. Diversity Gain Analysis
For an error probability P (ρ) and two constants O and C, if lim Therefore, we obtain an asymptotic upper bound on the ABEP for perfect CSI as: where the coding gain, C GPSM , is computed by (50), provided at the bottom of the next page. Although the asymptotic upper bound is looser than the UUB, it sheds light on deriving the diversity gain of the considered GPSM system in the high SNR regime. From (49), the GPSM has the same diversity gain as the DP-SM and PolarSK, i.e., 2N R . Since the DP-SM and the proposed GPSM has the same diversity gain, the comparison of their coding gain will be shown in the next section. As shown in the numerical results section, when the GPSM can detect correctly the information that is mapped onto polarized signals, then a smaller constellation size can be used to achieve a higher coding gain over the DP-SM at the same spectral efficiency. For imperfect CSI, the diversity gain depends on the model of channel estimation error σ e . In previous works, σ 2 e is modeled as either a constant or as 1 Npρ , where N p is the number of pilot signals [38]- [40]. However, to generalize the channel estimation error model, we assume that Note that previous models of σ 2 e can be considered as two specification of (51) The asymptotic ABEP with imperfect CSI is derived as follows.
While β = 0, substituting (51) into (47) and (49), we obtain the asymptotic ABEP of the proposed GPSM system with imperfect CSI as , 0 < β < 1, Thus, with imperfect CSI, we can summarize the asymptotic diversity gain of the GPSM, as a function of β, as follows Note that (53) indicates that the diversity gain with imperfect CSI depends on the value of β. When β = 0, the diversity order is zero, and the ABEP is a constant in the high SNR regime. When 0 < β < 1, the diversity order is 2βN R . When β ≥ 1, the diversity order is 2 N R . Especially, when β > 1 the asymptotic ABEP of the GPSM system is independent of the quality of CSI. That is, imperfect CSI results in the same performance as perfect CSI in the high SNR regime.

VI. NUMERICAL RESULTS
Numerical results and insights on the proposed GPSM system will be presented in this section. Without further specifications, the parameters are listed in TABLE II.

A. Complexity Analysis
In order to shed light on massive MIMO, we analyze the computational complexity of detectors as a function of N T in Fig. 2. Numerical results show that the computational complexities of all detectors are shown to be O(N NRF T ). Nevertheless, the flops consumption for the ML detector is dramatically higher than those of the proposed linear and SD detectors, even though the ML detector and the SD detector achieve the same ABEP performance. Interestingly, For N T = 2 and N RF = 2, the inversion operation in Algorithm 1 lines 3 and 9 requires more flops in comparison with the QR decomposition in Algorithm 2 line 3. Therefore, the linear (l,k,qV,qH) (l,k,qV,qH) N (l,k,qV,qH),(l,k,qV,qH) ( detector cost more flops than the SD detector under this situation. In the large N T regime, the flops consumed by computing inversion of matrix and detecting (q V,ñRF ,q H,ñRF ,kñ RF ) in the linear detector is shown to be negligible. In the large N T regime, when N RF = 1, the average flops consumed by the linear detector is only 17% and 8% of those of the ML detector, respectively, for C 1 and C 2 signal constellations. The average flops consumption of the SD detector for C 1 and C 2 signal constellations are 41% and 26% of those of the ML detector. In contrast, for N RF = 2, the average flops consumption of the SD detector is only 5% and 2% of those of the ML detector, respectively, for C 1 and C 2 signal constellations. Moreover, the average flops consumed by the SD detector are very close to those consumed by the linear detector. Since the SD detector is capable of achieving the optimum bit error performance, the SD detector is recommended for N RF = 2.

B. ABEP Analysis for Perfect CSI
First, the derived UUB and the asymptotic bound on the ABEP of GPSM systems computed by (43) and (49) are illustrated in Fig. 3, which shows that the UUB is tight on the ABEP computed by Monte-Carlo simulation. The comparison of detectors for the GPSM is also shown in Fig. 3. From Fig. 3(a,b), we can see that the ABEP of the linear detector is quite close to those of the optimum ML and SD detectors for N RF = 1. Since the linear detector requires a much lower computational complexity than the SD detector, the linear detector is recommended for N RF = 1. However, the linear detector leads to a large ABEP due to the inter-channel interference (ICI) among activated transmit antennas [1]. For C 1 GPSM with N RF = 2, the linear detector leads to an SNR gap of 8 dB at the ABEP of 10 −6 in comparison with the SD detector. For C 2 GPSM with N RF = 2, the SNR gap is 4 dB. Thus, for N RF = 2, the SD detector is recommended without a strict limitation on the computational complexity.
The proposed GPSM jointly exploits DoF available in the signal, the polarization and the spatial domains. In the system design stage, we need to clarify how to balance bit rates among all variables such as N T , N R , N RF , K, and M for a fixed spectral efficiency. According to (52), the asymptotic ABEP is uniquely determined by the coding gain C GPSM with a given value of N R and a fixed channel estimation error. Therefore, we compare the coding gain of the GPSM system with various variables specified in TABLE III. Therein, the system is specialized as the SMX system with PolarSK signal constellation with N T = N RF , and is specialized as the PolarSK system with N T = 1.
It is observed from TABLE III that the proposed GPSM systems, equipped with a larger array of transmit DP antennas, have a greater coding gain than the SMX system for the same spectral efficiency R = 6 b/s/Hz even through the GPSM system requires one less RF chain than the SMX system. With the same number of RF chains, e.g., N RF = 2, the proposed GPSM system achieves the spectral efficiency of 13 b/s/Hz, which is 3 b/s/Hz greater than that of the SMX system, even through the GPSM and the SMX systems has the same asymptotic ABEP, i.e., C GPSM = −14.61 dB. With the same number of transmit DP antennas, e.g., N T = 2, the SMX and the GPSM systems require two RF chains and one RF chain, respectively. Under this situation, we observe that the SMX system requires 2.13 dB more SNR than the GPSM does to achieve the same asymptotic ABEP, targetting a same spectral efficiency of R = 6 b/s/Hz. In contrast, for N T = 2 and R = 8 b/s/Hz, the coding gain of the SMX system is 3.6 dB higher than that of the GPSM. Therefore, requiring a less number of RF chains, the proposed GPSM systems achieves a higher asymptotic ABEP than the SMX system while maintaining the same spectral efficiency. Moreover, it is observed from TABLE III that the ABEP reduces with an increasing N T for fixed spectral efficiency and N RF . In the system design stage, to enhance the reliability of the GPSM system, more cost due to an increasing number of transmit antennas is required.
Comparison of 8 b/s/Hz PSM with DP-SM is also given in Fig. 4 with one RF chain. By letting k = 0, q H = 1,ˆ k ∈ {0, π/2}, andq H = 1, we obtain the asymptotic expression for the ABEP of DP-SM as where the coding gain can be computed by substituting N RF = 1 and k ∈ {0, π 2 } into (45), (44) and (50), as (55), provided at the bottom of the next page.
Since GPSM and DP-SM systems achieve the same diversity gain, asymptotic ABEPs of DP-SM and PolarSK with a same N R are parallel lines with a same slope of 2N R . With the same data rate, the GPSM system outperforms the DP-SM system. In particular, the SNR gap between the asymptotic ABEPs of the GPSM with C 1 constellation and DP-SM is 2 dB. The SNR gap between the asymptotic ABEPs of the GPSM with C 2 constellation and DP-SM is 4 dB. Thus, GPSM achieves a lower asymptotic ABEP than DP-SM.
According to (49) and (54), the SNR gap between GPSM and DP-SM systems is the ratio of coding gains at the same asymptotic ABEP as To obtain a more precise comparison of the GPSM and the DP-SM, the SNR gap G versus the spectral efficiency is plotted in Fig. 5. For the GPSM with C 1 constellation and K = 1, the spectral efficiency is log 2 (M 2 N T ) b/s/Hz, whereas for the GPSM with C 2 constellation and K = 2, the spectral efficiency is log 2 (2M 2 N T ) b/s/Hz. Therefore, for the GPSM with C 1 constellation and an odd number of log 2 (N T ), the spectral efficiency will be an even number, and vice versa. However, for the GPSM with C 2 constellation and an odd number of log 2 (N T ), the spectral efficiency will be an odd number, and vice versa. In addition, we can observe that the SNR gap   6. Impact of increasing number of transmit antennas on the ABEP of the GPSM system with the SD detector. When N T = 1, the GPSM system becomes the PolarSK system [37]. Markers show simulation results.
increases with an increasing spectral efficiency due to a more efficient exploiting of the polarization domain resource.
The single RF MIMO system is recognized as a candidate of massive MIMO implementations [12], [13]. Therefore, the ABEPs of GPSM with various scales of transmit antenna array against SNR are illustrated and investigated in Fig. 6, where the SD detector is employed. When N T = 1, the GPSM system becomes the PolarSK system. (49) indicates that Fig. 7. SNR gap between the GPSM system and the PolarSK system [37]. Markers show simulation results. increasing scale of DP transmit antenna array does not affect the diversity gain achieved by the GPSM transmission, but decreases the coding gain and leads to a SNR gap. For a given signal constellation diagram with a single RF chain, N RF = 1, the SNR gap introduced by increasing N T from 1 to 128 is less than 2 dB, which is quite small, while the data rate is increased by 7 b/s/Hz in contrast to the PolarSK system. To investigate the impact of adding transmit antennas on the ABEP performance, the SNR gap computed by G massive 10 log 10

C PolarSK
CGPSM is plotted in Fig. 7, where C GPSM is computed by (50), and C PolarSK is computed by substituting N T = 1 into (44) and (50). Moreover, we observe from Fig. 7 that for the GPSM with C 2 constellation, the SNR gap is negative for a large number of transmit antennas N T . It means that in the high SNR regime, the ABEP decreases even though a large spectral efficiency is achieved. To verify the results in the GPSM with C 2 constellation, simulations and analytic UUBs of ABEP with N T = 1 and N T = 256 are plotted in Fig. 8. Numerical results show that the ABEP for N T = 256 is higher than that for N T = 1 in the low SNR regime. However, N T = 256 results in a lower ABEP in the high SNR regime. Especially, in a massive MIMO system with N T transmit antennas, a log 2 N T multiplexing gain can be achieved without sacrificing much transmit power. Therefore, the proposed GPSM system can be a candidate for massive single-RF MIMO implementations.

C. ABEP Analysis for Imperfect CSI
With imperfect CSI, the UUB on ABEP, the asymptotic ABEP and the simulation results for various CSI estimation error models are illustrated in Fig. 9. With imperfect CSI, the upper bound on the ABEP is tight in the high SNR regime. The asymptotic ABEP, computed by the diversity gain, is validated. As indicated in (52), the diversity gain and coding gain depend on the model of channel estimation error. As shown in Fig. 9(a, b), using the model σ 2 e = 1 Np ρ −1 , the ABEP with imperfect CSI has the same diversity gain as that with perfect CSI. However, the coding gain is reduced by 10 log 2 1 + 1 Np dB, and leads to an SNR gap of 10 log 2 1 + 1 Np dB. For a constant CSI estimation error σ 2 e , as shown in Fig. 9(c, d), the ABEP has a floor, whose value is equal to the ABEP under a SNR of Therefore, the GPSM system with a less ABEP at ρ = 1−σ 2 e σ 2 e is more robust to imperfect CSI in the high SNR regime. When we assume σ 2 e = 1 4 ρ −β , as shown in Fig. 9(e, f), both the diversity gain and coding gain are reduced by imperfect CSI.

VII. CONCLUSIONS
The GSM, PolarSK, and DP-SM systems have been generalized into the development of the GPSM under DP-MIMO scenarios. Two less computationally complex detectors, i.e., the linear and SD detectors, have been proposed. The SD detector achieves the optimum ABEP performance, but requiring less complexity than the ML detector. The linear detector reduces the computational complexity significantly by sacrificing ABEP performance. The UUB and asymptotic bound on ABEP have been analytically derived, which enable us to show the diversity gain of 2N R . Moreover, we have verified that the diversity gain depends on the model of the channel estimation error. We have found that our proposed GPSM system outperforms the DP-SM system because the well exploited DoF in the polarization domain enlarges the minimum Euclidean distance between any pair of closest symbols. Simulation results have shown that the SNR gap in the high SNR regime introduced by an increasing number of antennas from 1 to 128 is less than 2 dB while the data rate is increased by 7 b/s/Hz. Thus, the GPSM can be a promising candidate for downlink SM massive MIMO implementations with a reduced number of RF chains at the transmitter.

A. Proof of Theorem 1
We first rewrite (11) as follows: Regardless of the fact that x l,k,qV,qH is selected from a finite set with a degree of 2 NRF log 2 (M 2 K) , we assume that the optimum estimate of x l,k,qV,qH for each selection of A l is given by and therefore we can obtain (15). Again substituting (A.4) into (16), the sub-optimal detection of k, q V , and q H is given by Maximizing (A.7), we have the sub-optimum detections of q V,ñRF and q H,ñRF , respectively, by (17) and (18).

B. Proof of Theorem 2
According to (11), the UUB of the GPSM is given by where N (l,k,qV,qH),(l,k,qV,qH) is the Hamming distance between symbols (l, k, q V , q H ) and (l,k,q V ,q H ). The pairwise error probability (PEP) is defined by where Q(x) du, vectors s andŝ are composed following (5), and ρ 0 is defined in (47). Substituting where B(z 1 , z 2 ) is the beta function, and G z1,z2 is defined by (46). Substituting (B.5) into (B.1), we obtain the closedform expression for UUB on the ABEP of the proposed GPSM system as (43).