Spatially Coupled Protograph LDPC-Coded Hierarchical Modulated BICM-ID Systems: A Promising Transmission Technique for 6G-Enabled Internet of Things

As a spectral-efficiency technique for unequal error protection (UEP), hierarchical modulation (HM) bit-interleaved coded modulation (BICM) with iterative decoding (ID) has attracted interests in the wireless communication community. In this article, we conduct an investigation on spatially coupled (SC) protograph low-density parity-check (P-LDPC)-coded $M$ -ary quadrature amplitude modulation (QAM) HM-BICM-ID systems. We first develop an information-theoretic methodology to calculate $(\log _{2}M)/2$ types of constellation-constrained average mutual information (AMI), which can be used to characterize the performance limits of different layers in the HM-BICM systems. We further propose a two-stage design approach to construct a novel type of constellations, called as structural quadrant (SQ) constellations, and develop a quadrant-based harmonic mean analysis to evaluate the nonfeedback and iterative-feedback asymptotic performance of the proposed constellations. In addition, we conceive a performance-analysis tool, referred to as multistream-based extrinsic information transfer (MS-EXIT) algorithm, for predicting the decoding thresholds of all individual coded-bit streams in the proposed SC P-LDPC-coded HM-BICM-ID systems. Simulation results not only agree well with the theoretical analyses but also indicate that the proposed SC P-LDPC-coded HM-BICM-ID systems are remarkably superior to the state-of-the-art counterparts. Thereby, the proposed SC P-LDPC-coded HM-BICM-ID systems are competent to provide diverse Quality of Service (QoS) for future wireless applications, such as 6G-enabled Internet of Things (IoT).


I. INTRODUCTION
W ITH the rapid development of modern communication technology, large coverage and multiple users have become the most fundamental features of digital transmission scenarios, such as multimedia transmissions [1], broadcasting systems [2], and cellular networks [3]. Hence, there is a surge of demand to develop advanced technologies to realize high-throughput and high-reliability wireless communications, especially for the 6G-enabled massively data-intensive Internet-of-Things (IoT) applications [4]. To achieve the above goal, many researchers have focused their attention on the hierarchical modulation (HM), which is a powerful solution to meet the requirements for diverse communication services simultaneously [5]. In particular, the HM scheme not only can carry multiple transmitted data streams at the same time but also possesses strong flexibility and compatibility; because the layered structure of HM can provide unequal error protection (UEP) for these transmitted data streams based on their different importance and priorities [6]. For this reason, HM has been widely used in various communication scenarios, such as multiclass data transmissions [7], relay systems [8], [9], digital broadcast systems [10], and cooperative communication systems [11]. Thanks to the intrinsic UEP characteristics, it is advisable to deploy the HM-aided techniques to optimize the resource allocation in 6G-enabled IoT.
In order to achieve higher transmission throughput, bitinterleaved coded modulation (BICM) [12] can be incorporated into HM to formulate a spectral-efficiency framework, referred to as HM-BICM. Owing to the advantages of intrinsic UEP and spectral efficiency, extensive research works for analyzing and optimizing the HM-BICM systems have been carried out. For instance, some attempts have concentrated on deriving excellent mapping rules, which are one of the critical parameters that determine the achievable rates of HM-BICM systems [13]- [16]. In [13], a type of nonuniform constellations has been designed for HM-BICM systems based on the minimum-Euclidean-distance (MED) property of adjacent labels. Moreover, by adjusting the MED ratio between different labels, a type of dynamic constellations has been presented to improve the performance of HM-BICM systems under varying channel conditions [14]. Furthermore, a protectionlevel-exchanging scheme has been introduced in HM-BICM systems [15], which can be used to enhance the performance of cooperative networks by adopting two different constellations to process the information at the source node and relay node, respectively. Hossain and Jornet [16] have applied the HM-BICM to ultrabroadband terahertz (THz) communications and generated a novel constellation, which is referred to as the hierarchical bandwidth modulation (HBM). Besides, the design and implementation of hierarchical amplitude and phase-shift keying (HAPSK) constellations and hierarchical quadrature amplitude modulation (HQAM) constellations have been discussed in [17] and [18], respectively. However, BICM is not an optimal transmission technique due to its independent demapping framework. The performance of such a technique can be further boosted by including iterative decoding (ID) between the demodulator and decoder to process the soft information iteratively [19]. Therefore, an enhanced version of HM-BICM systems, called HM-BICM-ID systems, has been formulated [20]. The HM-BICM-ID not only can maintain the spectral-efficiency superiority of HM-BICM but also can exhibit a higher achievable rate and better error performance. Recently, the constellation design for HM-BICM-ID systems has been intensely investigated [20], [21]. 1 Error-correction codes (ECCs) are another important issue to determine the overall performance of HM-BICM systems. During the past decade, a significant amount of research effort has been devoted to spurring the theoretical investigation and practical application of ECCs for the HM-BICM systems. In particular, a type of capacity-approaching ECCs, called low-density parity-check (LDPC) codes, has been widely investigated in the HM scenarios [17], [21]. For instance, a degree-distribution design strategy, which can be used to construct the UEP LDPC codes, has been presented for HM-BICM systems [22]. Moreover, two different complexityreduced decoding algorithms have been proposed for the LDPC-coded HM-BICM systems [23] and their HM-BICM-ID counterparts [24], respectively. Although conventional LDPC codes exhibit desirable error performance in the ID context, the irregular structure leads to relatively high encoding and decoding complexities [25]. To overcome such weakness, a class of structured LDPC codes, referred to as protograph LDPC (P-LDPC) codes, has been conceived [26]. To facilitate the analysis and design of P-LDPC-coded BICM systems, a variety of protograph-based extrinsic information transfer (P-EXIT) algorithms have been proposed [27]- [29]. Following the information-theoretic advancements, the optimization of P-LDPC-coded BICM-ID counterparts has been investigated in Poisson pulse-position modulation (PPM) scenarios [30], [31]. Motivated by the superiorities of protograph, a type of convolutional-based P-LDPC codes, referred to as spatially coupled (SC) P-LDPC codes, has also been studied. Such codes not only inherit the appealing advantages of P-LDPC codes but also have the convolutional feature to improve the error performance [32], [33]. In the past few years, a variety of research works related to the SC P-LDPC codes have been invoked over different channels, such as binary erasure channels (BECs) [34] and additive white Gaussian noise (AWGN) channels [35]. Due to their own superiorities (e.g., outstanding performance and simple implementation), the SC P-LDPC codes have great potential to be a desirable errorcorrection coding scheme for HM scenarios. Nevertheless, as far as we know, there have been few previous works on SC P-LDPC codes for the bandwidth-limited systems, especially HM-BICM-ID systems, to date.
Inspired by the aforementioned discussion, a comprehensive investigation on SC P-LDPC-coded M-ary QAM HM-BICM-ID systems is conducted in this article, whose main contributions can be summarized as four-fold. Specifically, an information-theoretic method is proposed to calculate (log 2 M)/2 types of constellation-constrained average mutual information (AMI), which can be utilized to exactly evaluate the achievable capacity limits of different layers in HM-BICM systems. 2 Furthermore, a novel type of constellations, referred to as structural quadrant (SQ) constellations, is designed based on a two-stage design approach, while a quadrant-based harmonic mean analysis is presented to measure the nonfeedback and iterative-feedback performance of different constellations in HM-BICM-ID systems. The SQ constellations not only exhibit higher capacities and larger harmonic mean than the state-of-the-art counterparts in HM-BICM scenarios but also attain desirable performance gain in HM-BICM-ID scenarios. In addition, we develop an asymptotic-performance analytical tool, called as multistream-based EXIT (MS-EXIT) algorithm, to predict the asymptotic convergence performance of all individual coded-bit streams in HM-BICM-ID systems. Analytical and experimental results reveal that our proposed SC P-LDPC-coded HM-BICM-ID systems significantly outperform the existing counterparts. As a consequence, the proposed SC P-LDPC-coded HM-BICM-ID systems can be considered as a promising technique for the wireless communication applications with UEP requirement, such as massive IoT.
The remainder of this work consists of five parts. In Section II, we present the SC P-LDPC-coded HM-BICM-ID system model and develop an information-theoretic methodology to calculate its AMIs. In Section III, we propose a twostage design approach to construct the SQ constellations and illustrate their advantages from the perspective of both capacity and quadrant-based harmonic mean analyses. In Section IV, we put forward an MS-EXIT algorithm and estimate the convergence performance for HM-BICM-ID systems with different constellations. We perform simulations to demonstrate the superiority of the proposed SC P-LDPC-coded HM-BICM-ID systems in Section V and give conclusions in Section VI.
II. SC P-LDPC-CODED HM-BICM-ID SYSTEM AND AMI CALCULATION We first briefly describe the SC P-LDPC-coded HM-BICM-ID system model. Subsequently, we propose an information-theoretic methodology to calculate the constellation-constrained AMI of such a system.

A. SC P-LDPC-Coded HM-BICM-ID Systems
The system model of an SC P-LDPC-coded HM-BICM-ID scheme is depicted in Fig. 1. As can be seen, (log 2 M)/2 different information-bit streams are first encoded by (log 2 M)/2 individual SC P-LDPC encoders to yield their corresponding coded-bit streams, respectively. Moreover, these (log 2 M)/2 types of coded-bit streams are permuted by their individual bit-level interleavers. Then, every m = log 2 M consecutive coded bits output from these (log 2 M)/2 individual interleavers are grouped together and modulated into an M-ary transmitted signal by a given hierarchical modulator, e.g., an M-QAM hierarchical modulator. It should be noted that the coded bits output from the layer-1 interleaver, layer-2 interleaver, . . . , layer-(log 2 M)/2 interleaver are assumed to be the highest priority data stream, second highest priority data stream, . . . , lowest priority data stream, respectively. These data streams are successively assigned into the most significant bit (MSB) positions, second MSB positions, . . . , least significant bit (LSB) positions in the mapping process. 3 To ease the understanding of the HM process, a specific example is given in Fig. 2. As illustrated in this figure, the parameters m, (log 2 M)/2, and m in the Gray constellation for a 16-QAM HM-BICM-ID system are 4, 2, and 2, respectively. Besides, the minimum distance between the labels in adjacent quadrants is denoted as d 1 , while the minimum distance between 3 In this article,  [15], [16], [20] and 64-ary HM-BICM-ID systems [18]. Moreover, the numbers of labeling bits in each group are assumed to be identical and equal to m (i.e., m 1 = m 2 = · · · = m (log 2 M)/2 = m = 2) [15], [16], [18], [20]. the labels in a given quadrant is denoted as d 2 . By adjusting the distance rate r d = d 1 /d 2 , UEP can be changed for the layer-1 and layer-2 coded-bit streams. Through the above procedure, two length-n coded-bit streams are transformed into a length-(2n)/m M-ary modulated symbol sequence. Then, the jth modulated symbol x j is transmitted over an AWGN channel, and the corresponding output y j is given by where j = 1, 2, . . . , 2n/m; x j stands for a modulated symbol taken from an M-ary constellation set χ ; n j represents the complex Gaussian noise with zero mean and variance σ 2 = N 0 /2 in each dimension. Furthermore, the transmitted energy per symbol is assumed to be normalized to unit, i.e., denotes the expectation function.
On the receiver side, the signal is processed by a hierarchical serial-parallel concatenated structure, which is formulated by a multiple-input-multiple-output (MIMO) hierarchical demodulator and (log 2 M)/2 single-input-singleoutput (SISO) decoders. To be specific, given the received signal y j and the a priori log-likelihood ratios (LLRs) of hierarchical demodulator, i.e., L k A,Dem , the extrinsic LLRs output from the hierarchical demodulator can be computed (i.e., L k E,Dem = L k P,Dem − L k A,Dem ) and passed to their corresponding layer-k deinterleavers through a serial-to-parallel operation, where k = 1, 2, . . . , (log 2 M)/2. Through a designated deinterleaving procedure, the extrinsic LLRs L k E,Dem are assumed to be the a priori LLRs L k A,Dec for the corresponding layer-k decoders. Subsequently, the extrinsic LLRs output from the (log 2 M)/2 individual decoders can be calculated (i.e., L k E,Dec = L k P,Dec −L k A,Dec ) and sent to the corresponding layer-k interleavers. After being permuted, these extrinsic LLRs are processed by a parallel-to-serial operation and fed back to the hierarchical demodulator so as to serve as a sequence of latest input parameters for the next ID process. By using the above decoding framework, the extrinsic LLRs can be updated iteratively, which will help to promote the information convergence and improve the overall performance of HM-BICM-ID systems. Note that the max-sum approximation of the logdomain maximum a posteriori probability (Max-Log-Map) algorithm [36] and belief-propagation (BP) algorithm [33] are exploited in the hierarchical demodulator and (log 2 M)/2 types of decoders, respectively.
Remark 1: As proved in [25] and [37], LDPC-based codes (including the P-LDPC codes and SC P-LDPC codes), which possess excellent performance over AWGN channels, can also perform well over other memoryless channels. For this reason, although an AWGN channel is assumed in this article, our proposed design can be applicable to other types of ergodic channels, such as fast fading channels.
B. AMI Calculation 1) CM Capacity: As is well known, AMI between the channel input and corresponding output specifies the maximum achievable rate in the case of error-free transmissions [38], [39]. Thus, the AMI can be considered as the performance limit for the HM system over a given channel.
Given an M-ary constellation, when the modulated symbol is chosen from the corresponding set χ with equal probability, the constellation-constrained AMI for coded modulation (CM), i.e., CM capacity C CM , over an AWGN channel is evaluated as where p(y|z) and p(y|x) denote the conditional probability density functions (PDFs) of the received signal y with the complex symbol z taken from χ and the modulated symbol x, respectively. It is noteworthy that CM is considered as an optimal transmission technique, which can obtain the maximum achievable rate [40]. In the BICM-ID scenario, an iterative procedure is executed at the receiver to exchange the soft information between the demodulator and decoder. Based on the above procedure, each labeling bit of a modulated symbol can be estimated with the aid of the a priori information of the remaining m − 1 labeling bits, which can enhance the system reliability. Actually, BICM-ID is capable of achieving the maximum information rate by performing substantially iterative operations [41], [42]. In consequence, BICM-ID is viewed as an optimal transmission technique, and thus can achieve the same constellation-constrained AMI as CM [40], [43], i.e., C BICM-ID C CM .
2) BICM Capacity: It is worth noting that BICM is a suboptimal alternative technique due to its independent demapping structure [41]. The constellation-constrained AMI, i.e., BICM capacity C BICM , can be measured by where stands for a constellation subset with the ith labeling bit being b (b ∈ {0, 1}).
3) HM-BICM Capacity: Based on the relationship between the labeling bits and the parallel channels, an M-ary constellation is equivalent to m parallel independent and binary input memoryless subchannels [12]. Moreover, these m subchannels correspond to the m labeling bits within modulated symbols. Thereby, C BICM defined in (3) can be reformulated as where I(b; y|S = i) denotes the conditional MI and S denotes the index of a labeling bit (i.e., 1 ≤ S ≤ m). In an HM-BICM system, aiming to simultaneously process the (log 2 M)/2 individual coded-bit streams with different priorities, the constellation should be divided into (log 2 M)/2 layers, and each layer is composed of m fixed positions within corresponding labeling bit sequence. Due to the above feature of the HM-BICM system, its AMI analysis must be divided into (log 2 M)/2 parts, and each part corresponds to only one of these (log 2 M)/2 layers. Let C k HM-BICM denotes the constellation-constrained AMI of the kth layer, where  [20], MSED-B [20], M3 [15], and HBM [16]. A 16-QAM modulation is considered.
HM-BICM is equal to the summation of conditional MIs corresponding to the kth group of labeling bits within a labeling bit sequence. In consequence, C k HM-BICM , referred to as layer-k capacity, can be measured by Remark 2: C k HM-BICM is used to evaluate the maximum achievable rate of the kth layer (resp. the kth coded-bit stream) in a BICM scenario. However, in actual implementation, a constellation possessing larger C k HM-BICM may help an SC P-LDPC code achieving more desirable error performance in a BICM-ID scenario than the counterparts possessing small C k HM-BICM .

A. Proposed SQ Constellations
In wireless communication systems, constellation is a critical parameter to determine the transmission rate and error performance. Therefore, the constellations should be carefully designed and optimized so as to meet the practical requirements for different transmission environments. For example, the Gray constellation, which is considered as an optimal mapping scheme in the BICM systems, is not applicable in ID systems since there is nearly no performance gain in the iterative process [20], [44]. In order to address this weakness, a variety of new constellations have been proposed for the BICM-ID systems [45]. Different from conventional BICM-ID systems, HM-BICM-ID systems can be viewed as layered systems, whose constellations are decomposed into several different layers so as to facilitate the processing of different data streams. Thus, the investigation on the constellation design for HM-BICM-ID systems deserves further exploration. Fig. 3 shows four well-performing constellations, i.e., maximum squared Euclidean distance (MSED) constellation [20], MSED-B constellation [20], M3 constellation [15], and HBM constellation [16], proposed in recent years.
For an M-QAM HM-BICM-ID system, its constellation can be divided into (log 2 M)/2 different layers so as to carry (log 2 M)/2 types of data streams. According to the importance and priorities of such (log 2 M)/2 individual data streams, the layers constructed from the relatively significant labeling bit positions must be first processed in the constellationdesign procedure. The remaining layers should be optimized to enhance the overall system performance. Based on the above discussion, we propose a two-stage design approach to generate a novel type of constellations, called SQ constellations, for the sake of improving the HM-BICM-ID performance. The detailed design process is described as follows.
1) Gray-Mapped Base Layer: Given an M-ary modulated scheme, the labeling bit sequence can be decomposed into (log 2 M)/2 groups sequentially, and every group consists of m labeling bits. The labeling bits within the first group correspond to the MSB positions, which are used to construct the first layer to carry the highest priority data stream. The remaining labeling bits within the second group, third group, . . . , last group correspond to the second MSB positions (i.e., second layer), third MSB positions (i.e., third layer), . . . , LSB positions (i.e., last layer), respectively, which are used to carry the second highest priority data stream, third highest priority data stream, . . . , lowest priority data stream. Therefore, one should first process the first layer so as to ensure the performance of the highest priority data stream. To be specific, the constellation is first divided into four quadrants equally, called as layer-1 quadrants; For the first group, the 4m component labeling bits within all the M/4 labels in a given layer-1 quadrant are set to be the same value, while there is only one different labeling bit between two adjacent layer-1 quadrants. From the layer-1 perspective, the above operation makes the first-group labeling bits of all labels formulating a Gray mapping, which is able to achieve the maximum AMIs. 2) Structural Enhanced Layers: The labeling bits within the other (log 2 M)/2 − 1 groups, which correspond to the remaining (log 2 M)/2 − 1 layers, can be processed through a structural scheme in a sequential order. Specifically, a layer-1 quadrant can be further equally divided into four subquadrants, called as layer-2 quadrants; The 4m second-group labeling bits within these four layer-2 quadrants, which are generated from a given layer-1 quadrant, can formulate a fixed structure (e.g., the Gray structure or anti-Gray structure). From the perspective of the layer-1 quadrant, the second-group labeling bits constitute two Gray mappings and two anti-Gray mappings. In particular, every two adjacent layer-1 quadrants have different second-group constellation structures, each of which includes four virtual labels (i.e., 4m labeling bits). Furthermore, the layer-3 quadrants, layer-4 quadrants, . . . , layer-(log 2 M)/2 quadrants can be generated by processing the labeling bits of remaining (log 2 M)/2 − 2 groups in a similar method. For example, the 4m kth-group labeling bits within the four layer-k quadrants, which are formulated by equally dividing a given layer-(k − 1) quadrant, can also constitute two Gray mappings and two anti-Gray mappings. Based on the above two steps, one can generate the SQ constellations, which not only can ensure excellent HM-BICM performance but also can realize desirable HM-BICM-ID performance. It should be noted that the number of labeling bits within each group is set to be m = 2. Then, the kth-group labeling bits has four different settings (i.e., 00, 01, 10, 11) in its corresponding layer-k quadrants. Hence, there are different realizations for the resultant SQ constellation. Nevertheless, all possible realizations have the same constellation structure (i.e., Gray structure or anti-Gray structure) from the quadrant perspective. As an example, the SQ constellation for 16-QAM modulation is given in Fig. 4. To verify the superiority of the proposed SQ constellation, three different types of capacities, i.e., C CM , C BICM , and C k HM-BICM for k = 1, 2 will be investigated in the forthcoming capacity analysis.

B. Capacity Analysis
According to Section II-B, the constellation-constrained AMI (i.e., capacity) for a given constellation can be calculated. Let E s /N 0 be the signal-to-noise ratio (SNR) per symbol, the capacities C CM , C BICM , and C k HM-BICM (k = 1, 2) for a 16-QAM modulation with the proposed SQ constellation and four state-of-the-art constellations, i.e., MSED [20], MSED-B [20], M3 [15], and HBM [16] constellations, are presented in Fig. 5. As observed, the SQ constellation exhibits a larger C BICM with respect to the other four constellations. This phenomenon illustrates that the HM-BICM system with an SQ constellation performs better than the counterparts using MSED, MSED-B, M3, and HBM constellations. More specifically, one can observe that the layer-1 capacities (i.e., C 1 HM-BICM ) of the M3, HBM, and SQ constellations are the same, which are larger than those of the MSED and MSED-B constellations. In addition, the layer-2 capacities (i.e., C 2 HM-BICM ) of the SQ constellation is the largest one among all the five constellations, which indicates that the HM-BICM system with an SQ constellation exhibits better layer-2 performance than the counterparts with the other four constellations.
Remark 3: The capacity analysis verifies the superiority of the proposed SQ constellation in the HM-BICM system, its advantage in HM-BICM-ID systems will be further demonstrated in the next subsection and Section IV.

C. Quadrant-Based Harmonic Mean Analysis
Different from the BICM-ID system, an M-ary HM-BICM-ID system is used to carry (log 2 M)/2 coded-bit streams, and thus each label within a given M-ary HM constellation is divided into (log 2 M)/2 different groups, which correspond to (log 2 M)/2 different layers and are used for mapping (log 2 M)/2 different coded-bit streams in the bit-to-symbol process. Therefore, the selection scheme for the labels and labeling bits in the conventional harmonic mean analysis [46], which is tailored for the BICM-ID systems, is no longer suitable for the HM-BICM-ID systems. Then, we modify the conventional harmonic mean analysis [46] and conceive a novel counterpart, referred to as quadrant-based harmonic mean analysis, so as to verify the superiority of the SQ constellations in the HM-BICM-ID system. Based on a given M-ary HM constellation, the quadrant-based harmonic mean analysis can be used to calculate (log 2 M)/2 different harmonic means of the minimum squared Euclidean distance (HMMSED), which is of great importance to evaluate the asymptotic performance of a constellation in both nonfeedback and iterative-feedback scenarios. In the following, we elaborate on the analysis process.
1) Nonfeedback HMMSED: Given an M-ary HM constellation, there exists (log 2 M)/2 layers. In particular, each layer   = 1, 2, . . . , m and b ∈ {0, 1}. Then, one can calculate the nonfeedback HMMSED corresponding to the first layer, as i denotes the label closest to x, and x − z denotes the Euclidean distance between the label x and z. For the remaining (log 2 M)/2 − 1 layers, we define χ (b) i,j as the subset of χ (b) i whose labels are located at the jth layer- i,j as the subset of χ (b) i whose labels are located at the j th layer-(k − 1) quadrant for j ∈ {1, 2, . . . , 4 (k−1) } and j = j. Accordingly, the nonfeedback HMMSED corresponding to the kth layer for k = 2, 3, . . . , (log 2 M)/2 can be calculated as where z j ∈ χ (b) i,j denotes the label closest to x. 2) Iterative-Feedback HMMSED: In the ID scenario, the iterative-feedback HMMSED corresponding to a given M-ary HM constellation can be calculated as [46] i denotes the label whose m labeling bit positions are the same as those of x except the ith position.
Remark 4: It should be noted that (8), which is used to evaluate the iterative gain of constellations, is obtained from [46], while (6) and (7), which are used to evaluate the nonfeedback performance of constellations, are derived from the conventional harmonic mean based on the HM characteristics.
3) HMMSED Analysis: From the information-theoretic perspective, the nonfeedback HMMSED is tailored for the noniterative decoding scenario, and thus it can be utilized to estimate the intrinsic performance of constellations separately from the decoding algorithm. Besides, it should be noted that a constellation possessing larger nonfeedback HMMSED can ensure high reliability for the initial soft information output from the decoder, which may help the system achieving more desirable error performance in the ID context. Therefore, the nonfeedback HMMSED not only can perfectly measure the performance of HM-BICM systems but also plays an important role on the performance of HM-BICM-ID systems. Different from the nonfeedback HMMSED, the iterative-feedback HMMSED is derived under an assumption of error-free feedback [46], i.e., the soft information fed back from the decoder to the demodulator is totally correct. However, since the BP algorithm is a suboptimal decoding scheme [25], error-free feedback can not be realized in the practical communication scenarios. For this reason, the iterative-feedback HMMSED can not perfectly reflect the constellation performance in HM-BICM-ID systems, and thus is mainly used as a suboptimal criterion to analyze the performance gain of constellations in the iterative process. Based on the above discussion, one can easily find that devising a constellation enabling the maximum nonfeedback HMMSEDs for each layer and a relatively large iterativefeedback HMMSED is preferable in practical applications. It is because that a constellation having the above two features allows the system to reach desirable performance after a few outer iterations. To further elaborate on the advantages of the proposed SQ constellation, its nonfeedback HMMSED and iterative-feedback HMMSED are calculated, and four state-of-the-art counterparts, i.e., MSED [20], MSED-B [20], M3 [15], and HBM [16] constellations are used for comparison.
In Table I, the two nonfeedback HMMSEDs (i.e., D 1 and D 2 ) and one iterative-feedback HMMSED (i.e., D) are presented for a 16-QAM HM-BICM-ID system. One can easily find that the MSED and MSED-B constellations have the same D 1 , which implies that they exhibit the same layer-1 performance in the nonfeedback scenario. Similar conclusions can be drawn from the M3, HBM, and SQ constellations, which achieve better layer-1 performance than the MSED and MSED-B constellations due to a larger D 1 . Furthermore, in the case of the same D 1 , the SQ constellation achieves better layer-2 performance than the M3 and HBM constellations, while the MSED-B constellation exhibits better layer-2 performance than the MSED constellation. Consequently, the SQ constellation possesses the best performance among all the five constellations for both two layers in the nonfeedback scenario, which is consistent with the capacity analysis. Although the SQ constellation has a relatively smaller iterative-feedback HMMSED (i.e., D) with respect to those of the MSED, MSED-B, M3, and HBM constellations, the former still obtains a significant iterative gain over its nonfeedback counterparts. This phenomenon illustrates that the SQ constellation not only can exhibit outstanding nonfeedback performance for each layer but also can guarantee ideal iterative performance.
To further illustrate the iterative gain of the SQ constellation in the HM-BICM-ID scenarios, we will resort to a novel EXIT algorithm to achieve this goal.

A. SC P-LDPC Codes
A protograph G = (V, C, E) proposed in [47] can be viewed as a small Tanner graph composed of three different types of sets, i.e., a variable node (VN) set V = {v 1 , v 2 , . . . , v n p }, a check node (CN) set C = {c 1 , c 2 , . . . , c m p }, as well as an edge set E = {e i,j } for i = 1, 2, . . . , m p and j = 1, 2, . . . , n p . In particular, each edge e i,j ∈ E connects a CN c i ∈ C with a VN v j ∈ V, and the presence of parallel edges are allowed. Alternatively, a protograph of a code rate R = (n p − m p )/n p can be represented by an m p ×n p base matrix B = (b i,j ), where b i,j is the number of edges connecting c i with v j . Furthermore, a large protograph corresponding to a P-LDPC code, referred to as derived graph, can be generated by exploiting a "lifting" operation on a given protograph [25], [48].
By implementing L (L ≥ 2) times replication on a given protograph, a tail-biting SC (TB-SC) protograph can be constructed by coupling these identical replicas into a single coupled chain, where L represents the coupling length. From the matrix viewpoint, the base matrix B, which corresponds to a protograph of size m p ×n p , is decomposed into w+1 submatrices (resp. B 0 , B 1 , . . . , B w ) by exploiting an edge-spreading scheme. Particularly, all matrices (i.e., the base matrix and submatrices) have the same size and their relationship satisfies the condition of w k=0 B k = B. In the sequel, a TB-SC protograph can be generated and its base matrix of size m p L × n p L is expressed as

B. MS-EXIT Algorithm
In order to investigate the convergence performance of SC P-LDPC-coded HM-BICM-ID systems, an asymptoticperformance analysis tool, called MS-EXIT algorithm, is conceived in this subsection. One can use this algorithm to analyze the convergence procedure of MI between the hierarchical demodulator and the SC P-LDPC decoder. To be specific, the MS-EXIT algorithm can be used to calculate the decoding threshold of an SC P-LDPC code, which is regarded as the minimum SNR per information bit (i.e., the minimum E b /N 0 ) that allows the code to realize error-free transmission in an HM-BICM-ID system. For the sake of illustrating the MS-EXIT algorithm in detail, the principle of such an algorithm is shown in Fig. 6, and several concepts are needed to be introduced first.
In an AWGN scenario, assuming that the LLRs of coded bits follows a Gaussian distribution, i.e., l ∼ N ([σ 2 ch /2], σ 2 ch ), where (σ 2 ch /2) and σ 2 ch represent the mean and variance, respectively. Then, the MI between the coded bits and their corresponding LLRs is measured by where the inverse function of J(·) [i.e., J −1 (·)] and its corresponding closed-form approximation are provided in [49]. Besides, several types of MIs used in the MS-EXIT algorithm are defined as follows. 1) I k Ad and I k Ed represent the a priori MI and extrinsic MI of the hierarchical demodulator, respectively; 2) I k Av (i, j) and I k Ac (i, j) represent the a priori MI passed from c i to v j and v j to c i , respectively; 3) I k Ev (i, j) and I k Ec (i, j) represent the extrinsic MI passed from v j to c i and c i to v j , respectively; 4) I k APP (j) represents the a posteriori MI of v j ; where k ∈ {1, 2, . . . , (log 2 M)/2} is used to represent the index of the coded-bit stream located at the kth layer.
The maximum number, which corresponds to the outer iterations between the hierarchical demodulator and the decoder, is assumed to be T 1 , while that corresponding to the inner iterations for the decoder is assumed to be T 2 . Moreover, there is an exchange of MI from VN to CN and vice versa in each inner iteration. Based on the aforementioned foundations, the MS-EXIT algorithm can be illustrated as follows.
1) Initialization: Before the implementation of this algorithm, there exist an initial E b /N 0 and (log 2 M)/2 types of initial a priori MIs (i.e., I k Ad for k = 1, 2, . . . , (log 2 M)/2), which are assumed to be the input parameters in the hierarchical demodulator.

2) Calculating (log 2 M)/2 Types of Extrinsic MIs for
Hierarchical Demodulator: After a Monte Carlo simulation, the hierarchical demodulator can output a sequence of coded bits and their corresponding extrinsic LLRs. Then, the extrinsic MI I k Ed for k = 1, 2, . . . , (log 2 M)/2, which is assumed to be output from the hierarchical demodulator, is calculated as (11) where u k j = (−1) v k j ∈ {+1, −1} and l k j denote the jth converted binary symbol and its corresponding extrinsic LLR, respectively; v k j ∈ {0, 1} denotes the jth coded bit within the kth coded-bit stream, for j = 1, 2, . . . , n.

3) Passing (log 2 M)/2 Types of Extrinsic MIs From
Hierarchical Demodulator to Layer-k Decoder: The channel MIs for these (log 2 M)/2 types of decoders are set to be I k ch (j) = I k Ed , where k = 1, 2, . . . , (log 2 M)/2 and j = 1, 2, . . . , n p . As a special case, I k ch (j) equals zero if v j is a punctured VN in the protograph.

6) Updating (log 2 M)/2 Types of A Priori MIs for
Hierarchical Demodulator: Using the a priori MI I k Av (i, j), the average extrinsic MI passing from layer-k decoder to hierarchical demodulator can be calculated as where k = 1, 2, . . . , (log 2 M)/2. I k Ad can be regarded as an updated a priori MI for the demodulator in the next outer iteration. It should be noted that there exists a parallel-to-serial operation to process these (log 2 M)/2 different types of MIs for a coded-bit stream of length ((log 2 M)/2)n in the hierarchical demodulator. Because the entire coded-bit stream in the hierarchical demodulator is formulated by combining (log 2 M)/2 different coded-bit streams of length n.

7) Calculating (log 2 M)/2 Types of A Posteriori MIs for
VNs: Exploiting I k Av (i, j) and I k ch (j), the a posteriori MI I k APP (j) can be computed as where k = 1, 2, . . . , (log 2 M)/2 and j = 1, 2, . . . , n p . Due to the convergence-performance difference among these (log 2 M)/2 types of coded-bit streams, the a posteriori MIs should be calculated in ascending order of the layer index. Specially, for j = 1, 2, . . . , n p and z = 1, 2, . . . , (log 2 M)/2 − 1, the MI for the zth codedbit stream I z APP (j) should be computed prior to the MI for the (z + 1)th coded-bit stream I z+1 APP (j).
Note also the following.
1) The MS-EXIT algorithm is designed to investigate the convergence performance for SC P-LDPC-coded Mary QAM HM-BICM-ID systems, and thus contains an iterative procedure between the hierarchical demodulator and (log 2 M)/2 SC P-LDPC decoders. Importantly, this algorithm can also be used for the HM-BICM counterparts by setting the number of outer iterations to be zero. C. Convergence-Performance Analysis 1) Decoding Thresholds: In order to further verify the superiority of the proposed SQ constellations, we analyze the convergence performance for HM-BICM-ID systems with different constellations by using the proposed MS-EXIT algorithm. More specifically, it is assumed that all the data streams used in such systems are generated by the (3, 6) TB-SC P-LDPC code. Unless otherwise stated, the coupling length adopted for the (3, 6) TB-SC P-LDPC code is set to be L = 12 in this article. The protograph structure of such a code is given in Fig. 7, where the VNs and CNs are denoted by the filled circles and circles with a plus sign, respectively. Furthermore, the MSED [20], MSED-B [20], M3 [15], and HBM [16] constellations are considered as benchmarks. Referring to Table II, two types of decoding thresholds, i.e., τ 1 and τ 2 , which correspond to the layer-1 and layer-2 data streams, respectively, are presented for the 16-QAM HM-BICM-ID systems with five different constellations.
Based on the condition of T 1 = 0, the SC P-LDPC-coded HM-BICM-ID system exhibits the same layer-1 decoding threshold τ 1 when the MSED and MSED-B constellations are adopted. This phenomenon is reasonably consistent with the capacity analysis in Section III-B and HMMSED analysis in  Section III-C, i.e., the MSED and MSED-B constellations have the same C 1 HM-BICM and D 1 . Similar conclusions can be drawn from the M3, HBM, and SQ constellations. From the perspective of the layer-2 data stream, the five constellations have five different layer-2 decoding thresholds τ 2 , which are also consistent with the capacity analysis and HMMSED analysis (i.e., C 2 HM-BICM and D 2 ). Especially, the SC P-LDPC-coded HM-BICM-ID system with the SQ constellation exhibits the smallest layer-2 decoding threshold (i.e., τ 2 = 6.35 dB) compared with the other four constellations, and thus can achieve the best layer-2 performance.
By setting the maximum number of outer iterations to be T 1 = 8, the SC P-LDPC-coded HM-BICM-ID systems with the MSED and MSED-B constellations exhibit nearly the same decoding thresholds for both two layers. On the other hand, the layer-1 decoding thresholds τ 1 of the M3, HBM, and SQ constellations in the case of T 1 = 8 are the same as those in the case of T 1 = 0. This is because the first layer of the above three constellations is generated by the Gray-mapped base structure, and thus no iterative gain can be obtained by  exploiting ID. For the second layer, the SC P-LDPC-coded HM-BICM-ID system with the SQ constellation can obtain additional threshold gain over the counterparts with the other four existing constellations. Hence, the SQ constellation can make such systems achieving the best layer-2 performance in ID scenarios.

2) Extrinsic MIs of Demodulator:
To get further insight, we investigate the extrinsic MIs of four different labeling bits, which are output from the demodulator of the SQconstellation-aided 16-QAM HM-BICM-ID system. One can easily find from Fig. 8 that the extrinsic MIs of the first and second labeling bits, which constitute the first layer, keep the same value for both two different cases (i.e., the case of T 1 = 0 and the case of T 1 = 8). This implies that introducing outer iterations cannot improve the layer-1 performance of such an HM-BICM-ID system. Different from the first layer, the second layer is constructed by the third and fourth labeling bits, whose extrinsic MIs can be improved by increasing the number of outer iterations T 1 . Accordingly, the second layer can attain desirable performance gain by exploiting ID, and thus the overall performance of such an HM-BICM-ID system is further improved.

V. SIMULATION RESULTS
To demonstrate the merit of the proposed design, we provide various bit-error-rate (BER) simulation results on the SC P-LDPC-coded M-ary QAM HM-BICM-ID systems over AWGN channels. In particular, all the data streams used in such systems are generated by the (3, 6) TB-SC P-LDPC code, which has a code rate of 1/2 and a codeword length of 2400. In addition, the maximum numbers of the outer iterations T 1 and the inner iterations T 2 are set to be 8 and 25, respectively. Unless otherwise mentioned, we assume a 16-QAM modulation and a standard distance rate r d = 1 for all the constellations in simulations (i.e., a constellation with r d = 1 is referred to as standard constellation).
A. BER Performance for the 16-QAM HM-BICM-ID Systems 1) BER Performance for Standard Constellations: For the 16-QAM standard constellations, Fig. 9 depicts the BER curves of both the layer-1 and layer-2 data streams in the SC P-LDPC-coded HM-BICM-ID systems with five different types of constellations, i.e., the MSED, MSED-B, M3, HBM, and proposed SQ constellations. It is observed in Fig. 9(a) and (b) that the SQ constellation not only can possess excellent layer-1 performance as the M3 and HBM constellations, which exhibit a remarkable gain of about 2.2 dB over the MSED and MSED-B constellations at a BER of 2 × 10 −6 but also can achieve better layer-2 performance with respect to the other four constellations. Specifically, the SQ constellation achieves gains of about 0.4, 0.9, 1.1, and 1.4 dB over the HBM, MSED-B, M3, and MSED constellations, respectively, at a BER of 4 × 10 −6 . Referring to Fig. 9(c) and (d), the relative performance among the MSED, MSED-B, M3, HBM, and the proposed SQ constellations in the case of T 1 = 8 is similar to that in the case of T 1 = 0. As a result, the proposed SQ constellation appears to be the best mapping scheme compared with the other four counterparts when ID is considered. It can be easily observed that the simulated BER curves of the five different constellations agree well with their corresponding analytical results in Section IV-C.
As a further advancement, Fig. 10 compares the BER performance between the Gray constellation and the SQ constellation. 4 As observed, although the SQ-constellationaided (3, 6) TB-SC P-LDPC code achieves the same layer-1 performance as the Gray-constellation-aided counterpart [see Fig. 10(a)], the former can accomplish better layer-2 performance than the latter [see Fig. 10(b)]. Particularly, at a BER of 6 × 10 −6 , the former only requires about an SNR of BER curves of the layer-1 and layer-2 data streams in the SC P-LDPC-coded HM-BICM-ID systems with the Gray and proposed SQ constellation. The distance rate r d = 1/2. 6 dB to achieve this BER value, while the latter needs about an SNR of 6.2 dB to do so.
2) BER Performance for Nonstandard Constellations: Aiming to further demonstrate the superiority of the SQ constellation, some BER simulation results in the case of nonstandard constellations (i.e., the distance rate r d = 1/2) are provided in Fig. 11, where the maximum number of outer iterations is set to be T 1 = 8. Referring to Fig. 11(a) for the layer-1 performance, at E b /N 0 = 4.2 dB, the SQ-constellationaided (3, 6) TB-SC P-LDPC code can achieve a BER of 1 × 10 −6 , while the HBM, M3, MSED-B, and MSED constellations accomplish BERs of 1 × 10 −5 , 6 × 10 −4 , 3 × 10 −2 , and 5 × 10 −2 , respectively. Moreover, at a BER of 2 × 10 −5 , the SQ constellation can obtain gains of about 0.2, 0.7, 1.8, and 2.4 dB over the HBM, M3, MSED-B, and MSED constellations, respectively. Similar phenomenon can be observed from Fig. 11(b), which shows the layer-2 performance. The above observations verify the effectiveness of our proposed SQ constellation in the case of nonstandard constellations. Fig. 12 presents the BER curves of the Gray constellation and proposed SQ constellation in 16-QAM HM-BICM-ID systems with T 1 = 8. It is apparent that our proposed SQ constellation is better than the Gray constellation in terms of both layer-1 and layer-2 data streams. Specifically, at a BER of 1 × 10 −6 , the layer-1 performance and layer-2 performance of the former can attain gains of about 1.1 dB and 0.3 dB over that of the latter. As a consequence, our proposed SQ constellation can keep a desirable balance between the layer-1 performance and layer-2 performance, while the Gray constellation cannot to do so.
Remark 5: We have also performed simulations with other values of r d (i.e., r d = 1/3) and have found that the proposed SQ constellation also outperforms the HBM, M3, MSED-B, MSED, and Gray constellations in the HM-BICM-ID systems.

B. BER Performance for the 64-QAM HM-BICM-ID Systems
To illustrate the feasibility of our proposed two-stage design approach, we construct a 64-ary SQ constellation, whose structure is given in Fig. 13, to process three data streams with different priorities. In Fig. 14(a), (b), and (c), we measure the performance of the proposed 64-ary SQ constellation and two state-of-the-art counterparts (i.e., called as 64-ary Gray constellation [14] and 64-ary vectored modulated (VM) constellation [18]). As observed, the 64-ary SQ constellation not only possesses excellent layer-1 performance as the 64-ary Gray constellation and 64-ary VM constellation but also can achieve better layer-2 performance and layer-3 performance. Specifically, at a BER of 1 × 10 −5 , the (3, 6) TB-SC P-LDPC code with the 64-ary SQ constellation can achieve gains of about 0.1 dB and 0.2 dB for the layer-2 data stream and layer-3 data stream, respectively, over the other two constellations. This implies that the SQ constellations are very suitable for HM-BICM-ID systems, especially multilayer scenarios (i.e., k > 2).

VI. CONCLUSION
In this article, we have presented a comprehensive study on the design and analysis of the SC P-LDPCcoded HM-BICM-ID systems. We have first proposed an information-theoretic methodology to calculate (log 2 M)/2 types of AMIs for the HM-BICM scenarios. Based on the AMI analysis, we have accurately evaluated the capacity limits (resp. maximum achievable rates) for all the individual coded-bit streams in the HM-BICM scenarios. We have also conceived a two-stage design approach to generate a novel type of constellations, called SQ constellations, for the HM-BICM-ID scenarios. Additionally, we have put forward the quadrant-based harmonic mean analysis to demonstrate that the SQ constellations not only can guarantee the largest HMMSEDs for all layers but also can realize desirable iterative performance. As a further insight, we have developed an MS-EXIT algorithm and analyzed the convergence performance for the HM-BICM-ID scenarios. Simulative results have indicated that the proposed SC P-LDPC-coded HM-BICM-ID systems are superior to the state-of-the-art counterparts in the cases of both standard and nonstandard distance rates. Owing to the aforementioned benefits, our proposed SC P-LDPC-coded HM-BICM-ID systems can be viewed as a promising alternative for use in future wireless-communication scenarios with different Quality-of-Service (QoS) requirements, e.g., the 6Genabled IoT. As a future line, we will extend this work to more general and practical IoT scenarios (e.g., different codedbit streams have different codeword lengths and code rates, and the numbers of labeling bits assigned into different layers are assumed to be different). Besides, we plan to exploit the intrinsic UEP characteristics of HM to trigger the wave-like convergence of SC P-LDPC codes, so as to further enhance the system performance.