Protograph-Based Interleavers for Punctured Turbo Codes

A method to design efficient puncture-constrained interleavers for turbo codes (TCs) is introduced. Resulting TCs profit from a joint optimization of puncturing pattern and interleaver to achieve an improved error rate performance. First, the puncturing pattern is selected based on the constituent code Hamming distance spectrum and on the TC extrinsic information exchange under uniform interleaving. Then, the interleaver function is defined via a layered design process taking account of several design criteria, such as minimum span, correlation girth, and puncturing constraints. We show that applying interleaving with a periodic cross connection pattern that can be assimilated to a protograph improves error-correction performance when compared to the state-of-the-art TCs. An application example is elaborated and compared with the long-term evolution standard: a significant gain in performance can be observed. An additional benefit of the proposed technique resides in the important reduction of the search space for the different interleaver parameters.


I. INTRODUCTION
I N recent years, interest in a large diversity of applications such as TV or multimedia content on demand [1], live streaming, or interactive gaming has been continuously growing due to the increased number of nomadic users. Accordingly, the future generations of mobile networks (5G and beyond) [2] call for higher data rates and capacity, with an enhanced quality of service for different receiver scenarios [3] and applications. To meet such requirements, error-correcting codes able to guarantee low error rates (down to 10 −5 of frame error rate (FER)) need to be provided. A current communication system such as long term evolution (LTE) [4] cannot guarantee such error rates. First of all, due to the hybrid automatic repeat request (HARQ) retransmission mechanism, the targeted error rate in LTE is around 10 −2 of FER. In addition, the rate matching mechanism used to provide rate compatibility in LTE causes undesirable interactions between the code interleaver and the puncturing mechanism for some configurations of block sizes and coding rates, entailing a poor distance spectrum for the code and resulting in a pronounced error floor [5].
The LTE standard adopted a turbo code (TC) as channel code [4]. TCs, introduced by Berrou et al. [6], are certainly one of the most popular channel coding schemes in wireless The authors are with the Electronics Department, IMT Atlantique, CNRS UMR 6285 Lab-STICC, CS 83818 -29238 Brest Cedex 3, France (e-mail: ronald.garzonbohorquez@imt-atlantique.fr; charbel.abdelnour@imtatlantique.fr; catherine.douillard@imt-atlantique.fr). systems: besides LTE, they have been adopted in the IEEE 802.16 WiMAX (worldwide interoperability for microwave access) [7] and DVB-RCS2 (2 nd generation digital video broadcasting -return channel via satellite) [8] standards. In addition to their near-capacity performance, TCs are known to be particularly flexible with respect to information frame length and coding rate. Indeed, encoding information frames of various lengths with different coding rates can be achieved with the same encoder just by modifying the interleaver parameters and the puncturing pattern. Provided that they are able to guarantee lower error rates when they are punctured, TCs remain promising channel coding candidates for 5G and future generations.
The error rate performance of a TC is closely related to its internal interleaving function. As introduced by Berrou and Glavieux [9], the minimum Hamming distance (d min ) of a TC is not only defined by its constituent encoders, but also affected by the TC interleaver. In the last decade, different interleaver structures have been proposed, particularly suited for practical implementation and for improving the asymptotic performance of TCs e.g., quadratic permutation polynomial (QPP) interleavers [10] adopted in LTE [4], dithered relative prime (DRP) interleavers [11], and almost regular permutation (ARP) interleavers [12], adopted in the DVB-RCS/RCS2 [13], [8] and IEEE 802.16 WiMAX [7] standards. The QPP interleaver coefficients are selected based on the maximization of the d min value of a subset of low-weight input sequences with weights of the form 2n, n being a small positive integer [10]. A similar criterion was used in [14] to select the DRP interleaver parameters. First, a regular interleaver with high scattering properties is identified. Then, the dither vectors of the DRP interleaver are selected in order to maximize the d min value of a subset of low-weight input patterns. In the case of the ARP interleaver, a first strategy involves selecting the parameters from those providing the best scattering properties and leading to the highest d min [12]. Another method, proposed in [15], is based on the maximization of the correlation girth. In [16], it was shown that introducing parity puncturing constraints into the interleaver design yields improved performance.
In this study, we investigate the joint optimization of puncturing patterns and interleavers for TCs in order to guarantee low error floors and good convergence thresholds. As a result, a layered construction of TC interleaver is proposed, involving the introduction of connection patterns called protographs, named by analogy with protograph-based (PB) low-density parity-check (LDPC) codes [17]. This work focuses on the ARP interleaver model [12]. A significant reduction of the search space for the different interleaver parameters was achieved with an important improvement in TC error rate performance.
The rest of the paper is organized as follows. In Section II, a description of the considered encoder structure, code interleaver, and puncturing pattern is given. Section III introduces relevant interleaver design criteria. In Section IV, a puncturing pattern selection method is proposed. It is followed by the description of the constraints on the interleaver imposed by the puncturing patterns, leading to the concept of PB interleavers for TCs. Section V describes the proposed layered design method for TC interleavers. Afterwards, a summary of the puncture-constrained interleaver design method is given. Then, in Section VI, the proposed method is applied to design TC interleavers and puncturing patterns for a set of frame parameters included in LTE. Section VII shows the simulated error rate performance of the proposed code and its comparison with the original LTE code. Finally, Section VIII concludes the paper.

II. SYSTEM DESCRIPTION
Among existing trellis termination techniques, tail-biting or circular termination [18] is well suited for TCs. This technique avoids the loss in spectral efficiency due to termination bits and the generation of truncated codewords. Furthermore, with tail-biting termination, the whole information sequence is protected in the same way avoiding any edge effect. Therefore, in this study we consider circular recursive systematic convolutional (CRSC) codes as constituent codes of the TC, since they apply tail-biting termination. The turbo encoder structure considered in this study is shown in Fig. 1. The information sequence d of size K is encoded by CRSC1, and the corresponding interleaved sequence d is encoded by CRSC2. The vectors at the output of the TC, data (d), parity 1 (r 1 ), and parity 2 (r 2 ), are punctured using a puncturing mask of period M before being transmitted.

A. Turbo Code Interleaver Model
The interleaver is a key component of TCs. Its role is twofold. First, it has a high impact on the achievable d min of the TC [9]. Second, due to its scattering properties, it also affects the correlation of exchanged extrinsic information during the iterative decoding process [19]. In this paper, TC interleaving is defined as follows: the interleaver reads the symbols from the data vector d = (d 0 , d 1 , ..., d K−1 ) and writes them to the interleaved vector d = (d Π(0) , d Π(1) , ..., d Π(K−1) ), where Π denotes the interleaver function. A symbol read out from address Π(i) in d is written to address i in d . When using CRSC codes as constituent codes of the TC, d and d can be represented by circles, as shown in Fig. 2. Three of the most popular interleaver families are the QPP interleavers [10], the DRP interleavers [11], and the ARP interleavers [12]. Our study only focuses on the ARP family. As shown in [20], the ARP interleaver can provide the same interleaving properties as the QPP or the DRP interleavers, guaranteeing d min values at least as high as these two families of interleavers.

B. The ARP Interleaver
The ARP interleaver structure is derived from the regular interleaver (RI): where P is the RI period that must be relatively prime to K. Rectangular return to zero (RTZ) error patterns cannot be efficiently avoided with this permutation due to its regular structure [12]. Therefore, a degree of disorder is introduced by a vector of shifts S into the permutation, leading to the ARP function: The vector of shifts S has length Q. It represents the introduced disorder degree. Q is a divisor of K [15].

C. Puncturing Pattern
The TC coding rate, R, can be increased by puncturing some bits at the TC output. Periodic puncturing is widely used in practice because it can be easily implemented. In this study, a periodic puncturing pattern with period M is considered (see Fig. 1). CRSC codes with coding rate 1/2 are used as constituent codes of the TC. Thus, for each constituent CRSC code, the puncturing mask or pattern is composed of two vectors of length M , corresponding to the puncturing positions in the data (d) and parity vectors (r 1 and r 2 ). In order to avoid edge effects when applying the puncturing mask, M is assumed to be divisor of K.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Two major target criteria have to be considered for the design of a TC interleaver: the Hamming distance spectrum of the resulting TC and the correlation between the channel information and the a priori information at the decoder input. The first terms of the distance spectrum have to be maximized, with multiplicities (i.e., the number of codewords at these distances) as low as possible, while the correlation between the channel and the a priori information should be minimized. Two measurable parameters related to these criteria have been considered in our work: mimimum span and correlation girth.

A. Minimum Span
The span value associated with a couple of symbols in positions i and j is defined according to [12], [21] as: where f (u, v) evaluates the shortest distance between two symbols (u and v) in a circular vector of size K, also called Lee distance [22]. Then, the minimum value of span S p min associated to the interleaver is defined as: It was shown in [23] that the maximum achievable value for S p min is upper bounded by: when using tail-biting termination. The minimum span value affects the TC distance spectrum. In particular, for random-based interleavers, the increase of minimum span was shown to yield larger d min values for the resulting TC [21]. Therefore, this value needs to be maximized.

B. Correlation Girth
In the turbo decoding process, the decoder output at position Π(i) (i.e., corresponding to the symbol at address Π(i) in d) depends on the received symbol at the same position and, due to the recursive nature of the constituent code, it is also influenced by received symbols at positions in the vicinity of Π(i). In addition, the decoder output at position Π(i) also depends on the a priori information provided by the second decoder from position i (i.e., associated to the symbol at address i in d ). The corresponding correlation properties depend on the recursive nature of the second decoder (i.e., positions in the vicinity of i in d ) and on the interleaver [24]. This latter should be designed to reduce the level of correlation between the a priori information and the data sequence of each constituent code [25]. To this end, a correlation graph can be established in the design of TC interleavers. The resulting interleaver should maximize the correlation girth g (i.e., minimum correlation cycle).
An example of the proposed correlation graph for a TC is shown in Fig. 3. The vertices are the bits of the information sequence. The connections between neighbor bits in the noninterleaved sequence d are represented by blue edges (outer circle). The green dotted edges are the connections between neighbor bits in the interleaved sequence d , for a given interleaver Π. This graph shows the different correlation cycles in the turbo decoding process. Note that the correlation graph for TCs is a regular graph of degree r = 4 (i.e., each vertex has exactly 4 neighbors) with a number of vertices x = K. An upper bound on the girth value g of a regular graph of degree r can be deduced, based on the Moore bound [26]. Let x(r, g) be the lowest number of vertices in a r-regular graph with girth g. The Moore bound implies that g can be at most proportional to the logarithm of x(r, g): where O(1) is the error term of the approximation. For the information block sizes considered in this paper, K = 1504 and 4000 bits, the upper bounds on the correlation girth obtained from (7) are 13 and 15.
IV. INTERLEAVING WITH PUNCTURING CONSTRAINTS As shown in [27], [28], the puncturing of well-chosen systematic bits can increase the d min value and reduce the convergence threshold of high coding rate TCs. Thus, in this work, we consider the design of puncturing masks including data puncturing. The proposed puncturing mask selection and puncturing constraints on the interleaver design are described in the following sections.

A. Puncturing Mask Selection
The puncturing mask configuration is defined according to the target code rate R of the TC and to the puncturing period M . The target code rate R is computed as: where the data puncturing rate D p corresponds to the ratio between the number of punctured data bits (systematic bits) and the total number of data bits. It is illustrated by the puncturing mask applied on vector d in Fig. 1. U p is the number of unpunctured parity bits per constituent CRSC code in a puncturing period. For given R and M , D p can take M +1 different values: However, in practice, the values of D p are restricted to those ensuring a constituent CRSC code rate R c smaller than 1, to be able to reconstruct the information sequence from the encoded sequence. From (8), U p is given by: In this study, U p is an integer value and the same value applies for both CRSC codes, since only symmetric puncturing masks (i.e., same puncturing pattern for both constituent CRSC codes) are considered.
The puncturing mask design proposed in this work involves the following steps: 1) Find the best puncturing pattern for each D p value: The "fast algorithm for searching trees" (FAST) was introduced in [29] to evaluate the Hamming distance spectrum of unpuctured convolutional codes. In practice, only the first terms of the distance spectrum, i.e., a truncated distance spectrum, are needed. When the CRSC code is punctured by a periodic pattern of period M , the FAST algorithm have to be run M times, each time starting from a different position in the mask, from 0 to M − 1, and the resulting M distance spectra are accumulated to obtain the Hamming distance spectrum of the punctured CRSC code. The best puncturing mask for each D p value is identified as the one generating the best CRSC Hamming distance spectrum (i.e., highest distance values in the first spectrum terms and minimal number of codewords at these distances). Note that the truncated distance spectrum of convolutional codes is independent of K, provided that K is large enough (in practice, if K is greater than the longest return to zero (RTZ) sequence at the maximum Hamming distance considered in the truncated spectrum). For CRSC codes, an RTZ sequence is defined as any finite input sequence which makes the code leave a given state and return back to the same state. 2) Carry out a mutual information exchange analysis to select a restricted set of puncturing masks: In [28], the distribution of the extrinsic information in terms of log-likelihood ratios at the output of a soft-in soft-out (SISO) decoder was plotted and analyzed. It was shown that the distribution of the extrinsic information related to punctured data positions is different from the one related to unpunctured data positions. Thus, in the considered extrinsic information transfer (EXIT) chart [30] analysis, we do not rely on a Gaussian approximation of the a priori message. Rather, we measure the iterative evolution of the a priori mutual information, within actual turbo decoding iterations via Monte Carlo simulations. Uniform TC permutations [31] are used to average the effect of the interleaver on the extrinsic information exchange. A similar analysis was used in the past to identify efficient precoding structures for TCs [32]. The average mutual information between the a posteriori log-likelihood ratios L of each constituent SISO decoder and the data frame X is computed as in [33] by: However, at each decoding iteration, the a priori information at the input of one SISO decoder is taken from the other SISO decoder, not from a virtual AWGN channel. In this modified EXIT chart, the best puncturing mask in terms of convergence performance is the one providing the closest crossing point (IA, IE) to (1, 1). Actually, in the selection process, we keep the restricted set of puncturing masks providing crossing points closer to (1, 1) than the mask corresponding to the systematic code (D p = 0). Afterwards, the error rate performance of the TC is evaluated for the remaining puncturing masks with uniform interleaving. The puncturing mask providing the best tradeoff between performance in the waterfall and error floor regions is finally selected.

B. Data Puncture-Constrained Interleavers
Designing a non-catastrophic puncturing mask with punctured data bits is an easy task for the first (non-interleaved) CRSC code of the TC [34], using for instance the abovementioned FAST algorithm. However, when designing TC interleavers, data puncturing constraints must be considered to avoid semi-catastrophic or catastrophic puncturing masks in the second (interleaved) CRSC code [34], [5]. Indeed, due to such poor puncturing patterns, the Hamming distance spectrum of the CRSC code can contain a d min value equal to one or even zero. To avoid poor puncturing patterns in the second CRSC code, a data puncture-constrained (DPC) interleaver must guarantee the same data puncturing pattern in both constituent CRSC codes [34]. For instance, the possible connections made by a DPC interleaver between punctured data positions from d to d are shown in Fig. 4, for a data puncturing mask with period M = 8. Note that only the first puncturing period is shown in the figure, but connections between different periods are also admitted.

C. Protograph-Based Interleavers
The following analysis aims to identify additional useful puncturing constraints for the design of TC interleavers. Our approach is based on the observation that, for a punctured code, the reliability of extrinsic information related to an information symbol depends on different parameters such as the position of considered symbol in the puncturing period, the puncturing or not of the corresponding parity, and the number of punctured parities in the case of constituent codes with several parities per information symbol.
During the turbo decoding process, extrinsic information from a given constituent decoder is generated based on its received parity sequence and is sent to the other constituent decoder via the interleaver/deinterleaver as a priori information on data. The extrinsic information computed from unpunctured parity positions is expected to be more reliable than the one generated from punctured parity positions. In order to illustrate this conjecture, we have plotted a conventional EXIT chart for a TC only on positions with punctured parities, only on positions with unpunctured parities and on the complete frame. As shown in Fig. 5, the EXIT curve obtained from data at unpunctured parity positions shows a wider tunnel opening between the EXIT curves than the one obtained from data at punctured parity positions. Since extrinsic information is used as a priori information on data, a possible strategy for the interleaver construction involves connecting the positions with highly reliable extrinsic information to the positions with unreliable extrinsic information, which are more prone to errors. This connection strategy aims to spread the correction capability of the TC over the whole data block. A particular case involves sorting the data positions in a puncturing period M by increasing order of reliability and connecting via the interleaver the least errorprone data positions of one CRSC code to the most errorprone data positions of the other one. We named the resulting connection graph protograph, refering to protograph-based LDPC codes [17], since the proposed protograph defines a set of inter-period permutations. The protograph will be then defined as follows: 1) Sorting of unpunctured data positions by error-prone level in a puncturing period M: To this end, the unpunctured data positions are punctured in turn with an associated evaluation of the distance spectrum of the resulting CRSC code. They are then sorted according to their distance spectrum: the least error-prone data position is the one with the best distance spectrum (i.e., the highest d min and the lowest multiplicity; in the case of equal d min values and multiplicities the next higher distance is considered) for the resulting CRSC code when punctured and the most error-prone data position is the one with the poorest distance spectrum (i.e., the lowest d min and the highest multiplicity) for the resulting CRSC code when punctured. Note that additional data puncturing is only introduced to evaluate the error-prone level of unpunctured data positions and is then removed from the puncturing mask. An example of sorting of unpunctured data positions via this procedure is shown in Fig. 6 for a puncturing period M = 8.

V. LAYERED CONSTRUCTION OF ARP INTERLEAVERS
Different methods to select ARP interleaver parameters have been investigated, e.g., [12], [15]. However, the high coding rate TCs derived from these codes are still subject to error floors detrimental to applications with high reliability requirements. In this paper, we propose an alternative construction method, based on a layered approach, that helps to design TCs with high d min values. In addition, the proposed approach facilitates the introduction of the above-mentioned puncturing constraints into the interleaver design, as well as 0090-6778 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  the verification of other design criteria such as minimum span S p min and minimum girth g targets. In order to simplify the interleaver parameters selection, the interleaver addresses, Π(i), are divided into different groups that are incrementally defined. Let us consider equation (2), it can be shown that (see Appendix A for the proof): Therefore, Q groups of permutation addresses are identified, each corresponding to a given modulo Q value. The sequences d and d are divided into these Q different layers of K/Q bits. The layer index l for the bit Π(i) in sequence d and the layer index l for the bit i in sequence d are defined by: The interleaver is defined by a group of Q different regular permutations, each linking a layer l in the non-interleaved sequence d to its corresponding layer l in the interleaved sequence d . It can be represented on a circle, as shown in Fig. 8, where each layer is identified by a color (and a line type) and a layer index. The addresses in the non-interleaved order, denoted by Π(i), are at the inner part of the circle, and their corresponding addresses i, in the interleaved order, are at the outer part of the circle.

Non-interleaved addresses
Interleaved addresses For a bit of layer l at address i, the proposed interleaver chooses the shift value S(l ) decomposed into inter-layer shift T l and intra-layer shift A l such as: where T l = 0, ..., Q − 1 and A l = 0, ..., (K/Q) − 1. The inter-layer shift T l defines the value of the layer position l (or equivalently the position within the period Q) of the noninterleaved sequence d that will be connected to layer position l of the interleaved sequence d as shown in Fig. 9. The intra-layer shift A l defines which position, from of the K/Q possible ones within layer l (or equivalently which period), to be connected to address i as illustrated in Fig. 10.
. . . This layered construction simplifies the validation of S p min and g targets in the interleaver design step, since these criteria are verified each time a new layer is placed or equivalently its corresponding S(l ) value is defined.
In order to simplify the introduction of puncturing constraints into the interleaver design, the disorder degree Q of the ARP interleaver is set as a multiple of the puncturing period M . In this study, Q is set equal to M . In other words, the periodic disorder degree of the interleaver and puncturing period are identical. Since the PB interleaver constraints define a Q-periodic connection strategy for unpunctured data bits on one side and since the proposed ARP interleaver is defined by a group of Q different regular permutations on the other side, the T l values are chosen to apply these constraints. Indeed, the validation of puncturing constraints within a puncturing period M = Q is a sufficient condition for their validation in the whole data sequence. Note that for a given layer l in d the corresponding layer position l in d is obtained by l = (P · l + T l ) mod Q. Thus, a periodic connection pattern, with period Q, is established by the inter-layer shifts T l between d and d , for a given P .
The Q layers of the interleaver structure can be defined incrementally by choosing their corresponding value S(l ). Note that the initial position of each layer (i.e., before the introduction of the described shifts) is defined by the value of P .

A. Overall Interleaver Construction Method
For a given set of design parameters (S p min and g targets, K, R, polynomial generators, and puncturing mask), the proposed interleaver design strategy involves the following steps: 1) Select the candidate values for P: The set of admissible values for P is the group of integers from 1 to K −1 relatively prime to K. In this set, only the C candidate values for P ensuring a S p min value (5) greater than or equal to S p min target, considering a RI structure (1), are selected. 2) Select the Q shift values for each candidate value for P: A detailed description of the different steps of the shifts selection, for the C candidates for P , is presented in Appendix B. To summarize, for each candidate value for P , layer l is placed by computing a value for S(l ) from (15), fulfilling puncturing constraints if any (see Section IV). For this value, S p min and g are evaluated.
If they are equal to or higher than S p min and g targets, one can move on to layer l +1. If the S p min and g targets are not met, another value for S(l ) has to be evaluated. This process is performed until the whole group of Q shift values are determined. 3) Select the best ARP interleaver candidate: As a last phase of design, the best candidate for TC interleaver is selected from the group of candidates previously generated by comparing their Hamming distance spectra. The truncated TC Hamming distance spectrum can be estimated by different methods as proposed in [35]. The ARP interleaver candidate with the best TC Hamming distance spectrum is chosen. Determining a suitable couple of S p min and g targets for obtaining good ARP interleaver candidates is not trivial. However, a possible selection strategy involves the following steps: first, evaluate the convergence of the algorithm in Appendix B for values of S p min and g set to their corresponding upper bounds (see Sections III-A and III-B). Then, if the algorithm does not converge, progressively reduce S p min and g until the algorithm converges to a group of ARP candidate interleavers.

B. Summary of Puncture-Constrained Interleaver Design Method
This section summarizes the proposed method to jointly optimize the TC interleaver with the puncturing pattern: 1) Select the puncturing mask: The best puncturing mask for the constituent CRSC code of the TC, in terms of constituent code distance spectrum and TC extrinsic information exchange, is identified according to Section IV-A. 2) Define the puncturing constraints: The corresponding puncturing constraints that must be fulfilled by the interleaver are determined according to Sections IV-B and IV-C. 3) Generate the candidate interleavers: A group of candidate interleavers validating the different design criteria (e.g., S p min and g targets, and proposed puncturing constraints) is generated via the method described in Section V-A. Finally, the candidate interleaver with the best TC Hamming distance spectrum is selected.

VI. APPLICATION EXAMPLES
We have applied the previous design guidelines to two coding rates, R = 2/3 and 4/5, for K = 1504 bits, available in the LTE standard. The constituent code is the CRSC code with feedback and feedforward polynomials 13 and 15 (expressed in octal format) CRSC(1, 15/13) 8 . Only the design for code rate 2/3 is detailed hereafter.

A. Puncturing Mask Selection
We consider puncturing periods M of 8 and 16. Table I lists the distance spectrum of the constituent CRSC code for the best puncturing masks at each D p value. The analysis of the distance spectra shows that D p values higher or equal to 6/8 should be avoided, since the puncturing mask becomes catastrophic for the CRSC code.  The TC convergence behavior with the different puncturing masks is then analyzed, using uniform interleaving. Fig. 11 shows the modified EXIT chart of the TC evaluated at its signal-to-noise ratio (SNR) decoding threshold for D p = 0. The puncturing masks providing better TC convergence performance than the D p = 0 mask correspond to D p = 2/16 and 2/8. We finally choose the puncturing mask with D p = 2/8, which displays the best error floor performance in the error rate evaluation.

B. Protograph Construction
For the selected mask, the unpunctured data positions are sorted as explained in Section IV-C. Table II lists the different distance spectra, truncated to distance 4, obtained including one additional punctured data symbol from positions 1 to 6 in the mask. Fig. 12 shows the corresponding sorting of unpunctured data positions and the resulting protograph for the D p -2/8 mask, R = 2/3 and M = 8. C. Puncture-constrained Interleaver Design For K = 1504, S p min has an upper bound of 54 as defined in (6). Following the strategy introduced in Section V-A, a S p min goal of 80-85% of the S p min upper bound and a g goal of 8 were selected to guarantee the convergence of the algorithm described in Appendix B.
1) Selection of the candidate values for P: The maximum achievable value of S p min obtained when testing all the admissible candidate values for P considering a RI structure is 52. Thus, we limited the search to candidates for P leading to S p min values between 45 and 52. 2) Selection of the Q shift values for each candidate for P: The parameters for the ARP interleaver candidates are determined by the algorithm in Appendix B. Three different design configurations have been studied. In the first one (NDP), no data bits are punctured (D p = 0).
In the other two, DPC, and PB ARP interleavers are considered for the D p -2/8 mask. DPC interleavers have already been studied in the past [34]. In this study, DPC interleavers are designed based on the ARP model for comparison purposes. In order to compare the efficiency of the different configurations in finding large d min values, the same number of ARP candidates (64, 000) is generated by each configuration. 3) Selection of the best ARP interleaver candidate: Table  III lists the best ARP interleavers generated for each design configuration. All candidates achieve a S p min value of 45 and a g value of 8. Their respective distance spectrum, truncated to three terms, is estimated and given in Table IV. It is observed that the use of data puncturing allows a larger d min value to be reached. Furthermore, the PB ARP interleaver achieves the largest d min .   792  630  829  1010  90  1471  658  DPC  227  495  998  280  1090  734  361  362  PB  651  89  528  852  1501  1396  688  490   TABLE IV  ESTIMATED DISTANCE SPECTRUM, TRUNCATED TO THREE TERMS, OF THE  RESULTING TC FOR THE  ARP 5640  15  16  17  1128  4512  7708  DPC  4324  19  20  21  752  1880  5264  PB  10716  20  21  22  1504  3008  6016 Finally, the statistics on the search efficiency of the different configurations for large d min values is provided in Table V. The configurations are compared in terms of the number of obtained candidates meeting the S p min and g targets. Note that the more the constraints included into the interleaver design (e.g., DPC, PB), the larger the number of candidates validating these criteria. Actually, the introduction of these constraints reduces the randomness of the interleaver design process and facilitates the validation of the span and girth criteria. Furthermore, the addition of the different constraints to the interleaver design allows the percentage of candidates with a d min of at least 18 to be increased and the average time to find such candidates to be reduced. Thus, the search becomes more efficient due to the introduction of the design constraints. Note that PB candidates are the best in terms of d min value. The proposed design guidelines were also applied to a code rate R = 4/5. Table VI lists the best ARP interleavers generated for each design configuration. All candidates achieve a S p min value of 39 and a g value of 8. Their respective distance spectrum is estimated and given in Table VII.

VII. SIMULATED PERFORMANCE RESULTS
The error rate performance of TCs using the interleaver parameters of Tables III and VI is evaluated in AWGN channel with BPSK modulation and a maximum of 16 decoding iterations with the maximum a posteriori probability (MAP) algorithm. The estimated truncated distance spectra listed in Tables IV and VII are used to compute the truncated union   TABLE VII  ESTIMATED DISTANCE SPECTRUM FOR THE BEST ARP INTERLEAVERS IN  AWGN CHANNEL WITH CORRESPONDING 12220  9  10  11  2350  8554  29516  DPC  10434  11  12  13  1880  5358  16732  PB  4324  11  12  13  752  4794  15134 upper bounds (TUBs) [36]. In addition, error rate simulation results for the original LTE [4] TC are included for comparison. Figs. 13 and 14 show the FER and BER performance of the 8-state CRSC(1, 15/13) 8 TC for K = 1504 bits, R = 2/3 and 4/5. We observe that DPC and PB interleavers achieve a substantial asymptotic performance gain compared to the NDP interleaver. The proposed PB interleaver achieves a slightly better asymptotic performance than the DPC interleaver for both code rates. Compared to the LTE TC, the proposed PB interleaver provides a gain of about 0.5 and 0.7 dB at 2·10 −3 of FER for R = 2/3 and 4/5, respectively and almost 4 decades in error floor in both cases (see Fig. 13).  Fig. 15 shows a comparison of the required SNR at 10 −3 of FER evaluated over AWGN channel with QPSK modulation, between the proposed TCs, parity-check (PC) polar codes, and low-density parity-check (LDPC) codes considered during the 3GPP standardization process. The proposed TCs perform a maximum of 8 decoding iterations of the scaled max-log MAP algorithm, PC polar codes use the successive cancellation list-8 decoding algorithm, and LDPC codes use the layered offset min-sum algorithm with 20 decoding iterations. The three families of codes allow the same level of flexibility in terms of coding rate and frame size. The corresponding performance results are taken from [37] for the proposed TCs, from [38] (R = 1/5) and [39] for PC polar codes, and from [40] and [41] for LDPC codes. For short information block sizes (K around 100 bits) TCs and PC polar codes show similar error rates, while LDPC codes suffer of short correlation cycles leading to a degraded performance. As the information frame size increases, the error rate shown by the three family of codes becomes equivalent. Note that as the coding rate decreases the proposed TCs start to exhibit best performance. Additionally, performance results of LDPC codes finally adopted for 5G (from [42]) performing 50 iterations of the sum-product algorithm are included. As can be expected, obtained performance is better than the one obtained with the layered offset min-sum algorithm, especially for the largest frame size. However, even if the proposed TCs use the scaled max-log MAP decoding algorithm, the proposed TCs remain competitive when compared to these codes. Please note that these results do not necessarily provide a fair comparison due to large disparities in the levels of required decoding complexity. They have been included for information purposes.
As we can observe in the presented application examples, the proposed approach brings a gain both in convergence and in the error floor compared to the LTE code, showing that the association between the rate matching and the QPP interleaver of LTE is sub-optimal. Furthermore, the proposed TCs are competitive in terms of performance against considered PC polar and LDPC codes for the 5G.

VIII. CONCLUSION
A new method to design TC interleavers is proposed, which calls for a joint optimization of puncturing patterns and interleaver function. Catastrophic puncturing masks for the constituent codes of the TC are early identified in the selection process by evaluating their respective distance spectrum. Then, a modified EXIT chart analysis allows identifying  [38] and [39] with the successive cancellation list-8 decoding algorithm, LDPC code of [40] and [41] with 20 decoding iterations of the layered offset min-sum algorithm, and LDPC code of [42] with 50 decoding iterations of the sum-product algorithm.
a suitable puncturing mask for the TC in terms of convergence performance, with data puncturing. It was shown that significant improvements in the waterfall and error floor regions can be achieved by including puncturing constraints into the interleaver design. The best candidates in terms of d min are obtained by using the proposed PB interleaver.
Finally, the presented method allows an easy introduction of puncturing constraints as well as the validation of other design criteria such as span and correlation girth into the interleaver design. The generation of ARP interleavers validating the different design criteria is greatly simplified in comparison to previous methods, since the proposed layered construction suitably limits the search space for the different interleaver parameters.
Thanks to the proposed method, enhanced turbo codes were designed and submitted to the 3GPP standardization process for 5G. They are currently being considered for the ultrareliable and machine type communication scenarios. APPENDIX A PROOF OF Π(i + Q)mod Q = Π(i)mod Q Considering (2), Π(i + Q) can be written: Π(i + Q) = (P (i + Q) + S((i + Q) mod Q))mod K = (P ·i + S(i mod Q) + QP )mod K = (Π(i) + QP )mod K Then, noting that K is a multiple of Q: Π(i + Q)mod Q = ((Π(i) + QP )mod K)mod Q = (Π(i) + QP )mod Q = Π(i)mod Q