Additive and Dominance Genomic Analysis for Litter Size in Purebred and Crossbred Iberian Pigs

INGA FOOD S. A., as a Spanish company that produces and commercializes fattened pigs, has produced a hybrid Iberian sow called CASTÚA by crossing the Retinto and Entrepelado varieties. The selection of the parental populations is based on selection criteria calculated from purebred information, under the assumption that the genetic correlation between purebred and crossbred performance is high; however, these correlations can be less than one because of a GxE interaction or the presence of non-additive genetic effects. This study estimated the additive and dominance variances of the purebred and crossbred populations for litter size, and calculated the additive genetic correlations between the purebred and crossbred performances. The dataset consisted of 2030 litters from the Entrepelado population, 1977 litters from the Retinto population, and 1958 litters from the crossbred population. The individuals were genotyped with a GeneSeek® GGP Porcine70K HDchip. The model of analysis was a ‘biological’ multivariate mixed model that included additive and dominance SNP effects. The estimates of the additive genotypic variance for the total number born (TNB) were 0.248, 0.282 and 0.546 for the Entrepelado, Retinto and Crossbred populations, respectively. The estimates of the dominance genotypic variances were 0.177, 0.172 and 0.262 for the Entrepelado, Retinto and Crossbred populations. The results for the number born alive (NBA) were similar. The genetic correlations between the purebred and crossbred performance for TNB and NBA—between the brackets—were 0.663 in the Entrepelado and 0.881 in Retinto poplulations. After backsolving to obtain estimates of the SNP effects, the additive genetic variance associated with genomic regions containing 30 SNPs was estimated, and we identified four genomic regions that each explained > 2% of the additive genetic variance in chromosomes (SSC) 6, 8 and 12: one region in SSC6, two regions in SSC8, and one region in SSC12.


Introduction
The Iberian pig breed is one of the porcine populations that has the highest meat quality [1]. Historically, Iberian pig production was developed extensively with purebred varieties, which took advantage of the Dehesa environment in southwestern Spain. In recent decades, however, many traditional production systems have been substituted with intensive production systems that use crossbreeding with Duroc populations to improve growth and efficiency [2]. The norms that regulate Iberian pig production [3] obligate Genes 2022, 13, 12 2 of 10 farmers crossing Iberian and Duroc varieties to cross boars from the Duroc variety and sows from the Iberian variety. Prolificacy, which is lower than that of white pig populations, is the major limitation in the intensive production of crossbred pigs from Iberian dams [4]. The INGA FOOD, S.A. company has developed a crossbreeding scheme between two Iberian varieties (Retinto-R-and Entrepelado-E-) that has created a hybrid sow called CASTUA-ER, which has an important heterosis effect in prolificacy [5]. In addition, the company has been developing a breeding scheme for increasing litter size through selection in the parental Retinto and Entrepelado populations.
Theoretically, the optimal strategy for the selection of purebreds for crossbred performance is Recurrent Reciprocal Selection [6]; however, it has not been routinely used in pig breeding because it involves a delay in the generation interval. In fact, purebred parental populations are selected based on selection criteria calculated from purebred phenotypic information, and under the assumption that the genetic correlation between purebred and crossbred performance is high [7]. Those genetic correlations can be imperfect (<1) because of genotype-by-environment (GxE) interactions and the presence of non-additive genetic effects [7].
Genomic information facilitates the analysis of crossbreeding data, even if genotyped and phenotyped individuals are not directly related [8], by the definition of an additive-dominance genotypic model that provides estimates of genotype x environmental interactions through genotypic correlations. In addition, the estimates of genotypic and dominance variances can be used to estimate the additive genetic correlation between purebred and crossbred performances. Backsolving, as proposed by Wang et al. [9], provides an estimate of the SNP effects and allows us to calculate the amount of additive genetic variance associated with each genomic region in purebred and crossbred performances.
This study estimated the additive and dominance genotypic variances and covariances, which were used to calculate the additive and dominance genetic variances and the genetic correlations between purebred and crossbred performances in the Retinto and Entrepelado populations. In addition, the distribution of the additive genetic variance within the autosomal genome for purebred and crossbred performance was quantified.

Materials and Methods
The phenotypic data included the number of piglets born alive (NBA) and the total number born (TNB) for 306 Entrepelado and 313 Retinto purebred sows, and for 333 crossbred (Entrepelado x Retinto) sows when crossed with Entrepelado, Retinto or Duroc boars (Table 1). All of the sows were genotyped with the GeneSeek ® GPP Porcine 70K HDchip (Illumina Inc., San Diego, CA, USA). Filtering excluded genotypes that had a minor allele frequency < 0.05 and an SNP call rate < 0.90 in the overall population. From that, 34,316 SNP markers were used to build the genomic relationship matrices with our own developed software in the R environment [10]. The missing genotypes were replaced with their expectation.
The model of analysis assumed that the phenotypic values of individuals (y) (TNB and NBA) are explained by the (biological) additive (u) and dominance (v) effects of the SNPs, and a covariate (c) with the average homozygosity (f ), the systematic effects (b) order of parity (1, 2, 3, and >3), the sire of service breed (Entreplado, Retinto, or Duroc) and herd-year-season (122 levels). Phenotypic data were generated in three herds, and herd-year-season effects were defined every 3 months. The sow permanent environmental effects (s) with 2030, 1958 and 1977 levels for the Entrepelado, Retinto and Crossbred populations, and the residuals (e), were as follows: where X and T are the corresponding incidence matrices. Following Vitezica et al. [8], u and v can be described in terms of the vectors of additive (a) and dominance (d) SNP genotypic effects as follows: The matrices Z = (z 1 . . . . . . z m ) and W = (w 1 . . . . . . .w m ) are equal to 1, 0, −1 and 0, 1, 0 for SNP genotypes A 1 A 1 , A 1 A 2 and A 2 A 2 , respectively.
The covariance across individual genotypic additive (u) and dominance (v) effects is The variance components were estimated by REML [11] through the EM-REML algorithm using remlf90 software [12] and, in order to obtain the average information matrix, we used one extra iteration with airemlf90. Additive and dominance variance components were calculated in each of the populations (E, R, and ER) as follows: The additive (σ 2 A ) and dominance (σ 2 D ) genetic variances of the purebred populations were calculated as follows: wherep Xi andq Xi are the raw estimates of the allelic frequencies for A 1 and A 2 at the ith SNP marker and the X = {E,R or ER} population, respectively. The estimates of the contributions to the additive variance in the crossbred population from the Entrepelado (σ 2

A ER(R)
) were obtained by [8] as follows: Following Vitezica et al. [8], the additive variance in the crossbred population was the average of the two values resulting from Equations (6) and (7), as follows: The estimate of the dominance variance of the crossbred population [8] was calculated as follows: With these estimates, the heritabilities (h 2 X ) and dominance ratios (d 2 X ) in the purebred (X = E,R) and crossbred (X = ER) populations were obtained by: whereσ 2 S X andσ 2 E X are the estimates of the sow permanent environmental and residual variance in the X = {E,R,ER} population.
The covariance between purebred and crossbred additive genetic effects in the Entrepelado (σ A E A ER(E) ) and Retinto (σ A R A ER(R) ) populations were as follows: Therefore, the genetic correlations between the purebred and crossbreed breeding values in the Entrepelado and Retinto populations were computed as follows: The vector of the SNP additive effects (â E ,â R andâ ER ) was obtained by backsolving [9], asâ (16) and the vector of the SNP dominance effects (d E ,d R andd ER ) was as follows: Genes 2022, 13, 12

of 10
With those, the genetic additive variances (σ 2 A E (k) , σ 2 A R (k) , σ 2 A ER(E) (k) and σ 2 A ER(R) (k) ) explained by the kth segment of the genome were calculated as follows: where n(k) is the number of SNP markers within the kth segment, which was set to 30 after testing several number of the SNP markers (20, 30 and 40). In order to identify the genes within the genomic regions that explained >2.0% of the total genetic variance, we used the biomart tool (www.ensembl.org (accessed on 10 October 2021)).

Results and Discussion
The results based on TNB and NBA were similar, which was expected because these two traits have a high genetic correlation [13], and the raw correlation between them in the analyzed dataset was 0.94; therefore, we focused on the results with the TNB, and the results for NBA are presented as Supplementary Information (Tables S1-S3 and Figure S1). The REML estimates of the additive genotypic (co) variances are shown in Tables 2 and 3 in TNB and NBA. The additive genotypic variance was higher in the crossbred populations than it was in the purebred populations. This may be due to scale effects, as the phenotypic variation in also greater. In addition, the estimates of the genotypic covariances between purebreds (Entrepelado and Retinto) and the crossbred population were all high and positive, and they corresponded to additive genotypic correlations of 0.704 (0.259/ √ 0.248 × 0.546) between Entrepelado and Crossbred pigs, 0.988 (0.388/ √ 0.546 × 0.282) between Retinto and Crossbred pigs, and 0.756 (0.200/ √ 0.248 × 0.282) between the two purebreds. These results indicated that the genotype x environmental interaction was small, and the additive genotypic correlations were similar to those obtained by Vitezica et al. [8] in white pig populations. The REML estimates of the dominance genotypic (co)variances ranged from 0.170 (Retinto) to 0.265 (Crossbred) ( Table 3).
The estimates of the dominance genotypic covariances were all positive, and reflected genotypic dominance correlations >0.95. The analysis provided the REML estimates of the sow permanent and residual effects ( Table 4). The residual variance (σ 2 E ) is greater in the crossbred population than in purebreds, consistently with the greater phenotypic variation. In contrast, the estimate of the sow environmental variance (σ 2 R ) was very low in the crossbred population. The additive and dominance genotypic (co) variances were used to calculate the additive and dominance genetic variances in the purebred populations based on expressions (1) to (5) ( Table 3). The estimates of the additive genetic variances were 0.170 (Entrepelado) and 0.150 (Retinto), and the estimates of the dominance genetic variances were 0.074 (Entrepelado) and 0.056 (Retinto). The heritability estimates were calculated using Equation (10); they were 0.052 (Entrepelado) and 0.037 (Retinto), which were within the range or slightly lower than those of white pigs [13][14][15] and in the same [5] or other Iberian [16,17] populations. The dominance ratios were obtained from Equation (11), and were 0.023 for Entrepelado and 0.014 for Retinto. They were smaller than the heritabilities, but their ratios with them were approximately 40%, which was higher than those reported for white pig populations [8,18] for litter size and similar to the results of Tusell et al. [19] in other swine traits.
We used Equations (6) and (7) to calculate the additive variances for crossbred performance in the purebred populations, which were 0.413 (Entrepelado) and 0.293 (Retinto). Therefore, the additive genetic variance in the crossbred population was the average of the two (0.353), which was higher than the additive genetic variances in the purebred populations, which were similar to the results of Vitezica et al. [8] with regard to litter size, and to the results of Tusell et al. [19] for other pig traits. Nevertheless, Xiang et al. [20] found the opposite in a cross between Landrace and Yorkshire breeds (0.86 and 0.54 in purebreds and 0.28 in crossbreds). In the present study, the dominance genetic variance in the crossbred population (0.079) was calculated based on the Equation (8), which was similar to the dominance genetic variance in the purebreds; however, its ratio with the additive genetic variances was lower (22%). Given those variance components, the heritability and dominance ratio estimates in the crossbred population were 0.072 and 0.016, respectively.
In addition, the additive genetic correlations between purebred and crossbred performances in the Entrepelado and Retinto populations were calculated based on expressions (12) to (15), which were 0.663 in Entrepelado and 0.881 in Retinto populations. Those correlations were within the range of the estimates summarized by Wientjes and Calus [7], and suggest that the efficiency of the selection for increased crossbred performance by selecting for purebred performance will be more effective in Retinto than in Entrepelado pigs.
We used Equations (16) and (17) to calculate the additive and dominance genotypic effects associated with each of the 34,316 SNP markers, which were used in Equations (18)- (21) to calculate the proportion of the additive genetic variance that was explained by segments of 30 consecutive SNPs (Figure 1). The distribution of the additive variance explained by segments of 20 and 40 SNP markers were similar, and are presented as supplementary information (Figures S2 and S3).
We used Equations (16) and (17) to calculate the additive and dominance genotypic effects associated with each of the 34,316 SNP markers, which were used in Equations (18)- (21) to calculate the proportion of the additive genetic variance that was explained by segments of 30 consecutive SNPs (Figure 1). The distribution of the additive variance explained by segments of 20 and 40 SNP markers were similar, and are presented as supplementary information (Figures S2 and S3).  The figure presents the distribution of the additive variance along the autosomal chromosomes in the Entrepelado and Retinto populations, and for the purebred and crossbred performance. Four genomic regions can be highlighted; each explained >2% of the additive genetic variance in at least one of the populations. The SNPs at the center of each of the genomic regions that explained the highest amount of additive genetic variance, and the genes in the Sus_Scrofa 11.1. genomic map that were within 1 Mb downstream or upstream, are presented in Table 5. Table 5. SNPs at the center of each of the four genomic regions that explained > 2% of the additive genetic variance in at least one of the populations, and the genes located within 1 Mb downstream or upstream. Among those genes, several can be proposed as candidate genes to explain the additive genetic variation. The genomic region surrounding bp 7,597,405 in SSC6 included BCO1 (β-Carotene Oxygenase 1), which encodes an enzyme that catalyzes the breakdown of provitamin A and provides retinoids for embryogenesis [21,22]. Furthermore, the GCSH (Glycine Cleavage System H) protein plays an important role in embryonic viability [23].
Two genomic regions were identified in SSC8 around bp 11,585,865 and bp 137,540,516. Among the genes within those regions, PRDM8 (PR/SET Domain 8) is involved in the neurogenesis [24] of the FGF5 (Fibroblast Growth Factor 5), a member of the fibroblast growth factor family that is involved in several biological processes, including embryonic development, cell growth, and morphogenesis [25,26].
The genomic region around bp 46,079,417 in SSC12 contains, among others, the GIT1 (G protein-coupled receptor kinase interactor 1) gene, which plays a role in spine morphogenesis [27], the NSRP1 (Nuclear Speckle Splicing Regulatory Protein 1) development process, and in utero embryonic development [28], and ANKRD1 (Ankyrin Repeat Domain 1), which is involved in neuron projection development [29].
The Gene Ontology (GO) terms for the biological processes for the proposed candidate genes are presented as Supplementary Table S4.

Conclusions
(1) The additive genetic variance and the heritabilities were higher in the crossbred than those in the purebred populations, (2) the genetic correlation between purebred and crossbred performances were higher in Retinto than they were in Entrepelado pigs, and (3) the additive genetic variances were heterogeneously distributed throughout the autosomal genome, and four genomic regions in SSC6, SSC8, and SSC12 with several candidate genes were identified.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13010012/s1. Table S1: REML estimates ± the standard error (SE) of the additive genotypic (co)variances for the number born alive (NBA). Table S2: REML estimates ± the standard error (SE) of the dominance genotypic (co)variances for the number born alive (NBA). Table S3: REML estimates ± the standard error (SE) of the permanent environmental and residual variances for the number born alive (NBA). Table S4: GO (Gene Ontology) terms for the biological process of the proposed candidate genes. Figure S1: Distribution of the percentage of the additive genetic variance explained by genomic segments of 30 SNPs within the autosomal genome of the purebred and crossbred performance for the number born alive (NBA) in the Entrepelado and Retinto varieties. Figure S2: Distribution of the percentage of the additive genetic variance explained by genomic segments of 20 SNPs within the autosomal genome of purebred and crossbred performance for the total number born (TNB) in the Entrepelado and Retinto varieties. Figure S3: Distribution of the percentage of the additive genetic variance explained by genomic segments of 40 SNPs within the autosomal genome of the purebred and crossbred performance for the total number born (TNB) in the Entrepelado and Retinto varieties.

Data Availability Statement:
The dataset used in this study will be available upon reasonable request to the corresponding author (lvarona@unizar.es).