Published May 1, 2023 | Version v1
Journal article Open

High-resolution structural variants catalogue in a large-scale whole genome sequenced bovine family cohort data

  • 1. Animal Breeding and Genomics, Wageningen University & Research, Wageningen, the Netherlands
  • 2. Unit of Animal Genomics, Faculty of Veterinary Medicine, GIGA-R &, University of Liège, Liège, Belgium
  • 3. GIGA Institute, GIGA Genomics Platform, University of Liège, Liège, Belgium

Description

Background: Structural variants (SVs) are chromosomal segments that differ between genomes, such as deletions, duplications, insertions, inversions and translocations. The genomics revolution enabled the discovery of sub-microscopic SVs via array and whole-genome sequencing (WGS) data, paving the way to unravel the functional impact of SVs. Recent human expression QTL mapping studies demonstrated that SVs play a disproportionally large role in altering gene expression, underlining the importance of including SVs in genetic analyses. Therefore, this study aimed to generate and explore a high-quality bovine SV catalogue exploiting a unique cattle family cohort data (total 266 samples, forming 127 trios).

Results: We curated 13,731 SVs segregating in the population, consisting of 12,201 deletions, 1,509 duplications, and 21 multi-allelic CNVs (> 50-bp). Of these, we validated a subset of copy number variants (CNVs) utilising a direct genotyping approach in an independent cohort, indicating that at least 62% of the CNVs are true variants, segregating in the population. Among gene-disrupting SVs, we prioritised two likely high impact duplications, encompassingORM1andPOPDC3genes, respectively. Liver expression QTL mapping results revealed that these duplications are likely causing altered gene expression, confirming the functional importance of SVs. Although most of the accurately genotyped CNVs are tagged by single nucleotide polymorphisms (SNPs) ascertained in WGS data, most CNVs were not captured by individual SNPs obtained from a 50K genotyping array.

Conclusion: We generated a high-quality SV catalogue exploiting unique whole genome sequenced bovine family cohort data. Two high impact duplications upregulating theORM1andPOPDC3are putative candidates for postpartum feed intake and hoof health traits, thus warranting further investigation. Generally, CNVs were in low LD with SNPs on the 50K array. Hence, it remains crucial to incorporate CNVs via means other than tagging SNPs, such as investigation of tagging haplotypes, direct imputation of CNVs, or direct genotyping as done in the current study. The SV catalogue and the custom genotyping array generated in the current study will serve as valuable resources accelerating utilisation of full spectrum of genetic variants in bovine genomes.

Files

12864_2023_Article_9259.pdf

Files (6.1 MB)

Name Size Download all
md5:bb828bfb527abb9a2aafa9adfd528b2f
2.7 MB Download
md5:de22904d9f76820737d4dacc32fcd709
737.3 kB Download
md5:cd60c9e4702ef2daba93ef6e77c9c72c
2.7 MB Preview Download
md5:f60baa17de44e4f8225f0e337b910ca9
16.3 kB Download

Additional details

Funding

GPLUSE – Genotype and Environment contributing to the sustainability of dairy cow production systems through the optimal integration of genomic selection and novel management protocols based on the development 613689
European Commission