De novo assembly of 20 chicken genomes reveals the undetectable phenomenon for thousands of core genes on micro-chromosomes and sub-telomeric regions
Creators
Description
The gene numbers and evolutionary rates of birds were assumed to be much lower than those of mammals, which is in sharp contrast to the huge species number and morphological diversity of birds. It is therefore necessary to construct a complete avian genome and analyze its evolution. We constructed a chicken pan-genome from 20 de novo assembled genomes with high sequencing depth, and identified 1,335 protein-coding genes and 3,011 long noncoding RNAs not found in GRCg6a. The majority of these novel genes were detected across most individuals of the examined transcriptomes but were seldomly measured in each of the DNA sequencing data regardless of Illumina or PacBio technology. Furthermore, different from previous pan-genome models, most of these novel genes were overrepresented on chromosomal sub-telomeric regions and micro-chromosomes, surrounded by extremely high proportions of tandem repeats, which strongly blocks DNA sequencing. These hidden genes were proved to be shared by all chicken genomes, included many housekeeping genes, and enriched in immune pathways. Comparative genomics revealed the novel genes had three-fold elevated substitution rates than known ones, updating the knowledge about evolutionary rates in birds. Our study provides a framework for constructing a better chicken genome, which will contribute towards the understanding of avian evolution and improvement of poultry breeding.
Files
Files
(3.2 GB)
Name | Size | Download all |
---|---|---|
md5:2df17812bd7fc86db52b457dc18062ec
|
25.1 MB | Download |
md5:03e8ab3f2347d0b6445c6d400a08c226
|
3.1 GB | Download |