Published July 31, 2023 | Version v1
Dataset Open

Repetitive elements of Erebia and Carex and their genomic proportions

  • 1. University of Neuchâtel
  • 2. University of Jaén, University of South Bohemia
  • 3. University of Basel
  • 4. University of South Bohemia
  • 5. University of Seville

Description

Datasets of repetitive elements in the genomes of Erebia and Carex, detected and annotated using RepeatExplorer2 (Novák et al., 2010, 2020) using low-coverage (0.1X) short read sequencing data. Data from the article "Holocentric repeat landscapes: from microevolutionary patterns to macroevolutionary associations with karyotype evolution":

Cornet, C., Mora, P., Augustijnen, H., Nguyen, P., Escudero, M., & Lucek, K. (2023). Holocentric repeat landscapes: From micro-evolutionary patterns to macro-evolutionary associations with karyotype evolution. Molecular Ecology, 00, 1–19. https://doi.org/10.1111/ mec.17100

47 Erebia and 14 Carex species were analysed in genus-level analyses ("Erebia" and "Carex" folders).
In addition, individuals of 4 Erebia species ("Erebia cassioides", "Erebia tyndarus", "Erebia nivalis" and "Erebia pronoe" folders) from different populations were analysed in species-level analyses. 

Subfolders "Individuals" and "Comparative" represent the two modes in which RepeatExplorer2 was run: the individual mode identifies repeats in each sample separately, and the comparative mode identifies repeats in all samples simultaneously, allowing comparisons between individuals and species. 

Files named "CLUSTER_TABLE..." are the raw output of RepeatExplorer2 and represent the overall number of reads in each cluster of repetitive element, and their annotation.
Files named "COMPARATIVE_ANALYSIS_COUNTS..." are the raw output of RepeatExplorer2 in comparative mode, representing the number of reads in each cluster for each sample included in the analysis. 
Files named "Genome_proportion..." are the genomic proportion of each repeat annotation, calculated as the proportion of reads with the same annotation.

Refer to Cornet et al. (2023) in Molecular Ecology for more details on how the data was generated, the downstream analyses and the sample names (see Tables S1, S2 and S3). 

References:

Novák, P., Neumann, P., & Macas, J. (2010). Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics, 11(1), 378. https://doi.org/10.1186/1471-2105-11-378

Novák, P., Neumann, P., & Macas, J. (2020). Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nature Protocols, 15(11), Article 11. https://doi.org/10.1038/s41596-020-0400-y

Notes

Additional funding: ME was supported by FEDER/MICINN – AEI (PID2021-122715NB-I00). PN was supported by the 23-06455S grant of the Czech Science Foundation. PM thanks the University of Jaén for its "Convocatoria de Recualificación del Sistema Universitario Español-Margarita Salas" postdoctoral grant under the "Plan de Recuperación Transformación" program funded by the Spanish Ministry of Universities with European Union's NextGenerationEU funds (grant no. UJAR10MS).

Files

Repeats.zip

Files (6.0 MB)

Name Size Download all
md5:1c3e1d4a9c1c948075c6bf455ca61689
6.0 MB Preview Download

Additional details

Funding

Swiss National Science Foundation
The evolution of strong reproductive barriers towards the completion of speciation PCEFP3_202869
Swiss National Science Foundation
Genomic Rearrangements and the Origin of Species 310030_184934