Repetitive elements of Erebia and Carex and their genomic proportions

Cornet, Camille; Mora, Pablo; Augustijnen, Hannah; Nguyen, Petr; Escudero, Marcial; Lucek, Kay

doi:10.5281/zenodo.8199371

Published July 31, 2023 | Version v1

Dataset Open

Repetitive elements of Erebia and Carex and their genomic proportions

1. University of Neuchâtel
2. University of Jaén, University of South Bohemia
3. University of Basel
4. University of South Bohemia
5. University of Seville

Datasets of repetitive elements in the genomes of Erebia and Carex, detected and annotated using RepeatExplorer2 (Novák et al., 2010, 2020) using low-coverage (0.1X) short read sequencing data. Data from the article "Holocentric repeat landscapes: from microevolutionary patterns to macroevolutionary associations with karyotype evolution":

Cornet, C., Mora, P., Augustijnen, H., Nguyen, P., Escudero, M., & Lucek, K. (2023). Holocentric repeat landscapes: From micro-evolutionary patterns to macro-evolutionary associations with karyotype evolution. Molecular Ecology, 00, 1–19. https://doi.org/10.1111/ mec.17100

47 Erebia and 14 Carex species were analysed in genus-level analyses ("Erebia" and "Carex" folders).
In addition, individuals of 4 Erebia species ("Erebia cassioides", "Erebia tyndarus", "Erebia nivalis" and "Erebia pronoe" folders) from different populations were analysed in species-level analyses.

Subfolders "Individuals" and "Comparative" represent the two modes in which RepeatExplorer2 was run: the individual mode identifies repeats in each sample separately, and the comparative mode identifies repeats in all samples simultaneously, allowing comparisons between individuals and species.

Files named "CLUSTER_TABLE..." are the raw output of RepeatExplorer2 and represent the overall number of reads in each cluster of repetitive element, and their annotation.
Files named "COMPARATIVE_ANALYSIS_COUNTS..." are the raw output of RepeatExplorer2 in comparative mode, representing the number of reads in each cluster for each sample included in the analysis.
Files named "Genome_proportion..." are the genomic proportion of each repeat annotation, calculated as the proportion of reads with the same annotation.

Refer to Cornet et al. (2023) in Molecular Ecology for more details on how the data was generated, the downstream analyses and the sample names (see Tables S1, S2 and S3).

References:

Novák, P., Neumann, P., & Macas, J. (2010). Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics, 11(1), 378. https://doi.org/10.1186/1471-2105-11-378

Novák, P., Neumann, P., & Macas, J. (2020). Global analysis of repetitive DNA from unassembled sequence reads using RepeatExplorer2. Nature Protocols, 15(11), Article 11. https://doi.org/10.1038/s41596-020-0400-y

Notes

Additional funding: ME was supported by FEDER/MICINN – AEI (PID2021-122715NB-I00). PN was supported by the 23-06455S grant of the Czech Science Foundation. PM thanks the University of Jaén for its "Convocatoria de Recualificación del Sistema Universitario Español-Margarita Salas" postdoctoral grant under the "Plan de Recuperación Transformación" program funded by the Spanish Ministry of Universities with European Union's NextGenerationEU funds (grant no. UJAR10MS).

Files

Repeats.zip

Files (6.0 MB)

Name	Size	Download all
Repeats.zip md5:1c3e1d4a9c1c948075c6bf455ca61689	6.0 MB	Preview Download

Additional details

Swiss National Science Foundation
The evolution of strong reproductive barriers towards the completion of speciation PCEFP3_202869
Swiss National Science Foundation
Genomic Rearrangements and the Origin of Species 310030_184934

	All versions	This version
Views	82	82
Downloads	11	11
Data volume	66.5 MB	66.5 MB

Repetitive elements of Erebia and Carex and their genomic proportions

Creators

Description

Notes

Files

Repeats.zip

Files (6.0 MB)

Additional details

Funding