Published February 19, 2026
| Version v1
Dataset
Open
Investigating the topological motifs of inversions in pangenome graphs
Authors/Creators
Description
Simulated datasets used in the study "Investigating the topological motifs of inversions in pangenome graphs"
This repository contains several simulated datasets generated to study inversion representation and detection in pangenome graphs.
All datasets were obtained by mutating the human chromosome 21 (without peri-centromeric and telomeric regions), with SNPs and inversions. Datasets contains simulated haplotype sequences in fasta format, together with VCF files listing the inversions present in simulated sequences.
This repository contains 3 datasets:
- simulated_2hap_100INV: simulated haplotypes used to generate graphs with 2 haplotypes, with a same set of 100 inversions, with different levels of SNP polymorphism.
- simulated_10hap: a set of 9 haplotypes used to generate graphs with 10 haplotypes, each haplotype contains a different subset of the same set of 100 inversions (as in simulated_2hap_100INV) and is mutated with 1% SNP divergence.
- simulated_2hap_inversion_density: simulated haplotypes used to generate graphs with 2 haplotypes, with different numbers of inversions.
Reference : Romain, S., Dubois, S., Legeai, F., & Lemaitre, C. (2026). Investigating the topological motifs of inversions in pangenome graphs. bioRxiv, 2026-02 https://www.biorxiv.org/content/10.1101/2025.03.14.643331v2
Files
simulated_2hap_100INV.zip
Additional details
Dates
- Available
-
2026-02-19
Software
- Repository URL
- https://github.com/SandraLouise/INVPG_annot_paper/