Published May 24, 2026 | Version v1
Dataset Open

First genome assemblies and chromosome-level pseudomolecules of Tunisian durum wheat landraces Chili and Mahmoudi

  • 1. ROR icon University of Sfax
  • 2. GenoFlow

Description

This dataset contains the first genome assemblies and chromosome-level pseudomolecules for two Tunisian durum wheat (Triticum turgidum subsp. durum, 2n=4x=28, AABB genome) landraces: Chili and Mahmoudi.

Four FASTA files are provided:

1. chili_assembly.fsa : Primary scaffold assembly for landrace Chili, generated using hifiasm v0.25.0 and YAHS v1.2a.2 from PacBio HiFi reads, with Hi-C scaffolding. NCBI Assembly accession: DBNSEI000000000.

2. mahmoudi_assembly.fsa : Primary scaffold assembly for landrace Mahmoudi, generated using the same pipeline. NCBI Assembly accession: DBNSEJ000000000.

3. chili_all_chromosomes.fa : Chromosome-level pseudomolecules for Chili (14 chromosomes: 1A, 1B, 2A, 2B, 3A, 3B, 4A, 4B, 5A, 5B, 6A, 6B, 7A, 7B, plus unassigned scaffolds as Chr00). Constructed using sourmash v4.9.4 for chromosome assignment and RagTag v2.1.0 for scaffold ordering and orientation against the Svevo.v2 reference (IWGSC RefSeq v2.1).

4. mahmoudi_all_chromosomes.fa : Chromosome-level pseudomolecules for Mahmoudi (same chromosome composition as above). The Mahmoudi assembly contains a homeologous chromosome fusion (2B-3B) that was preserved as a single pseudomolecule.

Quality metrics:
- Chili: QV 68.0, k-mer completeness 98.42%, BUSCO 99.4%
- Mahmoudi: QV 68.3, k-mer completeness 98.34%, BUSCO 99.3%

Sequencing data source:
The PacBio HiFi and Hi-C sequencing data used for assembly were obtained from the public Durum Genome Project (OpenDurumGPT) consortium dataset (BioProject: PRJNA1467186).

Reference genome:
Svevo.v2 (Triticum turgidum subsp. durum cv. Svevo, release 2, IWGSC RefSeq v2.1) was used as the reference for chromosome assignment and pseudomolecule construction.

Files

Files (43.0 GB)

Name Size Download all
md5:a18ffc6df23d6117e19e8e474e4eef2a
10.2 GB Download
md5:e9ae01a2201123f266072f3566359331
11.0 GB Download
md5:397e7c3f1e94db4381d06549b7d09155
10.8 GB Download
md5:91b50e4c277913af6369fe29f7a94754
10.9 GB Download

Additional details

Related works

Is supplement to
Dataset: PRJNA1467186 (Other)
Dataset: DBNSEI000000000 (Other)
Dataset: DBNSEJ000000000 (Other)
References
Preprint: https://www.biorxiv.org/content/10.64898/2026.05.08.723814v2 (Other)