Published November 16, 2022 | Version v1
Dataset Open

Supporting Data for "Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes"

  • 1. Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany.
  • 2. Genomics Technologies, Corteva Agriscience, Johnston, IA 50131, USA.
  • 3. Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, United Kingdom.

Description

This dataset contains supporting files referenced by the following publication:

• Rabanal FA, Gräff M, Lanz C, Fritschi K, Llaca V, Lang M, Carbonell-Bejerano P, Henderson I, Weigel D. Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomesNucleic Acids Research. doi: 10.1093/nar/gkac1115

 

Directory structure:

  • Bionano_optical_maps_based_assemblies: this directory contains results from the Bionano optical map based scaffolding for the main long-read assemblers analysed in the study for Arabidopsis thaliana accession Ey15-2 (9994):
    • 9994.CLR_Canu
    • 9994.HiFi_FALCON
    • 9994.HiFi_HiCanu
    • 9994.HiFi_Hifiasm
    • 9994.HiFi_IPA
    • 9994.HiFi_Peregrine

 

  • Col-0_HiFi-Hifiasm_assembly: this directory contains the Pacbio HiFi based chromosome level assembly (fasta file) and repeat annotation (gff file) of Arabidopsis thaliana accession Col-0 (6909). 

 

  • Ey15-2_HiFi-Hifiasm_plus_CLR-Canu_assembly: this directory contains the Pacbio HiFi+CLR based chromosome level assembly (fasta file) and repeat annotation (gff file) of Arabidopsis thaliana accession Ey15-2 (9994). 

 

  • Naish2021_Wang2021_repeat_annotation: this directory contains the repeat annotation (gff files) for the Arabidopsis thaliana Col-0 (6909) assemblies performed by Naish et al. (doi: 10.1126/science.abi7489) and Wang et al. (doi: 10.1016/j.gpb.2021.08.003). 

 

  • TAIR10_masked: this directory contains the repeat-hard-masked version of the TAIR10 Arabidopsis thaliana Col-0 (6909) reference genome that was used for in silico scaffolding of contigs with RagTag (https://github.com/malonge/RagTag).

 

Files

SupportingData_A.thaliana_CLR_vs_HiFi.zip

Files (1.6 GB)

Name Size Download all
md5:fd3b9bd2f12195aa036a16d2b25d660a
1.6 GB Preview Download

Additional details

Related works

Is referenced by
Journal article: 10.1093/nar/gkac1115 (DOI)