De novo genome assembly of Gentiana triflora and Gentiana scabra
Authors/Creators
Description
We conducted de novo genome assembly of Gentiana triflora and Gentiana scabra using Nanopore long reads. These species are typically heterozygous; however, haploid individuals were generated for both species following the method described by Doi et al. (2013). The haploid genomes were sequenced using a Nanopore sequencer for de novo assembly. The raw sequencing data in FAST5 format were base-called using Guppy v4.2.2 (Oxford Nanopore Technologies, Oxford, UK). Genome assembly was performed with Flye v2.8.1 (Kolmogorov et al., 2019), followed by two rounds of error correction with Racon v1.4.13 (Vaser et al., 2017). Misassemblies were corrected using Medaka v1.2.1 (https://github.com/nanoporetech/medaka). Further consensus correction was conducted using short reads obtained from the DNBSEQ platform with minimap2 v2.17 (Li 2018) and HyPo v1.0.3 (Kundu et al., 2019). Finally, artificial contigs were removed using Purge Haplotigs v1.0.4 (Roach et al., 2018).
Table. Genome assembly statistics of G.triflora and G. scabra
| Genome assembly statistic | Gentiana_triflora_draft_v1.0.fasta | Gentiana_scabra_draft_v1.0.fasta |
| Total length (bp) | 3,657,985,820 | 3,786,908,358 |
| Number of contigs | 4,374 | 9,719 |
| Length of the largest contig (bp) | 13,845,662 | 6,219,710 |
| Average length of contigs (bp) | 836,302 | 389,640 |
| N50 length (bp) | 2,702,773 | 946,422 |
| GC content | 37.69% | 37.42% |
|
Percentage of complete genes idetified using BUSCO v4.1.4 with embryophyta_odb10 |
95.4% | 94.8% |