Published April 21, 2023 | Version v1
Dataset Open

Dataset for WGS of TiLV using Nanopore

  • 1. WorldFish (Penang)
  • 2. Patriot Biotech Sdn. Bhd (Kuala Lumpur)
  • 3. Centex Shrimp, BIOTEC, NSTDA (Bangkok)
  • 4. Asian Institute of Technology (Bangkok)

Description

This zip file contains scripts, initial fastq files, assembled genomes (public and from this study) as well as bioinformatics intermediate files used for this study.

File Structure and Descriptions

├── 01.Filter.sh : Script to perform read filtering from raw fastq.gz
├── 02.RefGenome_Assembly.sh : Script to perform reference-mapping genome assembly from the trimmed fastq file
├── 03.Cleanup.sh : Script to reorganize folder and clean-up intermediate files
├── 04.GapAnalysis.sh : Script to extract individual viral segment and perform QUAST analysis to calculate number of gaps
├── 05.Phylogenetic.sh : Script to combine TiLV genome from public database (availabel in Phylogenetic folder) and generate Maximum likelihood tree
├── backup : Raw Fastq (basecalled with Guppy super accuracy mode)
├── Consensus : Assembled genome in fasta format
├── Coverage: Contig coverage information
├── Filtered_Fastq: Quality and length-filtered fastq used for generating the genome assembly
├── Filtered_Segment.fasta: Viral segments from samples that have been filtered for high completeness ( > 80% genome without gap)
├── GapAnalysis.tsv: Table containing the gap information for each assembled viral segment of each sample
├── Normalised_Bam: BAM alignment file used for variant calling
├── Phylogenetic: Contains crucial whole genome sequences of publicly available virus downloaded from NCBI
├── primer-schemes: Contains reference sequence used for reference-based mapping
├── Raw_Bam: Raw BAM alignment file prior to normalisation. Used to estimate the read depth observed for each sample and each viral segment
├── ReadDepth.tsv:  Table file showing the read depth of each sample and its viral segment
├── Sequencing_Stat.tsv: Sequencing statistics of samples before and after length/quality filter with NanoFilt
└── TILV.tre: Newick file containing the maximum likelihood tree generated from fasttree (-gtr -nt)

8 directories, 10 files

 

Files

TILV_Dataset.zip

Files (518.9 MB)

Name Size Download all
md5:091ea42de911508b3fa470d17c52c354
518.9 MB Preview Download