Published April 24, 2020 | Version v1
Dataset Open

Genome sequence of the banana aphid, Pentalonia nigronervosa Coquerel (Hemiptera: Aphididae) and its symbionts

  • 1. John Innes Centre
  • 2. International Institute of Tropical Agriculture

Description

Pentalonia nigronervosa v1 frozen release

Genome assembly: Pentalonia_nigronervosa.v1.scaffolds.fa.gz

BRAKER2 gene models: Pentalonia_nigronervosa.v1.scaffolds.gff

BRAKER2 protein sequences: Pentalonia_nigronervosa.v1.scaffolds.gff.aa.fa

BRAKER2 protein sequences (longest transcript per gene only): Pentalonia_nigronervosa.v1.scaffolds.gff.aa.LTPG.fa

BRAKER2 coding sequences: Pentalonia_nigronervosa.v1.scaffolds.gff.cds.fa

InterProScan functional annotation: Pentalonia_nigronervosa.v1.scaffolds.gff.aa.LTPG.interproscan.tsv

Pentalonia nigronervosa v1 mitochondrial genome: Pentalonia_nigronervosa.v1.mt_genome.fa

Buchnera aphidicola (BPn) scaffolds: Buchnera_aphidicola_BPn.scaffolds.fa

Wolbachia (WolPenNig) scaffolds: Wolbachia_WolPenNig.scaffolds.fa

Myzus cerasi v1.2 frozen release

Genome assembly: Myzus_cerasi.v1.2.scaffolds.fa

BRAKER2 gene models: Myzus_cerasi.v1.2.scaffolds.gff

BRAKER2 protein sequences: Myzus_cerasi.v1.2.scaffolds.gff.aa.fa

BRAKER2 protein sequences (longest transcript per gene only): Myzus_cerasi.v1.2.scaffolds.gff.aa.LTPG.fa

BRAKER2 coding sequences: Myzus_cerasi.v1.2.scaffolds.gff.cds.fa

Aphid orthogroups and species tree

Proteomes included in the analysis: proteomes.tar.gz

Orthogroups: Orthogroups.txt

Gene counts per orthogroup, per species: Orthogroups.GeneCount.csv

Single copy conserved orthogroups used for species tree: Orthogroups_for_concatenated_alignment.txt

Species tree alignment: SpeciesTreeAlignment.fa

Rooted species tree: SpeciesTree_rooted.nwk

Bash script to run k-mer based assembly deduplication pipeline

File: disco_filter_dups.v1.1.sh

This script will parse a discovar de novo assembly and remove scaffolds likely to be haplotigs based on their k-mer content and a self alignment of the assembly (see manuscript for details).

The input discovar assembly needs to have white space in scaffold IDs replaced with "_" before running. Illumina reads should be unzipped before running.

Usage:

sh disco_filter_dups.sh <./path_to_assembly> <./path_to_r1> <./path_to_r2> <homozyzgous_lower_cov> <homozyzgous_upper_cov> <nucmer_id_cutoff> <nucmer_cov_cutoff> <assembly_output_prefix> <threads> <./working_dir>

 

 

 

 

 

Notes

TCM is funded by a BBSRC Future Leader Fellowship (BB/R01227X/1). The described work was supported by a CEPAMs grant (17.03.2) to SH, a Bill and Melinda Gates Foundation grant (OPP1087428) awarded to LT, the BBSRC Institute Strategy Program (BB/P012574/1) award to the John Innes Centre, and the John Innes Foundation. This research was supported in part by the NBI Computing Infrastructure for Science Group, which provides technical support and maintenance to the John Innes Centre's high-performance computing cluster and storage systems.

Files

Orthogroups.txt

Files (500.4 MB)

Name Size Download all
md5:94f8a800e1c5226d413a4654d008c420
4.8 kB Download
md5:8bd4abbb31f0ffb9cf9024fe383b176e
636.4 kB Download
md5:79afe0691cef94bc19e9bf877c72b273
12.3 MB Download
md5:0bd59f37e693e218d833e9fbb8aaac10
10.7 MB Download
md5:9a2f40b009a79d7b868ffb9455b69251
36.7 MB Download
md5:5b8510b739fee5de6e17d8dab31087b5
53.8 MB Download
md5:a074ff536edad9b4cb13495fffc6820d
112.2 MB Download
md5:b6400853fc3156b114e73843d3c2f45e
713.0 kB Download
md5:77f9acd523fa3ea34a019948722ba75c
4.0 MB Preview Download
md5:095bc2fa1f2536c64584a474cd7f0f5f
47.2 kB Preview Download
md5:17f140ab8241df7ddee12b8ad3557bab
15.4 kB Download
md5:452ee37a4ae5a35ea1a8fd2c531dba5b
106.9 MB Download
md5:8bf0bb90bbbfb6b2a3b5096a2965abd5
27.3 MB Download
md5:3302b02a363db08232f02cf13bae38ee
10.9 MB Download
md5:365bf4a10d0a6a3233df4a397b8f4a1b
9.4 MB Download
md5:2775566bbea31a36f29f8ce67537b8f8
3.6 MB Download
md5:c033cd14146c176e4cc34f9e5dc86743
32.5 MB Download
md5:d4f9acedb344d97962e1ecc557a864d8
52.2 MB Download
md5:6706605abb143e65c5f8c28f0d8749e6
198 Bytes Download
md5:17f520f40e562fd626bb40e303c0d451
25.1 MB Download
md5:d0ee7a922490a2dcbd6c8a5ad77bbb08
1.5 MB Download

Additional details

Funding

UK Research and Innovation
Evolutionary genomics of host range expansion in aphid crop pests BB/R01227X/1