Dataset Open Access

Draft genome assembly version 1 of the meadow spittlebug Philaenus spumarius (Linnaeus, 1758) (Hemiptera, Aphrophoridae)

Roberto Biello; Thomas C. Mathers; Sam T. Mugford; Qun Liu; Ana S. B. Rodrigues; Ana Carina Neto; Maria Teresa Rebelo; Octávio S. Paulo; Sofia G. Seabra; Saskia A. Hogenhout

We sequenced the genome of the meadow spittlebug, Philaenus spumarius (Linnaeus, 1758), the main insect vector of Xylella fastidiosa Wells et al. 1987 in Europe (Saponari et al., 2014), using 10x Chromium linked-reads. A single P. spumarius adult female from Portugal (Fontanelas, Sintra; GPS location: 38°50'15.75"N; 9°25'20.77"W), collected in September of 2018, was selected for genome sequencing. This population was initially surveyed for colour polymorphism in 1988 (Quartau & Borges, 1997) and was later included in phylogeographic and population genomic studies of this species (Rodrigues et al., 2014; Seabra et al., unpublished). It is also geographically close to the population from which the individual used for the first partial genome assembly was collected (Rodrigues et al., 2016). The availability of this previous genetic information contributed to the choice of this population as the source of genomic material for whole genome sequencing. A subset of males from the same collection date were analysed for genitalia morphology to confirm species identification, as the best diagnostic characters are the appendages of the aedeagus (Drosopoulos & Quartau, 2002).

The genomic DNA of the P. spumarius adult from Sintra was extracted using Illustra Nucleon Phytopure kit according to the manufacturer’s instructions (GE Healthcare). We assessed the quality and concentration of the DNA using Femto fragment analyser (Agilent). 10x Chromium library preparation and Illumina genome sequencing (HiSeq X, 150bp paired-end) were performed by Novogene Bioinformatics Technology Co, Beijing, China, in accordance with standard protocols.

To create the de novo 10x Chromium assembly we ran Supernova 2.1.1 (Weisenfeld et al., 2017) on the 10x Chromium linked-read data with default parameters, using 1.0 billion reads corresponding to 56X coverage. To improve the initial supernova assembly, we performed iterative scaffolding using all of the 10x raw data (2.3 billion of reads). We ran two rounds of Scaff10x (https://github.com/wtsi-hpag/Scaff10X), followed by mis-assembly detection and correction with Tigmint (Jackman et al., 2018). This was followed by a final round of scaffolding with ARCS (Yeo et al., 2018). The assembly was checked for contamination using the BlobTools pipeline (version 0.9.19; Laetsch and Blaxter 2017; Kumar et al., 2013) and k-mer content was analysed with the KAT comp tool (Mapleson et al., 2017). In order to perform these analyses, it was necessary to remove the 10x linked barcodes from the reads with the script process_10xReads.py (https://github.com/ucdavis-bioinformatics/proc10xG). We assessed the quality of our draft genome assembly by searching for conserved, single copy, arthropod genes (n=1,066) with Benchmarking Universal Single-Copy Orthologs (BUSCO) v3.0 (Waterhouse et al., 2018).

With the above assembly procedure, we obtained a final assembly of 2.7 Gb, having a scaffold N50 length of 116 Kb (contig N50 = 18 Kb) and the longest scaffold was 3.7 Mb. The length of the assembly was consistent with the genome size estimated by flow cytometry (Rodrigues et al., 2016). The k-mer distribution indicated high heterozygosity, estimated at 2.3%. BlobTools analyses revealed the presence of contigs assigned to Sodalis spp. (Enterobacteriaceae), a symbiont in members of tribe Philaenini (Koga et al., 2013). These contigs were filtered from the final assembly. Gene completeness assessment shows that 956 (89.6%) among 1,066 BUSCOs were  found as complete copies, with only 26 (2.4%) missing. Of the BUSCOs that were detected, 878 (82.4%) were complete and single-copy, 78 (7.3%) were complete and duplicated and 84 (7.9%) were fragmented.

In conclusion, due in part to high (2.3%) heterozygosity levels, the P. spumarius version 1 genome assembly is highly fragmented. Nonetheless, the assembly is considered complete and is likely to contain the majority of the gene content of P. spumarius.

Data Availability Short Illumina linked-reads and the genome assembly are available at the National Center for Biotechnology Information (NCBI) with the BioProject number PRJNA602656. The BioSample is available at NCBI with accession number SAMN13900937. Acknowledgments We thank José A. Quartau and Sara E. Silva for help with collection of the field samples of P. spumarius. Financial support for sample collection was obtained from CESAM (UID/AMB/50017/2019), cE3c (UID/BIA/00329/2019) and FCT/MCTES through national funds (Norma Transitoria – DL57/2016/CP1479), and co-funding by FEDER, within the PT2020 Partnership Agreement and Compete 2020. The collaboration between the research groups and overall research is funded from the BRIGIT project by UK Research and Innovation through the Strategic Priorities Fund, by a grant from BBSRC, with support from the Department for Environment, Food and Rural Affairs and the Scottish Government (BB/S016325/1). Additional support was received from a BBSRC Future Leader Fellowship (BB/R01227X/1) to T.C.M., the BBSRC Institute Strategy Programme (BB/P012574/1) and the John Innes Foundation.
Files (2.8 GB)
Name Size
Pspu_JIC_v1.0.fasta
md5:857f9659b60bf22700ccdca0a9dac72c
2.8 GB Download
  • Drosopoulos, S. & Quartau, J.A. (2002). The spittle bug Philaenus tesselatus Melichar, 1899 (Hemiptera, Auchenorrhyncha, Cercopidae) is a distinct species. Zootaxa, 68: 1-8.

  • Jackman, S.D., Coombe, L., Chu, J., Warren, R.L., Vandervalk, B.P., Yeo, S., Xue, Z., Mohamadi, H., Bohlmann, J., Jones, S.J.M., Birol, I. (2018). Tigmint: Correcting assembly errors using linked reads from large molecules. BMC Bioinformatics, 19.

  • Koga, R., Bennett, G. M., Cryan, J. R., & Moran, N. A. (2013). Evolutionary replacement of obligate symbionts in an ancient and diverse insect lineage. Environmental Microbiology, 15(7), 2073-2081.

  • Kumar, S., Jones, M., Koutsovoulos, G., Clarke, M., Blaxter, M. (2013). Blobology: exploring raw genome data for contaminants, symbionts, and parasites using taxon-annotated GC- coverage plots. Frontiers in Genetics, 4, 1–12.

  • Laetsch, D.R., Blaxter, M.L. (2017). BlobTools: Interrogation of genome assemblies. F1000Research, 6, 1287.

  • Mapleson, D., Accinelli, G.G., Kettleborough, G., Wright, J., Clavijo, B.J. (2017). KAT: A K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics, 33, 574– 576.

  • Quartau, J.A., Borges, P.A.V. (1997), On the colour polymorphism of Philaenus spumarius (L.) (Homoptera, Cercopidae) in Portugal. Miscellània Zoològica 20.2: 19-30.

  • Rodrigues, A.S., Silva, S.E., Pina-Martins, F., Loureiro, J., Castro, M., Gharbi, K., Johnson, K.P., Dietrich, C.H., Borges, P.A., Quartau, J.A., Jiggins, C.D., Paulo, O.S., Seabra, S.G. (2016). Assessing genotype-phenotype associations in three dorsal colour morphs in the meadow spittlebug Philaenus spumarius (L.) (Hemiptera: Aphrophoridae) using genomic and transcriptomic resources. BMC Genetics, 17, 144.

  • Rodrigues, A.S.B., Silva, S.E., Marabuto, E., Silva, D.N., Wilson, M.R., Thompson, V., Yurtsever, S., Halkka, A., Borges, P.A.V., Quartau, J.A., Paulo, O.S., Seabra, S.G. (2014). New mitochondrial and nuclear evidences support recent demographic expansion and an atypical phylogeographic pattern in the spittlebug Philaenus spumarius (Hemiptera, Aphrophoridae). PLoS ONE, 9(6), 1–12.

  • Saponari, M., Loconsole, G., Cornara, D., Yokomi, R.K., Stradis, A.D.E., Boscia, D., Bosco, D., Martelli, G.P., Krugner, R., Porcelli, F. (2014). Infectivity and Transmission of Xylella fastidiosa by Philaenus spumarius (Hemiptera: Aphrophoridae) in Apulia, Italy. Journal of Economic Entomology, 107(4), 1–4.

  • Waterhouse, R.M., Seppey, M., Simao, F.A., Manni, M., Ioannidis, P., Klioutchnikov, G., Kriventseva, E.V., Zdobnov, E.M. (2018). BUSCO applications from quality assessments to gene prediction and phylogenomics. Molecular Biology and Evolution, 35, 543–554.

  • Weisenfeld, N.I., Kumar, V., Shah, P., Church, D.M., Jaffe, D.B. (2017). Direct determination of diploid genome sequences. Genome Resources, 27, 757–767.

  • Wells, J.M., Raju, B.C., Hung, H.Y., Weisburg, W. G., Mandelco-Paul, L., Brenner, D.J. (1987). Xylella fastidiosa gen. nov., sp. nov.: gram-negative, xylem-limited, fastidious plant bacteria related to Xanthomonas spp. International Journal of Systematic Bacteriology, 37, 136–143.

  • Yeo, S., Coombe, L., Warren, R.L., Birol, I. (2018). ARCS: Scaffolding genome drafts with linked reads. Bioinformatics, 34, 725–731.

517
44
views
downloads
All versions This version
Views 517517
Downloads 4444
Data volume 122.4 GB122.4 GB
Unique views 463463
Unique downloads 3737

Share

Cite as