Dataset Open Access

Draft genome assembly version 1 of the meadow spittlebug Philaenus spumarius (Linnaeus, 1758) (Hemiptera, Aphrophoridae)

Roberto Biello; Thomas C. Mathers; Sam T. Mugford; Qun Liu; Ana S. B. Rodrigues; Ana Carina Neto; Maria Teresa Rebelo; Octávio S. Paulo; Sofia G. Seabra; Saskia A. Hogenhout


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Drosopoulos, S. &amp; Quartau, J.A. (2002). The spittle bug Philaenus tesselatus Melichar, 1899 (Hemiptera, Auchenorrhyncha, Cercopidae) is a distinct species. Zootaxa, 68: 1-8.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Jackman, S.D., Coombe, L., Chu, J., Warren, R.L., Vandervalk, B.P., Yeo, S., Xue, Z., Mohamadi, H., Bohlmann, J., Jones, S.J.M., Birol, I. (2018). Tigmint: Correcting assembly errors using linked reads from large molecules. BMC Bioinformatics, 19.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Koga, R., Bennett, G. M., Cryan, J. R., &amp; Moran, N. A. (2013). Evolutionary replacement of obligate symbionts in an ancient and diverse insect lineage. Environmental Microbiology, 15(7), 2073-2081.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Kumar, S., Jones, M., Koutsovoulos, G., Clarke, M., Blaxter, M. (2013). Blobology: exploring raw genome data for contaminants, symbionts, and parasites using taxon-annotated GC- coverage plots. Frontiers in Genetics, 4, 1–12.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Laetsch, D.R., Blaxter, M.L. (2017). BlobTools: Interrogation of genome assemblies. F1000Research, 6, 1287.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Mapleson, D., Accinelli, G.G., Kettleborough, G., Wright, J., Clavijo, B.J. (2017). KAT: A K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics, 33, 574– 576.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Quartau, J.A., Borges, P.A.V. (1997), On the colour polymorphism of Philaenus spumarius (L.) (Homoptera, Cercopidae) in Portugal. Miscellània Zoològica 20.2: 19-30.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Rodrigues, A.S.B., Silva, S.E., Marabuto, E., Silva, D.N., Wilson, M.R., Thompson, V., Yurtsever, S., Halkka, A., Borges, P.A.V., Quartau, J.A., Paulo, O.S., Seabra, S.G. (2014). New mitochondrial and nuclear evidences support recent demographic expansion and an atypical phylogeographic pattern in the spittlebug Philaenus spumarius (Hemiptera, Aphrophoridae). PLoS ONE, 9(6), 1–12.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Rodrigues, A.S., Silva, S.E., Pina-Martins, F., Loureiro, J., Castro, M., Gharbi, K., Johnson, K.P., Dietrich, C.H., Borges, P.A., Quartau, J.A., Jiggins, C.D., Paulo, O.S., Seabra, S.G. (2016). Assessing genotype-phenotype associations in three dorsal colour morphs in the meadow spittlebug Philaenus spumarius (L.) (Hemiptera: Aphrophoridae) using genomic and transcriptomic resources. BMC Genetics, 17, 144.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Saponari, M., Loconsole, G., Cornara, D., Yokomi, R.K., Stradis, A.D.E., Boscia, D., Bosco, D., Martelli, G.P., Krugner, R., Porcelli, F. (2014). Infectivity and Transmission of Xylella fastidiosa by Philaenus spumarius (Hemiptera: Aphrophoridae) in Apulia, Italy. Journal of Economic Entomology, 107(4), 1–4.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Waterhouse, R.M., Seppey, M., Simao, F.A., Manni, M., Ioannidis, P., Klioutchnikov, G., Kriventseva, E.V., Zdobnov, E.M. (2018). BUSCO applications from quality assessments to gene prediction and phylogenomics. Molecular Biology and Evolution, 35, 543–554.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Weisenfeld, N.I., Kumar, V., Shah, P., Church, D.M., Jaffe, D.B. (2017). Direct determination of diploid genome sequences. Genome Resources, 27, 757–767.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Wells, J.M., Raju, B.C., Hung, H.Y., Weisburg, W. G., Mandelco-Paul, L., Brenner, D.J. (1987). Xylella fastidiosa gen. nov., sp. nov.: gram-negative, xylem-limited, fastidious plant bacteria related to Xanthomonas spp. International Journal of Systematic Bacteriology, 37, 136–143.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Yeo, S., Coombe, L., Warren, R.L., Birol, I. (2018). ARCS: Scaffolding genome drafts with linked reads. Bioinformatics, 34, 725–731.</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Hemiptera</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Spittlebug</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Genome Assembly</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Pest</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Xylella fastidiosa</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Insect Vector</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Xylem</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Plant Disease</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Pathogen</subfield>
  </datafield>
  <controlfield tag="005">20200131192051.0</controlfield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="a">Data Availability

Short Illumina linked-reads and the genome assembly are available at the National Center for Biotechnology Information (NCBI) with the BioProject number PRJNA602656. The BioSample is available at NCBI with accession number SAMN13900937.

Acknowledgments

We thank José A. Quartau and Sara E. Silva for help with collection of the field samples of P. spumarius. Financial support for sample collection was obtained from CESAM (UID/AMB/50017/2019), cE3c (UID/BIA/00329/2019) and FCT/MCTES through national funds (Norma Transitoria – DL57/2016/CP1479), and co-funding by FEDER, within the PT2020 Partnership Agreement and Compete 2020. The collaboration between the research groups and overall research is funded from the BRIGIT project by UK Research and Innovation through the Strategic Priorities Fund, by a grant from BBSRC, with support from the Department for Environment, Food and Rural Affairs and the Scottish Government (BB/S016325/1). Additional support was received from a BBSRC Future Leader Fellowship (BB/R01227X/1) to T.C.M., the BBSRC Institute Strategy Programme (BB/P012574/1) and the John Innes Foundation.</subfield>
  </datafield>
  <controlfield tag="001">3368385</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-8637-3515</subfield>
    <subfield code="a">Thomas C. Mathers</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-8537-5578</subfield>
    <subfield code="a">Sam T. Mugford</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, United Kingdom</subfield>
    <subfield code="a">Qun Liu</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Animal Biology/Centre for Ecology, Evolution and Environmental Changes (cE3c), University of Lisbon, Portugal</subfield>
    <subfield code="a">Ana S. B. Rodrigues</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Animal Biology/Centre for Environmental and Marine Studies (CESAM), University of Lisbon, Portugal</subfield>
    <subfield code="a">Ana Carina Neto</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Animal Biology/Centre for Environmental and Marine Studies (CESAM), University of Lisbon, Portugal</subfield>
    <subfield code="a">Maria Teresa Rebelo</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Animal Biology/Centre for Ecology, Evolution and Environmental Changes (cE3c), University of Lisbon, Portugal</subfield>
    <subfield code="0">(orcid)0000-0001-5408-5212</subfield>
    <subfield code="a">Octávio S. Paulo</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Animal Biology/Centre for Ecology, Evolution and Environmental Changes (cE3c), University of Lisbon, Portugal</subfield>
    <subfield code="0">(orcid)0000-0003-1413-2349</subfield>
    <subfield code="a">Sofia G. Seabra</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0003-1371-5606</subfield>
    <subfield code="a">Saskia A. Hogenhout</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2780724952</subfield>
    <subfield code="z">md5:857f9659b60bf22700ccdca0a9dac72c</subfield>
    <subfield code="u">https://zenodo.org/record/3368385/files/Pspu_JIC_v1.0.fasta</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-01-31</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:3368385</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, United Kingdom</subfield>
    <subfield code="0">(orcid)0000-0002-5916-884X</subfield>
    <subfield code="a">Roberto Biello</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Draft genome assembly version 1 of the meadow spittlebug Philaenus spumarius (Linnaeus, 1758) (Hemiptera, Aphrophoridae)</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">BB/R01227X/1</subfield>
    <subfield code="a">Evolutionary genomics of host range expansion in aphid crop pests</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;We sequenced the genome of the meadow spittlebug, &lt;em&gt;Philaenus spumarius &lt;/em&gt;(Linnaeus, 1758), the main insect vector of &lt;em&gt;Xylella fastidiosa &lt;/em&gt;Wells et al. 1987 in Europe (Saponari et al., 2014), using 10x Chromium linked-reads. A single &lt;em&gt;P. spumarius&lt;/em&gt; adult female from Portugal (Fontanelas, Sintra; GPS location: 38&amp;deg;50&amp;#39;15.75&amp;quot;N; 9&amp;deg;25&amp;#39;20.77&amp;quot;W), collected in September of 2018, was selected for genome sequencing. This population was initially surveyed for colour polymorphism in 1988 (Quartau &amp;amp; Borges, 1997) and was later included in phylogeographic and population genomic studies of this species (Rodrigues et al., 2014; Seabra et al., unpublished). It is also geographically close to the population from which the individual used for the first partial genome assembly was collected (Rodrigues et al., 2016). The availability of this previous genetic information contributed to the choice of this population as the source of genomic material for whole genome sequencing. A subset of males from the same collection date were analysed for genitalia morphology to confirm species identification, as the best diagnostic characters are the appendages of the aedeagus (Drosopoulos &amp;amp; Quartau, 2002).&lt;/p&gt;

&lt;p&gt;The genomic DNA of the &lt;em&gt;P. spumarius&lt;/em&gt; adult from Sintra was extracted using Illustra Nucleon Phytopure kit according to the manufacturer&amp;rsquo;s instructions (GE Healthcare). We assessed the quality and concentration of the DNA using Femto fragment analyser (Agilent). 10x Chromium library preparation and Illumina genome sequencing (HiSeq X, 150bp paired-end) were performed by Novogene Bioinformatics Technology Co, Beijing, China, in accordance with standard protocols.&lt;/p&gt;

&lt;p&gt;To create the &lt;em&gt;de novo&lt;/em&gt; 10x Chromium assembly we ran Supernova 2.1.1 (Weisenfeld et al., 2017) on the 10x Chromium linked-read data with default parameters, using 1.0 billion reads corresponding to 56X coverage. To improve the initial supernova assembly, we performed iterative scaffolding using all of the 10x raw data (2.3 billion of reads). We ran two rounds of Scaff10x (https://github.com/wtsi-hpag/Scaff10X), followed by mis-assembly detection and correction with Tigmint (Jackman et al., 2018). This was followed by a final round of scaffolding with ARCS (Yeo et al., 2018). The assembly was checked for contamination using the BlobTools pipeline (version 0.9.19; Laetsch and Blaxter 2017;&amp;nbsp;Kumar et al., 2013) and k-mer content was analysed with the KAT comp tool (Mapleson et al., 2017). In order to perform these analyses, it was necessary to remove the 10x linked barcodes from the reads with the script process_10xReads.py (https://github.com/ucdavis-bioinformatics/proc10xG).&amp;nbsp;We assessed the quality of our draft genome assembly by searching for conserved, single copy, arthropod genes (n=1,066) with Benchmarking Universal Single-Copy Orthologs (BUSCO) v3.0 (Waterhouse et al., 2018).&lt;/p&gt;

&lt;p&gt;With the above assembly procedure, we obtained a final assembly of 2.7 Gb, having a scaffold N50 length of 116 Kb (contig N50 = 18 Kb) and the longest scaffold was 3.7 Mb. The length of the assembly was consistent with the genome size estimated by flow cytometry (Rodrigues et al., 2016). The k-mer distribution indicated high heterozygosity, estimated at 2.3%. BlobTools analyses revealed the presence of contigs assigned to &lt;em&gt;Sodalis &lt;/em&gt;spp. (Enterobacteriaceae), a symbiont in members of tribe Philaenini (Koga et al., 2013). These contigs were filtered from the final assembly. Gene completeness assessment shows that 956 (89.6%) among 1,066 BUSCOs were &amp;nbsp;found as complete copies, with only 26 (2.4%) missing. Of the BUSCOs that were detected, 878 (82.4%) were complete and single-copy, 78 (7.3%) were complete and duplicated and 84 (7.9%) were fragmented.&lt;/p&gt;

&lt;p&gt;In conclusion, due in part to high (2.3%) heterozygosity levels, the &lt;em&gt;P. spumarius&lt;/em&gt; version 1 genome assembly is highly fragmented. Nonetheless, the assembly is considered complete and is likely to contain the majority of the gene content of &lt;em&gt;P. spumarius.&lt;/em&gt;&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3368384</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3368385</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
956
57
views
downloads
All versions This version
Views 956956
Downloads 5757
Data volume 158.5 GB158.5 GB
Unique views 873873
Unique downloads 4848

Share

Cite as