Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published March 25, 2020 | Version v1
Dataset Open

Progress Towards Plant Community Transcriptomics: Pilot RNA-Seq Data from 24 Species of Vascular Plants at Harvard Forest

  • 1. Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ 85721

Description

Assembled transcriptomes of 24 vascular plant species from Harvard Forest. Transcriptomes for each species were sequenced and assembled as described below. Additional details available in the associated manuscript: https://doi.org/10.1101/2020.03.31.018945. Raw reads for each available at NCBI SRA SRP127805 and BioProject PRJNA422719.

Taxon selection and sampling 

The Harvard Forest Flora (Jenkins et al., 2008) was used to select taxa to represent each category (native/invasive, diploid/polyploid). Invasive species status was determined from the Harvard Forest Flora Database (Jenkins and Motzkin, 2009). Putative diploids and neo-polyploid species were identified from chromosome counts obtained from the Chromosome Counts Database (Rice et al., 2015). Congeneric species pairs were selected based on their phylogenetic relatedness. The Harvard Forest Flora Database was used to assesscalculate recent encounter rates of each target species, and locate sampling sites. 

Tissue from mature leaves was collected from an individual representing each target species at two time points (July and August) during the 2016 growing season. The same individual was sampled at both time points for perennial individuals, and the same population was sampled for annuals. Field sampling for plant RNA-seq followed the protocol described in Yang et al. 2017 (Yang et al., 2017). Leaf tissues were flash frozen in liquid nitrogen in the field, and shipped on dry ice to the University of Arizona for RNA extraction.

 

RNA extraction and RNA-seq

Total RNA was extracted from leaf tissue collected at each time point for all species using the Spectrum Plant Total RNA Kit (Sigma-Aldrich Co., St. Louis, MO, USA) following Protocol A. RNA was used to prepare cDNA using Nugen’s Ovation RNA-Seq System via single primer isothermal amplification (Catalogue # 7102-A01) and automated on the Apollo 324 liquid handler (Wafergen). cDNA was quantified on the Nanodrop (Thermo Fisher Scientific) and was sheared to approximately 300 bp fragments using the Covaris M220 ultrasonicator. Libraries were generated using Kapa Biosystem’s library preparation kit (KK8201). Fragments were end repaired and A-tailed, and individual indexes and adapters (Bioo, catalogue #520999) were ligated on each separate sample. The adapter ligated molecules were cleaned using AMPure beads (Agencourt Bioscience/Beckman Coulter, A63883), and amplified with Kapa’s HIFI enzyme (KK2502). Each library was then analyzed for fragment size on an Agilent’s Tapestation, and quantified by qPCR (KAPA Library Quantification Kit, KK4835) on Thermo Fisher Scientific’s Quantstudio 5 before multiplex pooling (13-16 samples per lane) and paired-end sequencing at 2x150 bp on the Illumina NextSeq500 platform at Arizona State University’s CLAS Genomics Core facility. Raw read quality was assessed using fastQC (Andrews, 2010).

 

De novo transcriptome assembly

Raw sequence reads were processed using the SnoWhite pipeline (Barker et al., 2010a; Dlugosch et al., 2013), which included trimming adapter sequences and bases with a quality score below 20 from the 3' ends of all reads, removing reads that are entirely primer and/or adapter fragments using TagDust (Lassmann et al., 2009), and removing polyA/T tails with SeqClean (https://sourceforge.net/projects/seqclean/). The cleaned reads from each sample time point were merged together by pairs, and pooled to assemble a reference de novo transcriptome for each species. All transcriptomes were assembled with SOAPdenovo-Trans v1.03 (Xie et al., 2014) using a k-mer of 57.

Files

Files (1.1 GB)

Name Size Download all
md5:b48bcfcc41f39d5bf3dea1080d5c7d03
65.9 MB Download
md5:83476fb39a0ce2e5a08e0e5773bf747a
54.6 MB Download
md5:cf77bd789080c509967fde14bab664a2
55.4 MB Download
md5:ea3b7fbd10d0fc068de561c63ec127c9
48.6 MB Download
md5:2859d9c599fdb2b883d5b5bd2d3708c4
11.0 MB Download
md5:2f64c785cdac5a381f104331d42ec462
54.0 MB Download
md5:25c2deabc4ade08f98d650c2cde53bbd
30.8 MB Download
md5:6b1816147d1c96af574c4852e1cd711f
71.0 MB Download
md5:c19c85ff609078f74b2bd0ef48a60348
44.2 MB Download
md5:8c3c4ea0c5ee38acc38f365fe955472f
123.4 MB Download
md5:fd1c07613a5fdd27b4ab32aeecc29cf7
48.8 MB Download
md5:854e1bae00363e71d1c53a26dec20618
36.1 MB Download
md5:b2bee0f3f3b241719568c96c892a73f3
50.2 MB Download
md5:2c6074b94921182b93821043107780e4
39.9 MB Download
md5:a78972daef4f55c4a10d4f9dd05e06ac
42.4 MB Download
md5:b60cc11ad062196875001b66798dce99
22.9 MB Download
md5:c1465daed58ac9eecff936d8a8e993ee
24.8 MB Download
md5:fa6530ee5ac28910b2fbfdf11f326554
36.8 MB Download
md5:f4df6ba471ce7bd06cb574282255425f
86.6 MB Download
md5:7ef3fbc887e2728bd4dc87e2cea97086
36.3 MB Download
md5:4ff0e1873114aca38e00e59b38358359
38.0 MB Download
md5:2311bcf81ddfc8848a3703c759e9b136
34.5 MB Download
md5:60eaf407f5592e9d75c70d1ea6efdc94
48.8 MB Download
md5:cae8aa41e2c58e9319deae48ff5d00db
43.1 MB Download

Additional details

Related works

Funding

EAGER-NEON: Genomic Plasticity in Response to Variable Environments 1550838
National Science Foundation