Progress Towards Plant Community Transcriptomics: Pilot RNA-Seq Data from 24 Species of Vascular Plants at Harvard Forest

Hannah E. Marx; Stacey A. Jorgensen; Eldridge Wisely; Zheng Li; Katrina M. Dlugosch; Michael S. Barker

doi:10.5281/zenodo.3727313

Published March 25, 2020 | Version v1

Dataset Open

Progress Towards Plant Community Transcriptomics: Pilot RNA-Seq Data from 24 Species of Vascular Plants at Harvard Forest

1. Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ 85721

Assembled transcriptomes of 24 vascular plant species from Harvard Forest. Transcriptomes for each species were sequenced and assembled as described below. Additional details available in the associated manuscript: https://doi.org/10.1101/2020.03.31.018945. Raw reads for each available at NCBI SRA SRP127805 and BioProject PRJNA422719.

Taxon selection and sampling

The Harvard Forest Flora (Jenkins et al., 2008) was used to select taxa to represent each category (native/invasive, diploid/polyploid). Invasive species status was determined from the Harvard Forest Flora Database (Jenkins and Motzkin, 2009). Putative diploids and neo-polyploid species were identified from chromosome counts obtained from the Chromosome Counts Database (Rice et al., 2015). Congeneric species pairs were selected based on their phylogenetic relatedness. The Harvard Forest Flora Database was used to assesscalculate recent encounter rates of each target species, and locate sampling sites.

Tissue from mature leaves was collected from an individual representing each target species at two time points (July and August) during the 2016 growing season. The same individual was sampled at both time points for perennial individuals, and the same population was sampled for annuals. Field sampling for plant RNA-seq followed the protocol described in Yang et al. 2017 (Yang et al., 2017). Leaf tissues were flash frozen in liquid nitrogen in the field, and shipped on dry ice to the University of Arizona for RNA extraction.

RNA extraction and RNA-seq

Total RNA was extracted from leaf tissue collected at each time point for all species using the Spectrum Plant Total RNA Kit (Sigma-Aldrich Co., St. Louis, MO, USA) following Protocol A. RNA was used to prepare cDNA using Nugen’s Ovation RNA-Seq System via single primer isothermal amplification (Catalogue # 7102-A01) and automated on the Apollo 324 liquid handler (Wafergen). cDNA was quantified on the Nanodrop (Thermo Fisher Scientific) and was sheared to approximately 300 bp fragments using the Covaris M220 ultrasonicator. Libraries were generated using Kapa Biosystem’s library preparation kit (KK8201). Fragments were end repaired and A-tailed, and individual indexes and adapters (Bioo, catalogue #520999) were ligated on each separate sample. The adapter ligated molecules were cleaned using AMPure beads (Agencourt Bioscience/Beckman Coulter, A63883), and amplified with Kapa’s HIFI enzyme (KK2502). Each library was then analyzed for fragment size on an Agilent’s Tapestation, and quantified by qPCR (KAPA Library Quantification Kit, KK4835) on Thermo Fisher Scientific’s Quantstudio 5 before multiplex pooling (13-16 samples per lane) and paired-end sequencing at 2x150 bp on the Illumina NextSeq500 platform at Arizona State University’s CLAS Genomics Core facility. Raw read quality was assessed using fastQC (Andrews, 2010).

De novo transcriptome assembly

Raw sequence reads were processed using the SnoWhite pipeline (Barker et al., 2010a; Dlugosch et al., 2013), which included trimming adapter sequences and bases with a quality score below 20 from the 3' ends of all reads, removing reads that are entirely primer and/or adapter fragments using TagDust (Lassmann et al., 2009), and removing polyA/T tails with SeqClean (https://sourceforge.net/projects/seqclean/). The cleaned reads from each sample time point were merged together by pairs, and pooled to assemble a reference de novo transcriptome for each species. All transcriptomes were assembled with SOAPdenovo-Trans v1.03 (Xie et al., 2014) using a k-mer of 57.

Files

Files (1.1 GB)

Name	Size	Download all
Dryopteris_carthusiana-57.scafSeq.gz md5:b48bcfcc41f39d5bf3dea1080d5c7d03	65.9 MB	Download
Dryopteris_intermedia-57.scafSeq.gz md5:83476fb39a0ce2e5a08e0e5773bf747a	54.6 MB	Download
Dryopteris_marginalis-57.scafSeq.gz md5:cf77bd789080c509967fde14bab664a2	55.4 MB	Download
Galium_mollugo-57.scafSeq.gz md5:ea3b7fbd10d0fc068de561c63ec127c9	48.6 MB	Download
Galium_tinctorium-57.scafSeq.gz md5:2859d9c599fdb2b883d5b5bd2d3708c4	11.0 MB	Download
Galium_triflorum-57.scafSeq.gz md5:2f64c785cdac5a381f104331d42ec462	54.0 MB	Download
Hypericum_perforatum-57.scafSeq.gz md5:25c2deabc4ade08f98d650c2cde53bbd	30.8 MB	Download
Juglans_cinerea-57.scafSeq.gz md5:6b1816147d1c96af574c4852e1cd711f	71.0 MB	Download
Lonicera_morrowii-57.scafSeq.gz md5:c19c85ff609078f74b2bd0ef48a60348	44.2 MB	Download
Lysimachia_ciliata-57.scafSeq.gz md5:8c3c4ea0c5ee38acc38f365fe955472f	123.4 MB	Download
Lysimachia_nummularia-57.scafSeq.gz md5:fd1c07613a5fdd27b4ab32aeecc29cf7	48.8 MB	Download
Lysimachia_quadrifolia-57.scafSeq.gz md5:854e1bae00363e71d1c53a26dec20618	36.1 MB	Download
Persicaria_arifolia-57.scafSeq.gz md5:b2bee0f3f3b241719568c96c892a73f3	50.2 MB	Download
Persicaria_hydropiperoides-57.scafSeq.gz md5:2c6074b94921182b93821043107780e4	39.9 MB	Download
Persicaria_sagittata-57.scafSeq.gz md5:a78972daef4f55c4a10d4f9dd05e06ac	42.4 MB	Download
Plantago_lanceolata-57.scafSeq.gz md5:b60cc11ad062196875001b66798dce99	22.9 MB	Download
Plantago_major-57.scafSeq.gz md5:c1465daed58ac9eecff936d8a8e993ee	24.8 MB	Download
Plantago_rugelii-57.scafSeq.gz md5:fa6530ee5ac28910b2fbfdf11f326554	36.8 MB	Download
Polygonum_cilinode-57.scafSeq.gz md5:f4df6ba471ce7bd06cb574282255425f	86.6 MB	Download
Potentilla_argentea-57.scafSeq.gz md5:7ef3fbc887e2728bd4dc87e2cea97086	36.3 MB	Download
Potentilla_canadensis-57.scafSeq.gz md5:4ff0e1873114aca38e00e59b38358359	38.0 MB	Download
Prunus_serotina-57.scafSeq.gz md5:2311bcf81ddfc8848a3703c759e9b136	34.5 MB	Download
Prunus_virginiana-57.scafSeq.gz md5:60eaf407f5592e9d75c70d1ea6efdc94	48.8 MB	Download
Reynoutria_japonica-57.scafSeq.gz md5:cae8aa41e2c58e9319deae48ff5d00db	43.1 MB	Download

Additional details

Cites: Dataset: https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA422719 (URL)
Is cited by: Preprint: 10.1101/2020.03.31.018945 (DOI)

U.S. National Science Foundation
EAGER-NEON: Genomic Plasticity in Response to Variable Environments 1550838

	All versions	This version
Views	481	481
Downloads	312	312
Data volume	18.6 GB	18.6 GB

Progress Towards Plant Community Transcriptomics: Pilot RNA-Seq Data from 24 Species of Vascular Plants at Harvard Forest

Files

Files (1.1 GB)

Additional details

Related works

Funding

Progress Towards Plant Community Transcriptomics: Pilot RNA-Seq Data from 24 Species of Vascular Plants at Harvard Forest

Creators

Description

Files

Files (1.1 GB)

Additional details

Related works

Funding