There is a newer version of the record available.

Published January 5, 2024 | Version 0.9
Dataset Open

The North Pacific Eukaryotic Gene Catalog: Raw assemblies from Gradients 1, 2 and 3

  • 1. University of Washington

Description

The North Pacific Subtropical Gyre is bordered on its northern side by a region known as the North Pacific Transition Zone; a latitudinal band of strong physical, chemical, and biological gradients and high productivity where warm, nutrient-deplete water from the subtropical gyre mixes with cold, nutrient-rich water from the north. The North Pacific Eukaryotic Gene Catalog consolidates eukaryotic metatranscriptome data from three latitudinal transects of the transition zone and one cruise in the subtropical gyre. Metatranscriptomes were gathered from latitudinally-resolved surface samples, and diel-resolved temporal studies, with samples taken in triplicate or duplicate and collected on 0.2-100 μm, 0.2-3 μm, and 3 μm-100 or 200 μm size fractions. These metatranscriptome data were de novo assembled into 175 independent assemblies, totalling 182 million clustered transcript contigs. Assemblies were annotated by taxonomy and function and enumerated by short read alignment. This catalog provides  assembled environmental contigs, their translated peptide sequences, taxonomic and functional annotations and read counts with the aim of facilitating continued discoveries about the molecular ecology of microbial eukaryotes in the North Pacific.

This dataset repository is associated with a codebase and documentation repository:
https://github.com/armbrustlab/NPac_euk_gene_catalog
Please see this repository for additional data and project updates

File contents: this repository contains five .tar.gz compressed tarballs with raw de novo Trinity assemblies of poly-A selected metatranscriptomes from the Gradients 1 through 3 cruises, and a plain-text file with the custom spike-in mRNA standards (CustomStandardSequences.txt)

Gradients1.KOK1606.PA.assemblies.tar.gz
- Link to G1PA project github page
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/KOK1606
- Short read processing code: G1PA.process_short_reads.sh
- Trinity assembly code: G1PA.trinity_assemblies.sh

Gradients2.MGL1704.PA.assemblies.tar.gz
- Link to G2PA project github page
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/MGL1704
- Short read processing code: G2PA.process_short_reads.sh
- Trinity assembly code: G2PA.trinity_assemblies.sh

G2_depth.MGL1704.PA.assemblies.tar.gz
- Link to G2PA project github page
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/MGL1704
- Short read processing code: G2PA.RR_DCM.process_short_reads.sh
- Trinity assembly code: G2PA_DCM.trinity_assemblies.sh

Gradients3.KM1906.PA.assemblies.tar.gz
- Link go G3PA project github page
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/KM1906
- Short read processing code: G3PA_UW.process_short_reads.sh
- Trinity assembly code: G3PA_UW.trinity_assemblies.sh

G3_diel.KM1906.PA.assemblies.tar.gz
- Link go G3PA project github page
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/KM1906
- Short read processing code: G3PA_diel.process_short_reads.sh
- Trinity assembly code: G3PA_diel.trinity_assemblies.sh

CustomStandardSequences.txt
- Plain-text FASTA file with the spike-in standards used during mRNA extraction and sequencing prep
- Link to publication of spike-in standards methods: https://www.nature.com/articles/s41564-019-0507-5

The 2015 SCOPE Diel metatranscriptome raw assemblies have been released in a previous Zenodo repository, and are not included again in this deposition. We provide the links to the Diel1 resources here:
- Diel1 raw metatranscriptome assembly Zenodo repository: https://zenodo.org/records/5009803
- Dataset DOI: https://doi.org/10.5281/zenodo.5009803
- Associated publication: https://www.frontiersin.org/articles/10.3389/fmicb.2021.682651/full
- Codebase: https://github.com/armbrustlab/diel_eukaryotes
- Simons CMAP cruise page and datasets: https://simonscmap.com/catalog/cruises/KM1513
- Short read processing code: D1PA.process_short_reads.sh
- Trinity assembly code: D1PA.trinity_assemblies.sh



Files

CustomStandardSequences.txt

Files (36.9 GB)

Name Size Download all
md5:2cdf16373c80c4607938aca7f3bba938
13.0 kB Preview Download
md5:8da993b6ad76d89889a61a102e7b1179
2.1 GB Download
md5:2bd622b69eb5d5b5d9aed53cf27d151e
5.3 GB Download
md5:6cf87325f0285025c5e21c83081c7c8c
9.7 GB Download
md5:757d4a2fd83ff93b0ed2ad94297fd86a
12.8 GB Download
md5:eadf9de8169060ad4ad90df8e0846039
7.0 GB Download