There is a newer version of the record available.

Published January 1, 2023 | Version 1.0
Dataset Open

MarFERReT: an open-source, version-controlled reference library of marine microbial eukaryote functional genes

  • 1. University of Washington

Description

The emerging field of environmental metatranscriptomics generates large volumes of sequence data about actively transcribed genes in natural environments, and taxonomic annotation of these sequences are dependent on curated reference sequences. For marine microbial eukaryotes, current reference libraries are limited by gaps in sequenced organism diversity and barriers to updating libraries with new sequence data and approximately half of eukaryotic environmental transcripts can be annotated. Here, we introduce Marine Functional EukaRyotic Reference Taxa (MarFERReT), an updated marine microbial eukaryotic sequence library with version-controlled contents designed for taxonomic annotation of eukaryotic metatranscriptomes. MarFERReT contains over 30 million protein sequences from 899 marine eukaryotic genomes and transcriptomes, covering 503 species and 323 genera. Continued expansion of MarFERReT as new reference sequences become available will enable up-to-date taxonomic annotations into the future.

Please see the MarFERReT GitHub repository for full code, documentation and updates:
https://github.com/armbrustlab/marferret

Files

MarFERReT.v1.metadata.csv

Files (6.1 GB)

Name Size Download all
md5:66e56f4bf64f2e867f12a149815a4c50
228.7 kB Preview Download
md5:8dfed13849cc2b97c4dafa6bb925c568
655.7 MB Download
md5:9df5de32ba5a62aed449634acf48d4a5
5.1 GB Download
md5:141e1071d80dcdbf031f86d149097d9c
271.0 MB Download
md5:9305b80a83087e0e4ed344b24d8c3f31
99.6 MB Download