Published April 22, 2023 | Version 1.0.0
Dataset Open

Genome-scale community modelling reveals key metabolic cross-feedings in epipelagic bacterioplankton communities (Supplementary Materials)

  • 1. Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France.

Description

A comprehensive catalog of 19,791 marine prokaryotic isolates (WGS), single-amplified genomes (SAGs) and metagenomic-assembled genomes (MAGs) compiled from MarRef v4.0 (N=943, mostly high-quality WGS), MarDB v4.0 (N=12,963), and the aquatic representative genomes from the ProGenomes database v1.0 (N=566). This collection of well-documented genomes was complemented by 5,319 MAGs assembled from four distinct studies, namely: Parks et al. 2017 (DOI; N=1,765; downloaded from EBI), Tully et al. 2017/2018 (DOI and DOI; N=2,597; downloaded from EBI), and Delmont et al. 2018 (DOI; N=957; downloaded from FIGSHARE). The Parks et al. study contained genomes reconstructed from non-marine biomes. Thus, a selection of 1,765 genomes was extracted by searching for specific keywords: “tara|marine|sea|ocean|mediterranean” (case insensitive). Note that depending on their study of origin, included MAGs may have been reconstructed using different assembling and binning methods.

The archive includes:

  • a metadata file describing the quality and redundancy of the genomes named `EcoSysMic_metadata.tsv`
  • sequences of the 19,791 (redundant) genomes in `All/WGS`
  • companion files in `All/Data` and `dRep95/Data` (see Methods in the associated paper), including
    • predicted CDS and EggNOG functional annotations
    • predicted GTDB taxonomy
    • CarveMe reconstructed metabolic models and their MEMOTE quality

The 7,658 non-redundant species-level genomes (delineated by a 95% ANI threshold over 60% of genome length) that were used in the associated paper are defined by the column `is_drep95` in the metadata file.

Files

Files (27.0 GB)

Name Size Download all
md5:57137a46b00441e06886675e78b9ff95
27.0 GB Download

Additional details

Funding

France-Génomique – Organisation et montée en puissance d'une Infrastructure Nationale de Génomique ANR-10-INBS-0009
Agence Nationale de la Recherche
AtlantECO – Atlantic ECOsystems assessment, forecasting & sustainability 862923
European Commission
OCEANOMICS – Biotechnologies et bioressources pour la valorisation des écosystèmes marins planctoniques ANR-11-BTBR-0008
Agence Nationale de la Recherche