Red Sea SAR11 and Prochlorococcus Single-cell Genomes Reflect Globally Distributed Pangenomes
Creators
- 1. Atlantic Oceanographic and Meteorological Laboratory, National Oceanic and Atmospheric Administration
Description
The Red Sea is isolated geographically from the rest of the ocean and has a combination of high irradiance, high temperature, and high salinity that is unique among the ocean; we therefore asked whether it harbors endemic gene content. We sequenced and assembled single-cell genomes of 21 SAR11 (subclades Ia, Ib, Id, II) and 5 Prochlorococcus (ecotype HLII) cells from the Red Sea and combined them with globally-sourced reference genomes to cluster genes into ortholog groups (OGs) using the program OrthoMCL (version 2.0). OrthoMCL configuration settings were as follows: percentMatchCutoff=50, evalueExponentCutoff=–5. This yielded 5272 SAR11 OGs and 10439 Prochlorococcus OGs. This archive contains four files: the protein identifiers associated with each OG (proch_ortholog_groups.txt, sar11_ortholog_groups.txt) and the protein sequences for each protein identifier (proch_protein_sequences.fasta, sar11_protein_sequences.fasta).