Published July 14, 2021 | Version v1
Dataset Open

ZooBase: A global synthesis of marine zooplankton species occurrences.

  • 1. Environmental Physics, Institute of Biogeochemistry and Pollutant Dynamics, ETH Zürich, Universitätstrasse 16, 8092 Zürich, Switzerland.
  • 2. Dynamic Macroecology, Landscape Dynamics, Swiss Federal Research Institute WSL, 8903 Birmensdorf, Switzerland.


Description of the methods used to implement the present ZooBase dataset (extarct from Section A.2 from the Methods of Benedetti et al., 2021).

A new dataset of global zooplankton species occurrences was compiled in a comparable fashion to that put together for phytoplankton (Righetti et al., 2020). Prior to retrieving the occurrence data online, we first identified the phyla (Order/Class/Family) that comprise the bulk of extant oceanic zooplankton communities: Copelata (i.e. appendicularians), Ctenophora, Cubozoa (i.e. box jellyfish), Euphausiidae (i.e. krill), Foraminifera, Gymnosomata (i.e. sea angels, pteropods), Hydrozoa (i.e. jellyfish), Hyperiidea (i.e. amphipods), Myodocopina (i.e. ostracods), Mysidae (i.e. small pelagic shrimps resembling krill), Neocopepoda, Podonidae and Penilia avirostris (i.e. cladocerans), Sagittoidea (i.e. chaetognaths), Scyphozoa (i.e. jellyfish), Thaliacea (i.e. salps, doliolids and pyrosomes), Thecosomata (i.e. pteropods), and four families of pelagic Polychaeta (i.e. worms) that are often found in the zooplankton and whose species are known to display holoplanktonic lifecycles (Tomopteridae, Alciopidae, Lopadorrhynchidae, Typhloscolecidae). The presence data associated with species belonging to these groups were retrieved from OBIS and GBIF between the 12/04/2018 and the 18/04/2018 using online queries via the R packages RPostgreSQL, robis and rgbif. Since the Neocopepoda infra-class comprise several thousands of benthic and parasitic taxa (, a preliminary selection of the non-parasitic planktonic species had to be carried out prior to the online downloading using the species list of Razouls et al. ( as a reference. The spatial distributions of the groups cited above were first inspected using GBIF’s and OBIS’s online mapping tools to evaluate the potential number of overlapping observations between the two databases. As a result of their relatively low contributions to total observations/diversity, and very high overlap between databases, the occurrences of Cladocera and Polychaeta were retrieved from OBIS only (which usually harbours more occurrences). On top of the data collected from OBIS and GBIF, the copepod occurrences from Cornils et al. (2018) and the pteropod occurrences from the MAREDAT initiative (Buitenhuis et al., 2013) were added to the dataset. We discarded records that: (i) presented at least one missing spatial coordinate, (ii) were associated with an incomplete sampling date (d/m/y), (iii) were associated with a year of collection older than 1800, (iv) were not associated with any sampling depth, (v) were not identified down to the species level. Occurrences associated with grid cells shallower than 10m were removed (bathymetry data from ETOPOv1 at a 15min resolution, downloaded using the 'marmap' R package). Finally, every species name was then carefully examined and compared to the taxonomic reference list of the World Register of Marine Species (WoRMS; for all taxa. The AphiaID and the Status were retreived from WoRMS based on the ScientificName. To remove the duplicate occurrences due to the highly overlapping source archives (GBIF and OBIS), a unique occurrenceID was given to each record based on rounded spatial coordinates (closest 0.1°x0.1°), rounded depth layer (10m depth layers), month and year of the occurrence and the acccpted species name (e.g., AphiaID).

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 862923. This output reflects only the author’s view, and the European Union cannot be held responsible for any use that may be made of the information contained therein.


# To read the .txt file in R use: read.table("ZooBase_Benedetti_et_al._Nat.Comms.2021_final_Zenodo.txt", h = T).



Additional details


AtlantECO – Atlantic ECOsystems assessment, forecasting & sustainability 862923
European Commission


  • Benedetti, F., Vogt, M., Hofmann-Elizondo, U., Righetti, D., Zimmermann, N.E., Gruber, N. Major restructuring of marine plankton assemblages under global warming. Nature Communications. 2021.