Sixty-one thousand Recent planktonic foraminifera from the Atlantic Ocean
- 1. Yale University
- 2. Swedish Museum of Natural HIstory
- 3. University of Kansas
- 4. University of California, Berkeley
Here we provide an extensive image library of Recent microfossils (primarily planktonic foraminifera) from the Atlantic Ocean, with accompanying 2D and 3D coordinate data and morphometric measurements. This data was generated using high-throughput imaging methods (AutoMorph) developed in P.M. Hull's lab at Yale University.
The dataset consists of microfossils from 34 sediment samples, mounted on 155- micropaleontological slides, primarily from the North Atlantic Ocean (metadata_tables.tar.gz: Table 1). All slides are accessioned to the Yale Peabody Museum of Natural History (YPM) Division of Invertebrate Paleontology with unique YPM catalog numbers (metadata_tables.tar.gz: Table 2). Slides of microfossils were imaged at multiple focal heights (z-planes) using a light microscope with an automated stage and processed with the image processing models of AutoMorph as detailed in Table 2. AutoMorph software and tutorials can be accessed here: For an example of a raw slide scan see: Hsiang, Allison Y., Nealson, Kaylea, Elder, Leanne E., Liu, Yusu, & Hull, Pincelli M. (2016). Slide scan example for Automorph. Zenodo.
124,230 unique objects (primarily microfossils) were identified from the 155 slides of 34 sediment samples, and were classified into 16 object categories (metadata_tables.tar.gz: Table 3). Object classification is provided in Table 3 and summarized in Table 4 (metadata_tables.tar.gz). Table 5 in metadata_tables.tar.gz provides a technical validation of the automated 2D morphometric measurements; comparable data validation for the 3D morphometric measurements are provided in Hsiang et al. 2016 (
Images and morphometric data are provided in 12 additional datasets:
1) slide_images.tar.gz contains one image for each slide scanned in this study (155 slides), with a red box around each object extracted using the AutoMorph segment module. Slides are named according to their YPM catalog number and related sample and site information can be found in Tables 1 and 2 (metadata_tables.tar.gz), and in the YPM database ( by YPM catalog number.
2) edf_images.tar.gz contains the extended depth of focus images (EDF: a 2D image composite created from multiple z-stacked photographic images) generated by the AutoMorph focus module for each of the 124,230 individual objects identified by segment.
3 & 4) obj_zstacks_part1.tar.gz and obj_zstacks_part2.tar.gz contain the original zstack images of each object. 2D outlines and shape measurements for each object are extracted using the AutoMorph module run2dmorph.
5) 2d_outline_check.tar.gz provides an overlay of the extracted 2D outline on the object EDF for quality control purposes for all extracted objects (113,847 objects) and a text file of all objects with failed 2D extractions (10,384 objects).
6) 2d_coordinates.tar.gz provides the 2D coordinates of each object in a single csv (all_coordinates.csv) and by slide (155 csv files named according to YPM catalog number), and a text file of all objects with failed 2D extractions (10,384 objects).
7) shape_measurements.csv contains the complete list of all objects in the dataset (124,230 objects) with the 2D and 3D measurements extracted by the AutoMorph routines run2morph and run3dmorph when available.
8 & 9) 3d_pdfs_part1.tar.gz and 3d_pdfs_part2.tar.gz provide 3D pdfs of each 3D object extracted (109,207 objects) for quality control purposes and a text file of all objects with failed 3D extractions (15,023 objects). 3D pdfs, meshes and shape measurements were generated by the AutoMorph module run3dmorph. Note that only some pdf viewers are able to display 3d pdfs properly.
10-12) 3d_obj_files_part1.tar.gz, 3d_obj_files_part2.tar.gz, and 3d_obj_files_part3.tar.gz provide the 3D mesh coordinates as obj files for each extracted object and a text file of all objects with failed 3D extractions (15,023 objects).
This dataset accompanies the manuscript "Sixty-one thousand Recent planktonic foraminifera from the Atlantic Ocean" submitted to Scientific Data. The manuscript describes important details related to data collection and usage and should be consulted before using the data provided here. One key note about this dataset is repeated here as a precaution. We provide image classification for only 4/5ths of the complete data set. A random subset (1/5th of the classifications) are excluded as a test set, so that this image database can be used in machine learning.
(300.4 GB)
Name | Size | Download all |
467.7 MB | Download |
13.1 GB | Download |
44.2 GB | Download |
38.7 GB | Download |
31.2 GB | Download |
46.1 GB | Download |
37.3 GB | Download |
8.4 GB | Download |
2.5 MB | Download |
41.2 GB | Download |
33.3 GB | Download |
25.4 MB | Preview Download |
6.4 GB | Download |
Additional details
Related works
- Has part
- 10.5281/zenodo.167557 (DOI)