Published October 16, 2024 | Version v1
Poster Open

Unlocking the Power of ED2 model on HPC cluster: A Singularity Container Approach

  • 1. ROR icon National Center for Supercomputing Applications
  • 2. ROR icon University of Illinois Urbana-Champaign
  • 3. Climate and Ecosystem Sciences Division, Lawrence Berkeley National Laboratory, Berkeley CA
  • 4. International Institute of Tropical Forestry, USDA Forest Service, Río Piedras, Puerto Rico
  • 5. Department of Organismic and Evolutionary Biology, Harvard University, Cambridge MA

Description

The Ecosystem Demography Biosphere Model (ED2[1]) is an open-source, comprehensive terrestrial biosphere model that integrates hydrology, land-surface biophysics, vegetation dynamics, and carbon biogeochemistry. Since its inception in 2001, researchers have utilized this model to examine a variety of tropical and temperate ecosystems over years. 

ED2 presents challenges to new users due to its implementation in Fortran 90 with required HDF5 libraries, necessitating specific prior knowledge. Additionally, running the model over extended periods demands substantial computational resources, complicating deployment across various computing platforms. High Performance Computing (HPC) cluster usage requires prior knowledge about clusters and a learning curve for job submission. To lower these barriers for the broader ecological community, we implemented a Singularity-based containerization of ED2 and developed a Jupyter Notebook to simplify job submissions to HPC clusters.

The containerized version of ED2 can be easily deployed on various High-Performance Computing (HPC) platforms. This container includes the ED2 binary and all necessary libraries, removing the need for additional installations and allowing the capability to "run everywhere." Additionally, the Jupyter notebooks enable users to seamlessly modify model configurations and run the model on both local machines and different HPC clusters. This approach simplifies the process of communicating with the HPC cluster and managing slurm jobs. The notebook also includes features for frequently checking the status of ongoing jobs on the HPC cluster and transferring the output back to the local machine. We have created basic demo visualizations of the output, allowing users to further visualize their results using Python or R as they prefer.

We ran the containerized ED2 model running on the University of Illinois Campus Cluster, TEXAS Stampede clusters and NCSA Delta, analyzing 20 years of model simulations over multiple sites in tropical and temperate ecosystems. We are working towards testing and adding the configuration to run on additional clusters (such as the NASA HEC system). We have held a workshop at the ED2 community meeting at Harvard in 2024 where the ED2 community showed significant interest in the combination of notebooks and containers.

 

Files

rse-2024-containerized-ED2.pdf

Files (9.2 MB)

Name Size Download all
md5:d9e82f835731db1adde4511f9446e610
1.4 MB Preview Download
md5:72d17603f3608ce93ef8fd52efd48f0e
7.8 MB Download